Service Disruption - maps and some UI elements are unavailable
Incident Report for Auvik Networks Inc.
Postmortem

Service Disruption - Maps and data not being populated in the UI

Root Cause Analysis

Duration of incident

Discovered: Oct 26, 2024 13:13 - UTC
Resolved: Oct 26, 2024 17:59 - UTC

Cause

Requests to the permission cache prevented data from syncing properly across clusters.

Effect

Key product features, like map functionality, became unavailable to users.

Action taken

All times in UTC

10/26/2024

11:00 Planned upgrade of backend data started.

13:13 Issues were noted with permissions that affected UI components.

13:30 Confirmed issues with Maps during the post-upgrade check.

14:24 Attempt to restart services to enable UI updates.

15:30 Decision was made to restart services from the beginning to flush out any lingering issues.

15:47 Backend services restarted with an additional emphasis on enabling a clean restart.

17:59 Service fully restored.

Future consideration(s)

  • Parallelize investigations and timebox responses for efficiency to avoid prolonged troubleshooting when a complete restart could resolve issues.
  • Improve upgrade and detection protocols to catch errors earlier.
Posted Nov 01, 2024 - 12:10 EDT

Resolved
The source of the disruption has been resolved. Services have been fully restored.

A Root Cause Analysis (RCA) will follow after a full review has been completed.
Posted Oct 26, 2024 - 13:57 EDT
Update
We are continuing to monitor the system as tenants are becoming available. We estimate that all tenants will be fully operational within the hour and will provide an update accordingly.
Posted Oct 26, 2024 - 13:23 EDT
Monitoring
The system has been restarted and services are coming back online. Tenants are beginning to start and become available at this time. We are monitoring the system and will provide an update in 30 minutes.
Posted Oct 26, 2024 - 12:50 EDT
Identified
We’ve identified the source of the service disruption with maps and other UI elements. We will be restarting the system to recover from this incident.
Posted Oct 26, 2024 - 11:32 EDT
Investigating
We’re experiencing disruption with network topology and select UI data not populating in the Auvik UI. Some data may be unavailable. We will continue to provide updates as they become available.
Posted Oct 26, 2024 - 10:51 EDT
This incident affected: Network Mgmt (us1.my.auvik.com, us2.my.auvik.com, us3.my.auvik.com, us4.my.auvik.com, eu1.my.auvik.com, eu2.my.auvik.com, au1.my.auvik.com, ca1.my.auvik.com, us5.my.auvik.com, us6.my.auvik.com).