Discovered: Feb 28, 2025 Time - 16:32 - UTC
Resolved: Feb 28, 2025 Time - 19:30- UTC
Overload of backend resources for services on the US4 cluster.
Tenants on the US4 cluster became inaccessible.
All times in UTC
02/28/2025
16:32 - Auvik Engineering discovers several non-responsive backends on the US4 cluster, which causes some tenants to be unresponsive. Engineering begins investigating.
17:00 - Attempts are made to revive the non-responsive backends.
17:28 - Cluster is in distress, with more backends starting to fail.
17:45 - Engineering restarts the entire cluster.
18:10-19:30 - The cluster is observed as it restarts and monitored as it comes up to full functionality. The incident is declared resolved.