Discovered: Aug 25, 2025 18:00 - UTC
Resolved: Aug 29, 2025 14:00 - UTC
A configuration rollout unexpectedly generated a large number of configuration entries, which propagated across tenants. This resulted in excessive background processing and memory pressure in core services. The strain led to degraded performance, instability, and in some cases, brief service crashes across clusters.
Customers experienced:
All times are in UTC
08/25/2025
18:00 — Rollout halted after error rates increased.
19:00 — Targeted service restarts restored partial availability.
22:00 — Added backend capacity and began controlled rollouts.
08/26–08/28/2025
Continued staged rollouts with adjusted capacity.
Cleaned up configuration entries for affected tenants.
Tuned resource allocations for read/permissioning services
08/29/2025
14:00 — All clusters stabilized; monitoring confirmed normal performance.