Date: Apr 7, 2026
Date of Incident: Mar 30, 2026
Description: RCA for Directory Association Processing Delays
Summary:
Starting March 30th at approximately 15:40 MDT, JumpCloud customers experienced significant delays in directory-related updates. This included latency in password changes, user-to-group associations, and outbound provisioning reflecting in downstream systems. The root cause was identified as a specific code deployment in our Devices service that inadvertently flooded a background processing queue with unpartitioned messages, causing a bottleneck that prevented updates from processing in real-time. The issue was fully resolved by 00:25 MDT on March 31, 2026.
What Happened:
The incident was caused by a change in how the JumpCloud agent retrieves software application configurations.
Resolution and Recovery:
Once the offending code was rolled back, the "tap" was turned off, and no further unpartitioned messages were added to the queue.
Because the bottleneck was caused by the lack of partitioning, simply scaling horizontally could not speed up the processing of the existing backlog. The team monitored the queue throughput and determined that the safest and fastest path to recovery was allowing the worker to process the existing messages sequentially rather than risking further disruption by attempting to manually manipulate the production queue.
Corrective Actions:
To ensure this type of bottleneck does not occur again, we have committed to the following: