
Date: Oct 27, 2025
Date of Incident: Oct 20, 2025
Description: RCA for Service Degradation Linked to AWS US-EAST-1 Regional Disruption
Summary:
On October 20, 2025, between approximately 07:00 UTC and 10:00 UTC, the JumpCloud platform experienced significant performance degradation. This primarily affected the responsiveness of our core APIs, administrative console access, user console access, single sign-on (SSO) functions for customers, failed trial creation, user updates, and the potential loss of some Directory Insights events. Residual performance degradation, affecting services like our Privileged Access Management (PAM) feature and specific Support Portal APIs, persisted until approximately 17:00 UTC, at which point all services were fully restored.
The degradation was not caused by a failure within the JumpCloud platform code or infrastructure configuration, but was a direct consequence of a severe, cascading failure within the Amazon Web Services (AWS) US-EAST-1 region. Although many of our services are deployed across multiple Availability Zones (AZs) within the region for resilience, the nature of the AWS issue - impacting fundamental regional services - compromised inter-AZ communication preventing some of our standard failover mechanisms from operating successfully. Services returned to full operational status after AWS reported stability with the impacted foundational services.
Root Cause:
Based on the post-incident analysis of AWS, the service disruption was not a single event but a sequence of cascading failures across three fundamental AWS services.
All affected service teams were paged by our monitoring and alerting systems, and our incident management team was engaged to coordinate efforts and assess areas where we could throttle traffic to stabilize remaining capacity.
Corrective Actions / Risk Mitigation: