Date: 2022-06-27
Date of Incident: 2022-06-22
Description: RCA for intermittent increased error rates returned on some API endpoints
Summary:
At approximately 17:15 MDT on 2022-06-22, some JumpCloud customers experienced a decrease in success rates returned from some API endpoints. This decrease lasted until approximately 20:00 MDT on 2022-06-22.
Root Cause:
A unique race condition caused a failure in our secrets engine, which in turn prohibited some services from acquiring the necessary credentials that would have allowed them to start successfully. These services were not automatically removed from service due to a gap in health checks.
Corrective Actions / Risk Mitigation: