Date: 2022-06-27
Date of Incident: 2022-06-22
Description: RCA for intermittent increased error rates returned on some API endpoints
Summary:
At approximately 07:20 MDT on 2022-06-22, some JumpCloud customers experienced an intermittent decrease in success rates returned from some API endpoints. This behavior lasted until approximately 11:00 MDT on 2022-06-22.
Root Cause:
A sharp increase in container count saturated one host causing it to intermittently exceed specific kernel parameters. This caused some services dedicated to that host to experience delayed response times causing an increased error rate.
Corrective Actions / Risk Mitigation: