Authentication Service Failures

Incident Report for JumpCloud

Postmortem

Incident Report

Date: May 5, 2025

Date of Incident: Apr 30, 2025

Description: RCA for User Portal / SSO Failed Authentication

Summary:

On April 30th at 12:35 UTC our monitors detected a significant increase in errors across our User Console API endpoints. The increased errors manifested as failed authentication for new attempts to the User Portal, SSO and other services.  Existing connections to applications continued uninterrupted. At 12:41 UTC a formal incident was declared and multiple teams were paged. The issue was resolved at 12:58 UTC

Root Cause:

A state transition anomaly occurred during the credential rotation process for a database within our authentication infrastructure, resulting in connection failures by dependent services to that layer. Consequently, end-users experienced authentication errors when attempting to establish new sessions with specific applications.

The authentication service team was able to identify a failure with secondary credentials during the rotation and quickly failed back to valid secrets.

Corrective Actions / Risk Mitigation:

  1. Fail back to primary secrets - DONE
  2. Increased automation for secrets rotation at this layer - IN PROGRESS
Posted May 05, 2025 - 20:26 MDT

Resolved

This incident has been resolved.
Posted Apr 30, 2025 - 07:15 MDT

Monitoring

We have identified an issue that was affecting authentication and SSO and have implemented a fix. Services have recovered and we are monitoring.
Posted Apr 30, 2025 - 07:05 MDT
This incident affected: LDAP, SSO, and User Console.