This article was originally published on Jo4 Blog.
We found three authentication bugs in production. Not from penetration testing. Not from a security audit. From a single user saying "I can't log in sometimes."
All three bugs were interconnected. Fixing one revealed the next. We shipped the fix in a single commit because pulling on one thread unraveled the whole chain.
Here's each bug, why it existed, and how we fixed it.
Bug 1: The 405 That Shouldn't Exist
Symptom: Sentry alerts showing HttpRequestMethodNotSupportedException — HTTP 405 "Method Not Allowed" — on endpoints that absolutely accept the methods being used.
Investigation: The stack traces pointed at bot traffic. Scanners probing random paths with random HTTP methods. PROPFIND /admin. OPTIONS /api/v1/protected/users. TRACE /oauth/token.
These should return 404 or be handled gracefully. Instead, they were hitting our impersonation filter, which assumed any request reaching it was a valid authenticated request. When the filter tried to process a PROPFIND request on a path that only accepts GET, Spring threw a MethodNotAllowed before our error handler could catch it.
The fix: Add HttpRequestMethodNotSupportedException to our global exception handler:
@ExceptionHandler(HttpRequestMethodNotSupportedException.class)
public ResponseEntity<ErrorResponse> handleMethodNotAllowed(
HttpRequestMethodNotSupportedException ex) {
return ResponseEntity.status(HttpStatus.METHOD_NOT_ALLOWED)
.body(ErrorResponse.of("Method not allowed"));
}
Simple. But finding it required understanding that our filter was letting garbage requests through to the controller layer.
Which led us to Bug 2.
Bug 2: The Filter That Ran Too Early
Symptom: Admin impersonation — a feature that lets support staff act as a specific user — worked sometimes. Other times it silently failed and the admin saw their own account instead of the target user.
The architecture: We have an ImpersonationFilter that checks for an X-Impersonate-User header. If present and the caller is an admin, it swaps the security context to the target user.
The problem: The filter executed before our user sync filter.
In our Auth0 integration, the first request from a new Auth0 user triggers a "sync" — we look up the Auth0 subject in our database and create a local user record if one doesn't exist. This happens in Auth0UserSyncFilter.
The filter chain looked like this:
Request → ImpersonationFilter → Auth0UserSyncFilter → Controller
When an admin's first request included the impersonation header:
-
ImpersonationFilterruns. Tries to look up the admin user. But the admin hasn't been synced yet. Lookup returns null. Impersonation silently fails. -
Auth0UserSyncFilterruns. Creates the admin user record. - Controller runs. Admin sees their own account, not the target.
On the second request, the admin user exists. Impersonation works. Hence "works sometimes."
The fix: Reorder the filters:
Request → Auth0UserSyncFilter → ImpersonationFilter → Controller
The sync must happen before any filter that depends on the user existing in the database. We enforced this with explicit @Order annotations:
@Order(1) // Runs first — ensures user exists
public class Auth0UserSyncFilter extends OncePerRequestFilter { ... }
@Order(2) // Runs second — can now look up the user
public class ImpersonationFilter extends OncePerRequestFilter { ... }
Spring Security's filter chain doesn't guarantee ordering by default. If you register filters without explicit ordering, you're at the mercy of component scanning order, which can vary between environments.
Bug 3: The Race Condition in User Creation
Symptom: Intermittent DataIntegrityViolationException — duplicate key constraint on the users table — during peak traffic.
Root cause: The Auth0 user sync had a classic check-then-act race condition:
// Thread A // Thread B
user = findByAuth0Sub(sub); user = findByAuth0Sub(sub);
// user is null // user is null
user = createUser(sub, email); user = createUser(sub, email);
// SUCCESS // DataIntegrityViolationException!
Two concurrent requests from the same user (common on app startup — the mobile app fires multiple API calls simultaneously) both see "user doesn't exist" and both try to create the record. One succeeds. One hits the unique constraint.
The fix: Catch the constraint violation and retry the lookup:
UserEntity syncUser(String auth0Sub, String email) {
UserEntity existing = userRepository.findByAuth0Sub(auth0Sub);
if (existing != null) {
return existing;
}
try {
return createNewUser(auth0Sub, email);
} catch (DataIntegrityViolationException e) {
// Another thread created the user between our check and insert.
// Just fetch the record they created.
UserEntity raced = userRepository.findByAuth0Sub(auth0Sub);
if (raced != null) {
return raced;
}
throw e; // Genuine constraint violation, not a race
}
}
This is the optimistic concurrency pattern. Instead of acquiring a lock before the check (pessimistic), we let the race happen and recover from the loser's exception. It's cheaper under normal load (no locking overhead) and handles the edge case gracefully.
Why All Three Were Connected
The 405 errors drew attention to our filter chain. Investigating the filter chain revealed the ordering bug. Fixing the ordering bug and putting more load on the sync path exposed the race condition.
It's a common pattern in production debugging: the bug you're investigating isn't the bug that matters. It's the thread that leads you to the real problem.
Lessons Learned
-
Handle every HTTP method in your exception handler. Bots send
PROPFIND,TRACE,PATCHto paths that don't support them. Don't let these bubble up as unhandled exceptions. -
Spring filter ordering is not implicit. If Filter B depends on state created by Filter A, use
@Orderto guarantee A runs first. Don't rely on component scan order — it varies. - Check-then-act is a race condition. If two threads can execute the "check" simultaneously, they'll both proceed to "act." Use optimistic concurrency (catch + retry) or pessimistic locking (SELECT FOR UPDATE).
- Mobile apps create concurrent requests on startup. When the app opens, it often fires 3-5 API calls in parallel (user profile, notifications, config). If your user sync runs per-request, you will hit the race condition.
- One bug leads to another. Don't stop when you fix the surface issue. Ask: "Why did this request reach this code path in the first place?"
What's the most interconnected set of bugs you've found in production? Share the debugging chain in the comments.
Building jo4.io — a multi-tenant platform where auth bugs are never "just" auth bugs.
Top comments (0)