Postmortem: How a C# 13 Null Reference Bug Caused 500 Errors for 5% of API Requests for 1 Hour
Executive Summary
On October 12, 2024, at 14:00 UTC, our team deployed a minor update to our core API service targeting .NET 9 (with C# 13 language features enabled) to add optional metadata to user profile responses. Within 5 minutes of deployment, we observed a 5% spike in 500 Internal Server Error responses for API requests, primarily affecting endpoints that return user profile data. The issue persisted for 1 hour until a rollback was completed at 15:00 UTC, after which error rates returned to baseline.
Investigation revealed the root cause was a C# 13 null reference bug introduced by a misconfigured nullable reference type annotation in a new helper method, which bypassed compiler null checks under specific runtime conditions. No user data was lost or corrupted, and all affected requests were retried successfully by client applications with exponential backoff.
Timeline
- 14:00 UTC: Deployment of API v2.8.1 to production, enabling C# 13 features including enhanced nullable reference type checks.
- 14:05 UTC: Monitoring alerts triggered for 500 error rate exceeding 1% threshold (baseline: 0.02%).
- 14:07 UTC: On-call engineer acknowledged alert, started investigating API logs.
- 14:15 UTC: Identified pattern: 500 errors only occurred for requests where user profile metadata was null, affecting ~5% of total traffic.
- 14:20 UTC: Isolated bug to
ProfileMetadataSerializerhelper class added in the deployment. - 14:35 UTC: Confirmed root cause: Nullable annotation mismatch causing runtime null reference exception.
- 14:40 UTC: Initiated rollback to API v2.8.0.
- 15:00 UTC: Rollback completed, error rates returned to baseline.
- 15:30 UTC: Post-rollback validation completed, all endpoints healthy.
Impact
Total affected requests: ~12,500 (5% of 250,000 hourly API requests).
Affected endpoints: /v2/users/{id}/profile, /v2/users/bulk-profiles.
No user data was modified, lost, or exposed. All client applications retried failed requests automatically, with no reported user-facing outages. 500 errors returned a standard JSON error payload, so no malformed responses were served.
Root Cause
The deployment introduced a new ProfileMetadataSerializer class to handle optional metadata fields for user profiles, compiled with C# 13's nullable reference type enhancements. The class had the following method:
public string? SerializeMetadata(UserProfile profile)
{
if (profile.Metadata is null)
return null;
return JsonSerializer.Serialize(profile.Metadata);
}
However, the calling code in the API controller incorrectly omitted a null check for the return value, assuming the method would never return null (despite the nullable return type annotation):
var metadataJson = _serializer.SerializeMetadata(userProfile);
// Bug: No null check before accessing metadataJson
var response = new ProfileResponse
{
UserId = userProfile.Id,
MetadataJson = metadataJson.ToLowerInvariant()
};
C# 13 introduced improved nullable flow analysis that would have flagged the unguarded access to the nullable metadataJson variable at compile time, but our team had not updated our CI rules to enable Nullable warnings as errors for C# 13 code. The developer also incorrectly assumed that the SerializeMetadata method would never return null, despite the explicit string? return type, leading to the unguarded access. The Metadata field was null for only 5% of user profiles, so the SerializeMetadata method returned null only for that subset, triggering the null reference exception when ToLowerInvariant() was called on the null string.
Remediation
- Immediately rolled back to the previous stable API version (v2.8.0) which did not include the new metadata feature.
-
Fixed the bug by adding a null check for
metadataJsonin the controller:
var metadataJson = _serializer.SerializeMetadata(userProfile); var response = new ProfileResponse { UserId = userProfile.Id, MetadataJson = metadataJson?.ToLowerInvariant() }; Updated the CI pipeline to set
TreatWarningsAsErrorstotruefor all C# 13 nullable reference type warnings.Re-deployed the fixed version (v2.8.2) at 16:30 UTC after full validation in staging.
Prevention Steps
- Enforce nullable reference type warnings as errors in all C# 13+ projects, with no exceptions for legacy code.
- Add mandatory null reference exception tests for all helper methods that return nullable types, using C# 13's
MemberNotNullorMemberNotNullWhenattributes where applicable. - Update code review checklists to require explicit null handling documentation for all nullable return values.
- Add a pre-deployment check that scans for nullable warning suppressions (e.g.,
#pragma warning disable nullable) and requires sign-off from a senior engineer. - Improve monitoring to alert on 500 error rate spikes for specific endpoints, rather than aggregate API error rates, to reduce time to detection.
Conclusion
This incident highlighted the importance of aligning CI pipeline configurations with new language features like C# 13's nullable enhancements. While nullable reference types reduce null reference bugs, they are only effective if compiler warnings are not suppressed. We have since updated all our pipelines to enforce nullable warnings as errors, and added additional guardrails to prevent similar issues in the future.
Top comments (0)