FlareCanary

Posted on Apr 14

How to Detect API Breaking Changes Before They Hit Production

#api #testing #monitoring #devops

Your API integration works today. Will it work tomorrow?

Most teams discover breaking API changes the hard way: a production incident, a customer complaint, or a Slack message that starts with "is anyone else seeing...?"

Here's a practical checklist for catching breaking changes before they reach production.

What counts as a "breaking change"?

Not every API change breaks your code. A useful severity model:

Breaking (immediate action required):

Field removed from response
Field type changed (integer → string, object → array)
Required field becomes absent
Enum value removed that your code relies on
Response structure reorganized (nested object flattened or vice versa)

Warning (investigate soon):

Nullable field that was never null starts returning null
New enum values your switch/case doesn't handle
Field value range changes (IDs go from 6 to 10 digits)
Date format changes (2026-04-07 → 1712444800)

Informational (usually safe):

New fields added to response
New optional parameters in request
New enum values added (if you have a default handler)

The dangerous ones are the warnings. They don't crash your app — they corrupt data silently.

The detection checklist

Layer 1: CI/CD (catch your own drift)

If you maintain OpenAPI specs:

[ ] Run spec diffing on every PR. Tools like oasdiff compare spec versions and flag breaking changes. Free, open source, GitHub Action available.
[ ] Enforce backwards compatibility in CI. Fail the build if a PR removes a field or changes a type in your public API spec.
[ ] Keep specs in sync with code. Use code-first spec generation (e.g., TypeSpec, Zod-to-OpenAPI) to prevent spec drift from code.

Layer 2: Contract testing (catch integration drift)

[ ] Write consumer-driven contracts for critical integrations. Pact and PactFlow let you define what you expect from an API and verify it against the provider.
[ ] Accept that most third-party APIs won't run your contracts. Contract testing is powerful when both sides participate. For external APIs you don't control, you need runtime monitoring.

Layer 3: Live monitoring (catch what everything else misses)

This is where most teams have a gap:

[ ] Monitor your top 5 external API dependencies. Which APIs, if they changed silently, would cause the worst impact? Start there.
[ ] Use a tool that compares real responses to expected schemas. This catches drift even when providers don't announce changes.
[ ] Set up severity-based alerting. Breaking changes → immediate (Slack/PagerDuty). Warnings → daily digest. Info → weekly review.
[ ] Check baseline freshness. If your expected schema is from 6 months ago, it might be your code that drifted, not the API.

Tools for live monitoring:

FlareCanary — polls endpoints on a schedule, compares against learned baselines or OpenAPI specs, classifies changes by severity. Free for 5 endpoints.
API Drift Alert — similar concept, enterprise pricing ($149/mo+).
Rumbliq — general monitoring platform with basic JSON diffing.

Layer 4: AI agent dependencies

If your application uses AI agents that call external tools (MCP servers, function calling):

[ ] Monitor tool schemas, not just tool availability. An MCP server returning 200 OK doesn't mean the tool schema hasn't changed.
[ ] Watch for parameter renames and type changes. LLMs will attempt to adapt silently — they'll pass the old parameter name and interpret empty results as "no data" rather than "wrong schema."
[ ] Track tool catalog changes. Tools being added or removed from an MCP server changes what your agent can do.

Common drift patterns (real examples)

The silent type change:
A payment provider changes transaction_id from integer to string (prefixed: txn_12345). Your code casts to int, gets 0, and processes a $0 transaction.

The nullable surprise:
A geocoding API starts returning null for formatted_address on ambiguous queries. Your UI renders "null" as literal text.

The enum expansion:
A shipping API adds status "returned_to_sender". Your switch statement falls through to default, which marks the shipment as delivered.

The nested restructure:
A social API moves user.profile.avatar_url to user.avatar_url. Your code traverses the old path and silently gets undefined.

The MCP tool rename:
An MCP server renames search_documents to query_documents. Your agent calls the old name, gets a "tool not found" error, and tells the user "I couldn't find any documents" instead of surfacing the real issue.

The minimum viable monitoring setup

If you do nothing else:

List your 5 most critical external API dependencies
Set up daily schema checks on those endpoints (FlareCanary's free tier covers exactly this)
Route breaking changes to Slack or email
Review warnings weekly

This takes 10 minutes to set up and catches the most impactful drift before it becomes a production incident.

FlareCanary monitors REST APIs and MCP servers for schema drift. Free tier, no credit card required.

DEV Community