Last month, I watched a production incident unfold at a company I was consulting for. Their mobile app started crashing for roughly 30% of users. T...
For further actions, you may consider blocking this person and/or reporting abuse
Why do you manually write the openAPI spec? If there's drift, how do you consume new endpoints in the client?
I just have the BE generate it automatically so it's never out of date.
My FE deployment pipelines fetch the latest spec, then generate the client code and compile. If there's a build error, output a diff of the schema changes so it's immediately clear what field or endpoint changed (eg int to string would fail, int to long would likely pass fine).
This works for native apps, Flutter, typescript Web projects etc.
You can do the same for third party services, whether they publish openAPI or an sdk.
Great setup, auto-generated specs with client codegen and build-time validation is genuinely one of the strongest workflows out there. If you have that running end-to-end, you're ahead of 90% of teams I've worked with.
That said, here's where the problem still lives even with that pipeline:
This nails a problem I've been experiencing for years. The "green tests, broken API" scenario - especially brutal with third-party APIs you don't control — you can't even add the validation to your CI pipeline because you're not the one deploying the changes.
One approach we're been exploring is continuous live monitoring — polling endpoints on a schedule and comparing the actual response structure against a learned baseline. Not just "are the expected fields present" but also "did any types change? Did nullability shift? Are there new fields that might indicate a structural migration?"
The severity classification piece is key too. Not every schema change is a fire. A new optional field is informational. A type change from string to string|null is a warning. A removed field is breaking. Without that classification, you drown in noise.
Curious whether your team found that the 23/47 drift cases were mostly additive (new fields) or mostly breaking (removed/changed fields)? In the data I've seen, the split tends to be ~60% additive / ~40% breaking-or-warning. Most changes aren't tragic, but still worth knowing about.
This article really hits an under‑the‑radar issue! Schema drift can quietly break your API tests without you noticing, and it’s one of those problems that feels obvious in hindsight. Love the practical examples and reminders to keep test suites aligned with real API changes — definitely something every API team should be watching for. Thanks for highlighting this!
This resonates — we hit this exact problem with data payloads, not just API responses. Fields going null, types changing, cardinality shifting. We built a screening layer that sits at the ingest boundary and compares every batch against a stored schema fingerprint (SHA-256 of sorted field names + types). First batch sets the baseline, every subsequent batch is compared. Drift events get classified: new field = WARN, type change = BLOCK, null spike = WARN/BLOCK depending on severity.
The "compared to what?" problem you mention is key — we solve it by storing baselines per-source with EMA-smoothed null rate history, so gradual drift gets caught too.
If anyone's interested in the implementation: github.com/AppDevIQ/datascreeniq-p...
Okay about type change from int to string, it may be missed, although may be caught by schema validation.
But what the tests do if they don’t fail when property name changes or type is switched to array? 🫨🤔
nice!