DEV Community: Taras H

API Contract Testing: Why Safe Changes Still Break Clients

Taras H — Sat, 09 May 2026 12:00:00 +0000

APIs often break clients in ways that don’t show up in tests.

A change looks safe inside the service:

remove an unused field
tighten validation
adjust an error response

Everything still works locally. Tests pass. Deployment succeeds.

Then a mobile app crashes.

A partner integration fails.

An older frontend silently breaks.

Nothing is “wrong” in the service - but the contract changed.

The Problem: Internal Changes, External Breakage

Consider a response:

{
  "id": "ord_123",
  "status": "paid",
  "customer": {
    "id": "cus_456",
    "email": "ada@example.com"
  }
}

Removing customer.email looks like cleanup.

But for a client, that field might:

power a receipt screen
feed an export pipeline
be required in a generated SDK

From the server’s perspective: harmless

From the client’s perspective: breaking change

Why This Keeps Happening

Most tests focus on correctness inside the service:

business logic
database state
request handling

They don’t protect what clients depend on.

That gap is where breakage happens.

What Contract Testing Actually Protects

Contract testing focuses on the boundary between services and clients.

It answers:

“Did we change what clients rely on?”

Typical breaking changes:

removing or renaming fields
changing types
adding required inputs
changing error formats

Non-breaking (usually):

adding optional fields

Why Schema Alone Isn’t Enough

Schema diffs (like OpenAPI) catch structure.

But real systems depend on behavior:

error codes
pagination shape
idempotency responses

Those require examples, not just schemas.

The Key Insight

Most API breakages aren’t failures of code - they’re failures of assumptions at the boundary.

Contract testing makes those assumptions visible before release.

Final Thought

Small internal changes can quietly break external systems.

Contract testing turns that from a surprise into a decision.

👉 Full deep dive: https://codenotes.tech/blog/api-contract-testing-prevent-breaking-clients-before-release

How to Write API Integration Tests (That Actually Catch Bugs)

Taras H — Mon, 04 May 2026 15:05:14 +0000

API integration tests aren’t about checking 200 OK.

They exist to answer a harder question:

When a real request crosses authentication, validation, persistence, and transactions — does the system behave correctly?

Most production bugs don’t live inside a single function.

They happen between boundaries.

What Integration Tests Should Actually Prove

A useful API integration test verifies:

routing + request parsing
authentication & authorization
validation and error shape
database writes and reads
transaction boundaries
response contract
retry / duplicate handling
concurrency behavior

This sits between unit tests and end-to-end tests:

Unit → logic correctness
Integration → system behavior at boundaries
E2E → full user journey

Start With Risk, Not Coverage

Don’t write the same number of tests per endpoint.

Focus on endpoints that:

mutate important data
cross auth or tenant boundaries
involve multiple writes or transactions
depend on external systems
must handle retries or duplicates
have broken in production before

The goal isn’t coverage.

It’s risk-weighted confidence.

What a Good Test Looks Like

A strong integration test:

sends a real HTTP request
goes through real auth + validation
writes to a real database
asserts persisted state, not just response

const res = await request(app)
  .post('/api/orders')
  .set('Authorization', `Bearer ${token}`)
  .send(body)

expect(res.status).toBe(201)

const order = await db.order.findFirst(...)
expect(order).toBeTruthy()

If you only check the response, you’re missing half the system.

The Critical Cases People Skip

These are where integration tests provide real value:

Authorization → tenant / ownership rules
Validation → bad input blocked before writes
Rollback → no partial state on failure
Idempotency → retries don’t duplicate effects
Concurrency → overlapping requests don’t break invariants

Most bugs hide here—not in the happy path.

What Not to Mock

Avoid mocking:

database
transactions
auth middleware
validation
your own application code

Mock only external systems like payments or email.

Mock dependencies outside your system, not inside it.

The Short Version

Good API integration tests prove:

real requests cross real boundaries
correct state is persisted
failures don’t leave partial data
retries and concurrency are safe
response contracts don’t break clients

They matter because production bugs usually live between components, not inside them.

👉 Full deep dive (idempotency, concurrency, rollback examples):

https://codenotes.tech/blog/how-to-write-api-integration-tests

API Idempotency Keys: Prevent Duplicate Requests

Taras H — Sun, 26 Apr 2026 17:00:00 +0000

Duplicate requests aren’t edge cases - they’re normal behavior in distributed systems.

A client times out, retries, and suddenly your API creates:

two payments
two orders
two subscriptions

Idempotency keys exist to prevent that. But many implementations still fail under real conditions.

The Problem

Consider POST /payments:

Server processes the payment
Response is lost (timeout, network issue)
Client retries

Without idempotency, the retry looks like a new request → duplicate charge.

The Assumption That Breaks

A common approach is:

“Store the idempotency key and reject duplicates.”

This sounds correct—but it’s not.

Two concurrent requests can both:

check for the key
see nothing
execute the side effect

Result: duplicates still happen.

What Actually Works

Idempotency is not just storing keys - it’s about ownership of execution.

The critical rule:

Only one request must be allowed to perform the operation.

This requires an atomic reservation, typically:

SQL: unique constraint + INSERT ... ON CONFLICT
Redis: SET NX

Everything else builds on top of that.

The Minimum Safe Design

A correct implementation must:

Atomically reserve (scope, key)
Store a request fingerprint (to detect misuse)
Track state:
- in_progress
- completed
- ambiguous
Replay the original response on retries
Reject same key with different payload
Use a TTL that matches real retry behavior

The Hard Part: Ambiguous Failures

The real failure mode isn’t duplicates - it’s uncertainty.

Example:

Payment provider accepts the charge
Your service times out
Client retries

You don’t know if the charge succeeded.

Retrying blindly can double-charge.

Safe systems:

mark the request as ambiguous
reconcile with downstream systems
only finalize once certainty is restored

Practical Signals

If idempotency is working, you should see:

replayed responses (normal)
occasional in-progress conflicts
rare payload mismatches

If not, expect:

duplicate writes
inconsistent downstream state
hard-to-debug production issues

The Core Insight

Idempotency keys are not a cache.

They are a correctness boundary:

they define ownership
they prevent duplicate side effects
they preserve system integrity under retries

Without atomic reservation and state modeling, they don’t actually solve the problem.

Full Article

For the complete breakdown (schema design, handler flow, TTL strategy, and failure cases):

👉 https://codenotes.tech/blog/api-idempotency-keys-prevent-duplicate-requests

Background Jobs in Production: The Problems Queues Don’t Solve

Taras H — Sun, 08 Mar 2026 11:17:45 +0000

Moving work out of the request path is one of the most common ways to
speed up backend systems.

Emails are sent asynchronously.
Invoices are generated by workers.
Webhooks are delivered through queues.
Image processing and indexing run in background jobs.

Latency improves immediately.

But many teams eventually notice something strange in production:

duplicate emails appear
retries increase system load
dead-letter queues slowly grow
workflows technically "succeed"... but the outcome is wrong

The queue is healthy.
Workers are running.

Yet the system behaves incorrectly.

Moving work to the background changes where failures happen.
It does not remove them.

This post is a shorter version of a deeper engineering write-up
originally published on CodeNotes.

The Assumption Behind Background Jobs

Background job systems are usually introduced with a simple expectation:

If a job fails, the queue will retry it until it succeeds.

Queues also provide useful features:

buffering traffic spikes
independent worker scaling
retry handling
isolation from request latency

Because of this, async processing often feels safer than synchronous
execution.

But that assumption depends on something rarely guaranteed in
production:

that running a job multiple times produces the same result as running
it once.

What "At-Least-Once Delivery" Actually Means

Most queue systems guarantee at-least-once delivery.

That means the system will try hard to deliver a message - even if it
results in duplicate execution.

It does not mean:

the job runs exactly once
side effects happen exactly once
messages are processed in order

In other words, the queue protects against message loss, not
duplicate work.

Once duplicate execution becomes possible, correctness has to come from
somewhere else.

Usually that means:

idempotent handlers
deduplication keys
explicit state transitions
retry boundaries

Without those protections, the infrastructure is reliable while the
workflow is not.

A Classic Failure Scenario

Consider a worker that sends a payment receipt:

await emailClient.send(...)

await db.payment.update({
  receiptSentAt: new Date()
})

If the worker crashes after sending the email but before updating
the database, the job will be retried.

Now the customer receives two receipts.

The queue behaved exactly as designed.

But the business outcome is incorrect.

Why Production Systems Break Here

Background job systems introduce two things that make correctness
harder.

1. Duplicate execution

Workers can crash after performing side effects but before acknowledging
the message.

2. Time separation

Jobs may execute minutes or hours after they were created, when system
state has already changed.

Because of this, retries often interact with partial state or
outdated context.

The Design Rule Most Teams Learn Later

A background job should never be treated as a one-time action.

It should be treated as a replayable command.

Every handler should be safe if it runs:

twice
later than expected
after partial completion
out of order

If those conditions break the workflow, retries will eventually corrupt
system behavior.

The Monitoring Trap

Teams often monitor queue infrastructure:

queue depth
worker throughput
retry counts
dead-letter volume

Those metrics matter - but they don't answer questions like:

Did users receive duplicate emails?
Did a payment create multiple ledger entries?
Did downstream systems receive conflicting updates?

A queue dashboard can look completely healthy while the workflow is
incorrect.

Read the Full Production Breakdown

This post only covers the core failure patterns.

The full article explains:

why retries can make outages worse
how idempotent background jobs are designed
why dead-letter queues silently grow
what production teams monitor beyond queue depth
a practical rollout checklist for new background jobs

👉 Full article:
https://codenotes.tech/blog/background-jobs-in-production

Why AI Code Review Comments Look Right but Miss Real Risks

Taras H — Fri, 27 Feb 2026 16:59:54 +0000

Many teams have added AI code review to their pull request workflow.

The promise is obvious: faster feedback, broader coverage, fewer review bottlenecks. AI scans every diff, flags suspicious code, suggests test cases, and highlights style issues in seconds.

Pull requests move faster. Review queues shrink. Everything looks healthier.

But production incidents don’t disappear.

So the practical question emerges:

If AI reviews every PR, why are high-risk issues still reaching production?

The Reasonable Assumption

It’s natural to assume:

More review coverage + faster feedback = better quality.

AI increases comment volume. It catches missing null checks. It suggests cleaner error handling. It improves surface-level consistency.

At a process level, things look better.

But review activity is not the same thing as risk reduction.

Where the Gap Appears

Most AI code review tools are excellent at:

Pattern matching
Local correctness
Code explanation
Generic best practices

They are much weaker at:

Business logic validation
Authorization boundaries
Implicit architectural constraints
Production failure modes

For example:

export async function updateUserRole(userId: string, role: string) {
  const user = await db.user.findUnique({ where: { id: userId } });

  if (!user) {
    throw new Error('User not found');
  }

  await db.user.update({ where: { id: userId }, data: { role } });
}

An AI reviewer might suggest stronger validation or clearer error handling.

But the real production risk may be completely different:

Who is allowed to change roles?
Is there audit logging?
Does this break cross-service assumptions?
What happens under concurrent updates?

These risks don’t live in the diff. They live in the system.

Why AI Feels More Effective Than It Is

Three patterns show up repeatedly:

1. Plausible Comments Create Confidence

LLMs generate comments that sound correct. That increases perceived rigor — even when the risk profile hasn’t changed.

2. Diffs Hide System Context

Pull requests rarely include architectural history, compliance constraints, or production incident lessons. Humans often carry this context implicitly. AI usually doesn’t.

3. Automation Changes Human Behavior

When AI has already “reviewed” the code, humans subtly shift from critical analysis to verification mode.

The question changes from:

“What could fail in production?”

to:

“Did we resolve the AI comments?”

That shift matters.

The Key Insight

AI expands coverage.

Humans must still own judgment.

AI is strong at local correctness. Production failures usually emerge from system interactions: retries under load, cache drift, authorization boundaries, cross-service contracts.

If the review process optimizes for comment resolution instead of failure thinking, speed improves — but risk stays constant.

If You’re Using AI Review

A useful mental model:

Let AI handle first-pass mechanical checks.
Explicitly reserve human review for system-level risk.
Measure escaped defects — not comment counts.

The real question isn’t whether AI comments are helpful.

It’s whether your review process still forces engineers to think about how systems fail in production.

If this topic resonates, the full breakdown goes deeper into why this happens and how teams misinterpret review signal vs. real risk:

👉 Full article:
https://codenotes.tech/blog/why-ai-code-review-comments-look-right-but-miss-real-risks