DEV Community

Cover image for From Warnings to Evidence: Reproducing Java Runtime Failures with Testcontainers
Joaquinriosheredia
Joaquinriosheredia

Posted on

From Warnings to Evidence: Reproducing Java Runtime Failures with Testcontainers

Last year I noticed something frustrating:

The more advanced the warning, the more likely developers were to ignore it.

Not because they were careless. Because the warning stopped at "this is dangerous" and never showed why.

So I tried to fix that.

The Idea: Detect → Explain → Verify

Most static analyzers stop at detection. They tell you what is wrong.

I wanted to build something that shows why it matters — reproducibly, on any machine, in under a minute:

bashjava-vibe-guard .

❌ VIBE-001 detected
Evidence available: Java Production Lab #04
Verify locally: java-vibe-guard --verify VIBE-001

bashjava-vibe-guard --verify VIBE-001

Environment Check
✓ Docker 29.x
✓ Java 21
✓ Memory OK

VIBE-001 Verification

Observed:
✓ Connection pool fully utilized (utilization: 100%)
✓ Requests blocked waiting for connections (waiting: 15)
✓ Latency amplification under concurrency (p95: 1540ms)

Matches behavior documented in:
Java Production Lab #04

The verifier turns a warning into evidence.

The Anti-Pattern: VIBE-001

java@Transactional
public void createOrder() {
repository.save(new Order());

CompletableFuture
    .supplyAsync(() -> {
        Thread.sleep(500); // simulating external IO
        return "OK";
    })
    .join(); // DB connection still held here
Enter fullscreen mode Exit fullscreen mode

}

This code compiles. Unit tests pass. In development with one or two concurrent users, nothing breaks.

The problem: @Transactional acquires a HikariCP connection at the start of the method and holds it until the method returns. While join() blocks the thread waiting for async work, that connection sits idle — unavailable to other requests.

With a pool of 5 connections and 20 concurrent users, requests queue up in batches. Latency compounds with each batch. The connection pool becomes the bottleneck.

The Evidence: Lab #04

Before building the verifier, I ran this pattern under load using Testcontainers in Java Production Lab #04:

Min latency: 510ms (first batch, pool available)
Max latency: 2025ms (fourth batch, pool exhausted)
Spread: 1515ms

What mattered wasn't the exact numbers. What mattered was that the shape of the failure was reproducible.

That spread is the signature of connection pool contention — not hardware variance, not network noise. The pool exhausting and recovering in predictable cycles.

Building the Verifier: Decisions That Matter

Why Thread.sleep(500) and not an HTTP mock

I wanted to simulate blocking IO without introducing unrelated variables — DNS resolution, network latency, connection timeouts. All of those would obscure the phenomenon I was trying to demonstrate.

Thread.sleep(500) isolates what matters: the thread is blocked, the connection is held, the pool drains. One variable, one cause.

Why observable phenomena instead of exact metrics

Early versions checked for specific numbers: p95 > 2000ms. Brittle. Different machines, different CI runners — the numbers shifted.

The stable version checks for phenomena:

maxLatency > minLatency * 2
spread > 800ms

The verifier is designed to confirm the phenomenon, not benchmark performance. The threshold is generous enough to survive hardware variation while being strict enough to be meaningful.

Making the test invisible to normal builds

java@EnabledIfSystemProperty(named = "vibe.verify", matches = "true")
class Vibe001VerificationTest { ... }

Running mvn test never triggers this. The Node.js verifier activates it explicitly. The verification layer stays completely separate from the normal development cycle.

Unexpected Findings

Testcontainers and Docker Engine compatibility

While debugging compatibility issues with Docker Engine 29.x, I found that configuring api.version=1.44 in Maven Surefire resolved connection failures in my setup:

xml
1.44

If you're running Testcontainers 1.20.x with a recent Docker Engine and hitting unexplained connection errors, this is worth checking.

A note on CI hardening

The verifier installs testcontainers-doctor as a CI dependency. Given recent supply chain incidents in the npm ecosystem, I pinned the version and blocked install scripts:

yamlrun: npm install -g --ignore-scripts testcontainers-doctor@1.0.0

Small habit. Worth keeping.

Results

EnvironmentOutcomeLocal✅ p95: 1521ms · pool: 100% · waiting: 15CI — mvn test✅ spread > 800ms confirmedCI — CLI✅ exit code 0Clean machine (fresh clone)✅ p95: 1540ms · pool: 100% · waiting: 15

The numbers are consistent enough across environments that the phenomenon is clearly portable, not hardware-dependent.

The Interesting Part

Static analysis tells you where to look.
Evidence tells you what happens next.
The gap between those two things is where many production incidents hide.

Developers trust evidence more than warnings. A warning says "this might fail." Evidence says "here is what happens when it does."

And reproducible evidence — the kind you can run on your own machine, in a controlled experiment — is something our tooling still provides surprisingly little of.

Repo: https://github.com/Joaquinriosheredia/java-vibe-guard

testcontainers-doctor: https://www.npmjs.com/package/testcontainers-doctor

Java Production Labs: https://github.com/Joaquinriosheredia/Java-Production-Labs

Tags: #java #testing #docker #opensource #security

Top comments (0)