Dennis Traub for AWS

Posted on Mar 27

8 Agents Wrote Perfect Components - And Nothing Worked

#ai #programming #productivity #architecture

TL;DR: Parallel AI agents don't coordinate on shared contracts, such as column names, URL paths, parameter formats, or identifiers. Extract those contracts into a single reference file before generation, and run a review agent that traces end-to-end data flows once the parallel agents are done. This single step fixed all 17 bugs in one pass.

I launched 8 AI agents in parallel to build a full-stack app on AWS: infrastructure stacks, a React frontend, and a Java backend. Each agent owned one piece, and they all delivered clean, compiling code. The CDK type-checked, the Java backend followed Spring Boot conventions, the React UI looked nice.

But when I tried to wire them together I hit bugs at every single boundary.

The architecture

A full-stack app on AWS with a lot of moving parts. Multiple CDK stacks for the infrastructure (IAM, VPC, DB with seed functions, Cognito, CodePipeline, CloudFront/WAF), a Spring Boot backend on ECS Fargate, and a React frontend hosted on S3.

The implementation plan was thorough and covered every component. But it wasn't detailed enough for agents that need to agree on shared contracts.

The bugs

The first two block everything. Bugs 3 through 5 only show up after you fix the previous ones.

Bug 1: The Spring Boot app won't even start

The seed data function creates a schema with passenger_id and full_name, but the Spring Boot entity maps to id and name:

-- Agent 1: seed data function creates the schema
CREATE TABLE passengers (
    passenger_id   VARCHAR(64) PRIMARY KEY,
    full_name      VARCHAR(255) NOT NULL,
    ...
);

// Agent 2: The Spring Boot entity maps the table
@Column(name = "id")       // Schema says "passenger_id"
@Column(name = "name")     // Schema says "full_name"

With ddl-auto: validate, Hibernate checks the mapping on startup. But the columns don't exist, so the ECS task crashes before serving a single request.

Bug 2: Every call returns 404

The CDK stack registers ALB routes for /approve and /generate while the Java client sends requests to /voucher/approve and /voucher/generate:

CDK ALB routes:  /approve, /generate
Java client:     /voucher/approve, /voucher/generate

Both agents wrote correct, working code in isolation, but the CDK stack used clean paths while the Java client added a service prefix. Neither checked the other.

Bug 3: Missing request fields

A downstream service validates four required fields. The Java client sends three:

Lambda expects:  escalationId, passengerId, amount, situation
Java sends:      escalationId, passengerId, amount

Even with the URLs from bug 2 fixed, every approval returns 400.

Bug 4: User lookup doesn't work

This one was the most interesting: three systems work with the user, and each of them created their own identifier:

Cognito custom attribute:  custom:passenger_id = "pax-a1b2c3d4-e5f6-..."
RDS seed data:             passenger_id = "PAX-a1b2c3d4-e5f6-..."
JWT subject claim:         sub = "a1b2c3d4-e5f6-..."  (Cognito UUID)

The backend uses jwt.getSubject() to look up the user. That's a Cognito UUID - neither prefixed with pax- nor with PAX-. No user lookup ever returns a result.

Three agents. Three naming conventions. Zero coordination.

Bug 5: Every status lookup returns "not found"

A downstream service returns JSON. The Java client parses XML:

{"status": "FOUND_LOCAL", "location": "Warehouse-B-Shelf-47"}

String status = extractXmlElement(xml, "status");  // Looks for <status>...</status>

No XML tags in a JSON string. extractXmlElement returns empty for every single request.

The agent that wrote the downstream service followed one spec (JSON). The agent that wrote the Java client followed a different spec (XML).

Bugs 6 to 17: SSM parameter path mismatches

One CDK stack writes an SSM parameter. Another CDK stack reads it. But they never coordinated on paths:

Producer stack writes:  /${AppName}/test/data/rds-secret-arn
Consumer stack reads:   /${AppName}/${Env}/data/rds-password-secret-arn

...

Twelve SSM parameters mismatched between producer and consumer stacks. The app fails on every one of them.

Why parallel agents can't catch this

Each agent had context about the overall plan and its own component. But none of them could see the implementation details that the others came up with.

When I write an app, I hold the contracts in working memory. "The column is passenger_id, so I'll use that in both the migration and the entity." But an AI agent writing the migration doesn't know what the entity agent chose for its column name - and vice versa.

The plan contained all the high-level information, but the agents were reading different sections and making their own calls on the shared details.

Each agent wrote correct code that followed good conventions. But they never coordinated. Like digging a tunnel from two sides of a mountain - without ever checking in with each other.

How I found all of them at once

After generation, before actually deploying the app. I ran an architecture review agent with a simple instruction:

Trace the actual data flow from user login through form submission to the downstream service calls, following every cross-component boundary.

It found every one of the bugs in a single pass.

The review agent started at the user-facing entry point, traced the request through every boundary, and at each one checked whether what one component sent actually matched what the next one expected. Same thing integration tests do after deployment, but you catch it before deploying anything.

How to prevent seam bugs

Before launching parallel agents, pull every shared contract out of the plan into a single reference file and pass it to every agent as mandatory context.

Then, after your parallel agents did their thing, run a review agent that traces a few real user flows across all the boundaries.

Fix the seam bugs in one pass, then deploy.

FAQ

What are seam bugs in AI-generated code?

Seam bugs are integration defects at the boundaries between components built by different AI agents. Each agent writes correct, working code in isolation, but the components don't fit together because the agents each made their own decisions about shared details - things like what a column is called, what path an API lives at, or what format an identifier uses.

Why does parallel AI code generation produce integration bugs?

Each agent only sees its own component and the plan it was given. When two agents need to agree on something - say, what a database column is called - they each pick a reasonable name independently. Those names often don't match. The plan says what the column should represent, but not necessarily the exact string both sides should use.

How do you catch integration bugs from parallel AI agents?

Run a single review agent after generation that traces real user flows across all the boundaries. Give it a prompt like "trace the data flow from user login through the frontend, backend, to databases and downstream service calls, checking every boundary." It will catch the mismatches in one pass.

DEV Community