DEV Community

Preecha
Preecha

Posted on

Google Agent Smith Writes 25% of Google's Code: What API Teams Should Know

TL;DR

Google’s internal AI coding agent, Agent Smith, now generates over 25% of the company’s new production code. Unlike autocomplete tools like Copilot, Agent Smith works asynchronously in the background, writing, testing, and iterating on code without human interaction. For API teams, this raises questions about contract stability, test coverage, documentation drift, and review workflows when a quarter of your codebase is machine-generated.

Try Apidog today

Introduction

During a March 2026 earnings call, Google CEO Sundar Pichai shared a number that caught the software industry’s attention: AI-generated code now accounts for more than 25% of new code produced at Google.

This is not autocomplete. It is not a developer accepting Copilot suggestions line by line. This is code generated by an AI agent, reviewed by humans, and shipped to production.

The internal tool behind it is called Agent Smith, a nod to the self-replicating antagonist from The Matrix. It reportedly became popular enough across Google’s 180,000+ employees that Google had to throttle access to manage infrastructure strain.

Agent Smith represents a different class of AI coding tool. Copilot and Claude Code usually assist while a developer is actively working. Agent Smith works asynchronously: engineers assign tasks, leave the agent to work, and later review the completed output.

For API teams, this changes the risk model. If autonomous agents can modify endpoints, schemas, validation logic, and tests, you need guardrails that do not depend on every reviewer catching every API-level change manually.

The core questions are practical:

  • How do you keep API contracts stable?
  • How do you detect schema drift?
  • How do you ensure generated tests cover existing behavior?
  • How do you keep docs, mocks, and specs synchronized?
  • How do you review AI-generated API changes without rubber-stamping them?

Apidog’s integrated API lifecycle platform helps keep API design, tests, mocks, and documentation in sync regardless of whether a change comes from a human developer or an AI coding agent.

This article explains what Agent Smith does, how it differs from other AI coding tools, and how API teams can build workflows that are safer for autonomous code generation.

What Agent Smith does

Asynchronous autonomous coding

Agent Smith does not sit in your IDE waiting for inline prompts. It runs in the background.

A typical workflow looks like this:

  1. An engineer describes a task in natural language.
  2. Agent Smith breaks the task into subtasks.
  3. It edits code across multiple files.
  4. It runs tests.
  5. It iterates on failures.
  6. The engineer reviews the completed work.

That makes Agent Smith less like autocomplete and more like a junior developer who picks up a ticket and returns later with a pull request.

For example, instead of asking for a single function implementation, an engineer might assign:

Add user notification preferences to the profile service.
Include persistence, API changes, and tests.
Enter fullscreen mode Exit fullscreen mode

An asynchronous agent may then touch:

  • route handlers
  • request validators
  • database models
  • service logic
  • unit tests
  • integration tests
  • API response types

That breadth is useful, but it also means API behavior can change in places reviewers may not immediately notice.

Google engineers can reportedly delegate tasks and check progress through Google’s internal chat platform, including from mobile devices. The tool can also access relevant employee profiles and internal documentation to pull context from Google’s knowledge base.

Built on Gemini and Antigravity

Agent Smith runs on Google’s Gemini model family and is augmented with retrieval systems that give it access to Google’s internal codebase and documentation.

It is built on top of Antigravity, Google’s agentic coding platform, and extends it with autonomous task decomposition and execution.

The retrieval layer matters. Agent Smith is not generating code in isolation. It can search internal implementations, reference existing patterns, and follow Google-specific conventions.

That context is what makes production-scale output possible. It also shows why your own API workflow needs explicit contracts and validation. The better the agent’s context, the better its output. The more implicit your API rules are, the easier they are to miss.

What “25% of new code” means

Pichai’s figure needs a precise reading.

“25% of new code” refers to code that:

  • is generated by AI, not merely autocompleted
  • passes human code review
  • ships in production systems
  • is measured across Google’s engineering output

It does not mean 25% of Google’s total historical codebase is AI-generated. It means that, at the time of the statement, over 25% of newly produced code was generated by AI.

The direction is still significant: autonomous coding is moving from experiment to production workflow.

How Agent Smith differs from other AI coding tools

The AI coding tool spectrum

Tool Mode Interaction Scope Production code?
GitHub Copilot Real-time autocomplete Inline in IDE Line/function level After human acceptance
Claude Code Interactive session Conversational Multi-file changes After human review
Cursor Agent Background + interactive IDE-embedded Project-level After human review
Agent Smith Asynchronous autonomous Task delegation Full feature implementation After human review
KAIROS (unreleased) Always-on daemon Background monitoring Repository-wide TBD

Agent Smith sits near the autonomous end of the spectrum.

The next step would be fully autonomous deployment without human review. No major tool does that yet, and for production API systems, it should not be the default.

Why asynchronous coding changes API review

With real-time AI tools, the developer usually sees each suggestion as it appears. They know what they asked for, why the code changed, and which assumptions were made.

With asynchronous agents, the reviewer sees the result after the work is done.

That creates several API-specific risks:

  • The reviewer may not know why a response format changed.
  • Contract changes may be buried inside implementation diffs.
  • Tests may validate only the new behavior.
  • Documentation, mocks, and SDK types may not be updated.
  • Breaking changes may pass code review if the reviewer focuses only on implementation correctness.

For API teams, the fix is not “review harder.” The fix is to make contracts executable and enforce them in CI.

What breaks when AI writes your API code

API contract drift

An API contract defines what consumers can rely on:

  • endpoints
  • methods
  • request schemas
  • response schemas
  • status codes
  • error formats
  • authentication requirements
  • pagination behavior
  • versioning rules

When humans modify an API, they may remember to update the OpenAPI spec, notify consumers, or version the change. Autonomous agents do not automatically know every coordination step unless your workflow enforces it.

Example scenario:

  1. Agent Smith is assigned: “Add user preferences to the profile endpoint.”
  2. It adds a preferences field to GET /api/users/{id}.
  3. Existing tests pass because they do not reject additional fields.
  4. A frontend TypeScript type does not include preferences.
  5. A mobile client with strict JSON parsing fails on the unexpected field.

The implementation may be reasonable. The tests may pass. The API contract may still be broken.

Test coverage gaps

AI agents can generate tests, but those tests often validate what the agent just built. They may not protect existing behavior.

For APIs, missing coverage often includes:

  • exact response schemas
  • standard error formats
  • authentication edge cases
  • authorization failures
  • rate limiting
  • pagination consistency
  • sorting behavior
  • backward compatibility
  • latency expectations
  • idempotency rules

A generated test like this is useful but incomplete:

it("returns user preferences", async () => {
  const response = await request(app).get("/api/users/123");

  expect(response.status).toBe(200);
  expect(response.body.preferences).toBeDefined();
});
Enter fullscreen mode Exit fullscreen mode

It confirms the new field exists. It does not confirm the full response still matches the contract.

Documentation drift

If your API docs are generated directly from OpenAPI, contract updates can flow into documentation automatically.

But many teams still maintain some API docs separately:

  • endpoint descriptions
  • examples
  • onboarding guides
  • migration notes
  • SDK usage snippets
  • consumer-specific caveats

When an agent changes an endpoint, those docs may not be updated unless the task explicitly includes it or your workflow requires it.

Even generated docs need human context. An agent can describe what an endpoint returns. It may not know why the endpoint exists, which consumers depend on it, or what migration constraints apply.

Review fatigue

AI-generated code often looks clean. It is formatted, consistent, and plausible.

That makes review harder, not easier.

Reviewers need to look beyond syntax and ask:

  • Does this match the API contract?
  • Does this preserve consumer expectations?
  • Does this follow versioning rules?
  • Are error responses consistent?
  • Are docs, mocks, tests, and schemas updated together?

If 25% of code is generated by agents, review volume increases. Without automated API checks, teams risk gradually rubber-stamping changes that look fine but break consumers.

How to build agent-proof API workflows

1. Make the API contract the source of truth

Design-first API development is the strongest defense against agent-induced drift.

Without a contract-first workflow:

Code change → Tests pass → Ship → Consumer breakage discovered later
Enter fullscreen mode Exit fullscreen mode

With a contract-first workflow:

OpenAPI spec defines contract → Code must match spec → CI catches drift
Enter fullscreen mode Exit fullscreen mode

Use your OpenAPI spec to define:

  • paths
  • methods
  • parameters
  • request bodies
  • response bodies
  • status codes
  • error schemas
  • auth requirements

Then validate implementation behavior against that spec.

Apidog’s visual API designer lets teams define endpoints, schemas, and response formats before implementation. When Agent Smith or another agent generates code, you validate the output against the spec instead of relying only on generated tests.

2. Use contract tests, not only unit tests

Unit tests validate internal behavior. Contract tests validate the agreement between your service and its consumers.

For AI-generated API changes, contract tests catch issues unit tests often miss.

Example using a strict response schema:

// This test fails if the response shape changes,
// even if the new shape looks reasonable.
describe("GET /api/users/:id contract", () => {
  it("returns the expected schema", async () => {
    const response = await request(app).get("/api/users/123");

    expect(response.body).toMatchSchema({
      type: "object",
      required: ["id", "name", "email", "created_at"],
      properties: {
        id: { type: "string" },
        name: { type: "string" },
        email: { type: "string", format: "email" },
        created_at: { type: "string", format: "date-time" }
      },
      additionalProperties: false
    });
  });
});
Enter fullscreen mode Exit fullscreen mode

The important line is:

additionalProperties: false
Enter fullscreen mode Exit fullscreen mode

Without it, an agent can add response fields and still pass the test. With it, any schema change must be intentional and reflected in the contract.

Apidog can automate contract testing from your API spec, so responses are validated against the declared schema during manual testing and CI/CD runs.

3. Gate deployments on spec validation

Add API contract validation to your CI/CD pipeline.

A basic pipeline step should fail the build if the running implementation does not match the declared API contract.

Example:

- name: Validate API contract
  run: |
    apidog run --test-scenario-id CONTRACT_TESTS

    if [ $? -ne 0 ]; then
      echo "API contract violation detected. Review API changes."
      exit 1
    fi
Enter fullscreen mode Exit fullscreen mode

This gives you a hard deployment gate for both human-written and AI-generated code.

The goal is simple: no implementation ships unless it matches the spec.

4. Require spec updates for API behavior changes

Create a team rule:

Any PR that changes API behavior must include the corresponding OpenAPI update.
Enter fullscreen mode Exit fullscreen mode

This should apply to:

  • new endpoints
  • removed endpoints
  • new request fields
  • changed request validation
  • new response fields
  • changed response fields
  • changed status codes
  • changed error formats
  • auth or permission changes
  • pagination changes

For AI-generated PRs, the agent must update the spec, or a human must update it before merge.

In Apidog, spec changes can propagate to:

  • API documentation
  • mock server responses
  • test assertions
  • client SDK types

That reduces the chance that code, docs, tests, and mocks drift apart.

5. Add API-specific CI checks

General test suites are not enough. Add checks that focus on API compatibility.

Useful CI checks include:

OpenAPI linting
OpenAPI diff checks
Contract tests
Backward compatibility checks
Mock validation
Generated SDK type checks
Error schema validation
Enter fullscreen mode Exit fullscreen mode

For example, your CI workflow could include:

name: API validation

on:
  pull_request:
    paths:
      - "src/api/**"
      - "openapi/**"
      - "tests/contract/**"

jobs:
  validate-api:
    runs-on: ubuntu-latest

    steps:
      - uses: actions/checkout@v4

      - name: Lint OpenAPI spec
        run: |
          npx @redocly/cli lint openapi/openapi.yaml

      - name: Run contract tests
        run: |
          npm run test:contract

      - name: Validate API scenarios
        run: |
          apidog run --test-scenario-id CONTRACT_TESTS
Enter fullscreen mode Exit fullscreen mode

The exact commands depend on your stack. The principle is what matters: API behavior should be validated separately from implementation logic.

6. Monitor API behavior in production

Pre-production checks reduce risk, but production monitoring still matters.

Track signals such as:

  • responses that do not match the declared schema
  • unexpected fields appearing in responses
  • missing required fields
  • error rate changes
  • status code distribution changes
  • latency changes
  • new endpoint traffic patterns
  • increased validation failures
  • consumer-specific failures

For example, you can log schema validation failures at the edge:

function validateResponse(schema, body, route) {
  const valid = ajv.validate(schema, body);

  if (!valid) {
    logger.warn({
      route,
      errors: ajv.errors,
      message: "API response schema violation"
    });
  }

  return valid;
}
Enter fullscreen mode Exit fullscreen mode

Do not rely on consumers to discover contract issues first.

7. Separate API review from code review

Code review asks:

Does this implementation work?
Enter fullscreen mode Exit fullscreen mode

API review asks:

Does this change affect consumers?
Enter fullscreen mode Exit fullscreen mode

For AI-generated API changes, use a dedicated checklist.

Example API review checklist:

## API review checklist

- [ ] Does this PR add, remove, or modify an endpoint?
- [ ] Is the OpenAPI spec updated?
- [ ] Are request and response schemas accurate?
- [ ] Are status codes documented?
- [ ] Are error responses consistent with the existing error format?
- [ ] Are backward-incompatible changes versioned?
- [ ] Are contract tests updated?
- [ ] Are mocks updated?
- [ ] Are API docs and examples updated?
- [ ] Have downstream consumers been notified if needed?
Enter fullscreen mode Exit fullscreen mode

Put this checklist in your pull request template so it applies to both human and AI-generated changes.

8. Give agents better API instructions

If your team uses autonomous coding tools, encode API rules in the repository.

Examples:

/api-guidelines.md
/openapi/openapi.yaml
/docs/api-review-checklist.md
/tests/contract/
/examples/api-responses/
Enter fullscreen mode Exit fullscreen mode

Your agent instructions should be explicit:

When modifying API behavior:

1. Update the OpenAPI spec.
2. Add or update contract tests.
3. Preserve backward compatibility unless the task explicitly requests a breaking change.
4. Do not add undocumented response fields.
5. Use the standard error schema.
6. Update examples and mocks.
7. Mention API behavior changes in the PR summary.
Enter fullscreen mode Exit fullscreen mode

Autonomous agents work better when conventions are written down. If your API rules live only in senior engineers’ heads, agents will miss them.

The trajectory: where autonomous coding is heading

Agent Smith today vs. tomorrow

Agent Smith at 25% is likely the starting point, not the endpoint.

Sergey Brin called AI agents a “big focus” during a March 2026 sales town hall. As models improve, access restrictions loosen, and engineering workflows adapt, the percentage of AI-generated code is likely to grow.

Other companies are moving in similar directions:

  • Claude Code’s KAIROS, reportedly leaked in source code, suggests an always-on daemon with GitHub webhook subscriptions and background workers.
  • GitHub Copilot Agent Mode supports multi-step coding tasks with autonomous file editing.
  • Amazon’s CodeWhisperer has been expanding from autocomplete toward more agentic workflows.

The trend is clear: AI coding tools are moving from assistant to autonomous contributor to background infrastructure.

For API teams, the question is not whether AI will touch your API code. It is how safely your workflow handles it.

What API teams should prepare for now

Design-first development is becoming more important. When agents write implementation code, the API spec becomes the stable artifact reviewers and automation can trust.

Contract testing is also becoming mandatory. Unit tests are useful, but they do not fully encode consumer expectations. Contract tests make those expectations explicit.

Integrated tooling matters too. Disconnected tools create drift:

Separate API client
Separate test runner
Separate mock server
Separate docs generator
Separate SDK generator
Enter fullscreen mode Exit fullscreen mode

Each disconnected artifact is another thing an AI agent may forget to update.

Platforms like Apidog help keep specs, tests, mocks, and docs synchronized so API changes are easier to validate and review.

FAQ

What is Google Agent Smith?

Agent Smith is Google’s internal AI coding agent built on the Gemini model family and the Antigravity platform. It works asynchronously in the background: engineers assign tasks, and Agent Smith writes, tests, and iterates on code without real-time human interaction. It generated over 25% of Google’s new production code as of March 2026.

Is Agent Smith available outside Google?

No. Agent Smith is an internal tool restricted to Google employees. Google has not announced plans for a public release. The technology is similar to Copilot Agent Mode and Claude Code, but it is more deeply integrated with Google’s internal codebase and documentation systems.

Does AI-generated code break API contracts?

It can. AI agents write code that passes tests, but tests may not cover all parts of your API contract. Schema changes, new response fields, different error formats, and behavioral changes can pass tests while breaking downstream consumers. Contract testing and design-first development reduce this risk.

Should API teams worry about Agent Smith?

Not about Agent Smith specifically, since it is Google-internal. But API teams should pay attention to the trend it represents. Similar autonomous coding tools are reaching normal development workflows. Preparing now with design-first APIs, contract testing, and integrated tooling makes adoption safer.

How do I prevent AI agents from breaking my APIs?

Use the OpenAPI spec as the source of truth. Add strict contract tests, including additionalProperties: false where appropriate. Gate deployments on spec validation. Require spec updates for API behavior changes. Use tooling such as Apidog to synchronize specs, tests, mocks, and documentation.

What is the difference between AI-assisted and AI-generated code?

AI-assisted code is produced with real-time human oversight, such as Copilot suggestions or interactive Claude Code sessions. The developer sees and approves changes as they happen.

AI-generated code, in the Agent Smith model, is produced asynchronously. The developer assigns a task and reviews completed work later. That separation changes review dynamics and increases the need for automated validation.

Will AI agents replace API developers?

No. Agent Smith still requires human task definition, code review, and deployment approval. A March 2026 MIT study confirmed that AI augments developer productivity but does not replace the judgment, context awareness, and architectural thinking that humans provide. The role shifts toward defining tasks, reviewing output, and maintaining system coherence.

Key takeaways

  • Google’s Agent Smith generates over 25% of new production code through asynchronous autonomous operation.
  • This marks a shift from AI-assisted coding to AI-generated code.
  • API contract drift is one of the biggest risks when autonomous agents modify endpoints and schemas.
  • Design-first development with OpenAPI as the source of truth helps prevent contract breakage.
  • Contract testing catches API changes that unit tests often miss.
  • Deployment gates should validate implementation behavior against the declared spec.
  • API review should be separate from general code review.
  • Integrated platforms like Apidog help synchronize specs, tests, mocks, and docs.
  • Autonomous coding is accelerating, so API teams should build agent-proof workflows now.

Agent Smith at 25% is the beginning. Teams that build reliable API contracts, automated validation, and synchronized API workflows today will be better prepared to use autonomous coding tools safely tomorrow.

Top comments (0)