Stop Writing Docs: How AI Is Auto-Generating Your API Schema from Live Traffic

#ai #api #automation #documentation

IT
InstaTunnel Team
Published by our engineering team
Stop Writing Docs: How AI Is Auto-Generating Your API Schema from Live Traffic
Stop Writing Docs: How AI Is Auto-Generating Your API Schema from Live Traffic
The age-old developer grievance — “the docs are outdated again” — is finally meeting its match. Not through better discipline or stricter processes, but through a fundamental shift in how documentation gets made in the first place.

We are moving from writing docs to observing them into existence.

The Documentation Crisis Is Real, and the Numbers Prove It
API documentation has always been the unpaid technical debt of software projects. But the scale of the problem has become undeniable.

Postman’s 2025 State of the API Report surveyed thousands of developers and found that 93% of API teams face collaboration blockers — and the most common cause is inconsistent, outdated, or missing documentation. Context gets lost when specs live in Confluence, feedback happens in Slack, and examples are buried in someone’s personal GitHub repo. The result is a scavenger hunt every time someone needs to understand what an API actually does.

The 2024 edition of the same report found that 39% of developers cite inconsistent docs as the single biggest roadblock to working with APIs, even though 58% of teams rely on internal documentation tools. In other words: the tools exist, but the docs still fall apart. And 44% of developers resort to digging through source code directly just to understand an API’s behavior.

The problem is structural, not motivational. Developers aren’t lazy — they’re just fast. And manual documentation can’t keep pace with CI/CD velocity.

Three Approaches That Failed Us
Before we get to what’s working, it’s worth being honest about what didn’t.

Code-first annotations — decorating controllers with @schema and @ApiResponse tags — bloated source code and created a tight coupling between documentation accuracy and developer discipline. When logic changed under deadline pressure, the annotations rarely followed.

Design-first YAML — writing the OpenAPI spec before the code — was architecturally elegant but operationally fragile. The spec became a bottleneck, and developers under crunch would ship features the spec didn’t describe, creating drift the moment code hit production.

Postman Collections — great for testing, weak as formal contracts. They were often incomplete, missed edge cases, and lacked the structural rigor needed for automated client generation or compliance review.

The 2024 Postman report put it plainly: “APIs are no longer an afterthought but the foundation of development, with between 26 and 50 APIs powering the average application.” That level of API surface area cannot be maintained by hand.

The Shift: From Documentation as Task to Documentation as Observation
The approach gaining real traction in 2025 and 2026 is traffic-based API documentation — generating OpenAPI and Postman specs directly from live or pre-production traffic, rather than from developer annotations or manually maintained YAML.

The lead example of this in production is Levo.ai, which uses eBPF (Extended Berkeley Packet Filter) — the same kernel-level technology used by Datadog, New Relic, Palo Alto Networks, Cilium, and Sysdig — to passively capture API traffic without code changes, SDK integrations, configuration changes, or sidecar proxies.

Here is how the process actually works:

Passive traffic capture at the kernel level

Levo’s eBPF sensor installs via a single Helm Chart for Kubernetes or a single Docker command for other environments. Once installed, it captures every API request and response passing through the system — REST, GraphQL, gRPC, and SOAP — without being inline with the workload and without adding latency.

Because eBPF works at the Linux kernel level, it is language-agnostic and framework-agnostic. It doesn’t matter if your backend is Django, Spring Boot, or Express. The network traffic tells the truth regardless.

Schema inference from observed payloads

The system analyzes the captured traffic to infer types, required vs. optional fields, authentication schemes, status code patterns, and error structures. When it sees a field like "created_at": "2026-04-05T14:30:00Z" repeatedly, it identifies it as an ISO 8601 date-time. When it sees a usr_ prefix on IDs consistently, it captures that pattern. Multiple observations of the same endpoint allow it to distinguish fields that always appear from those that are conditionally present.

OpenAPI spec generation with AI-enriched metadata

Once enough traffic is observed, the platform generates an OpenAPI-compliant spec that includes endpoint paths, HTTP methods, request and response schemas, query parameter types, authentication requirements, rate limit information, status codes, and error handling patterns. Levo reports that this approach can improve documentation accuracy by up to 95% compared to manually maintained specs, and can reduce the 20–30% drift that typically plagues hand-written documentation.

Crucially, AI-generated human-readable summaries are added to each endpoint — not just field names and types, but context about what the endpoint does and how it should be used. This is documentation that a developer (or an AI agent) can actually act on.

PII detection before anything leaves your environment

Before any payload data is analyzed, a scrubbing layer identifies and masks sensitive data — emails, credit card numbers, passwords, and other PII, PSI, and PHI fields. Levo’s architecture ensures that less than 1% of your data is ever sent to its SaaS platform, and no PII leaves your environment. Only metadata and OpenAPI specs are transmitted.

The Developer Laptop Use Case
One important detail that gets overlooked: this approach works locally, not just in production or staging.

Levo’s dev laptop support — available as a free tier — lets developers install a Docker Compose on macOS or Windows, point their browser or API client at the local sensor, and generate OpenAPI specs just by using the application. Run your Jest, Pytest, or integration test suite, and the traffic from those tests automatically builds your documentation.

This matters because it means documentation can be generated at the point of development — before anything is merged, before staging, before production. The spec is a side effect of writing tests, not a separate deliverable.

What the Broader Tooling Landscape Looks Like
Traffic-based generation is one approach, but the AI documentation ecosystem has expanded significantly. The tools worth knowing about in 2026:

Levo.ai — The most technically rigorous traffic-based solution. Auto-discovers shadow APIs (undocumented endpoints that still receive traffic), zombie APIs (deprecated endpoints still being called), and internal APIs, in addition to documented ones. Integrates with GitHub, GitLab, Jenkins, Jira, AWS API Gateway, Postman, and Burp Suite. Strong compliance story for PCI, HIPAA, and ISO 27001.

Apidog — Takes a design-first approach: design the API, then generate docs automatically from the living specification. Supports REST, GraphQL, WebSocket, gRPC, SOAP, and Server-Sent Events. Replaces Postman, Swagger Editor, Swagger UI, Stoplight, and mock tools in a single platform. Free plan available; paid plans start at $12/user/month.

Mintlify — The documentation platform of choice for companies like Cursor, Perplexity, Coinbase, and Anthropic. AI-native with git sync, WYSIWYG editing, LLM-optimized output via /llms.txt, and an MCP Server generator that makes your API docs directly accessible to AI coding assistants. Designed for developer experience above all else.

Ferndesk — An AI agent (named Fern) that reads your codebase, support tickets, changelogs, and product videos to draft and update documentation continuously. Auto-syncs OpenAPI specs every 6 hours. Upgrades Swagger 2.0 specs to OpenAPI 3.x automatically.

Knowl.ai — Reads code directly from GitHub, Bitbucket, or GitLab and generates documentation that updates whenever the code changes. Continuous and codebase-integrated.

The Agent-Readiness Dimension
There is a dimension to this shift that goes beyond developer convenience.

According to Postman’s 2025 State of the API Report, 51% of organizations have already deployed AI agents, with another 35% planning to do so within two years. AI agents do not read documentation the way humans do — they parse it, reason over parameters, and issue API calls autonomously, without waiting for human confirmation.

This changes the quality bar for documentation dramatically. An agent working from an outdated or incomplete spec will call the wrong endpoints, pass malformed parameters, or fail to handle error states correctly. The spec is no longer a reference for humans — it is an instruction set for autonomous systems.

The 2025 Postman report found that 89% of developers now use generative AI tools in their daily work, and 41% use AI tools specifically to generate API documentation. But AI-generated documentation from a language model working on source code still depends on the code being accurately annotated and up to date. Traffic-based generation sidesteps this entirely: the spec reflects what the API does in practice, not what someone wrote about it six months ago.

Mintlify describes this succinctly: the best API documentation must be skimmable for humans and machine-readable for agents. Tools that publish at /llms.txt and generate MCP servers for their specs are positioning APIs to be consumed by AI systems as naturally as they are consumed by developers today.

What This Means in Practice
The workflow is shifting from a documentation phase to documentation as an emergent property of development and testing.

If you run your integration tests, the traffic generates the spec. If you push to production, the spec updates. If you deprecate an endpoint that still receives traffic, the system flags it — not because someone remembered to update a YAML file, but because the network doesn’t lie.

Levo estimates this approach can reclaim 30–50% of developer hours previously spent on documentation maintenance, and reduce partner onboarding time by up to 40% through always-accurate, always-current specs.

The documentation crisis was never really about effort. It was about timing: documentation was always being written after the fact, by a different person, in a different tool, against a moving target. Traffic-based, AI-enriched documentation generation collapses that gap entirely.

The spec becomes a continuous reflection of reality — because it is built from reality, not assembled from memory.

Comparison: Traditional vs. Traffic-Based Documentation
Dimension Traditional (Manual/Annotation) Traffic-Based (AI-Observed)
Effort High — requires developer time per endpoint Near zero — generated from test and production traffic
Accuracy Prone to drift; reflects intent, not behavior Reflects actual wire behavior
Update cadence Manual; often forgotten after release Continuous — updates with every deployment
Shadow API coverage None Full — discovers undocumented endpoints automatically
PII handling Manual review required Automated scrubbing before schema inference
Agent-readiness Depends on human completeness Structured, machine-readable from generation
Security posture Separate audit process Integrated — flags misconfigurations out of the box
Getting Started
If you want to experiment with traffic-based documentation today:

Levo.ai offers a free forever tier for developer laptops. Install Docker Compose, run your local tests or use your API client as normal, and OpenAPI specs are auto-generated in your Levo dashboard. No code changes required.
Apidog has a free plan with full API design, testing, and documentation features for teams getting started with a design-first approach.
Mintlify is the right choice if you already have specs and need them published beautifully and made AI-accessible.
The question is no longer whether your documentation will be automated. It’s whether you’ll make the shift before your API documentation falls so far behind that it becomes a liability.

Stop writing docs. Start observing them.