SurfaceDocs

Posted on Feb 27

The Missing Layer in Every AI Pipeline

#machinelearning #ai #python #productivity

You built the pipeline. The model works. The output is good — surprisingly good, actually. You paste it into Slack to show your team.

Then someone asks: "Cool, but where do users see this?"

And you realize you have no idea.

The Duct-Tape Gallery

If you've built anything with LLMs, you've done at least one of these:

print(response.choices[0].message.content) and called it a day
Written output to a local markdown file that only you can access
Built a "quick" React app that took three weeks and still doesn't handle tables
Piped structured output into a Google Doc via a script that breaks every time Google changes their API
Dumped JSON into Notion through an integration that rate-limits you after 3 requests
Stored everything in S3 and built a viewer on top because apparently that's your job now

Here's the thing — every single one of these is a document rendering and hosting problem disguised as an application feature. You're not building output infrastructure because you want to. You're building it because nothing exists.

Meanwhile, your actual pipeline looks like this:

# The part you spent weeks on
response = openai.chat.completions.create(
    model="gpt-4o",
    messages=messages,
    response_format=schema,
)

# The part you spent months on
output = response.choices[0].message.content
# ... now what?

That # ... now what? comment is doing a lot of heavy lifting in production codebases.

This Is an Architecture Problem

Think about any well-designed system. There's always a three-layer pattern:

Input → Processing → Output

In web apps: HTTP requests → business logic → rendered HTML. In data engineering: sources → transformations → warehouses and dashboards. In CI/CD: commits → build/test → artifacts and deployments.

Now look at a typical AI pipeline:

Input (APIs, databases, user prompts) → Processing (LLMs, agents, chains, RAG) → Output (???)

The output layer is conspicuously absent. Not under-invested — absent. Pull up any AI architecture diagram on the internet. You'll see boxes for vector databases, embedding models, orchestration frameworks, prompt management, evaluation suites, guardrails, observability. The arrow at the end just says "response" and points at nothing.

We've built an entire ecosystem around getting data into models and making models smarter. The ecosystem for getting structured output out of models and into the hands of users essentially doesn't exist.

This isn't a convenience problem. It's a missing architectural layer.

What an Output Layer Actually Requires

Once you frame it as infrastructure, the requirements become obvious:

API-first. Machines are generating this content, not humans. The primary interface needs to be a programmatic one. If the first step is "open a browser," you've already failed.

Structured content handling. LLM output isn't plain text. It's headings, code blocks, tables, lists, diagrams. The output layer needs to understand document structure natively, not treat everything as a string.

Instant shareability. The output needs a URL the moment it's created. No deploy step, no build process, no "publish" button. Generated → shareable, in one call.

Zero infrastructure. If I have to provision a server, set up a database, or configure a CDN to display AI-generated content, I'm solving the wrong problem. The output layer should be as invisible as the logging layer.

Built for machine-generated volume. Humans write a few documents a day. AI agents can generate hundreds per hour. The output layer needs to handle this without rate limits designed for human typing speed.

None of these requirements are exotic. They're just... not met by anything that currently exists in the AI toolchain.

The Output Layer, Concretely

This is why we built SurfaceDocs — as the output layer for AI pipelines. Not a document editor. Not a collaboration tool. Infrastructure. A place machines publish, not where humans type.

The core integration is three lines:

from surfacedocs import SurfaceDocs, DOCUMENT_SCHEMA, SYSTEM_PROMPT

docs = SurfaceDocs(api_key="sd_live_...")
result = docs.save(llm_output)
print(result.url)  # https://app.surfacedocs.dev/d/abc123

The SDK exports two things that make integration with any LLM trivial: DOCUMENT_SCHEMA (a JSON schema that tells the model how to structure its output) and SYSTEM_PROMPT (instructions that guide the model to generate well-structured documents). You wire these into whatever LLM you're already using.

With OpenAI

from surfacedocs import SurfaceDocs, DOCUMENT_SCHEMA, SYSTEM_PROMPT
from openai import OpenAI

openai = OpenAI()
docs = SurfaceDocs()

response = openai.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": SYSTEM_PROMPT},
        {"role": "user", "content": "Write documentation for user authentication"},
    ],
    response_format={
        "type": "json_schema",
        "json_schema": {"name": "document", "schema": DOCUMENT_SCHEMA},
    },
)

result = docs.save(response.choices[0].message.content)
print(f"Saved: {result.url}")

With Anthropic

from surfacedocs import SurfaceDocs, DOCUMENT_SCHEMA, SYSTEM_PROMPT
import anthropic

client = anthropic.Anthropic()
docs = SurfaceDocs()

response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=4096,
    system=SYSTEM_PROMPT,
    messages=[{"role": "user", "content": "Write documentation for user authentication"}],
    tools=[{
        "name": "create_document",
        "description": "Create a structured document",
        "input_schema": DOCUMENT_SCHEMA,
    }],
    tool_choice={"type": "tool", "name": "create_document"},
)

tool_use = next(b for b in response.content if b.type == "tool_use")
result = docs.save(tool_use.input)
print(f"Saved: {result.url}")

With Gemini

from surfacedocs import SurfaceDocs, DOCUMENT_SCHEMA, SYSTEM_PROMPT
import google.generativeai as genai

genai.configure(api_key="...")
docs = SurfaceDocs()

model = genai.GenerativeModel(
    model_name="gemini-2.0-flash",
    system_instruction=SYSTEM_PROMPT,
    generation_config=genai.GenerationConfig(
        response_mime_type="application/json",
        response_schema=DOCUMENT_SCHEMA,
    ),
)

response = model.generate_content("Write documentation for user authentication")
result = docs.save(response.text)
print(f"Saved: {result.url}")

Same pattern every time: give the model the schema, get structured output back, save it, get a URL. The document supports headings, paragraphs, code blocks, lists, quotes, tables, images, dividers, Mermaid diagrams, and even Slidev presentations. Inline markdown just works.

Think of it as S3 for AI documents, but with a built-in viewer. You don't build a viewer for files stored in S3 from scratch every time — and you shouldn't have to build one for AI-generated documents either.

This Will Become Standard

Logging wasn't always standard infrastructure. There was a time when every team rolled their own log files, their own rotation scripts, their own search tools. Then centralized logging became a thing and nobody went back.

Monitoring followed the same arc. Error tracking. Feature flags. Auth. Each started as something teams built custom, then crystallized into dedicated infrastructure that everyone just uses.

The output layer for AI is at the beginning of that same curve. Right now, every team is building their own duct-tape solution. Custom React apps, Notion integrations, Google Docs hacks. It works, barely, and it's a massive distraction from the actual product.

The teams that adopt a real output layer early will have cleaner architectures, faster iteration, and fewer Sunday nights spent debugging why their document rendering broke after a dependency update.

The pipeline doesn't end at the model. It ends where the user sees the result. Build accordingly.

DEV Community