SurfaceDocs

Posted on Mar 6

SurfaceDocs + CrewAI: Publishable Agent Work Products

#ai #python #machinelearning #productivity

The Output Problem

CrewAI crews produce real work. Research reports, competitive analyses, technical documentation, content drafts. You define your agents, wire up the tasks, call crew.kickoff(), and get back... a string.

result = crew.kickoff()
print(result)  # 800 words of competitive analysis, dumped to stdout

Now what? You copy-paste it into a Google Doc. Or write it to a markdown file that lives on your laptop. Maybe you pipe it into Slack. The work product your agents spent tokens and time producing gets treated like log output — ephemeral, unstructured, unshareable.

This is a gap in the agent tooling ecosystem. We've invested heavily in making agents do work — better prompting, tool use, multi-agent orchestration. But we've barely thought about what happens to the output. Every team running CrewAI crews in production ends up building some bespoke viewer, or manually reformatting agent output into something presentable. It's the last mile problem nobody's solving at the framework level.

The Fix: A Publishing Layer

SurfaceDocs is a hosted document service built for machine-generated content — the output layer for AI pipelines. pip install surfacedocs, call docs.save(), get a shareable URL. That's the whole integration surface.

Here's a complete CrewAI crew that researches a topic and publishes the result:

import os
from crewai import Agent, Task, Crew
from crewai.tools import BaseTool
from surfacedocs import SurfaceDocs, SYSTEM_PROMPT, OPENAI_DOCUMENT_SCHEMA

docs = SurfaceDocs(api_key=os.environ["SURFACEDOCS_API_KEY"])

class PublishTool(BaseTool):
    name: str = "publish_document"
    description: str = "Publish a structured document and return a shareable URL."

    def _run(self, content: str) -> str:
        result = docs.save(content, folder_id=os.environ.get("SURFACEDOCS_FOLDER_ID"))
        return f"Published: {result.url}"

researcher = Agent(
    role="Senior Research Analyst",
    goal="Produce a comprehensive market research report",
    backstory="You're a seasoned analyst who writes clear, data-driven reports.",
    llm="gpt-4o",
)

publisher = Agent(
    role="Report Publisher",
    goal="Format and publish the final report as a shareable document",
    backstory=f"""You take research output and structure it for publication.

{SYSTEM_PROMPT}""",
    tools=[PublishTool()],
    llm="gpt-4o",
)

research_task = Task(
    description="Research the current state of AI agent frameworks. "
                "Cover CrewAI, LangGraph, AutoGen, and Swarm. "
                "Include strengths, weaknesses, and adoption trends.",
    expected_output="A detailed research report with sections and analysis.",
    agent=researcher,
)

publish_task = Task(
    description="Take the research report and publish it as a structured document. "
                "Use the publish_document tool with the formatted content.",
    expected_output="A shareable URL to the published report.",
    agent=publisher,
)

crew = Crew(
    agents=[researcher, publisher],
    tasks=[research_task, publish_task],
)

result = crew.kickoff()
print(result)  # https://app.surfacedocs.dev/d/abc123

Instead of a string that evaporates, you get a hosted URL. Send it to your team, your client, your stakeholders. The document is rendered, shareable, and persistent.

How the Schema Approach Works

The key to getting well-structured output from SurfaceDocs is the schema approach. Two imports do the heavy lifting:

SYSTEM_PROMPT — Instructions that tell the LLM how to structure content in SurfaceDocs' block format (headings, paragraphs, code blocks, lists, Mermaid diagrams)
OPENAI_DOCUMENT_SCHEMA — A JSON schema for OpenAI-compatible structured output

The flow looks like this:

CrewAI Task Output (raw text)
    → Publisher Agent (with SYSTEM_PROMPT in backstory)
        → Structured document (matching DOCUMENT_SCHEMA)
            → docs.save(structured_content)
                → Shareable URL

When the publisher agent has SYSTEM_PROMPT in its context, it knows how to format content into SurfaceDocs' block structure. The LLM does the formatting work — you just save the result.

For tighter control, you can define a Pydantic model that mirrors the SurfaceDocs schema and use CrewAI's output_pydantic:

from pydantic import BaseModel

class SurfaceDocument(BaseModel):
    title: str
    blocks: list[dict]

publish_task = Task(
    description="Publish the research report as a structured document.",
    expected_output="A structured document with title and blocks.",
    output_pydantic=SurfaceDocument,  # Force schema compliance
    agent=publisher,
)

This guarantees the task output is structured before you pass it to docs.save().

The Callback Approach: Automatic Publishing

The tool-based approach gives agents control over when to publish. But sometimes you just want every crew run to produce a document automatically. CrewAI's task callbacks make this clean:

from surfacedocs import SurfaceDocs, SYSTEM_PROMPT

docs = SurfaceDocs(api_key=os.environ["SURFACEDOCS_API_KEY"])
FOLDER_ID = os.environ.get("SURFACEDOCS_FOLDER_ID")

def publish_on_complete(output):
    """Auto-publish task output to SurfaceDocs."""
    result = docs.save(output.raw, folder_id=FOLDER_ID)
    print(f"📄 Published: {result.url}")

research_task = Task(
    description="Research AI agent frameworks and write a report. "
                "Structure your output with clear headings and sections.",
    expected_output="A well-structured research report.",
    agent=researcher,
    callback=publish_on_complete,  # Auto-publish when task completes
)

No publisher agent needed. The callback fires when the task finishes, saves the output, and you have your URL. This is the simpler pattern when you don't need a dedicated formatting step.

Advanced Patterns

Multiple Documents Per Crew

Real crews produce multiple deliverables. A due diligence crew might generate a research report and a risk assessment and an executive summary. Each task can publish its own document:

def make_publisher(doc_type: str):
    def callback(output):
        result = docs.save(output.raw, folder_id=FOLDER_ID)
        print(f"📄 {doc_type}: {result.url}")
    return callback

research_task = Task(
    description="Deep-dive research on the target company.",
    agent=researcher,
    callback=make_publisher("Research Report"),
)

risk_task = Task(
    description="Identify and assess key risks based on the research.",
    agent=analyst,
    callback=make_publisher("Risk Assessment"),
)

summary_task = Task(
    description="Write an executive summary of findings and risks.",
    agent=writer,
    callback=make_publisher("Executive Summary"),
)

Three tasks, three published documents, three shareable URLs. Each lands in the same folder for easy organization.

Organizing by Project

SurfaceDocs folders map naturally to projects or clients. Create a folder per engagement, pass the folder ID, and every crew run for that project is automatically organized:

PROJECT_FOLDERS = {
    "acme-corp": "folder_abc123",
    "globex": "folder_def456",
    "initech": "folder_ghi789",
}

def run_research_crew(project: str, topic: str):
    folder_id = PROJECT_FOLDERS[project]

    def publish(output):
        result = docs.save(output.raw, folder_id=folder_id)
        return result.url

    # ... crew setup with callback=publish ...
    crew.kickoff()

Your client deliverables are organized before anyone has to think about it.

Post-Kickoff Publishing

If you'd rather publish the crew's final output as one document instead of per-task, just handle it after kickoff():

crew = Crew(
    agents=[researcher, analyst, writer],
    tasks=[research_task, analysis_task, summary_task],
)

output = crew.kickoff()
result = docs.save(output.raw, folder_id=FOLDER_ID)
print(f"📄 Crew output: {result.url}")

Simple. The crew runs, you publish the final result, done.

What This Unlocks

The shift is small in code but significant in practice. Your agents stop producing text blobs and start producing deliverables — documents with URLs you can drop into a Slack thread, an email, or a client portal.

It also changes how you think about crew design. When the output has a destination, you design tasks around publishable artifacts. The research task becomes a report. The analysis task becomes a dashboard. The crew isn't just processing — it's producing.

pip install surfacedocs and give your crews somewhere to put their work.

DEV Community