How I built a real-time multi-step AI agent that investigates production incidents, showcases 10+ AI SDK 7 features, and streams results live to the browser.
Production incidents are stressful. An alert fires, the dashboard turns red, and someone has to dig through logs, query databases, check service health, and piece together what went wrong — all while the clock is ticking.
What if an AI agent could do that legwork? Not replace the engineer, but handle the grunt work: scanning logs, running queries, checking services, and producing a structured report with evidence.
That's exactly what Serverless Detective does. It's an open-source demo built with AI SDK 7 and Next.js 15 that showcases a new paradigm in LLM application development — durable, multi-step tool-using agents.
What Is Serverless Detective?
Serverless Detective is an interactive AI agent that:
- Ingests a production incident log (2MB, 20,000 lines)
- Searches for error patterns using regex/keyword matching
- Queries a simulated database for affected records
- Checks microservice health endpoints
- Escalates by paging on-call engineers or rolling back deployments
- Reports findings with a full incident summary and performance metrics
All of this happens in an automated loop: the LLM decides what tool to call, processes the result, and plans the next step — until the investigation is complete.
And the user sees every step as it happens, streamed live via Server-Sent Events.
Architecture Overview
┌──────────────────────────────────────────────────────────┐
│ Browser (Next.js) │
│ │
│ page.tsx (React 19) │
│ └─ fetch("/api/detective") │
│ └─ ReadableStream.getReader() │
│ └─ Parse SSE data: lines │
│ └─ React setState updates per step │
└──────────────────────────────────────────────────────────┘
│ SSE stream
▼
┌──────────────────────────────────────────────────────────┐
│ Next.js API Route (Node.js) │
│ │
│ route.ts │
│ ├─ new TransformStream() │
│ ├─ agent.generate() in background │
│ │ └─ onStepEnd → writer.write() │
│ │ └─ onEnd → writer.close() │
│ └─ return Response(stream.readable) │
└──────────────────────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────┐
│ ToolLoopAgent (AI SDK 7) │
│ │
│ Step loop: │
│ LLM call → tool calls → execute → results → repeat │
│ │
│ Features: │
│ ├── tool() API with Zod schemas │
│ ├── reasoning: high │
│ ├── Timeouts (total/step/tool) │
│ ├── Tool approvals (human-in-the-loop) │
│ ├── runtimeContext (typed shared state) │
│ ├── isStepCount() stop condition │
│ ├── onStepEnd / onEnd callbacks │
│ ├── File-based snapshot persistence │
│ └── OpenTelemetry tracing │
└──────────────────────────────────────────────────────────┘
│
▼
┌─────────────────┐
│ Groq / OpenAI │
└─────────────────┘
Deep Dive: How It Works
1. The Agent Core (agent/detective.ts)
The heart of the application is a ToolLoopAgent — one of AI SDK 7's built-in agent implementations. It runs a loop:
- Call the LLM with the current context
- If the LLM returns tool calls, execute them
- Feed results back to the LLM
- Repeat until a stop condition is met
export function createDetectiveAgent(callbacks?: AgentCallbacks) {
return new ToolLoopAgent<never, DetectiveTools, DetectiveRuntimeContext>({
id: "serverless-detective",
model: createModel(),
tools,
instructions: `You are "The Serverless Detective" — a senior SRE...`,
runtimeContext: { incidentId, status: "ingesting", ... },
timeout: { totalMs: 120000, stepMs: 30000, toolMs: 10000 },
toolApproval: { pageOnCall: "user-approval", ... },
stopWhen: isStepCount(10),
maxOutputTokens: 4096,
});
}
Key AI SDK 7 concepts at play:
-
tool()API — Declarative tools with Zod-validated input schemas and typedexecutefunctions -
runtimeContext— Typed mutable state that persists across all steps. The agent tracks which services it has checked, its current investigation phase, and hypotheses it has formed -
timeout— Three-tier timeout: total for the whole investigation, per-step, and per-tool execution. The rollback tool gets a longer timeout since it's a heavy operation -
toolApproval— Sensitive actions like paging an on-call engineer require user confirmation. This is set at the agent level but can be overridden per invocation -
isStepCount()— A built-in stop condition that caps the agent at 10 steps, preventing runaway investigations
2. The Investigation Tools (agent/tools.ts)
Five tools give the detective its capabilities:
searchLogs(pattern, maxLines) → Lines matching the pattern
queryDatabase(table, where, limit) → Simulated DB rows with error status
checkServiceHealth(service) → "healthy" or "degraded"
pageOnCall(severity, summary) → Creates a PagerDuty incident (requires approval)
rollbackDeployment(service, version) → Rolls back to a stable version (requires approval)
Each tool is defined using the tool() API with Zod schemas:
export const searchLogsTool = tool({
description: "Search the incident log for patterns",
inputSchema: z.object({
pattern: z.string().describe("Regex or keyword to search for"),
maxLines: z.number().optional().default(20),
}),
execute: async ({ pattern, maxLines }) => {
// Read log file, filter lines, return matches
return { totalMatches, lines, summary };
},
});
The inputSchema is AI SDK 7's replacement for the older parameters field. It accepts any Zod schema and uses the schema's .describe() hints as prompt context for the LLM.
3. SSE Streaming (app/api/detective/route.ts)
This is where the magic of real-time UX meets AI SDK 7's callback system.
The API route creates a TransformStream, wires the agent's lifecycle callbacks to write SSE events, and returns the readable half as the HTTP response:
export async function GET() {
const stream = new TransformStream();
const writer = stream.writable.getWriter();
const onStepEnd = (event) => {
writer.write(encoder.encode(`data: ${JSON.stringify({
type: "step",
stepNumber: event.stepNumber + 1,
toolNames: event.toolCalls?.map(tc => tc.toolName),
durationMs: event.performance?.stepTimeMs,
tokens: event.usage?.outputTokens,
})}\n\n`));
};
// Start agent in background — don't await
(async () => {
const agent = createDetectiveAgent({ onStepEnd, onEnd });
await agent.generate({ prompt: "..." });
})();
return new Response(stream.readable, {
headers: { "Content-Type": "text/event-stream" },
});
}
The key insight: agent.generate() runs in a background promise, while the response is returned immediately. The agent's onStepEnd callback writes to the same stream that the HTTP response reads from. This creates a live pipeline from the LLM to the browser.
4. The Web UI (app/page.tsx)
The client uses the native ReadableStream API to consume the SSE stream:
const res = await fetch("/api/detective");
const reader = res.body!.getReader();
const decoder = new TextDecoder();
let buffer = "";
while (true) {
const { done, value } = await reader.read();
if (done) break;
buffer += decoder.decode(value, { stream: true });
for (const part of buffer.split("\n\n")) {
const match = part.match(/^data: (.+)$/m);
if (match) {
const event = JSON.parse(match[1]);
// event.type === "step" → add step card
// event.type === "complete" → show report + performance
// event.type === "error" → show error
}
}
}
Each SSE event updates React state, which renders a new step card with tool name badges, duration, and token count. The result is a live-updating UI that feels like watching a real investigation unfold.
5. Workflow Snapshots (agent/workflow-store.ts)
For durability, each step is persisted to a JSON file:
onStepEnd: async (event) => {
saveSnapshot({
id: runtimeContext.incidentId,
completedSteps: stepNumber + 1,
runtimeContext,
// ...
});
}
This enables the crash-and-resume demo: if the process is killed mid-investigation (e.g., kill -9), the next run can find the snapshot and resume from where it left off.
Challenges & Learnings
Provider Compatibility
AI SDK 7's @ai-sdk/openai v2 defaults to the OpenAI Responses API format. Groq and other OpenAI-compatible providers only support the Chat Completions format. We fixed this by using provider.chat(modelId) instead of provider(modelId).
Type Safety Across Versions
The StreamTextResult.text property in AI SDK 7 is PromiseLike<string> (not plain string), which means JSON.stringify serializes it as {}. This caused a confusing "Objects are not valid as a React child" error. We switched to agent.generate() which returns GenerateTextResult.text as a plain string.
SSE Multiplexing
Streaming agent events through a TransformStream required careful error handling — the background promise must catch all errors and write them to the stream, then close it cleanly. Unhandled rejections in the background promise are silent failures.
Running the Project
npm install
npm run generate-logs
npm run dev
# Open http://localhost:3000
Configure your API key in .env.local:
# OpenAI
OPENAI_API_KEY=sk-...
OPENAI_MODEL=gpt-4o
# Or Groq
OPENAI_BASE_URL=https://api.groq.com/openai/v1
OPENAI_API_KEY=gsk_...
OPENAI_MODEL=llama-3.3-70b-versatile
What's Next
This demo barely scratches the surface of what's possible with AI SDK 7. Here are ideas I'd love to see the community explore:
-
Multi-agent investigations using
WorkflowAgentfrom@ai-sdk/workflow— one agent for logs, another for DB, a coordinator to synthesize findings - Real tool integrations — PagerDuty, Datadog, CloudWatch, Slack, Jira
- Persistent storage — SQLite or Postgres instead of file-based snapshots
-
Chat interface — Replace the single button with
useChatfrom@ai-sdk/reactfor interactive follow-up questions - Authentication — User sessions with investigation history
Code & more: https://www.dailybuild.xyz/project/174-serverless-detective

Top comments (0)