Search for "Next.js AI chat" in 2026 and Vercel AI SDK still comes up as the de facto standard. Nothing wrong with it, but relying on the SDK means you often don't understand what's happening under the hood — how streaming actually works, what the Route Handler is doing behind the scenes.
I built this from scratch in a sandbox using only the Anthropic SDK. The entire flow: create-next-app, adding @anthropic-ai/sdk, implementing the Route Handler. It turned out simpler than I expected. About 50 lines gets you a production-deployable streaming chat backend.
One thing I noticed while doing this: create-next-app@latest now installs Next.js 16. Most tutorials you'll find are still targeting Next.js 14 or 15. This post reflects what you actually get in May 2026.
What We're Building
The app structure:
- Next.js 16.2.6 + App Router
- Route Handler (
/api/chat) calling Claude API server-side - SSE (Server-Sent Events) delivering streaming responses to the client
- React 19
"use client"component rendering the stream in real time
The key point: the API key is read only on the server and never included in the client bundle. This is a direct consequence of how Next.js App Router separates server and client code.
Actual build output from the sandbox:
▲ Next.js 16.2.6 (Turbopack)
✓ Compiled successfully in 1874ms
Route (app)
┌ ○ / (Static) prerendered as static content
└ ƒ /api/chat (Dynamic) server-rendered on demand
Project Structure
When finished, the structure looks like this. Two files are the core; the rest is generated by create-next-app.
nextjs-claude-chat/
├── src/
│ └── app/
│ ├── api/
│ │ └── chat/
│ │ └── route.ts ← Claude API streaming endpoint (core)
│ ├── page.tsx ← Chat UI (core)
│ ├── layout.tsx ← Auto-generated
│ └── globals.css ← Auto-generated
├── .env.local ← ANTHROPIC_API_KEY goes here
├── package.json
└── tsconfig.json
Two files. route.ts is server code; page.tsx is client code. api/chat/route.ts ends up in the server bundle only, while page.tsx with its "use client" directive goes to the client bundle. This separation is what makes API key security work.
Prerequisites
- Node.js 18+
- Anthropic API key (
sk-ant-...) — get one at console.anthropic.com - Basic TypeScript knowledge
- Basic Next.js App Router understanding (you can follow along without it)
Step 1: Create the Project and Install Dependencies
npx create-next-app@latest nextjs-claude-chat \
--typescript \
--tailwind \
--eslint \
--app \
--src-dir \
--import-alias "@/*"
cd nextjs-claude-chat
npm install @anthropic-ai/sdk
As of May 2026, create-next-app@latest installs Next.js 16.2.6 with React 19.2.4. Existing tutorials using Next.js 14/15 may have minor differences.
Key dependencies after installation:
{
"dependencies": {
"@anthropic-ai/sdk": "^0.97.1",
"next": "16.2.6",
"react": "19.2.4",
"react-dom": "19.2.4"
}
}
Anthropic SDK 0.97.x is the current latest. Earlier versions (0.20.x and below) had a different messages.stream() API, so pin your version if you're migrating.
Step 2: Implement the Claude API Route Handler
The core file. Create src/app/api/chat/route.ts:
import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic({
apiKey: process.env.ANTHROPIC_API_KEY,
});
export async function POST(req: Request) {
const { messages } = await req.json();
const stream = await client.messages.stream({
model: "claude-opus-4-7",
max_tokens: 1024,
messages,
});
const encoder = new TextEncoder();
const readable = new ReadableStream({
async start(controller) {
for await (const chunk of stream) {
if (
chunk.type === "content_block_delta" &&
chunk.delta.type === "text_delta"
) {
controller.enqueue(
encoder.encode(
`data: ${JSON.stringify({ text: chunk.delta.text })}\n\n`
)
);
}
}
controller.enqueue(encoder.encode("data: [DONE]\n\n"));
controller.close();
},
});
return new Response(readable, {
headers: {
"Content-Type": "text/event-stream",
"Cache-Control": "no-cache",
Connection: "keep-alive",
},
});
}
Two things worth noting here.
First, client.messages.stream() returns an AsyncIterableStream. The for await...of loop receives chunks one at a time and pushes them to the client. When the stream ends, a [DONE] signal is sent and the controller closes.
Second, ReadableStream + TextEncoder is Web Streams API standard. Next.js Route Handlers use Web Streams, not Node.js stream module. This is why the code looks different from FastAPI streaming or Express implementations. new ReadableStream may feel unfamiliar, but it's the standard across modern JavaScript runtimes — Cloudflare Workers, Deno, Bun all work the same way.
The filter on content_block_delta events: Anthropic's streaming protocol emits multiple event types (message_start, content_block_start, content_block_delta, message_delta, message_stop). Only text_delta typed content_block_delta events carry actual text content.
Step 3: Environment Variables and Security
Create .env.local in the project root (same level as .next/):
ANTHROPIC_API_KEY=sk-ant-your-actual-key-here
Never use the NEXT_PUBLIC_ prefix. This is core to Next.js security.
| Variable format | Accessible from | Use for |
|---|---|---|
ANTHROPIC_API_KEY |
Server only (Route Handler, Server Component) | ✓ API keys |
NEXT_PUBLIC_API_KEY |
Client-public (included in browser bundle) | ✗ Never use for API keys |
NEXT_PUBLIC_ variables get inlined into the JavaScript bundle at build time. Anyone can see them in browser DevTools. Without the prefix, the variable is server-only — referencing it from client code will cause a build error.
Step 4: Client Chat UI
Create src/app/page.tsx with streaming state management:
"use client";
import { useState, useRef, useEffect } from "react";
type Message = {
role: "user" | "assistant";
content: string;
};
export default function ChatPage() {
const [messages, setMessages] = useState<Message[]>([]);
const [input, setInput] = useState("");
const [isLoading, setIsLoading] = useState(false);
const bottomRef = useRef<HTMLDivElement>(null);
useEffect(() => {
bottomRef.current?.scrollIntoView({ behavior: "smooth" });
}, [messages]);
const sendMessage = async () => {
if (!input.trim() || isLoading) return;
const userMessage: Message = { role: "user", content: input };
const updatedMessages = [...messages, userMessage];
setMessages(updatedMessages);
setInput("");
setIsLoading(true);
// Add placeholder for the assistant response
const assistantMessage: Message = { role: "assistant", content: "" };
setMessages([...updatedMessages, assistantMessage]);
const res = await fetch("/api/chat", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ messages: updatedMessages }),
});
if (!res.body) return;
const reader = res.body.getReader();
const decoder = new TextDecoder();
while (true) {
const { done, value } = await reader.read();
if (done) break;
const chunk = decoder.decode(value);
const lines = chunk.split("\n");
for (const line of lines) {
if (line.startsWith("data: ") && line !== "data: [DONE]") {
const data = JSON.parse(line.slice(6));
setMessages((prev) => {
const last = prev[prev.length - 1];
return [
...prev.slice(0, -1),
{ ...last, content: last.content + data.text },
];
});
}
}
}
setIsLoading(false);
};
return (
<main className="flex flex-col h-screen max-w-2xl mx-auto p-4">
<h1 className="text-2xl font-bold mb-4">Claude Chat</h1>
<div className="flex-1 overflow-y-auto space-y-4 mb-4">
{messages.map((msg, i) => (
<div key={i} className={`p-3 rounded-lg ${
msg.role === "user"
? "bg-blue-100 ml-auto max-w-xs"
: "bg-gray-100 mr-auto max-w-md"
}`}>
<span className="text-xs text-gray-500 block mb-1">
{msg.role === "user" ? "You" : "Claude"}
</span>
<p className="whitespace-pre-wrap">{msg.content}</p>
</div>
))}
<div ref={bottomRef} />
</div>
<div className="flex gap-2">
<input
type="text"
value={input}
onChange={(e) => setInput(e.target.value)}
onKeyDown={(e) => e.key === "Enter" && sendMessage()}
placeholder="Type a message..."
className="flex-1 border rounded-lg px-3 py-2 focus:outline-none focus:ring-2 focus:ring-blue-400"
/>
<button
onClick={sendMessage}
disabled={isLoading || !input.trim()}
className="bg-blue-500 text-white px-4 py-2 rounded-lg disabled:opacity-50"
>
Send
</button>
</div>
</main>
);
}
Two implementation details to highlight. First, the line !== "data: [DONE]" check: without it, the loop tries to parse [DONE] as JSON and throws an error. Second, the functional setMessages((prev) => ...) update: inside an async loop, closures capture stale state. Using prev ensures you're always appending to the latest message content.
Step 5: Build and Run
npm run dev
# ▲ Next.js 16.2.6 (Turbopack)
# ✓ Ready in 337ms
# Local: http://localhost:3000
npm run build
# ✓ Compiled successfully in 1874ms
# ƒ /api/chat (Dynamic)
337ms dev server startup is noticeably faster than Webpack-based builds. The production build also runs TypeScript checks automatically — type errors fail the build, which is the right behavior for a typed codebase.
Limitations of This Implementation
I'll be straight: don't ship this to production as-is.
No error handling. When the Claude API fails — rate limit, network error, invalid key — the stream just drops. The user sees nothing. Real services need try/catch blocks and error SSE events:
// Route Handler with error handling
export async function POST(req: Request) {
try {
const { messages } = await req.json();
const stream = await client.messages.stream({ /* ... */ });
const readable = new ReadableStream({
async start(controller) {
try {
for await (const chunk of stream) { /* ... */ }
} catch (streamError) {
controller.enqueue(
encoder.encode(`data: ${JSON.stringify({ error: "Stream error" })}\n\n`)
);
} finally {
controller.enqueue(encoder.encode("data: [DONE]\n\n"));
controller.close();
}
},
});
return new Response(readable, { /* headers */ });
} catch (err) {
return new Response(JSON.stringify({ error: "Request failed" }), {
status: 500,
headers: { "Content-Type": "application/json" },
});
}
}
No conversation length limit. The full message history goes to the API on every request. Long conversations will eventually exceed the context window or drive up costs. Production apps need to trim to the last N messages or manage token counts.
No concurrent request management. Rapid messages or multiple tabs cause streaming collisions. AbortController logic for canceling previous requests is missing.
Deploying to Vercel
A few things I hit when deploying:
Environment variables: Add ANTHROPIC_API_KEY in Vercel's Project Settings → Environment Variables. The .env.local file stays local.
Runtime: Explicitly set Node.js Runtime in your Route Handler to avoid Edge Runtime compatibility issues:
export const runtime = 'nodejs';
Function timeout: Vercel Hobby plan has a 10-second limit. For longer responses, add this to vercel.json:
{
"functions": {
"src/app/api/chat/route.ts": {
"maxDuration": 60
}
}
}
How SSE Works Under the Hood
Server-Sent Events is a one-way streaming protocol running over plain HTTP. Unlike WebSockets, it passes through proxies, CDNs, and firewalls with no special handling.
SSE message format:
data: {"text": "Hello"}\n\n
data: {"text": ", World"}\n\n
data: [DONE]\n\n
Each message starts with data: and ends with two newlines. The TextEncoder/TextDecoder pair converts between strings and Uint8Array (bytes) — Web Streams API operates at the byte level. This same pattern works across Next.js, Cloudflare Workers, Deno, and Bun.
Raw API vs. Vercel AI SDK
| Aspect | Raw Anthropic SDK | Vercel AI SDK |
|---|---|---|
| Code volume | More (~50-line Route Handler) | Less (useChat one-liner) |
| Customization | Completely free | Within SDK abstractions |
| Debuggability | SSE flow is transparent | Internal logic is opaque |
| Learning value | Forces you to understand Web Streams and SSE | Use immediately |
| Best for | Understanding streaming mechanics | Rapid prototyping |
My recommendation: build it the raw way once, then use the SDK. You'll understand what the SDK is actually handling for you. See Building a Claude Streaming Agent with Vercel AI SDK for the SDK-based comparison.
Next Steps
- Add Tool Use — Give Claude function-calling ability → Claude Agent SDK Complete Guide
- Prompt Caching — Cut API costs up to 90% → Claude API Prompt Caching in Practice
- Stronger error handling — AbortController, retry logic, error SSE events
- Stream cancellation — Cancel button to stop generation mid-stream
- Vercel deployment — Apply the notes above and go live
A deeper guide covering those production gaps is coming in a follow-up post.

Top comments (0)