Jangwook Kim

Posted on May 20 • Originally published at jangwook.net

Building a Streaming AI Chat App with Next.js 16 + Claude API — Complete App Router Guide

#nextjs #claudeapi #typescript #streaming

Search for "Next.js AI chat" in 2026 and Vercel AI SDK still comes up as the de facto standard. Nothing wrong with it, but relying on the SDK means you often don't understand what's happening under the hood — how streaming actually works, what the Route Handler is doing behind the scenes.

I built this from scratch in a sandbox using only the Anthropic SDK. The entire flow: create-next-app, adding @anthropic-ai/sdk, implementing the Route Handler. It turned out simpler than I expected. About 50 lines gets you a production-deployable streaming chat backend.

One thing I noticed while doing this: create-next-app@latest now installs Next.js 16. Most tutorials you'll find are still targeting Next.js 14 or 15. This post reflects what you actually get in May 2026.

What We're Building

The app structure:

Next.js 16.2.6 + App Router
Route Handler (/api/chat) calling Claude API server-side
SSE (Server-Sent Events) delivering streaming responses to the client
React 19 "use client" component rendering the stream in real time

The key point: the API key is read only on the server and never included in the client bundle. This is a direct consequence of how Next.js App Router separates server and client code.

Actual build output from the sandbox:

▲ Next.js 16.2.6 (Turbopack)
✓ Compiled successfully in 1874ms

Route (app)
┌ ○ /           (Static)  prerendered as static content
└ ƒ /api/chat   (Dynamic) server-rendered on demand

Project Structure

When finished, the structure looks like this. Two files are the core; the rest is generated by create-next-app.

nextjs-claude-chat/
├── src/
│   └── app/
│       ├── api/
│       │   └── chat/
│       │       └── route.ts    ← Claude API streaming endpoint (core)
│       ├── page.tsx             ← Chat UI (core)
│       ├── layout.tsx           ← Auto-generated
│       └── globals.css          ← Auto-generated
├── .env.local                   ← ANTHROPIC_API_KEY goes here
├── package.json
└── tsconfig.json

Two files. route.ts is server code; page.tsx is client code. api/chat/route.ts ends up in the server bundle only, while page.tsx with its "use client" directive goes to the client bundle. This separation is what makes API key security work.

Prerequisites

Node.js 18+
Anthropic API key (sk-ant-...) — get one at console.anthropic.com
Basic TypeScript knowledge
Basic Next.js App Router understanding (you can follow along without it)

Step 1: Create the Project and Install Dependencies

npx create-next-app@latest nextjs-claude-chat \
  --typescript \
  --tailwind \
  --eslint \
  --app \
  --src-dir \
  --import-alias "@/*"

cd nextjs-claude-chat
npm install @anthropic-ai/sdk

As of May 2026, create-next-app@latest installs Next.js 16.2.6 with React 19.2.4. Existing tutorials using Next.js 14/15 may have minor differences.

Key dependencies after installation:

{
  "dependencies": {
    "@anthropic-ai/sdk": "^0.97.1",
    "next": "16.2.6",
    "react": "19.2.4",
    "react-dom": "19.2.4"
  }
}

Anthropic SDK 0.97.x is the current latest. Earlier versions (0.20.x and below) had a different messages.stream() API, so pin your version if you're migrating.

Step 2: Implement the Claude API Route Handler

The core file. Create src/app/api/chat/route.ts:

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic({
  apiKey: process.env.ANTHROPIC_API_KEY,
});

export async function POST(req: Request) {
  const { messages } = await req.json();

  const stream = await client.messages.stream({
    model: "claude-opus-4-7",
    max_tokens: 1024,
    messages,
  });

  const encoder = new TextEncoder();
  const readable = new ReadableStream({
    async start(controller) {
      for await (const chunk of stream) {
        if (
          chunk.type === "content_block_delta" &&
          chunk.delta.type === "text_delta"
        ) {
          controller.enqueue(
            encoder.encode(
              `data: ${JSON.stringify({ text: chunk.delta.text })}\n\n`
            )
          );
        }
      }
      controller.enqueue(encoder.encode("data: [DONE]\n\n"));
      controller.close();
    },
  });

  return new Response(readable, {
    headers: {
      "Content-Type": "text/event-stream",
      "Cache-Control": "no-cache",
      Connection: "keep-alive",
    },
  });
}

Two things worth noting here.

First, client.messages.stream() returns an AsyncIterableStream. The for await...of loop receives chunks one at a time and pushes them to the client. When the stream ends, a [DONE] signal is sent and the controller closes.

Second, ReadableStream + TextEncoder is Web Streams API standard. Next.js Route Handlers use Web Streams, not Node.js stream module. This is why the code looks different from FastAPI streaming or Express implementations. new ReadableStream may feel unfamiliar, but it's the standard across modern JavaScript runtimes — Cloudflare Workers, Deno, Bun all work the same way.

The filter on content_block_delta events: Anthropic's streaming protocol emits multiple event types (message_start, content_block_start, content_block_delta, message_delta, message_stop). Only text_delta typed content_block_delta events carry actual text content.

Step 3: Environment Variables and Security

Create .env.local in the project root (same level as .next/):

ANTHROPIC_API_KEY=sk-ant-your-actual-key-here

Never use the NEXT_PUBLIC_ prefix. This is core to Next.js security.

Variable format	Accessible from	Use for
`ANTHROPIC_API_KEY`	Server only (Route Handler, Server Component)	✓ API keys
`NEXT_PUBLIC_API_KEY`	Client-public (included in browser bundle)	✗ Never use for API keys

NEXT_PUBLIC_ variables get inlined into the JavaScript bundle at build time. Anyone can see them in browser DevTools. Without the prefix, the variable is server-only — referencing it from client code will cause a build error.

Step 4: Client Chat UI

Create src/app/page.tsx with streaming state management:

"use client";

import { useState, useRef, useEffect } from "react";

type Message = {
  role: "user" | "assistant";
  content: string;
};

export default function ChatPage() {
  const [messages, setMessages] = useState<Message[]>([]);
  const [input, setInput] = useState("");
  const [isLoading, setIsLoading] = useState(false);
  const bottomRef = useRef<HTMLDivElement>(null);

  useEffect(() => {
    bottomRef.current?.scrollIntoView({ behavior: "smooth" });
  }, [messages]);

  const sendMessage = async () => {
    if (!input.trim() || isLoading) return;

    const userMessage: Message = { role: "user", content: input };
    const updatedMessages = [...messages, userMessage];
    setMessages(updatedMessages);
    setInput("");
    setIsLoading(true);

    // Add placeholder for the assistant response
    const assistantMessage: Message = { role: "assistant", content: "" };
    setMessages([...updatedMessages, assistantMessage]);

    const res = await fetch("/api/chat", {
      method: "POST",
      headers: { "Content-Type": "application/json" },
      body: JSON.stringify({ messages: updatedMessages }),
    });

    if (!res.body) return;

    const reader = res.body.getReader();
    const decoder = new TextDecoder();

    while (true) {
      const { done, value } = await reader.read();
      if (done) break;

      const chunk = decoder.decode(value);
      const lines = chunk.split("\n");

      for (const line of lines) {
        if (line.startsWith("data: ") && line !== "data: [DONE]") {
          const data = JSON.parse(line.slice(6));
          setMessages((prev) => {
            const last = prev[prev.length - 1];
            return [
              ...prev.slice(0, -1),
              { ...last, content: last.content + data.text },
            ];
          });
        }
      }
    }

    setIsLoading(false);
  };

  return (
    <main className="flex flex-col h-screen max-w-2xl mx-auto p-4">
      <h1 className="text-2xl font-bold mb-4">Claude Chat</h1>
      <div className="flex-1 overflow-y-auto space-y-4 mb-4">
        {messages.map((msg, i) => (
          <div key={i} className={`p-3 rounded-lg ${
            msg.role === "user"
              ? "bg-blue-100 ml-auto max-w-xs"
              : "bg-gray-100 mr-auto max-w-md"
          }`}>
            <span className="text-xs text-gray-500 block mb-1">
              {msg.role === "user" ? "You" : "Claude"}
            </span>
            <p className="whitespace-pre-wrap">{msg.content}</p>
          </div>
        ))}
        <div ref={bottomRef} />
      </div>
      <div className="flex gap-2">
        <input
          type="text"
          value={input}
          onChange={(e) => setInput(e.target.value)}
          onKeyDown={(e) => e.key === "Enter" && sendMessage()}
          placeholder="Type a message..."
          className="flex-1 border rounded-lg px-3 py-2 focus:outline-none focus:ring-2 focus:ring-blue-400"
        />
        <button
          onClick={sendMessage}
          disabled={isLoading || !input.trim()}
          className="bg-blue-500 text-white px-4 py-2 rounded-lg disabled:opacity-50"
        >
          Send
        </button>
      </div>
    </main>
  );
}

Two implementation details to highlight. First, the line !== "data: [DONE]" check: without it, the loop tries to parse [DONE] as JSON and throws an error. Second, the functional setMessages((prev) => ...) update: inside an async loop, closures capture stale state. Using prev ensures you're always appending to the latest message content.

Step 5: Build and Run

npm run dev
# ▲ Next.js 16.2.6 (Turbopack)
# ✓ Ready in 337ms
# Local: http://localhost:3000

npm run build
# ✓ Compiled successfully in 1874ms
# ƒ /api/chat  (Dynamic)

337ms dev server startup is noticeably faster than Webpack-based builds. The production build also runs TypeScript checks automatically — type errors fail the build, which is the right behavior for a typed codebase.

Limitations of This Implementation

I'll be straight: don't ship this to production as-is.

No error handling. When the Claude API fails — rate limit, network error, invalid key — the stream just drops. The user sees nothing. Real services need try/catch blocks and error SSE events:

// Route Handler with error handling
export async function POST(req: Request) {
  try {
    const { messages } = await req.json();
    const stream = await client.messages.stream({ /* ... */ });
    const readable = new ReadableStream({
      async start(controller) {
        try {
          for await (const chunk of stream) { /* ... */ }
        } catch (streamError) {
          controller.enqueue(
            encoder.encode(`data: ${JSON.stringify({ error: "Stream error" })}\n\n`)
          );
        } finally {
          controller.enqueue(encoder.encode("data: [DONE]\n\n"));
          controller.close();
        }
      },
    });
    return new Response(readable, { /* headers */ });
  } catch (err) {
    return new Response(JSON.stringify({ error: "Request failed" }), {
      status: 500,
      headers: { "Content-Type": "application/json" },
    });
  }
}

No conversation length limit. The full message history goes to the API on every request. Long conversations will eventually exceed the context window or drive up costs. Production apps need to trim to the last N messages or manage token counts.

No concurrent request management. Rapid messages or multiple tabs cause streaming collisions. AbortController logic for canceling previous requests is missing.

Deploying to Vercel

A few things I hit when deploying:

Environment variables: Add ANTHROPIC_API_KEY in Vercel's Project Settings → Environment Variables. The .env.local file stays local.

Runtime: Explicitly set Node.js Runtime in your Route Handler to avoid Edge Runtime compatibility issues:

export const runtime = 'nodejs';

Function timeout: Vercel Hobby plan has a 10-second limit. For longer responses, add this to vercel.json:

{
  "functions": {
    "src/app/api/chat/route.ts": {
      "maxDuration": 60
    }
  }
}

How SSE Works Under the Hood

Server-Sent Events is a one-way streaming protocol running over plain HTTP. Unlike WebSockets, it passes through proxies, CDNs, and firewalls with no special handling.

SSE message format:

data: {"text": "Hello"}\n\n
data: {"text": ", World"}\n\n
data: [DONE]\n\n

Each message starts with data: and ends with two newlines. The TextEncoder/TextDecoder pair converts between strings and Uint8Array (bytes) — Web Streams API operates at the byte level. This same pattern works across Next.js, Cloudflare Workers, Deno, and Bun.

Raw API vs. Vercel AI SDK

Aspect	Raw Anthropic SDK	Vercel AI SDK
Code volume	More (~50-line Route Handler)	Less (`useChat` one-liner)
Customization	Completely free	Within SDK abstractions
Debuggability	SSE flow is transparent	Internal logic is opaque
Learning value	Forces you to understand Web Streams and SSE	Use immediately
Best for	Understanding streaming mechanics	Rapid prototyping

My recommendation: build it the raw way once, then use the SDK. You'll understand what the SDK is actually handling for you. See Building a Claude Streaming Agent with Vercel AI SDK for the SDK-based comparison.

Next Steps

Add Tool Use — Give Claude function-calling ability → Claude Agent SDK Complete Guide
Prompt Caching — Cut API costs up to 90% → Claude API Prompt Caching in Practice
Stronger error handling — AbortController, retry logic, error SSE events
Stream cancellation — Cancel button to stop generation mid-stream
Vercel deployment — Apply the notes above and go live

A deeper guide covering those production gaps is coming in a follow-up post.

DEV Community