DEV Community

Jangwook Kim
Jangwook Kim

Posted on • Originally published at jangwook.net

Building a Streaming AI Chat App with Next.js 16 + Claude API — Complete App Router Guide

Search for "Next.js AI chat" in 2026 and Vercel AI SDK still comes up as the de facto standard. Nothing wrong with it, but relying on the SDK means you often don't understand what's happening under the hood — how streaming actually works, what the Route Handler is doing behind the scenes.

I built this from scratch in a sandbox using only the Anthropic SDK. The entire flow: create-next-app, adding @anthropic-ai/sdk, implementing the Route Handler. It turned out simpler than I expected. About 50 lines gets you a production-deployable streaming chat backend.

One thing I noticed while doing this: create-next-app@latest now installs Next.js 16. Most tutorials you'll find are still targeting Next.js 14 or 15. This post reflects what you actually get in May 2026.

What We're Building

The app structure:

  • Next.js 16.2.6 + App Router
  • Route Handler (/api/chat) calling Claude API server-side
  • SSE (Server-Sent Events) delivering streaming responses to the client
  • React 19 "use client" component rendering the stream in real time

The key point: the API key is read only on the server and never included in the client bundle. This is a direct consequence of how Next.js App Router separates server and client code.

Actual build output from the sandbox:

▲ Next.js 16.2.6 (Turbopack)
✓ Compiled successfully in 1874ms

Route (app)
┌ ○ /           (Static)  prerendered as static content
└ ƒ /api/chat   (Dynamic) server-rendered on demand
Enter fullscreen mode Exit fullscreen mode

Next.js 16 + Claude API Architecture Diagram

Project Structure

When finished, the structure looks like this. Two files are the core; the rest is generated by create-next-app.

nextjs-claude-chat/
├── src/
│   └── app/
│       ├── api/
│       │   └── chat/
│       │       └── route.ts    ← Claude API streaming endpoint (core)
│       ├── page.tsx             ← Chat UI (core)
│       ├── layout.tsx           ← Auto-generated
│       └── globals.css          ← Auto-generated
├── .env.local                   ← ANTHROPIC_API_KEY goes here
├── package.json
└── tsconfig.json
Enter fullscreen mode Exit fullscreen mode

Two files. route.ts is server code; page.tsx is client code. api/chat/route.ts ends up in the server bundle only, while page.tsx with its "use client" directive goes to the client bundle. This separation is what makes API key security work.

Prerequisites

  • Node.js 18+
  • Anthropic API key (sk-ant-...) — get one at console.anthropic.com
  • Basic TypeScript knowledge
  • Basic Next.js App Router understanding (you can follow along without it)

Step 1: Create the Project and Install Dependencies

npx create-next-app@latest nextjs-claude-chat \
  --typescript \
  --tailwind \
  --eslint \
  --app \
  --src-dir \
  --import-alias "@/*"

cd nextjs-claude-chat
npm install @anthropic-ai/sdk
Enter fullscreen mode Exit fullscreen mode

As of May 2026, create-next-app@latest installs Next.js 16.2.6 with React 19.2.4. Existing tutorials using Next.js 14/15 may have minor differences.

Key dependencies after installation:

{
  "dependencies": {
    "@anthropic-ai/sdk": "^0.97.1",
    "next": "16.2.6",
    "react": "19.2.4",
    "react-dom": "19.2.4"
  }
}
Enter fullscreen mode Exit fullscreen mode

Anthropic SDK 0.97.x is the current latest. Earlier versions (0.20.x and below) had a different messages.stream() API, so pin your version if you're migrating.

Step 2: Implement the Claude API Route Handler

The core file. Create src/app/api/chat/route.ts:

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic({
  apiKey: process.env.ANTHROPIC_API_KEY,
});

export async function POST(req: Request) {
  const { messages } = await req.json();

  const stream = await client.messages.stream({
    model: "claude-opus-4-7",
    max_tokens: 1024,
    messages,
  });

  const encoder = new TextEncoder();
  const readable = new ReadableStream({
    async start(controller) {
      for await (const chunk of stream) {
        if (
          chunk.type === "content_block_delta" &&
          chunk.delta.type === "text_delta"
        ) {
          controller.enqueue(
            encoder.encode(
              `data: ${JSON.stringify({ text: chunk.delta.text })}\n\n`
            )
          );
        }
      }
      controller.enqueue(encoder.encode("data: [DONE]\n\n"));
      controller.close();
    },
  });

  return new Response(readable, {
    headers: {
      "Content-Type": "text/event-stream",
      "Cache-Control": "no-cache",
      Connection: "keep-alive",
    },
  });
}
Enter fullscreen mode Exit fullscreen mode

Two things worth noting here.

First, client.messages.stream() returns an AsyncIterableStream. The for await...of loop receives chunks one at a time and pushes them to the client. When the stream ends, a [DONE] signal is sent and the controller closes.

Second, ReadableStream + TextEncoder is Web Streams API standard. Next.js Route Handlers use Web Streams, not Node.js stream module. This is why the code looks different from FastAPI streaming or Express implementations. new ReadableStream may feel unfamiliar, but it's the standard across modern JavaScript runtimes — Cloudflare Workers, Deno, Bun all work the same way.

The filter on content_block_delta events: Anthropic's streaming protocol emits multiple event types (message_start, content_block_start, content_block_delta, message_delta, message_stop). Only text_delta typed content_block_delta events carry actual text content.

Step 3: Environment Variables and Security

Create .env.local in the project root (same level as .next/):

ANTHROPIC_API_KEY=sk-ant-your-actual-key-here
Enter fullscreen mode Exit fullscreen mode

Never use the NEXT_PUBLIC_ prefix. This is core to Next.js security.

Variable format Accessible from Use for
ANTHROPIC_API_KEY Server only (Route Handler, Server Component) ✓ API keys
NEXT_PUBLIC_API_KEY Client-public (included in browser bundle) ✗ Never use for API keys

NEXT_PUBLIC_ variables get inlined into the JavaScript bundle at build time. Anyone can see them in browser DevTools. Without the prefix, the variable is server-only — referencing it from client code will cause a build error.

Step 4: Client Chat UI

Create src/app/page.tsx with streaming state management:

"use client";

import { useState, useRef, useEffect } from "react";

type Message = {
  role: "user" | "assistant";
  content: string;
};

export default function ChatPage() {
  const [messages, setMessages] = useState<Message[]>([]);
  const [input, setInput] = useState("");
  const [isLoading, setIsLoading] = useState(false);
  const bottomRef = useRef<HTMLDivElement>(null);

  useEffect(() => {
    bottomRef.current?.scrollIntoView({ behavior: "smooth" });
  }, [messages]);

  const sendMessage = async () => {
    if (!input.trim() || isLoading) return;

    const userMessage: Message = { role: "user", content: input };
    const updatedMessages = [...messages, userMessage];
    setMessages(updatedMessages);
    setInput("");
    setIsLoading(true);

    // Add placeholder for the assistant response
    const assistantMessage: Message = { role: "assistant", content: "" };
    setMessages([...updatedMessages, assistantMessage]);

    const res = await fetch("/api/chat", {
      method: "POST",
      headers: { "Content-Type": "application/json" },
      body: JSON.stringify({ messages: updatedMessages }),
    });

    if (!res.body) return;

    const reader = res.body.getReader();
    const decoder = new TextDecoder();

    while (true) {
      const { done, value } = await reader.read();
      if (done) break;

      const chunk = decoder.decode(value);
      const lines = chunk.split("\n");

      for (const line of lines) {
        if (line.startsWith("data: ") && line !== "data: [DONE]") {
          const data = JSON.parse(line.slice(6));
          setMessages((prev) => {
            const last = prev[prev.length - 1];
            return [
              ...prev.slice(0, -1),
              { ...last, content: last.content + data.text },
            ];
          });
        }
      }
    }

    setIsLoading(false);
  };

  return (
    <main className="flex flex-col h-screen max-w-2xl mx-auto p-4">
      <h1 className="text-2xl font-bold mb-4">Claude Chat</h1>
      <div className="flex-1 overflow-y-auto space-y-4 mb-4">
        {messages.map((msg, i) => (
          <div key={i} className={`p-3 rounded-lg ${
            msg.role === "user"
              ? "bg-blue-100 ml-auto max-w-xs"
              : "bg-gray-100 mr-auto max-w-md"
          }`}>
            <span className="text-xs text-gray-500 block mb-1">
              {msg.role === "user" ? "You" : "Claude"}
            </span>
            <p className="whitespace-pre-wrap">{msg.content}</p>
          </div>
        ))}
        <div ref={bottomRef} />
      </div>
      <div className="flex gap-2">
        <input
          type="text"
          value={input}
          onChange={(e) => setInput(e.target.value)}
          onKeyDown={(e) => e.key === "Enter" && sendMessage()}
          placeholder="Type a message..."
          className="flex-1 border rounded-lg px-3 py-2 focus:outline-none focus:ring-2 focus:ring-blue-400"
        />
        <button
          onClick={sendMessage}
          disabled={isLoading || !input.trim()}
          className="bg-blue-500 text-white px-4 py-2 rounded-lg disabled:opacity-50"
        >
          Send
        </button>
      </div>
    </main>
  );
}
Enter fullscreen mode Exit fullscreen mode

Two implementation details to highlight. First, the line !== "data: [DONE]" check: without it, the loop tries to parse [DONE] as JSON and throws an error. Second, the functional setMessages((prev) => ...) update: inside an async loop, closures capture stale state. Using prev ensures you're always appending to the latest message content.

Step 5: Build and Run

npm run dev
# ▲ Next.js 16.2.6 (Turbopack)
# ✓ Ready in 337ms
# Local: http://localhost:3000

npm run build
# ✓ Compiled successfully in 1874ms
# ƒ /api/chat  (Dynamic)
Enter fullscreen mode Exit fullscreen mode

337ms dev server startup is noticeably faster than Webpack-based builds. The production build also runs TypeScript checks automatically — type errors fail the build, which is the right behavior for a typed codebase.

Limitations of This Implementation

I'll be straight: don't ship this to production as-is.

No error handling. When the Claude API fails — rate limit, network error, invalid key — the stream just drops. The user sees nothing. Real services need try/catch blocks and error SSE events:

// Route Handler with error handling
export async function POST(req: Request) {
  try {
    const { messages } = await req.json();
    const stream = await client.messages.stream({ /* ... */ });
    const readable = new ReadableStream({
      async start(controller) {
        try {
          for await (const chunk of stream) { /* ... */ }
        } catch (streamError) {
          controller.enqueue(
            encoder.encode(`data: ${JSON.stringify({ error: "Stream error" })}\n\n`)
          );
        } finally {
          controller.enqueue(encoder.encode("data: [DONE]\n\n"));
          controller.close();
        }
      },
    });
    return new Response(readable, { /* headers */ });
  } catch (err) {
    return new Response(JSON.stringify({ error: "Request failed" }), {
      status: 500,
      headers: { "Content-Type": "application/json" },
    });
  }
}
Enter fullscreen mode Exit fullscreen mode

No conversation length limit. The full message history goes to the API on every request. Long conversations will eventually exceed the context window or drive up costs. Production apps need to trim to the last N messages or manage token counts.

No concurrent request management. Rapid messages or multiple tabs cause streaming collisions. AbortController logic for canceling previous requests is missing.

Deploying to Vercel

A few things I hit when deploying:

Environment variables: Add ANTHROPIC_API_KEY in Vercel's Project Settings → Environment Variables. The .env.local file stays local.

Runtime: Explicitly set Node.js Runtime in your Route Handler to avoid Edge Runtime compatibility issues:

export const runtime = 'nodejs';
Enter fullscreen mode Exit fullscreen mode

Function timeout: Vercel Hobby plan has a 10-second limit. For longer responses, add this to vercel.json:

{
  "functions": {
    "src/app/api/chat/route.ts": {
      "maxDuration": 60
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

How SSE Works Under the Hood

Server-Sent Events is a one-way streaming protocol running over plain HTTP. Unlike WebSockets, it passes through proxies, CDNs, and firewalls with no special handling.

SSE message format:

data: {"text": "Hello"}\n\n
data: {"text": ", World"}\n\n
data: [DONE]\n\n
Enter fullscreen mode Exit fullscreen mode

Each message starts with data: and ends with two newlines. The TextEncoder/TextDecoder pair converts between strings and Uint8Array (bytes) — Web Streams API operates at the byte level. This same pattern works across Next.js, Cloudflare Workers, Deno, and Bun.

Raw API vs. Vercel AI SDK

Aspect Raw Anthropic SDK Vercel AI SDK
Code volume More (~50-line Route Handler) Less (useChat one-liner)
Customization Completely free Within SDK abstractions
Debuggability SSE flow is transparent Internal logic is opaque
Learning value Forces you to understand Web Streams and SSE Use immediately
Best for Understanding streaming mechanics Rapid prototyping

My recommendation: build it the raw way once, then use the SDK. You'll understand what the SDK is actually handling for you. See Building a Claude Streaming Agent with Vercel AI SDK for the SDK-based comparison.

Next Steps

  1. Add Tool Use — Give Claude function-calling ability → Claude Agent SDK Complete Guide
  2. Prompt Caching — Cut API costs up to 90% → Claude API Prompt Caching in Practice
  3. Stronger error handling — AbortController, retry logic, error SSE events
  4. Stream cancellation — Cancel button to stop generation mid-stream
  5. Vercel deployment — Apply the notes above and go live

A deeper guide covering those production gaps is coming in a follow-up post.

Top comments (0)