Anmol Baranwal

for CopilotKit

Posted on Mar 11

How to add AI to your App in 5 minutes

#ai #opensource #programming #tutorial

Most AI integrations are overengineered.

Even basic tutorials on adding an AI assistant get mixed up with LangGraph workflows and FastAPI servers. You don't need any of that.

Today, we will build one without any external agent framework or orchestration layers. Just an AI that knows what's in your app and can actually do things.

You will also learn how to implement the Generative UI pattern, which lets you render actual components rather than just text responses.

Let's jump in.

Why AI Integrations Get Complicated

Adding AI to an app sounds simple until you actually try it.

The moment you go beyond a basic chat interface, you run into a fundamental problem: LLMs on their own can't do anything. They can only generate text.

To make an AI that actually takes actions: reads your app's state, updates data, calls APIs - you need to build the plumbing yourself.

That plumbing is what an AI agent is. At its core, every agent runs the same loop:

Observe - take in context (user input, tool results, memory)
Reason - decide what to do next
Act - call a tool, write to a file, hit an API, whatever's needed

Simple enough in theory. But in practice, you end up writing the orchestration layer that manages this loop, tool registries that expose your app's functions to the LLM, state management to track what the agent knows and error handling for when the model goes sideways.

context = [initial_event]
while True:
  next_step = await llm.determine_next_step(context)
  context.append(next_step)
  if next_step.intent == "done":
    return next_step.final_answer
  result = await execute_step(next_step)
  context.append(result)

In a full agent setup, the LLM sits at the center - orchestrating across data sources, tools, models, and external services.

That's why tutorials teach for LangGraph and FastAPI servers. And for complex automation pipelines, that complexity is justified.

But if your goal is just to add an AI assistant inside your product - one that understands your UI, reads your data, and can take actions on behalf of the user, you are building a lot of infrastructure to solve a much smaller problem.

CopilotKit handles the loop, the context, the streaming and the frontend integration so you don't have to wire any of it yourself.

For this tutorial, we will use CopilotKit without any external agent framework, just hooks and your LLM of choice.

What CopilotKit Actually Does

CopilotKit is an open source framework to add AI agents to your app that abstracts away everything you just read about.

Instead of building your own agent loop, tool registry and streaming layer, you get a set of hooks and pre-built components that plug directly into your existing frontend.

The mental model is simple, you provide the AI with two things:

Your app's state - so it understands what's on screen and what the user is working with
A set of tools - so it can actually do something useful, not just respond with text

In practice, this maps to two hooks:

useAgentContext - share app state with the AI. Whatever you pass in becomes part of the LLM's context: current user, selected items, loaded data - the AI knows what your app knows.
useFrontendTool - define actions the AI can trigger. You give it a name, a description, and a handler. The LLM decides when to call it based on what the user asks.

The result is that adding an AI assistant to your app becomes a frontend problem, not an infrastructure problem.

Your UI, agent, tools are in one single interaction loop.

How to wire everything (1 minute)

You can follow along with the official docs or just stick with this. I will walk through exactly what each piece does and why it's there.

I'm using Next.js with TypeScript, but this works with any React/Angular framework out there!

// creates a nextjs app  
npx create-next-app@latest .

Install the three CopilotKit packages you need:

npm install @copilotkit/react-core @copilotkit/react-ui @copilotkit/runtime

@copilotkit/react-core/v2 : hooks and built-in chat components
@copilotkit/react-ui/v2 : styles
@copilotkit/runtime : backend runtime and LLM adapters

Create the API route at app/api/copilotkit/route.ts.

It's the single endpoint that receives messages from your UI, runs them through the agent loop and streams responses back.

BuiltInAgent is what runs the observe → reason → act loop
CopilotRuntime manages the session, streaming, and tool execution

That's your entire backend.

import { CopilotRuntime, copilotRuntimeNextJSAppRouterEndpoint } from "@copilotkit/runtime";
import { BuiltInAgent } from "@copilotkit/runtime/v2";
import { NextRequest } from "next/server";

const builtInAgent = new BuiltInAgent({
  model: "openai:gpt-5",
  // apiKey: process.env.OPENAI_API_KEY,
});

const runtime = new CopilotRuntime({
  agents: { default: builtInAgent },
});

export const POST = async (req: NextRequest) => {
  const { handleRequest } = copilotRuntimeNextJSAppRouterEndpoint({
    runtime,
    endpoint: "/api/copilotkit",
  });
  return handleRequest(req);
};

Create .env.local in the root and add your OpenAI API key.

OPENAI_API_KEY=sk-proj-...

If you want to switch to any other LLM Provider, you just need to pass the modified string to BuiltInAgent in your API route. Everything else remains the same.

// Anthropic
const builtInAgent = new BuiltInAgent({ model: "anthropic:claude-sonnet-4-5" });

// Google
const builtInAgent = new BuiltInAgent({ model: "google:gemini-2.0-flash" });

Add the matching key to .env.local and you are done. For custom models like Azure OpenAI, AWS Bedrock or Ollama, check the model selection docs.

Now, connect the frontend to that backend by wrapping your app in app/layout.tsx.

import { CopilotKit } from "@copilotkit/react-core";
import "@copilotkit/react-ui/v2/styles.css";

export default function RootLayout({ children }: { children: React.ReactNode }) {
  return (
    <html lang="en">
      <body>
        <CopilotKit runtimeUrl="/api/copilotkit">
          {children}
        </CopilotKit>
      </body>
    </html>
  );
}

CopilotKit is the context provider that every hook below it depends on. runtimeUrl points to the route you just created.

There are many in-built components like CopilotSidebar, CopilotChat and CopilotPopup. For this, I will be adding the chat sidebar.

import { CopilotSidebar } from "@copilotkit/react-core/v2";

export default function Page() {
  return (
    <main>
      <h1>Your App</h1>
      <CopilotSidebar />
    </main>
  );
}

Run npm run dev and you now have a working AI sidebar assistant! 🎉

It can respond to anything but it has no idea what's actually in your app. It doesn't know your data, your UI state or what the user is looking at. The next two steps fix that.

Give the AI Context

By default, the AI knows nothing about what's in your app. You can ask it, "how much did I spend on food?" and it has no idea - it only sees the conversation, not your app's state.

useAgentContext solves that. It pushes your React state into the agent's context window on every turn, so the AI always has a current snapshot of what's in your UI.

Let's build a simple expense tracker to see how this works.

"use client";
import { useState } from "react";
import { useAgentContext, CopilotSidebar } from "@copilotkit/react-core/v2";

type Expense = {
  id: number;
  description: string;
  amount: number;
  category: string;
};

const initialExpenses: Expense[] = [
  { id: 1, description: "Groceries", amount: 85, category: "Food" },
  { id: 2, description: "Netflix", amount: 15, category: "Entertainment" },
  { id: 3, description: "Uber", amount: 22, category: "Transport" },
];

export default function Page() {
  const [expenses, setExpenses] = useState<Expense[]>(initialExpenses);

  useAgentContext({
    description: "The user's current expense list. Each item has an id, description, amount in dollars, and category.",
    value: expenses,
  });

  return (
    <main className="p-8">
      <h1 className="text-2xl font-bold mb-4">My Expenses</h1>
      <ul className="space-y-2">
        {expenses.map((e) => (
          <li key={e.id} className="flex justify-between border-b py-2">
            <span>
              {e.description}{" "}
              <span className="text-gray-400 text-sm">({e.category})</span>
            </span>
            <span>${e.amount}</span>
          </li>
        ))}
      </ul>
      <CopilotSidebar />
    </main>
  );
}

expenses is your normal React state. useAgentContext takes it and injects it into the LLM's context on every turn.

Now, if the user asks "what's my biggest expense?", the AI is looking at the same list your UI renders.

Let the AI Do Things

Awareness is one thing. Being able to act is what makes an agent actually useful.

useFrontendTool lets you hand the AI a set of actions it can trigger. You define a tool with a name, a description, and a handler - the LLM decides when to call it based on what the user asks.

The LLM reads the description fields to understand what each tool does and what each parameter expects, so write them clearly and the model will fill them accurately even from casual conversational input.

Add useFrontendTool inside the same page.

parameters takes a Zod schema (a TypeScript-first validation library). If you haven't used it before, the idea is simple: you define the shape of your data with z.object({...}) and each field gets a .describe() call that tells the LLM what it means.

Install it with npm install zod.

"use client";
import { useState } from "react";
import { useAgentContext, useFrontendTool, CopilotSidebar } from "@copilotkit/react-core/v2";
import { z } from "zod";

// ... type and initialExpenses stay the same

export default function Page() {
  const [expenses, setExpenses] = useState<Expense[]>(initialExpenses);

  // ... useAgentContext

  useFrontendTool({
  name: "addExpense",
  description:
    "Add a new expense when the user mentions spending money on something.",
  parameters: z.object({
    description: z
      .string()
      .describe("What the expense was for, e.g. Lunch, Taxi, Coffee"),
    amount: z.number().describe("How much was spent in dollars"),
    category: z
      .string()
      .describe("Category: Food, Transport, Entertainment, Health, or Other"),
  }),
  handler: async ({ description, amount, category }) => {
    setExpenses((prev) => [
      ...prev,
      { id: Date.now(), description, amount, category },
    ]);
  },
});

  return (
    // ... remains the same
  );
}

You can send a sample query like "I spent $40 on dinner last night." The agent calls addExpense with { description: "Dinner", amount: 40, category: "Food" }, your handler updates expenses and the new item appears in the list instantly.

This is the observe → reason → act loop running end to end.

The agent observed your message and your current expenses via useAgentContext, reasoned that addExpense is the right call, and acted by invoking your handler with the right parameters. CopilotKit handled everything in between.

Bonus: Generative UI

The AI can read your data and update it. But so far, it only responds in text.

Generative UI is a new idea: instead of the agent describing results, it renders actual UI. For example, if someone asks about their spending, the app would display a breakdown card.

CopilotKit supports this through the render property on useFrontendTool. Instead of a text reply, the tool returns a React component - rendered inline in the chat, using your own design system.

Let's add a summary tool inside the same component.

import { ToolCallStatus, useFrontendTool } from "@copilotkit/react-core/v2";

useFrontendTool({
  name: "showSpendingSummary",
  description:
    "Call this when the user asks for a summary or overview of their expenses.",
  parameters: z.object({}),
  handler: async () => {
    const summary = expenses.reduce(
      (acc, e) => {
        acc[e.category] = (acc[e.category] ?? 0) + e.amount;
        return acc;
      },
      {} as Record<string, number>,
    );
    const total = expenses.reduce((sum, e) => sum + e.amount, 0);
    return JSON.stringify({ summary, total });
  },
  render: ({ result, status }) => {
    return (
      <div className="rounded-lg border p-4 mt-2 space-y-3">
        <p className="font-semibold text-sm">
          {status === ToolCallStatus.InProgress ? "Calculating..." : "Spending Breakdown"}
        </p>
        {status === ToolCallStatus.Complete && result && (
          <>
            {Object.entries(
              (JSON.parse(result) as { summary: Record<string, number>; total: number }).summary
            ).map(([category, amount]) => (
              <div key={category} className="flex justify-between text-sm">
                <span className="text-gray-600">{category}</span>
                <span className="font-medium">${amount}</span>
              </div>
            ))}
            <div className="flex justify-between text-sm font-semibold border-t pt-2">
              <span>Total</span>
              <span>${(JSON.parse(result) as { summary: Record<string, number>; total: number }).total}</span>
            </div>
          </>
        )}
      </div>
    );
  },
});

What's happening here:

parameters: z.object({}) is empty. The LLM doesn't need to pass anything, it just needs to decide when to call the tool
handler does the actual work, calculates category totals from the expenses state and returns them
ToolCallStatus gives you the exact lifecycle states of the tool call
result is the JSON string returned by your handler, parsed back into an object inside render

When you ask, "summarize my spending." LLM reads the query, looks at the context, recognizes that showSpendingSummary is the right tool to call and triggers it.

That's the pattern: the LLM decides when to act, your code does the work, your components display the result.

The working implementation of this minimal expense tracker is available in this GitHub repo in the example/basic branch.

The main branch takes the same pattern further with a full kanban board:

Move tasks between columns by asking the AI
Add, delete and reassign tasks from natural language
Get a visual board summary with generative UI

Here is the demo!

Most AI features that feel magical aren't. They are just an app that knows what's on screen and has a few actions wired up. That's it.

The hard part was always the infrastructure. Turns out it doesn't have to be.

You can check my work at anmolbaranwal.com. Thank you for reading! 🥰

Follow CopilotKit on Twitter and say hi, and if you'd like to build something cool, join the Discord community.

CopilotKit

React UI + elegant infrastructure for AI Copilots, in-app AI agents, AI chatbots, and AI-powered Textareas 🪁

Top comments (9)

EmberNoGlow • Mar 11

Good job!

Anmol Baranwal CopilotKit • Mar 11

thanks for reading! 🙌

klement Gunndu • Mar 12

The observe-reason-act loop diagram is spot on. One thing I've learned the hard way: the Act step needs a kill switch — without bounded retries, a confused model will call the same tool in an infinite loop.