DEV Community

Kyng Mclendon
Kyng Mclendon

Posted on

How I Built My First AI-Powered App (Without a PhD)

A beginner-to-advanced guide to building real-world AI applications using modern APIs and tools.


Introduction

A year ago, "AI" felt like something reserved for researchers with massive GPU clusters and decades of experience. Today, you can build a production-ready AI-powered app in an afternoon.

This tutorial walks you through building a real AI application from scratch — a smart document summarizer — using the Anthropic Claude API. No machine learning theory required. Whether you're a curious beginner or a seasoned backend dev who's never touched AI, this guide is for you.

By the end, you'll understand:

  • How to call an AI API from your own app
  • How to structure prompts for reliable, high-quality results
  • How to handle streaming responses for a better UX
  • How to think about AI integration at different levels of complexity

What We're Building

A Document Summarizer web app that:

  1. Accepts any text input (paste an article, paste a legal doc, paste anything)
  2. Sends it to Claude via the Anthropic API
  3. Returns a structured summary with key points, tone, and a one-line TLDR
  4. Streams the response token-by-token (like ChatGPT does)

Let's go.


Prerequisites

  • Node.js 18+ installed
  • A free Anthropic API key → console.anthropic.com
  • Basic familiarity with JavaScript / TypeScript

That's it. No GPU. No Python. No ML frameworks.


Step 1: Set Up Your Project

mkdir ai-summarizer && cd ai-summarizer
npm init -y
npm install @anthropic-ai/sdk express dotenv
Enter fullscreen mode Exit fullscreen mode

Create a .env file:

ANTHROPIC_API_KEY=your_api_key_here
Enter fullscreen mode Exit fullscreen mode

And a basic server.js:

import Anthropic from "@anthropic-ai/sdk";
import express from "express";
import dotenv from "dotenv";

dotenv.config();

const app = express();
app.use(express.json());

const client = new Anthropic();

app.listen(3000, () => console.log("Server running on port 3000"));
Enter fullscreen mode Exit fullscreen mode

Step 2: Understand the Anatomy of a Prompt

This is where most tutorials gloss over the most important part. Prompt engineering is the skill that separates a flaky AI feature from a reliable one.

A prompt has three parts:

Part Purpose Example
System prompt Sets the AI's role and rules "You are a document analyst. Always respond in JSON."
User message The actual input The text to summarize
Constraints Format, length, tone guardrails "Respond with 3 bullet points max."

The Golden Rules of Prompting

1. Be specific, not vague

"Summarize this text"

"Summarize this text into: (1) a one-sentence TLDR, (2) 3-5 bullet point key insights, (3) the overall tone (formal/casual/technical)."

2. Tell it what format to return

If you want JSON back, say so explicitly. Claude will comply.

3. Give it a persona

"You are a senior editor at The Economist" produces very different results than just asking for a summary.

4. Use XML tags for complex inputs

Summarize the following document:

<document>
{{USER_TEXT}}
</document>

Respond only with valid JSON.
Enter fullscreen mode Exit fullscreen mode

This helps the model clearly separate instructions from content — especially important with long documents.


Step 3: Build the Summarize Endpoint

app.post("/summarize", async (req, res) => {
  const { text } = req.body;

  if (!text || text.trim().length === 0) {
    return res.status(400).json({ error: "No text provided" });
  }

  const systemPrompt = `You are a world-class editor and analyst. 
Your job is to produce concise, accurate document summaries.
Always respond with valid JSON in this exact shape:
{
  "tldr": "one sentence summary",
  "key_points": ["point 1", "point 2", "point 3"],
  "tone": "formal | casual | technical | emotional",
  "word_count_estimate": 123
}`;

  try {
    const message = await client.messages.create({
      model: "claude-opus-4-5",
      max_tokens: 1024,
      system: systemPrompt,
      messages: [
        {
          role: "user",
          content: `Summarize the following document:\n\n<document>\n${text}\n</document>`,
        },
      ],
    });

    const raw = message.content[0].text;
    const parsed = JSON.parse(raw);
    res.json(parsed);
  } catch (err) {
    console.error(err);
    res.status(500).json({ error: "Something went wrong" });
  }
});
Enter fullscreen mode Exit fullscreen mode

Test it:

curl -X POST http://localhost:3000/summarize \
  -H "Content-Type: application/json" \
  -d '{"text": "Artificial intelligence is transforming every industry..."}'
Enter fullscreen mode Exit fullscreen mode

You'll get back structured JSON — every time. That's the power of a well-crafted prompt.


Step 4: Add Streaming for Better UX

Nobody wants to stare at a spinner for 10 seconds. Streaming lets you show the response as it's being generated — token by token.

app.post("/summarize-stream", async (req, res) => {
  const { text } = req.body;

  res.setHeader("Content-Type", "text/event-stream");
  res.setHeader("Cache-Control", "no-cache");
  res.setHeader("Connection", "keep-alive");

  const stream = await client.messages.stream({
    model: "claude-opus-4-5",
    max_tokens: 1024,
    system: "You are a concise document summarizer. Write clearly and directly.",
    messages: [
      {
        role: "user",
        content: `Summarize this:\n\n${text}`,
      },
    ],
  });

  for await (const chunk of stream) {
    if (
      chunk.type === "content_block_delta" &&
      chunk.delta.type === "text_delta"
    ) {
      res.write(`data: ${chunk.delta.text}\n\n`);
    }
  }

  res.write("data: [DONE]\n\n");
  res.end();
});
Enter fullscreen mode Exit fullscreen mode

On the frontend, consume it with EventSource or fetch with a readable stream. The UX difference is dramatic.


Step 5: Level Up — Context Windows and Long Documents

Here's where it gets interesting for intermediate/senior devs.

Claude has a 200,000 token context window. That's roughly 150,000 words — longer than most novels. But you still need to be smart about what you send.

Chunking Strategy for Very Long Docs

If a document exceeds your comfortable token budget, chunk it:

function chunkText(text, maxChars = 50000) {
  const chunks = [];
  let start = 0;

  while (start < text.length) {
    let end = Math.min(start + maxChars, text.length);

    // Try to break at a paragraph boundary
    const lastParagraph = text.lastIndexOf("\n\n", end);
    if (lastParagraph > start) end = lastParagraph;

    chunks.push(text.slice(start, end));
    start = end;
  }

  return chunks;
}
Enter fullscreen mode Exit fullscreen mode

Then summarize each chunk, and do a final "summary of summaries" pass. This is called the Map-Reduce prompting pattern and it's used in production by many serious AI apps.


Step 6: Error Handling and Rate Limits

Production AI apps fail gracefully. Here's what to handle:

import Anthropic from "@anthropic-ai/sdk";

async function callWithRetry(fn, retries = 3) {
  for (let i = 0; i < retries; i++) {
    try {
      return await fn();
    } catch (err) {
      if (err instanceof Anthropic.RateLimitError) {
        const delay = Math.pow(2, i) * 1000; // Exponential backoff
        console.log(`Rate limited. Waiting ${delay}ms...`);
        await new Promise((r) => setTimeout(r, delay));
      } else if (err instanceof Anthropic.APIError) {
        console.error("API error:", err.status, err.message);
        throw err; // Don't retry on non-rate-limit errors
      } else {
        throw err;
      }
    }
  }
  throw new Error("Max retries exceeded");
}
Enter fullscreen mode Exit fullscreen mode

Key error types to handle:

  • RateLimitError → Retry with backoff
  • APIError (4xx) → Likely a bad prompt or input; don't retry
  • APIConnectionError → Network issue; retry
  • JSON parse failures → Your prompt didn't enforce the format well enough; refine it

What's Next?

You've built a working AI app. Here's where to go from here:

🔵 Beginner next steps

  • Add a simple HTML frontend with a textarea + button
  • Try different system prompts and see how the output changes
  • Experiment with temperature (coming in future API versions)

🟡 Intermediate next steps

  • Add conversation history (multi-turn chat)
  • Use tool use / function calling to let Claude trigger real actions in your app
  • Implement semantic caching to reduce API costs

🔴 Advanced next steps

  • Build an agent loop where Claude can take actions, observe results, and retry
  • Add RAG (Retrieval Augmented Generation) with a vector database
  • Fine-tune prompts using systematic evaluation pipelines

Key Takeaways

  1. Prompts are code. Treat them with the same rigor as your application logic. Version-control them.
  2. Structured output is your friend. Always ask for JSON when you need to parse the result.
  3. Streaming dramatically improves perceived performance. Use it for anything that takes more than 2 seconds.
  4. Start simple, then layer complexity. A direct API call beats a complicated agent system until you actually need the agent.
  5. Error handling matters. Rate limits and timeouts will happen in production.

Resources


Written by a developer, for developers. Drop any questions in the comments — I read them all.

Top comments (0)