Kyng Mclendon

Posted on Apr 5

How I Built My First AI-Powered App (Without a PhD)

#ai #beginners #tutorial #javascript

A beginner-to-advanced guide to building real-world AI applications using modern APIs and tools.

Introduction

A year ago, "AI" felt like something reserved for researchers with massive GPU clusters and decades of experience. Today, you can build a production-ready AI-powered app in an afternoon.

This tutorial walks you through building a real AI application from scratch — a smart document summarizer — using the Anthropic Claude API. No machine learning theory required. Whether you're a curious beginner or a seasoned backend dev who's never touched AI, this guide is for you.

By the end, you'll understand:

How to call an AI API from your own app
How to structure prompts for reliable, high-quality results
How to handle streaming responses for a better UX
How to think about AI integration at different levels of complexity

What We're Building

A Document Summarizer web app that:

Accepts any text input (paste an article, paste a legal doc, paste anything)
Sends it to Claude via the Anthropic API
Returns a structured summary with key points, tone, and a one-line TLDR
Streams the response token-by-token (like ChatGPT does)

Let's go.

Prerequisites

Node.js 18+ installed
A free Anthropic API key → console.anthropic.com
Basic familiarity with JavaScript / TypeScript

That's it. No GPU. No Python. No ML frameworks.

Step 1: Set Up Your Project

mkdir ai-summarizer && cd ai-summarizer
npm init -y
npm install @anthropic-ai/sdk express dotenv

Create a .env file:

ANTHROPIC_API_KEY=your_api_key_here

And a basic server.js:

import Anthropic from "@anthropic-ai/sdk";
import express from "express";
import dotenv from "dotenv";

dotenv.config();

const app = express();
app.use(express.json());

const client = new Anthropic();

app.listen(3000, () => console.log("Server running on port 3000"));

Step 2: Understand the Anatomy of a Prompt

This is where most tutorials gloss over the most important part. Prompt engineering is the skill that separates a flaky AI feature from a reliable one.

A prompt has three parts:

Part	Purpose	Example
System prompt	Sets the AI's role and rules	"You are a document analyst. Always respond in JSON."
User message	The actual input	The text to summarize
Constraints	Format, length, tone guardrails	"Respond with 3 bullet points max."

The Golden Rules of Prompting

1. Be specific, not vague

❌ "Summarize this text"

✅ "Summarize this text into: (1) a one-sentence TLDR, (2) 3-5 bullet point key insights, (3) the overall tone (formal/casual/technical)."

2. Tell it what format to return

If you want JSON back, say so explicitly. Claude will comply.

3. Give it a persona

"You are a senior editor at The Economist" produces very different results than just asking for a summary.

4. Use XML tags for complex inputs

Summarize the following document:

<document>
{{USER_TEXT}}
</document>

Respond only with valid JSON.

This helps the model clearly separate instructions from content — especially important with long documents.

Step 3: Build the Summarize Endpoint

app.post("/summarize", async (req, res) => {
  const { text } = req.body;

  if (!text || text.trim().length === 0) {
    return res.status(400).json({ error: "No text provided" });
  }

  const systemPrompt = `You are a world-class editor and analyst. 
Your job is to produce concise, accurate document summaries.
Always respond with valid JSON in this exact shape:
{
  "tldr": "one sentence summary",
  "key_points": ["point 1", "point 2", "point 3"],
  "tone": "formal | casual | technical | emotional",
  "word_count_estimate": 123
}`;

  try {
    const message = await client.messages.create({
      model: "claude-opus-4-5",
      max_tokens: 1024,
      system: systemPrompt,
      messages: [
        {
          role: "user",
          content: `Summarize the following document:\n\n<document>\n${text}\n</document>`,
        },
      ],
    });

    const raw = message.content[0].text;
    const parsed = JSON.parse(raw);
    res.json(parsed);
  } catch (err) {
    console.error(err);
    res.status(500).json({ error: "Something went wrong" });
  }
});

Test it:

curl -X POST http://localhost:3000/summarize \
  -H "Content-Type: application/json" \
  -d '{"text": "Artificial intelligence is transforming every industry..."}'

You'll get back structured JSON — every time. That's the power of a well-crafted prompt.

Step 4: Add Streaming for Better UX

Nobody wants to stare at a spinner for 10 seconds. Streaming lets you show the response as it's being generated — token by token.

app.post("/summarize-stream", async (req, res) => {
  const { text } = req.body;

  res.setHeader("Content-Type", "text/event-stream");
  res.setHeader("Cache-Control", "no-cache");
  res.setHeader("Connection", "keep-alive");

  const stream = await client.messages.stream({
    model: "claude-opus-4-5",
    max_tokens: 1024,
    system: "You are a concise document summarizer. Write clearly and directly.",
    messages: [
      {
        role: "user",
        content: `Summarize this:\n\n${text}`,
      },
    ],
  });

  for await (const chunk of stream) {
    if (
      chunk.type === "content_block_delta" &&
      chunk.delta.type === "text_delta"
    ) {
      res.write(`data: ${chunk.delta.text}\n\n`);
    }
  }

  res.write("data: [DONE]\n\n");
  res.end();
});

On the frontend, consume it with EventSource or fetch with a readable stream. The UX difference is dramatic.

Step 5: Level Up — Context Windows and Long Documents

Here's where it gets interesting for intermediate/senior devs.

Claude has a 200,000 token context window. That's roughly 150,000 words — longer than most novels. But you still need to be smart about what you send.

Chunking Strategy for Very Long Docs

If a document exceeds your comfortable token budget, chunk it:

function chunkText(text, maxChars = 50000) {
  const chunks = [];
  let start = 0;

  while (start < text.length) {
    let end = Math.min(start + maxChars, text.length);

    // Try to break at a paragraph boundary
    const lastParagraph = text.lastIndexOf("\n\n", end);
    if (lastParagraph > start) end = lastParagraph;

    chunks.push(text.slice(start, end));
    start = end;
  }

  return chunks;
}

Then summarize each chunk, and do a final "summary of summaries" pass. This is called the Map-Reduce prompting pattern and it's used in production by many serious AI apps.

Step 6: Error Handling and Rate Limits

Production AI apps fail gracefully. Here's what to handle:

import Anthropic from "@anthropic-ai/sdk";

async function callWithRetry(fn, retries = 3) {
  for (let i = 0; i < retries; i++) {
    try {
      return await fn();
    } catch (err) {
      if (err instanceof Anthropic.RateLimitError) {
        const delay = Math.pow(2, i) * 1000; // Exponential backoff
        console.log(`Rate limited. Waiting ${delay}ms...`);
        await new Promise((r) => setTimeout(r, delay));
      } else if (err instanceof Anthropic.APIError) {
        console.error("API error:", err.status, err.message);
        throw err; // Don't retry on non-rate-limit errors
      } else {
        throw err;
      }
    }
  }
  throw new Error("Max retries exceeded");
}

Key error types to handle:

RateLimitError → Retry with backoff
APIError (4xx) → Likely a bad prompt or input; don't retry
APIConnectionError → Network issue; retry
JSON parse failures → Your prompt didn't enforce the format well enough; refine it

What's Next?

You've built a working AI app. Here's where to go from here:

🔵 Beginner next steps

Add a simple HTML frontend with a textarea + button
Try different system prompts and see how the output changes
Experiment with temperature (coming in future API versions)

🟡 Intermediate next steps

Add conversation history (multi-turn chat)
Use tool use / function calling to let Claude trigger real actions in your app
Implement semantic caching to reduce API costs

🔴 Advanced next steps

Build an agent loop where Claude can take actions, observe results, and retry
Add RAG (Retrieval Augmented Generation) with a vector database
Fine-tune prompts using systematic evaluation pipelines

Key Takeaways

Prompts are code. Treat them with the same rigor as your application logic. Version-control them.
Structured output is your friend. Always ask for JSON when you need to parse the result.
Streaming dramatically improves perceived performance. Use it for anything that takes more than 2 seconds.
Start simple, then layer complexity. A direct API call beats a complicated agent system until you actually need the agent.
Error handling matters. Rate limits and timeouts will happen in production.

Resources

Written by a developer, for developers. Drop any questions in the comments — I read them all.

DEV Community