DEV Community

Cover image for Build Your First AI Agent with RapidKit in 10 Minutes
RapidKit
RapidKit

Posted on • Edited on

Build Your First AI Agent with RapidKit in 10 Minutes

The Problem with AI Integration

You want to add AI to your FastAPI backend.

What usually happens:

  • Install 5 different packages (openai, anthropic, langchain, etc.)
  • Write wrapper code for each provider
  • Handle streaming, caching, retries manually
  • Debug configuration hell
  • Spend 2 days integrating what should take 10 minutes

There's a better way.


What We're Building

A production-ready AI Assistant Service with:

  • ✅ Multiple AI providers (Echo, Template, OpenAI-ready)
  • ✅ Chat completions API
  • ✅ Streaming responses
  • ✅ Conversation history
  • ✅ Response caching
  • ✅ Health monitoring
  • ✅ Provider switching

Time to build: 10 minutes

Lines of custom code: ~50

Source code: https://github.com/getrapidkit/rapidkit-examples/tree/main/my-ai-workspace

Published article links:

Let's start.


Prerequisites

You need:

  • Node.js 20+ (node --version)
  • Python 3.10+ (python3 --version)
  • A terminal
  • (Optional) OpenAI API key for real AI

Time: 1 minute to verify


Minute 0-2: Create Workspace and Project

Step 1: Create workspace (handles Python/Poetry setup automatically)

npx rapidkit my-ai-workspace
Enter fullscreen mode Exit fullscreen mode

Prompts you'll see:

? Author name: developer
? Select Python version: 3.10
? Install method: 🎯 Poetry (Recommended)
Enter fullscreen mode Exit fullscreen mode

What happens:

  • Creates workspace directory
  • Installs Poetry if needed
  • Creates virtualenv in workspace root
  • Installs rapidkit-core

Step 2: Create FastAPI project

cd my-ai-workspace
rapidkit create project fastapi.standard ai-agent
Enter fullscreen mode Exit fullscreen mode

Prompts:

Project name: ai-agent
Install essential modules? Y
Enter fullscreen mode Exit fullscreen mode

Step 3: Initialize project

cd ai-agent
rapidkit init
Enter fullscreen mode Exit fullscreen mode

Your structure:

my-ai-workspace/
├── .venv/                  # Workspace virtualenv
├── pyproject.toml
└── ai-agent/
    ├── .venv/             # Project virtualenv
    ├── src/
    │   ├── main.py
    │   └── modules/
    ├── tests/
    └── pyproject.toml
Enter fullscreen mode Exit fullscreen mode

Minute 2-3: Add AI Assistant Module

rapidkit add module ai_assistant
Enter fullscreen mode Exit fullscreen mode

What gets installed:

src/
├── modules/
│   └── free/
│       └── ai/
│           └── ai_assistant/
│               ├── ai_assistant.py          # Runtime engine
│               ├── ai_assistant_types.py    # Type definitions
│               └── routers/
│                   └── ai/
│                       └── ai_assistant.py  # FastAPI routes
└── health/
    └── ai_assistant.py                      # Health endpoint

tests/
└── modules/
    └── free/
        └── integration/
            └── ai/
                └── ai_assistant/
                    └── test_ai_assistant_integration.py
Enter fullscreen mode Exit fullscreen mode

Key Components:

  1. Runtime (ai_assistant.py):

    • Provider management
    • Chat completion logic
    • Streaming support
    • Conversation history
    • Response caching
  2. Routes (routers/ai/ai_assistant.py):

    • POST /ai/assistant/completions — Get completions
    • POST /ai/assistant/stream — Stream responses
    • GET /ai/assistant/providers — List providers
    • POST /ai/assistant/cache/clear — Clear cache
    • GET /ai/assistant/health — Health check
  3. Types (ai_assistant_types.py):

    • AssistantMessage — Message structure
    • AssistantResponse — Response structure
    • ProviderConfig — Provider configuration
    • AiAssistantConfig — Runtime config

Minute 3-4: Configure Providers

Create config/ai_assistant.yaml:

# AI Assistant Configuration
providers:
  # Echo Provider (for testing)
  - name: echo
    provider_type: echo
    enabled: true
    options:
      prefix: "[AI] "
      suffix: " (powered by Echo)"
      mirror_context: true

  # Template Provider (pre-defined responses)
  - name: support
    provider_type: template
    enabled: true
    options:
      responses:
        - "Thank you for contacting support. {prompt}"
        - "I understand your concern about {prompt}. Let me help."
        - "Based on '{context}', regarding {prompt}, here's what I suggest..."

  # OpenAI Provider (ready for real AI)
  # - name: openai
  #   provider_type: openai
  #   enabled: false
  #   options:
  #     api_key: ${OPENAI_API_KEY}
  #     model: gpt-4
  #     temperature: 0.7

default_provider: echo
conversation_window: 20
cache_enabled: true
request_timeout_seconds: 30
Enter fullscreen mode Exit fullscreen mode

Provider Types Explained:

Echo Provider

Perfect for testing. Echoes back your prompt with optional prefix/suffix.

# Example response:
# Input: "Hello"
# Output: "[AI] Hello (powered by Echo)"
Enter fullscreen mode Exit fullscreen mode

Template Provider

Returns pre-configured responses in rotation. Great for FAQ bots.

# First request: "How do I reset password?"
# Response: "Thank you for contacting support. How do I reset password?"
# 
# Second request: "Where is my order?"
# Response: "I understand your concern about Where is my order?. Let me help."
Enter fullscreen mode Exit fullscreen mode

OpenAI Provider (Real AI)

Integrates with OpenAI API. You'll implement this in advanced section.


Minute 4-5: Integrate into FastAPI

Update src/main.py:

from fastapi import FastAPI
from src.modules.free.ai.ai_assistant.routers.ai.ai_assistant import (
    register_ai_assistant,
    build_config,
)

# Load config from YAML
import yaml
with open("config/ai_assistant.yaml") as f:
    config_data = yaml.safe_load(f)

app = FastAPI(
    title="AI Agent API",
    version="1.0.0",
    description="Production AI Assistant powered by RapidKit"
)

# Register AI Assistant
register_ai_assistant(app, config=config_data)

@app.get("/")
async def root():
    return {
        "message": "AI Agent API is running",
        "endpoints": {
            "completions": "/ai/assistant/completions",
            "stream": "/ai/assistant/stream",
            "providers": "/ai/assistant/providers",
            "health": "/ai/assistant/health"
        }
    }
Enter fullscreen mode Exit fullscreen mode

Install PyYAML:

poetry add pyyaml
Enter fullscreen mode Exit fullscreen mode

Minute 5-6: Test Basic Functionality

Start the server:

rapidkit dev
Enter fullscreen mode Exit fullscreen mode

Server output:

INFO:     Uvicorn running on http://127.0.0.1:8000
INFO:     Application startup complete.
Enter fullscreen mode Exit fullscreen mode

Open browser: http://localhost:8000/docs

You'll see Swagger UI with AI endpoints!


Minute 6-7: Test Echo Provider

List available providers:

curl http://localhost:8000/ai/assistant/providers
Enter fullscreen mode Exit fullscreen mode

Response:

["echo", "support"]
Enter fullscreen mode Exit fullscreen mode

Test echo completion:

curl -X POST http://localhost:8000/ai/assistant/completions \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "What is RapidKit?",
    "provider": "echo"
  }'
Enter fullscreen mode Exit fullscreen mode

Response:

{
  "provider": "echo",
  "content": "[AI] What is RapidKit? (powered by Echo)",
  "latency_ms": 0.234,
  "cached": false,
  "usage": {
    "prompt_tokens": 3,
    "completion_tokens": 8,
    "total_tokens": 11
  },
  "metadata": null
}
Enter fullscreen mode Exit fullscreen mode

Minute 7-8: Test Template Provider

curl -X POST http://localhost:8000/ai/assistant/completions \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "How do I reset my password?",
    "provider": "support"
  }'
Enter fullscreen mode Exit fullscreen mode

Response:

{
  "provider": "support",
  "content": "Thank you for contacting support. How do I reset my password?",
  "latency_ms": 0.156,
  "cached": false,
  "usage": {
    "prompt_tokens": 6,
    "completion_tokens": 8,
    "total_tokens": 14
  }
}
Enter fullscreen mode Exit fullscreen mode

Test with conversation context:

curl -X POST http://localhost:8000/ai/assistant/completions \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "Follow up question",
    "provider": "support",
    "context": [
      {"role": "user", "content": "I forgot my password"},
      {"role": "assistant", "content": "I can help reset your password"}
    ]
  }'
Enter fullscreen mode Exit fullscreen mode

Response includes context:

{
  "provider": "support",
  "content": "Based on 'I forgot my password | I can help reset your password', regarding Follow up question, here's what I suggest...",
  "latency_ms": 0.189,
  "cached": false,
  "usage": {...}
}
Enter fullscreen mode Exit fullscreen mode

Minute 8-9: Test Streaming

curl -X POST http://localhost:8000/ai/assistant/stream \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "Tell me about AI agents",
    "provider": "echo"
  }'
Enter fullscreen mode Exit fullscreen mode

Response streams word by word:

[AI] Tell me about AI agents (powered by Echo)
Enter fullscreen mode Exit fullscreen mode

With Python client:

import requests

response = requests.post(
    "http://localhost:8000/ai/assistant/stream",
    json={"prompt": "Explain RapidKit", "provider": "echo"},
    stream=True
)

for chunk in response.iter_content(chunk_size=None, decode_unicode=True):
    print(chunk, end="", flush=True)
Enter fullscreen mode Exit fullscreen mode

Minute 9-10: Test Caching

First request (not cached):

curl -X POST http://localhost:8000/ai/assistant/completions \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "What is caching?",
    "provider": "echo"
  }'
Enter fullscreen mode Exit fullscreen mode

Response:

{
  "cached": false,
  "latency_ms": 0.245
}
Enter fullscreen mode Exit fullscreen mode

Second identical request (cached):

# Same request again
curl -X POST http://localhost:8000/ai/assistant/completions \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "What is caching?",
    "provider": "echo"
  }'
Enter fullscreen mode Exit fullscreen mode

Response:

{
  "cached": true,
  "latency_ms": 0.245
}
Enter fullscreen mode Exit fullscreen mode

Notice: cached: true and same latency (from cache, not recomputed)

Clear cache:

curl -X POST http://localhost:8000/ai/assistant/cache/clear
Enter fullscreen mode Exit fullscreen mode

Health Monitoring

Check AI assistant health:

curl http://localhost:8000/ai/assistant/health
Enter fullscreen mode Exit fullscreen mode

Response:

{
  "vendor": {
    "module": "ai_assistant",
    "status": "ok",
    "version": "0.1.7"
  },
  "runtime": {
    "module": "ai_assistant",
    "status": "ok",
    "providers": [
      {
        "name": "echo",
        "status": "ok",
        "latency_ms": 0.245,
        "details": {
          "prefix": "[AI] ",
          "suffix": " (powered by Echo)",
          "mirror_context": true
        }
      },
      {
        "name": "support",
        "status": "ok",
        "latency_ms": 0.189,
        "details": {
          "responses": 3
        }
      }
    ],
    "cache_entries": 5,
    "history_length": 12,
    "timestamp": "2026-02-10T10:30:00"
  },
  "status": "ok"
}
Enter fullscreen mode Exit fullscreen mode

Key metrics:

  • Provider status and latency
  • Cache hit count
  • Conversation history length
  • Overall health status

Advanced: Add OpenAI Provider

Step 1: Install OpenAI SDK

poetry add openai
Enter fullscreen mode Exit fullscreen mode

Step 2: Create custom provider

Create src/modules/free/ai/ai_assistant/providers/openai_provider.py:

"""OpenAI provider implementation."""

from __future__ import annotations
import os
from typing import Any, Iterable, Mapping, Sequence

from openai import OpenAI
from src.modules.free.ai.ai_assistant.ai_assistant import ChatProvider
from src.modules.free.ai.ai_assistant.ai_assistant_types import (
    AssistantMessage,
    ProviderStatus,
)


class OpenAIProvider:
    """OpenAI chat completion provider."""

    def __init__(
        self,
        *,
        name: str = "openai",
        api_key: str | None = None,
        model: str = "gpt-3.5-turbo",
        temperature: float = 0.7,
    ) -> None:
        self.name = name
        self._model = model
        self._temperature = temperature
        self._client = OpenAI(api_key=api_key or os.getenv("OPENAI_API_KEY"))
        self._last_latency_ms: float | None = None

    def generate(
        self,
        prompt: str,
        *,
        conversation: Sequence[AssistantMessage],
        settings: Mapping[str, Any] | None = None,
    ) -> str:
        """Generate completion using OpenAI API."""
        messages = [
            {"role": msg.role, "content": msg.content}
            for msg in conversation
        ]
        messages.append({"role": "user", "content": prompt})

        response = self._client.chat.completions.create(
            model=settings.get("model", self._model) if settings else self._model,
            messages=messages,
            temperature=settings.get("temperature", self._temperature) if settings else self._temperature,
        )

        return response.choices[0].message.content

    def stream(
        self,
        prompt: str,
        *,
        conversation: Sequence[AssistantMessage],
        settings: Mapping[str, Any] | None = None,
    ) -> Iterable[str]:
        """Stream completion tokens from OpenAI."""
        messages = [
            {"role": msg.role, "content": msg.content}
            for msg in conversation
        ]
        messages.append({"role": "user", "content": prompt})

        stream = self._client.chat.completions.create(
            model=settings.get("model", self._model) if settings else self._model,
            messages=messages,
            temperature=settings.get("temperature", self._temperature) if settings else self._temperature,
            stream=True,
        )

        for chunk in stream:
            if chunk.choices[0].delta.content:
                yield chunk.choices[0].delta.content

    def note_latency(self, latency_ms: float) -> None:
        """Record latency for health metrics."""
        self._last_latency_ms = latency_ms

    def health(self) -> ProviderStatus:
        """Return provider health status."""
        return ProviderStatus(
            name=self.name,
            status="ok",
            latency_ms=self._last_latency_ms,
            details={
                "model": self._model,
                "temperature": self._temperature,
            },
        )
Enter fullscreen mode Exit fullscreen mode

Step 3: Register OpenAI provider

Update src/main.py:

from src.modules.free.ai.ai_assistant.routers.ai.ai_assistant import (
    register_ai_assistant,
)
from src.modules.free.ai.ai_assistant.providers.openai_provider import OpenAIProvider
import os

import yaml
with open("config/ai_assistant.yaml") as f:
    config_data = yaml.safe_load(f)

app = FastAPI(title="AI Agent API")

# Register assistant
assistant = register_ai_assistant(app, config=config_data)

# Add OpenAI provider manually
openai_provider = OpenAIProvider(
    name="openai",
    api_key=os.getenv("OPENAI_API_KEY"),
    model="gpt-4",
    temperature=0.7,
)
assistant.register_provider(openai_provider)
assistant.set_default_provider("openai")
Enter fullscreen mode Exit fullscreen mode

If OPENAI_API_KEY is not set, keep echo/support as default so local/demo flows still work.

Step 4: Set API key

export OPENAI_API_KEY="sk-..."
Enter fullscreen mode Exit fullscreen mode

Step 5: Test real AI

curl -X POST http://localhost:8000/ai/assistant/completions \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "Explain RapidKit in one sentence",
    "provider": "openai"
  }'
Enter fullscreen mode Exit fullscreen mode

Response (real GPT-4):

{
  "provider": "openai",
  "content": "RapidKit is a developer toolkit that accelerates backend API development by providing pre-built modules, workspace management, and production-ready templates for FastAPI and NestJS frameworks.",
  "latency_ms": 1247.5,
  "cached": false,
  "usage": {
    "prompt_tokens": 7,
    "completion_tokens": 31,
    "total_tokens": 38
  }
}
Enter fullscreen mode Exit fullscreen mode

Real-World Use Case: Customer Support Agent

Let's build a customer support AI agent:

Create src/agents/support_agent.py:

"""Customer support AI agent."""

from typing import Sequence
from src.modules.free.ai.ai_assistant.ai_assistant import AiAssistant
from src.modules.free.ai.ai_assistant.ai_assistant_types import (
    AssistantMessage,
    AssistantResponse,
)


class SupportAgent:
    """AI-powered customer support agent."""

    def __init__(self, assistant: AiAssistant) -> None:
        self._assistant = assistant
        self._system_prompt = """You are a helpful customer support agent.
Be concise, friendly, and solution-oriented.
If you don't know the answer, escalate to human support."""

    def handle_ticket(
        self,
        customer_message: str,
        ticket_history: Sequence[AssistantMessage] | None = None,
    ) -> AssistantResponse:
        """Process customer support ticket."""
        context = [
            AssistantMessage(role="system", content=self._system_prompt)
        ]
        if ticket_history:
            context.extend(ticket_history)

        return self._assistant.chat(
            customer_message,
            provider="openai",  # or "support" for template
            context=context,
        )

    def classify_urgency(self, message: str) -> str:
        """Classify ticket urgency using AI."""
        prompt = f"Classify urgency (low/medium/high): {message}"
        response = self._assistant.chat(
            prompt,
            provider="openai",
            settings={"temperature": 0.1},  # More deterministic
        )
        return response.content.strip().lower()
Enter fullscreen mode Exit fullscreen mode

Add route in src/main.py:

from fastapi import Request
from src.agents.support_agent import SupportAgent

@app.post("/support/ticket")
async def handle_support_ticket(
    message: str,
    request: Request,
) -> dict:
    assistant = request.app.state.ai_assistant
    agent = SupportAgent(assistant)

    # Classify urgency
    urgency = agent.classify_urgency(message)

    # Generate response
    response = agent.handle_ticket(message)

    return {
        "ticket_id": "TKT-12345",
        "urgency": urgency,
        "ai_response": response.content,
        "latency_ms": response.latency_ms,
        "next_action": "escalate" if urgency == "high" else "monitor"
    }
Enter fullscreen mode Exit fullscreen mode

Test support agent:

curl -X POST "http://localhost:8000/support/ticket?message=My%20payment%20failed"
Enter fullscreen mode Exit fullscreen mode

Response:

{
  "ticket_id": "TKT-12345",
  "urgency": "high",
  "ai_response": "I understand your payment failed. Let me help you resolve this. Can you provide your order ID and the error message you received?",
  "latency_ms": 1156.3,
  "next_action": "escalate"
}
Enter fullscreen mode Exit fullscreen mode

What You Built

A production-ready AI Agent API with:

Core Features:

  • ✅ Multi-provider AI assistant (Echo, Template, OpenAI)
  • ✅ Chat completion endpoint
  • ✅ Streaming responses
  • ✅ Conversation context handling
  • ✅ Response caching (instant cache hits)
  • ✅ Health monitoring with metrics
  • ✅ Provider switching at runtime

Advanced Features:

  • ✅ Custom OpenAI integration
  • ✅ Customer support agent
  • ✅ Urgency classification
  • ✅ Ticket routing logic

Production Ready:

  • ✅ Type-safe with Pydantic models
  • ✅ Health checks for monitoring
  • ✅ Performance metrics (latency, cache hits)
  • ✅ Error handling
  • ✅ Async/streaming support
  • ✅ Integration tests included

Total time: 10 minutes

Custom code: ~50 lines

Everything else: Generated by RapidKit


Key Takeaways

1. Provider-Agnostic Design

Switch AI providers without changing your code:

# Use OpenAI
response = assistant.chat(prompt, provider="openai")

# Switch to Anthropic (when you add it)
response = assistant.chat(prompt, provider="anthropic")

# Fallback to template
response = assistant.chat(prompt, provider="support")
Enter fullscreen mode Exit fullscreen mode

2. Built-in Production Features

No need to implement yourself:

  • Conversation history tracking
  • Response caching (faster + cheaper)
  • Streaming support
  • Health monitoring
  • Performance metrics

3. Module Architecture Benefits

The ai_assistant module follows RapidKit's standard:

  • module.yaml — Version and metadata
  • templates/ — FastAPI and NestJS variants
  • docs/usage.md — Usage guide
  • Health endpoints built-in
  • Testing included

4. Extensibility

Easy to add providers:

  1. Implement ChatProvider protocol
  2. Register with assistant.register_provider()
  3. Done

Example providers you can add:

  • Anthropic Claude
  • Google PaLM
  • Azure OpenAI
  • Cohere
  • Local models (Ollama, LM Studio)

Next Steps

Enhance your AI agent:

  1. Add authentication:
   rapidkit add module auth_core
Enter fullscreen mode Exit fullscreen mode
  1. Add database for chat history:
   rapidkit add module db_postgres
Enter fullscreen mode Exit fullscreen mode
  1. Add Redis for distributed caching:
   rapidkit add module redis
Enter fullscreen mode Exit fullscreen mode
  1. Add observability:
   rapidkit add module observability
Enter fullscreen mode Exit fullscreen mode
  1. Deploy:
   docker build -t ai-agent .
   docker run -p 8000:8000 ai-agent
Enter fullscreen mode Exit fullscreen mode

Module Customization

Override configuration via environment:

export RAPIDKIT_AI_ASSISTANT_LOG_LEVEL=DEBUG
export RAPIDKIT_AI_ASSISTANT_TIMEOUT=60
export RAPIDKIT_AI_ASSISTANT_FEATURE_FLAG=true
Enter fullscreen mode Exit fullscreen mode

Custom overrides in overrides.py:

See /src/modules/free/ai/ai_assistant/overrides.py for runtime hooks.


Testing

Run integration tests:

rapidkit test
Enter fullscreen mode Exit fullscreen mode

Test specific module:

poetry run pytest tests/modules/free/integration/ai/ai_assistant/ -v
Enter fullscreen mode Exit fullscreen mode

Module verification:

rapidkit modules status
Enter fullscreen mode Exit fullscreen mode

Troubleshooting

Provider Not Found

ProviderNotFoundError: Provider 'openai' is not registered
Enter fullscreen mode Exit fullscreen mode

Solution: Check provider is registered in config or manually.

Timeout Errors

TimeoutError: Request exceeded 30 seconds
Enter fullscreen mode Exit fullscreen mode

Solution: Increase timeout:

request_timeout_seconds: 60
Enter fullscreen mode Exit fullscreen mode

OpenAI API Errors

Authentication Error: Invalid API key
Enter fullscreen mode Exit fullscreen mode

Solution: Check environment variable:

echo $OPENAI_API_KEY
Enter fullscreen mode Exit fullscreen mode

Learn More


You just built a production AI agent in 10 minutes.

Now scale it.

Top comments (0)