The Problem with AI Integration
You want to add AI to your FastAPI backend.
What usually happens:
- Install 5 different packages (openai, anthropic, langchain, etc.)
- Write wrapper code for each provider
- Handle streaming, caching, retries manually
- Debug configuration hell
- Spend 2 days integrating what should take 10 minutes
There's a better way.
What We're Building
A production-ready AI Assistant Service with:
- ✅ Multiple AI providers (Echo, Template, OpenAI-ready)
- ✅ Chat completions API
- ✅ Streaming responses
- ✅ Conversation history
- ✅ Response caching
- ✅ Health monitoring
- ✅ Provider switching
Time to build: 10 minutes
Lines of custom code: ~50
Source code: https://github.com/getrapidkit/rapidkit-examples/tree/main/my-ai-workspace
Published article links:
- Medium: https://rapidkit.medium.com/build-your-first-ai-agent-with-rapidkit-in-10-minutes-f38a6a12088d
- Dev.to: https://dev.to/rapidkit/build-your-first-ai-agent-with-rapidkit-in-10-minutes-3dj6
Let's start.
Prerequisites
You need:
- Node.js 20+ (
node --version) - Python 3.10+ (
python3 --version) - A terminal
- (Optional) OpenAI API key for real AI
Time: 1 minute to verify
Minute 0-2: Create Workspace and Project
Step 1: Create workspace (handles Python/Poetry setup automatically)
npx rapidkit my-ai-workspace
Prompts you'll see:
? Author name: developer
? Select Python version: 3.10
? Install method: 🎯 Poetry (Recommended)
What happens:
- Creates workspace directory
- Installs Poetry if needed
- Creates virtualenv in workspace root
- Installs rapidkit-core
Step 2: Create FastAPI project
cd my-ai-workspace
rapidkit create project fastapi.standard ai-agent
Prompts:
Project name: ai-agent
Install essential modules? Y
Step 3: Initialize project
cd ai-agent
rapidkit init
Your structure:
my-ai-workspace/
├── .venv/ # Workspace virtualenv
├── pyproject.toml
└── ai-agent/
├── .venv/ # Project virtualenv
├── src/
│ ├── main.py
│ └── modules/
├── tests/
└── pyproject.toml
Minute 2-3: Add AI Assistant Module
rapidkit add module ai_assistant
What gets installed:
src/
├── modules/
│ └── free/
│ └── ai/
│ └── ai_assistant/
│ ├── ai_assistant.py # Runtime engine
│ ├── ai_assistant_types.py # Type definitions
│ └── routers/
│ └── ai/
│ └── ai_assistant.py # FastAPI routes
└── health/
└── ai_assistant.py # Health endpoint
tests/
└── modules/
└── free/
└── integration/
└── ai/
└── ai_assistant/
└── test_ai_assistant_integration.py
Key Components:
-
Runtime (
ai_assistant.py):- Provider management
- Chat completion logic
- Streaming support
- Conversation history
- Response caching
-
Routes (
routers/ai/ai_assistant.py):-
POST /ai/assistant/completions— Get completions -
POST /ai/assistant/stream— Stream responses -
GET /ai/assistant/providers— List providers -
POST /ai/assistant/cache/clear— Clear cache -
GET /ai/assistant/health— Health check
-
-
Types (
ai_assistant_types.py):-
AssistantMessage— Message structure -
AssistantResponse— Response structure -
ProviderConfig— Provider configuration -
AiAssistantConfig— Runtime config
-
Minute 3-4: Configure Providers
Create config/ai_assistant.yaml:
# AI Assistant Configuration
providers:
# Echo Provider (for testing)
- name: echo
provider_type: echo
enabled: true
options:
prefix: "[AI] "
suffix: " (powered by Echo)"
mirror_context: true
# Template Provider (pre-defined responses)
- name: support
provider_type: template
enabled: true
options:
responses:
- "Thank you for contacting support. {prompt}"
- "I understand your concern about {prompt}. Let me help."
- "Based on '{context}', regarding {prompt}, here's what I suggest..."
# OpenAI Provider (ready for real AI)
# - name: openai
# provider_type: openai
# enabled: false
# options:
# api_key: ${OPENAI_API_KEY}
# model: gpt-4
# temperature: 0.7
default_provider: echo
conversation_window: 20
cache_enabled: true
request_timeout_seconds: 30
Provider Types Explained:
Echo Provider
Perfect for testing. Echoes back your prompt with optional prefix/suffix.
# Example response:
# Input: "Hello"
# Output: "[AI] Hello (powered by Echo)"
Template Provider
Returns pre-configured responses in rotation. Great for FAQ bots.
# First request: "How do I reset password?"
# Response: "Thank you for contacting support. How do I reset password?"
#
# Second request: "Where is my order?"
# Response: "I understand your concern about Where is my order?. Let me help."
OpenAI Provider (Real AI)
Integrates with OpenAI API. You'll implement this in advanced section.
Minute 4-5: Integrate into FastAPI
Update src/main.py:
from fastapi import FastAPI
from src.modules.free.ai.ai_assistant.routers.ai.ai_assistant import (
register_ai_assistant,
build_config,
)
# Load config from YAML
import yaml
with open("config/ai_assistant.yaml") as f:
config_data = yaml.safe_load(f)
app = FastAPI(
title="AI Agent API",
version="1.0.0",
description="Production AI Assistant powered by RapidKit"
)
# Register AI Assistant
register_ai_assistant(app, config=config_data)
@app.get("/")
async def root():
return {
"message": "AI Agent API is running",
"endpoints": {
"completions": "/ai/assistant/completions",
"stream": "/ai/assistant/stream",
"providers": "/ai/assistant/providers",
"health": "/ai/assistant/health"
}
}
Install PyYAML:
poetry add pyyaml
Minute 5-6: Test Basic Functionality
Start the server:
rapidkit dev
Server output:
INFO: Uvicorn running on http://127.0.0.1:8000
INFO: Application startup complete.
Open browser: http://localhost:8000/docs
You'll see Swagger UI with AI endpoints!
Minute 6-7: Test Echo Provider
List available providers:
curl http://localhost:8000/ai/assistant/providers
Response:
["echo", "support"]
Test echo completion:
curl -X POST http://localhost:8000/ai/assistant/completions \
-H "Content-Type: application/json" \
-d '{
"prompt": "What is RapidKit?",
"provider": "echo"
}'
Response:
{
"provider": "echo",
"content": "[AI] What is RapidKit? (powered by Echo)",
"latency_ms": 0.234,
"cached": false,
"usage": {
"prompt_tokens": 3,
"completion_tokens": 8,
"total_tokens": 11
},
"metadata": null
}
Minute 7-8: Test Template Provider
curl -X POST http://localhost:8000/ai/assistant/completions \
-H "Content-Type: application/json" \
-d '{
"prompt": "How do I reset my password?",
"provider": "support"
}'
Response:
{
"provider": "support",
"content": "Thank you for contacting support. How do I reset my password?",
"latency_ms": 0.156,
"cached": false,
"usage": {
"prompt_tokens": 6,
"completion_tokens": 8,
"total_tokens": 14
}
}
Test with conversation context:
curl -X POST http://localhost:8000/ai/assistant/completions \
-H "Content-Type: application/json" \
-d '{
"prompt": "Follow up question",
"provider": "support",
"context": [
{"role": "user", "content": "I forgot my password"},
{"role": "assistant", "content": "I can help reset your password"}
]
}'
Response includes context:
{
"provider": "support",
"content": "Based on 'I forgot my password | I can help reset your password', regarding Follow up question, here's what I suggest...",
"latency_ms": 0.189,
"cached": false,
"usage": {...}
}
Minute 8-9: Test Streaming
curl -X POST http://localhost:8000/ai/assistant/stream \
-H "Content-Type: application/json" \
-d '{
"prompt": "Tell me about AI agents",
"provider": "echo"
}'
Response streams word by word:
[AI] Tell me about AI agents (powered by Echo)
With Python client:
import requests
response = requests.post(
"http://localhost:8000/ai/assistant/stream",
json={"prompt": "Explain RapidKit", "provider": "echo"},
stream=True
)
for chunk in response.iter_content(chunk_size=None, decode_unicode=True):
print(chunk, end="", flush=True)
Minute 9-10: Test Caching
First request (not cached):
curl -X POST http://localhost:8000/ai/assistant/completions \
-H "Content-Type: application/json" \
-d '{
"prompt": "What is caching?",
"provider": "echo"
}'
Response:
{
"cached": false,
"latency_ms": 0.245
}
Second identical request (cached):
# Same request again
curl -X POST http://localhost:8000/ai/assistant/completions \
-H "Content-Type: application/json" \
-d '{
"prompt": "What is caching?",
"provider": "echo"
}'
Response:
{
"cached": true,
"latency_ms": 0.245
}
Notice: cached: true and same latency (from cache, not recomputed)
Clear cache:
curl -X POST http://localhost:8000/ai/assistant/cache/clear
Health Monitoring
Check AI assistant health:
curl http://localhost:8000/ai/assistant/health
Response:
{
"vendor": {
"module": "ai_assistant",
"status": "ok",
"version": "0.1.7"
},
"runtime": {
"module": "ai_assistant",
"status": "ok",
"providers": [
{
"name": "echo",
"status": "ok",
"latency_ms": 0.245,
"details": {
"prefix": "[AI] ",
"suffix": " (powered by Echo)",
"mirror_context": true
}
},
{
"name": "support",
"status": "ok",
"latency_ms": 0.189,
"details": {
"responses": 3
}
}
],
"cache_entries": 5,
"history_length": 12,
"timestamp": "2026-02-10T10:30:00"
},
"status": "ok"
}
Key metrics:
- Provider status and latency
- Cache hit count
- Conversation history length
- Overall health status
Advanced: Add OpenAI Provider
Step 1: Install OpenAI SDK
poetry add openai
Step 2: Create custom provider
Create src/modules/free/ai/ai_assistant/providers/openai_provider.py:
"""OpenAI provider implementation."""
from __future__ import annotations
import os
from typing import Any, Iterable, Mapping, Sequence
from openai import OpenAI
from src.modules.free.ai.ai_assistant.ai_assistant import ChatProvider
from src.modules.free.ai.ai_assistant.ai_assistant_types import (
AssistantMessage,
ProviderStatus,
)
class OpenAIProvider:
"""OpenAI chat completion provider."""
def __init__(
self,
*,
name: str = "openai",
api_key: str | None = None,
model: str = "gpt-3.5-turbo",
temperature: float = 0.7,
) -> None:
self.name = name
self._model = model
self._temperature = temperature
self._client = OpenAI(api_key=api_key or os.getenv("OPENAI_API_KEY"))
self._last_latency_ms: float | None = None
def generate(
self,
prompt: str,
*,
conversation: Sequence[AssistantMessage],
settings: Mapping[str, Any] | None = None,
) -> str:
"""Generate completion using OpenAI API."""
messages = [
{"role": msg.role, "content": msg.content}
for msg in conversation
]
messages.append({"role": "user", "content": prompt})
response = self._client.chat.completions.create(
model=settings.get("model", self._model) if settings else self._model,
messages=messages,
temperature=settings.get("temperature", self._temperature) if settings else self._temperature,
)
return response.choices[0].message.content
def stream(
self,
prompt: str,
*,
conversation: Sequence[AssistantMessage],
settings: Mapping[str, Any] | None = None,
) -> Iterable[str]:
"""Stream completion tokens from OpenAI."""
messages = [
{"role": msg.role, "content": msg.content}
for msg in conversation
]
messages.append({"role": "user", "content": prompt})
stream = self._client.chat.completions.create(
model=settings.get("model", self._model) if settings else self._model,
messages=messages,
temperature=settings.get("temperature", self._temperature) if settings else self._temperature,
stream=True,
)
for chunk in stream:
if chunk.choices[0].delta.content:
yield chunk.choices[0].delta.content
def note_latency(self, latency_ms: float) -> None:
"""Record latency for health metrics."""
self._last_latency_ms = latency_ms
def health(self) -> ProviderStatus:
"""Return provider health status."""
return ProviderStatus(
name=self.name,
status="ok",
latency_ms=self._last_latency_ms,
details={
"model": self._model,
"temperature": self._temperature,
},
)
Step 3: Register OpenAI provider
Update src/main.py:
from src.modules.free.ai.ai_assistant.routers.ai.ai_assistant import (
register_ai_assistant,
)
from src.modules.free.ai.ai_assistant.providers.openai_provider import OpenAIProvider
import os
import yaml
with open("config/ai_assistant.yaml") as f:
config_data = yaml.safe_load(f)
app = FastAPI(title="AI Agent API")
# Register assistant
assistant = register_ai_assistant(app, config=config_data)
# Add OpenAI provider manually
openai_provider = OpenAIProvider(
name="openai",
api_key=os.getenv("OPENAI_API_KEY"),
model="gpt-4",
temperature=0.7,
)
assistant.register_provider(openai_provider)
assistant.set_default_provider("openai")
If OPENAI_API_KEY is not set, keep echo/support as default so local/demo flows still work.
Step 4: Set API key
export OPENAI_API_KEY="sk-..."
Step 5: Test real AI
curl -X POST http://localhost:8000/ai/assistant/completions \
-H "Content-Type: application/json" \
-d '{
"prompt": "Explain RapidKit in one sentence",
"provider": "openai"
}'
Response (real GPT-4):
{
"provider": "openai",
"content": "RapidKit is a developer toolkit that accelerates backend API development by providing pre-built modules, workspace management, and production-ready templates for FastAPI and NestJS frameworks.",
"latency_ms": 1247.5,
"cached": false,
"usage": {
"prompt_tokens": 7,
"completion_tokens": 31,
"total_tokens": 38
}
}
Real-World Use Case: Customer Support Agent
Let's build a customer support AI agent:
Create src/agents/support_agent.py:
"""Customer support AI agent."""
from typing import Sequence
from src.modules.free.ai.ai_assistant.ai_assistant import AiAssistant
from src.modules.free.ai.ai_assistant.ai_assistant_types import (
AssistantMessage,
AssistantResponse,
)
class SupportAgent:
"""AI-powered customer support agent."""
def __init__(self, assistant: AiAssistant) -> None:
self._assistant = assistant
self._system_prompt = """You are a helpful customer support agent.
Be concise, friendly, and solution-oriented.
If you don't know the answer, escalate to human support."""
def handle_ticket(
self,
customer_message: str,
ticket_history: Sequence[AssistantMessage] | None = None,
) -> AssistantResponse:
"""Process customer support ticket."""
context = [
AssistantMessage(role="system", content=self._system_prompt)
]
if ticket_history:
context.extend(ticket_history)
return self._assistant.chat(
customer_message,
provider="openai", # or "support" for template
context=context,
)
def classify_urgency(self, message: str) -> str:
"""Classify ticket urgency using AI."""
prompt = f"Classify urgency (low/medium/high): {message}"
response = self._assistant.chat(
prompt,
provider="openai",
settings={"temperature": 0.1}, # More deterministic
)
return response.content.strip().lower()
Add route in src/main.py:
from fastapi import Request
from src.agents.support_agent import SupportAgent
@app.post("/support/ticket")
async def handle_support_ticket(
message: str,
request: Request,
) -> dict:
assistant = request.app.state.ai_assistant
agent = SupportAgent(assistant)
# Classify urgency
urgency = agent.classify_urgency(message)
# Generate response
response = agent.handle_ticket(message)
return {
"ticket_id": "TKT-12345",
"urgency": urgency,
"ai_response": response.content,
"latency_ms": response.latency_ms,
"next_action": "escalate" if urgency == "high" else "monitor"
}
Test support agent:
curl -X POST "http://localhost:8000/support/ticket?message=My%20payment%20failed"
Response:
{
"ticket_id": "TKT-12345",
"urgency": "high",
"ai_response": "I understand your payment failed. Let me help you resolve this. Can you provide your order ID and the error message you received?",
"latency_ms": 1156.3,
"next_action": "escalate"
}
What You Built
A production-ready AI Agent API with:
Core Features:
- ✅ Multi-provider AI assistant (Echo, Template, OpenAI)
- ✅ Chat completion endpoint
- ✅ Streaming responses
- ✅ Conversation context handling
- ✅ Response caching (instant cache hits)
- ✅ Health monitoring with metrics
- ✅ Provider switching at runtime
Advanced Features:
- ✅ Custom OpenAI integration
- ✅ Customer support agent
- ✅ Urgency classification
- ✅ Ticket routing logic
Production Ready:
- ✅ Type-safe with Pydantic models
- ✅ Health checks for monitoring
- ✅ Performance metrics (latency, cache hits)
- ✅ Error handling
- ✅ Async/streaming support
- ✅ Integration tests included
Total time: 10 minutes
Custom code: ~50 lines
Everything else: Generated by RapidKit
Key Takeaways
1. Provider-Agnostic Design
Switch AI providers without changing your code:
# Use OpenAI
response = assistant.chat(prompt, provider="openai")
# Switch to Anthropic (when you add it)
response = assistant.chat(prompt, provider="anthropic")
# Fallback to template
response = assistant.chat(prompt, provider="support")
2. Built-in Production Features
No need to implement yourself:
- Conversation history tracking
- Response caching (faster + cheaper)
- Streaming support
- Health monitoring
- Performance metrics
3. Module Architecture Benefits
The ai_assistant module follows RapidKit's standard:
-
module.yaml— Version and metadata -
templates/— FastAPI and NestJS variants -
docs/usage.md— Usage guide - Health endpoints built-in
- Testing included
4. Extensibility
Easy to add providers:
- Implement
ChatProviderprotocol - Register with
assistant.register_provider() - Done
Example providers you can add:
- Anthropic Claude
- Google PaLM
- Azure OpenAI
- Cohere
- Local models (Ollama, LM Studio)
Next Steps
Enhance your AI agent:
- Add authentication:
rapidkit add module auth_core
- Add database for chat history:
rapidkit add module db_postgres
- Add Redis for distributed caching:
rapidkit add module redis
- Add observability:
rapidkit add module observability
- Deploy:
docker build -t ai-agent .
docker run -p 8000:8000 ai-agent
Module Customization
Override configuration via environment:
export RAPIDKIT_AI_ASSISTANT_LOG_LEVEL=DEBUG
export RAPIDKIT_AI_ASSISTANT_TIMEOUT=60
export RAPIDKIT_AI_ASSISTANT_FEATURE_FLAG=true
Custom overrides in overrides.py:
See /src/modules/free/ai/ai_assistant/overrides.py for runtime hooks.
Testing
Run integration tests:
rapidkit test
Test specific module:
poetry run pytest tests/modules/free/integration/ai/ai_assistant/ -v
Module verification:
rapidkit modules status
Troubleshooting
Provider Not Found
ProviderNotFoundError: Provider 'openai' is not registered
Solution: Check provider is registered in config or manually.
Timeout Errors
TimeoutError: Request exceeded 30 seconds
Solution: Increase timeout:
request_timeout_seconds: 60
OpenAI API Errors
Authentication Error: Invalid API key
Solution: Check environment variable:
echo $OPENAI_API_KEY
Learn More
- 📁 Tutorial source workspace
- 📰 Published on Medium
- 📰 Published on Dev.to
- 🌐 RapidKit Website
- 📦 npm Package
- 🐍 PyPI Package
- 🧩 VS Code Extension
- 📚 AI Assistant Module Docs
You just built a production AI agent in 10 minutes.
Now scale it.
Top comments (0)