DEV Community

iws technical
iws technical

Posted on

FastAPI LLM Master Series - Episode 1: Building Your First LLM-Ready API from Scratch

Welcome to the FastAPI LLM Master Series! This comprehensive series will take you from FastAPI beginner to building production-ready LLM applications. In this first episode, we'll build a solid foundation with FastAPI fundamentals.

🎯 What You'll Learn

By the end of this tutorial, you'll have:

  • ✅ A complete understanding of FastAPI basics
  • ✅ Your first FastAPI application running locally
  • ✅ Automatic API documentation with Swagger UI
  • ✅ A text processing API ready for LLM integration
  • ✅ Proper project structure for scaling

🛠️ Prerequisites

  • Python 3.11+ installed on your system
  • Basic Python knowledge (functions, classes, decorators)
  • A code editor (VS Code recommended)
  • Terminal/Command Prompt access

📋 Table of Contents

  1. What is FastAPI and Why It's Perfect for LLM Apps
  2. Setting Up Your Development Environment
  3. Your First FastAPI Application
  4. Understanding Automatic Documentation
  5. Building a Text Processing API
  6. Project Structure Best Practices
  7. Next Steps

🚀 What is FastAPI and Why It's Perfect for LLM Apps {#what-is-fastapi}

FastAPI is a modern, fast web framework for building APIs with Python. Here's why it's ideal for LLM applications:

Key Benefits for LLM Development:

🏃‍♂️ High Performance

  • Built on Starlette and Pydantic
  • Comparable performance to NodeJS and Go
  • Perfect for handling concurrent LLM requests

🔒 Automatic Validation

  • Input validation prevents malformed requests to expensive LLM APIs
  • Type safety reduces bugs in production
  • Built-in security features

📚 Auto-Generated Documentation

  • Interactive API docs with Swagger UI
  • Makes testing LLM endpoints effortless
  • Perfect for team collaboration

🔄 Async/Await Support

  • Native support for asynchronous operations
  • Handle multiple LLM requests concurrently
  • Non-blocking I/O operations

🔧 Setting Up Your Development Environment {#setup-environment}

Let's create a proper development environment for our FastAPI LLM series.

Step 1: Create Project Directory

# Create main project directory
mkdir fastapi-llm-series
cd fastapi-llm-series

# Create episode 1 directory
mkdir episode-01-fundamentals
cd episode-01-fundamentals
Enter fullscreen mode Exit fullscreen mode

Step 2: Set Up Virtual Environment

# Create virtual environment
python -m venv venv

# Activate virtual environment
# On Windows:
venv\Scripts\activate
# On macOS/Linux:
source venv/bin/activate
Enter fullscreen mode Exit fullscreen mode

You should see (venv) in your terminal prompt, indicating the virtual environment is active.

Step 3: Install Required Packages

# Install FastAPI and Uvicorn (ASGI server)
pip install fastapi uvicorn

# Install additional packages we'll use
pip install python-multipart pydantic

# Create requirements file
pip freeze > requirements.txt
Enter fullscreen mode Exit fullscreen mode

Your requirements.txt should look like this:

annotated-types==0.6.0
anyio==4.2.0
click==8.1.7
fastapi==0.109.0
h11==0.14.0
idna==3.6
pydantic==2.5.3
pydantic_core==2.14.6
python-multipart==0.0.6
sniffio==1.3.0
starlette==0.35.1
typing_extensions==4.9.0
uvicorn==0.27.0
Enter fullscreen mode Exit fullscreen mode

🏗️ Your First FastAPI Application {#first-app}

Let's build your first FastAPI application step by step.

Step 1: Create the Main Application File

Create a file named main.py:

# main.py
from fastapi import FastAPI

# Create FastAPI instance
app = FastAPI()

# Basic root endpoint
@app.get("/")
def read_root():
    return {"message": "Welcome to FastAPI LLM Series!"}

# Health check endpoint
@app.get("/health")
def health_check():
    return {"status": "healthy", "service": "fastapi-llm-api"}
Enter fullscreen mode Exit fullscreen mode

Step 2: Run Your First API

# Run the development server
uvicorn main:app --reload
Enter fullscreen mode Exit fullscreen mode

You should see output like this:

INFO:     Will watch for changes in these directories: ['/path/to/your/project']
INFO:     Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit)
INFO:     Started reloader process [12345] using WatchFiles
INFO:     Started server process [12346]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
Enter fullscreen mode Exit fullscreen mode

Step 3: Test Your API

Open your browser and visit:

Congratulations! 🎉 You've created your first FastAPI application.


📖 Understanding Automatic Documentation {#documentation}

One of FastAPI's killer features is automatic API documentation. Let's explore this.

Step 1: Access Swagger UI

Visit http://127.0.0.1:8000/docs in your browser. You'll see:

  • Interactive API documentation
  • Ability to test endpoints directly
  • Automatic request/response schemas
  • Authentication testing capabilities

Step 2: Access ReDoc

Visit http://127.0.0.1:8000/redoc for an alternative documentation view:

  • Clean, readable format
  • Great for sharing with team members
  • Mobile-friendly design

Step 3: OpenAPI Schema

Visit http://127.0.0.1:8000/openapi.json to see the raw OpenAPI schema that powers the documentation.


🔤 Building a Text Processing API {#text-processing}

Now let's build a practical text processing API that will serve as the foundation for LLM integration.

Step 1: Create Pydantic Models

Update your main.py:

# main.py
from fastapi import FastAPI
from pydantic import BaseModel, Field
from typing import Optional
import re
from datetime import datetime

# Create FastAPI instance with metadata
app = FastAPI(
    title="FastAPI LLM Series - Episode 1",
    description="A text processing API built for LLM integration",
    version="1.0.0",
    docs_url="/docs",
    redoc_url="/redoc"
)

# Pydantic models for request/response
class TextInput(BaseModel):
    text: str = Field(..., min_length=1, max_length=10000, description="Text to process")
    options: Optional[dict] = Field(default={}, description="Processing options")

class TextOutput(BaseModel):
    original_text: str
    processed_text: str
    statistics: dict
    timestamp: datetime
    processing_time_ms: float

class HealthResponse(BaseModel):
    status: str
    service: str
    timestamp: datetime

# Basic endpoints
@app.get("/", tags=["Root"])
def read_root():
    return {"message": "Welcome to FastAPI LLM Series!", "episode": 1}

@app.get("/health", response_model=HealthResponse, tags=["Health"])
def health_check():
    return HealthResponse(
        status="healthy",
        service="fastapi-llm-api",
        timestamp=datetime.now()
    )
Enter fullscreen mode Exit fullscreen mode

Step 2: Add Text Processing Endpoints

Add these functions to your main.py:

# Text processing functions
def analyze_text(text: str) -> dict:
    """Analyze text and return statistics"""
    words = text.split()
    sentences = text.split('.')
    paragraphs = text.split('\n\n')

    return {
        "character_count": len(text),
        "character_count_no_spaces": len(text.replace(' ', '')),
        "word_count": len(words),
        "sentence_count": len([s for s in sentences if s.strip()]),
        "paragraph_count": len([p for p in paragraphs if p.strip()]),
        "average_word_length": sum(len(word) for word in words) / len(words) if words else 0,
        "reading_time_minutes": len(words) / 200  # Average reading speed
    }

def clean_text(text: str, options: dict = {}) -> str:
    """Clean and normalize text"""
    cleaned = text

    # Remove extra whitespace
    if options.get("remove_extra_whitespace", True):
        cleaned = re.sub(r'\s+', ' ', cleaned)

    # Remove special characters
    if options.get("remove_special_chars", False):
        cleaned = re.sub(r'[^\w\s]', '', cleaned)

    # Convert to lowercase
    if options.get("lowercase", False):
        cleaned = cleaned.lower()

    # Remove numbers
    if options.get("remove_numbers", False):
        cleaned = re.sub(r'\d+', '', cleaned)

    return cleaned.strip()

# Text processing endpoints
@app.post("/process-text", response_model=TextOutput, tags=["Text Processing"])
async def process_text(input_data: TextInput):
    """
    Process text with cleaning and analysis

    This endpoint will be the foundation for LLM preprocessing in future episodes.
    """
    import time
    start_time = time.time()

    # Process the text
    processed_text = clean_text(input_data.text, input_data.options)

    # Analyze the text
    stats = analyze_text(input_data.text)

    # Calculate processing time
    processing_time = (time.time() - start_time) * 1000

    return TextOutput(
        original_text=input_data.text,
        processed_text=processed_text,
        statistics=stats,
        timestamp=datetime.now(),
        processing_time_ms=round(processing_time, 2)
    )

@app.post("/analyze-text", tags=["Text Processing"])
async def analyze_text_endpoint(input_data: TextInput):
    """
    Analyze text and return detailed statistics

    Useful for understanding text before sending to LLM APIs.
    """
    stats = analyze_text(input_data.text)

    return {
        "text_preview": input_data.text[:100] + "..." if len(input_data.text) > 100 else input_data.text,
        "statistics": stats,
        "llm_ready": len(input_data.text.split()) > 10 and len(input_data.text) < 8000,
        "estimated_tokens": len(input_data.text.split()) * 1.3  # Rough estimation
    }
Enter fullscreen mode Exit fullscreen mode

Step 3: Add Error Handling

Add proper error handling to your API:

from fastapi import FastAPI, HTTPException
from fastapi.responses import JSONResponse

# Add this after your existing imports
@app.exception_handler(ValueError)
async def value_error_handler(request, exc):
    return JSONResponse(
        status_code=400,
        content={"error": "Invalid input", "detail": str(exc)}
    )

# Add a text validation endpoint
@app.post("/validate-text", tags=["Text Processing"])
async def validate_text(input_data: TextInput):
    """
    Validate text for LLM processing

    Checks for common issues that might cause problems with LLM APIs.
    """
    issues = []

    # Check text length
    if len(input_data.text) < 10:
        issues.append("Text too short (minimum 10 characters)")

    if len(input_data.text) > 8000:
        issues.append("Text too long (maximum 8000 characters for most LLMs)")

    # Check for potential prompt injection
    suspicious_patterns = [
        r"ignore previous instructions",
        r"system prompt",
        r"you are now",
        r"forget everything",
    ]

    for pattern in suspicious_patterns:
        if re.search(pattern, input_data.text, re.IGNORECASE):
            issues.append(f"Potential prompt injection detected: {pattern}")

    return {
        "is_valid": len(issues) == 0,
        "issues": issues,
        "text_stats": analyze_text(input_data.text)
    }
Enter fullscreen mode Exit fullscreen mode

Step 4: Test Your Text Processing API

Restart your server:

uvicorn main:app --reload
Enter fullscreen mode Exit fullscreen mode

Now visit http://127.0.0.1:8000/docs and test these endpoints:

  1. POST /process-text: Test with sample text
  2. POST /analyze-text: Analyze text statistics
  3. POST /validate-text: Check text for LLM readiness

Sample test data:

{
  "text": "This is a sample text for processing. It contains multiple sentences and should demonstrate the API capabilities nicely!",
  "options": {
    "remove_extra_whitespace": true,
    "lowercase": false,
    "remove_special_chars": false
  }
}
Enter fullscreen mode Exit fullscreen mode

📁 Project Structure Best Practices {#project-structure}

Let's organize our code properly for scaling. Create this file structure:

episode-01-fundamentals/
├── app/
│   ├── __init__.py
│   ├── main.py
│   ├── models.py
│   ├── services/
│   │   ├── __init__.py
│   │   └── text_processing.py
│   └── routers/
│       ├── __init__.py
│       └── text.py
├── tests/
│   ├── __init__.py
│   └── test_main.py
├── requirements.txt
└── README.md
Enter fullscreen mode Exit fullscreen mode

Step 1: Create the App Structure

# Create directories
mkdir app app/services app/routers tests

# Create __init__.py files
touch app/__init__.py app/services/__init__.py app/routers/__init__.py tests/__init__.py
Enter fullscreen mode Exit fullscreen mode

Step 2: Move Models to Separate File

Create app/models.py:

# app/models.py
from pydantic import BaseModel, Field
from typing import Optional
from datetime import datetime

class TextInput(BaseModel):
    text: str = Field(..., min_length=1, max_length=10000, description="Text to process")
    options: Optional[dict] = Field(default={}, description="Processing options")

class TextOutput(BaseModel):
    original_text: str
    processed_text: str
    statistics: dict
    timestamp: datetime
    processing_time_ms: float

class HealthResponse(BaseModel):
    status: str
    service: str
    timestamp: datetime

class TextValidationResponse(BaseModel):
    is_valid: bool
    issues: list
    text_stats: dict
Enter fullscreen mode Exit fullscreen mode

Step 3: Create Text Processing Service

Create app/services/text_processing.py:

# app/services/text_processing.py
import re
from typing import Dict

class TextProcessor:
    """Service class for text processing operations"""

    @staticmethod
    def analyze_text(text: str) -> Dict:
        """Analyze text and return statistics"""
        words = text.split()
        sentences = text.split('.')
        paragraphs = text.split('\n\n')

        return {
            "character_count": len(text),
            "character_count_no_spaces": len(text.replace(' ', '')),
            "word_count": len(words),
            "sentence_count": len([s for s in sentences if s.strip()]),
            "paragraph_count": len([p for p in paragraphs if p.strip()]),
            "average_word_length": sum(len(word) for word in words) / len(words) if words else 0,
            "reading_time_minutes": len(words) / 200
        }

    @staticmethod
    def clean_text(text: str, options: Dict = {}) -> str:
        """Clean and normalize text"""
        cleaned = text

        if options.get("remove_extra_whitespace", True):
            cleaned = re.sub(r'\s+', ' ', cleaned)

        if options.get("remove_special_chars", False):
            cleaned = re.sub(r'[^\w\s]', '', cleaned)

        if options.get("lowercase", False):
            cleaned = cleaned.lower()

        if options.get("remove_numbers", False):
            cleaned = re.sub(r'\d+', '', cleaned)

        return cleaned.strip()

    @staticmethod
    def validate_for_llm(text: str) -> Dict:
        """Validate text for LLM processing"""
        issues = []

        if len(text) < 10:
            issues.append("Text too short (minimum 10 characters)")

        if len(text) > 8000:
            issues.append("Text too long (maximum 8000 characters)")

        suspicious_patterns = [
            r"ignore previous instructions",
            r"system prompt",
            r"you are now",
            r"forget everything",
        ]

        for pattern in suspicious_patterns:
            if re.search(pattern, text, re.IGNORECASE):
                issues.append(f"Potential prompt injection detected: {pattern}")

        return {
            "is_valid": len(issues) == 0,
            "issues": issues
        }
Enter fullscreen mode Exit fullscreen mode

Step 4: Create Router

Create app/routers/text.py:

# app/routers/text.py
from fastapi import APIRouter
from app.models import TextInput, TextOutput, TextValidationResponse
from app.services.text_processing import TextProcessor
from datetime import datetime
import time

router = APIRouter(prefix="/text", tags=["Text Processing"])

@router.post("/process", response_model=TextOutput)
async def process_text(input_data: TextInput):
    """Process text with cleaning and analysis"""
    start_time = time.time()

    processor = TextProcessor()
    processed_text = processor.clean_text(input_data.text, input_data.options)
    stats = processor.analyze_text(input_data.text)

    processing_time = (time.time() - start_time) * 1000

    return TextOutput(
        original_text=input_data.text,
        processed_text=processed_text,
        statistics=stats,
        timestamp=datetime.now(),
        processing_time_ms=round(processing_time, 2)
    )

@router.post("/analyze")
async def analyze_text(input_data: TextInput):
    """Analyze text and return detailed statistics"""
    processor = TextProcessor()
    stats = processor.analyze_text(input_data.text)

    return {
        "text_preview": input_data.text[:100] + "..." if len(input_data.text) > 100 else input_data.text,
        "statistics": stats,
        "llm_ready": len(input_data.text.split()) > 10 and len(input_data.text) < 8000,
        "estimated_tokens": len(input_data.text.split()) * 1.3
    }

@router.post("/validate", response_model=TextValidationResponse)
async def validate_text(input_data: TextInput):
    """Validate text for LLM processing"""
    processor = TextProcessor()
    validation_result = processor.validate_for_llm(input_data.text)
    text_stats = processor.analyze_text(input_data.text)

    return TextValidationResponse(
        is_valid=validation_result["is_valid"],
        issues=validation_result["issues"],
        text_stats=text_stats
    )
Enter fullscreen mode Exit fullscreen mode

Step 5: Update Main Application

Update app/main.py:

# app/main.py
from fastapi import FastAPI
from fastapi.responses import JSONResponse
from app.models import HealthResponse
from app.routers import text
from datetime import datetime

# Create FastAPI instance
app = FastAPI(
    title="FastAPI LLM Series - Episode 1",
    description="A text processing API built for LLM integration",
    version="1.0.0",
    docs_url="/docs",
    redoc_url="/redoc"
)

# Include routers
app.include_router(text.router)

# Root endpoints
@app.get("/", tags=["Root"])
def read_root():
    return {
        "message": "Welcome to FastAPI LLM Series!",
        "episode": 1,
        "title": "FastAPI Fundamentals",
        "endpoints": [
            "/docs - Interactive API documentation",
            "/health - Health check",
            "/text/process - Process text",
            "/text/analyze - Analyze text",
            "/text/validate - Validate text for LLM"
        ]
    }

@app.get("/health", response_model=HealthResponse, tags=["Health"])
def health_check():
    return HealthResponse(
        status="healthy",
        service="fastapi-llm-api",
        timestamp=datetime.now()
    )

# Error handlers
@app.exception_handler(ValueError)
async def value_error_handler(request, exc):
    return JSONResponse(
        status_code=400,
        content={"error": "Invalid input", "detail": str(exc)}
    )
Enter fullscreen mode Exit fullscreen mode

Step 6: Create README

Create README.md:

# FastAPI LLM Series - Episode 1: Fundamentals

A text processing API built with FastAPI, designed as the foundation for LLM integration.

## Features

- ✅ Text processing and cleaning
- ✅ Text analysis and statistics
- ✅ LLM-ready text validation
- ✅ Automatic API documentation
- ✅ Type-safe request/response models

## Installation

Enter fullscreen mode Exit fullscreen mode


bash
pip install -r requirements.txt


## Running the Application

Enter fullscreen mode Exit fullscreen mode


bash
uvicorn app.main:app --reload


## API Documentation

Visit `http://127.0.0.1:8000/docs` for interactive API documentation.

## Endpoints

- `GET /` - Welcome message
- `GET /health` - Health check
- `POST /text/process` - Process and clean text
- `POST /text/analyze` - Analyze text statistics
- `POST /text/validate` - Validate text for LLM processing

## Next Episode

Episode 2 will cover advanced request handling and data validation.
Enter fullscreen mode Exit fullscreen mode

Step 7: Run Your Refactored Application

# Run from the project root
uvicorn app.main:app --reload
Enter fullscreen mode Exit fullscreen mode

🎯 Next Steps {#next-steps}

Congratulations! You've built a solid foundation for LLM applications with FastAPI. Here's what you've accomplished:

✅ What You Built

  • Complete FastAPI application with proper structure
  • Text processing service ready for LLM integration
  • Automatic API documentation
  • Error handling and validation
  • Scalable project architecture

🔄 What's Coming in Episode 2

  • Advanced Pydantic models and validation
  • File upload handling
  • Custom exception handling
  • Input sanitization for LLM security
  • Request/response middleware

🚀 Challenge Yourself

Try these exercises before Episode 2:

  1. Add a word count endpoint that returns only word count
  2. Implement text summarization using basic extractive methods
  3. Add rate limiting to prevent API abuse
  4. Create unit tests for your text processing functions

📚 Additional Resources


💡 Key Takeaways

  1. FastAPI's automatic validation prevents malformed requests from reaching expensive LLM APIs
  2. Type hints make your code more maintainable and catch errors early
  3. Proper project structure is crucial for scaling LLM applications
  4. Built-in documentation makes API testing and team collaboration effortless

Ready for Episode 2? We'll dive deep into advanced request handling and build more sophisticated validation systems that will protect your LLM endpoints from malicious inputs.


This is part of the FastAPI LLM Master Series. Follow along for the complete journey from basics to production-ready LLM applications!

Next Episode Preview: Advanced Request Handling & Data Validation - Building bulletproof APIs for LLM integration 🛡️


🏷️ Tags

#FastAPI #Python #LLM #API #WebDevelopment #MachineLearning #Tutorial #Beginners

Top comments (0)