iws technical

Posted on Jul 10

FastAPI LLM Master Series - Episode 1: Building Your First LLM-Ready API from Scratch

Welcome to the FastAPI LLM Master Series! This comprehensive series will take you from FastAPI beginner to building production-ready LLM applications. In this first episode, we'll build a solid foundation with FastAPI fundamentals.

🎯 What You'll Learn

By the end of this tutorial, you'll have:

✅ A complete understanding of FastAPI basics
✅ Your first FastAPI application running locally
✅ Automatic API documentation with Swagger UI
✅ A text processing API ready for LLM integration
✅ Proper project structure for scaling

🛠️ Prerequisites

Python 3.11+ installed on your system
Basic Python knowledge (functions, classes, decorators)
A code editor (VS Code recommended)
Terminal/Command Prompt access

📋 Table of Contents

What is FastAPI and Why It's Perfect for LLM Apps
Setting Up Your Development Environment
Your First FastAPI Application
Understanding Automatic Documentation
Building a Text Processing API
Project Structure Best Practices
Next Steps

🚀 What is FastAPI and Why It's Perfect for LLM Apps {#what-is-fastapi}

FastAPI is a modern, fast web framework for building APIs with Python. Here's why it's ideal for LLM applications:

Key Benefits for LLM Development:

🏃‍♂️ High Performance

Built on Starlette and Pydantic
Comparable performance to NodeJS and Go
Perfect for handling concurrent LLM requests

🔒 Automatic Validation

Input validation prevents malformed requests to expensive LLM APIs
Type safety reduces bugs in production
Built-in security features

📚 Auto-Generated Documentation

Interactive API docs with Swagger UI
Makes testing LLM endpoints effortless
Perfect for team collaboration

🔄 Async/Await Support

Native support for asynchronous operations
Handle multiple LLM requests concurrently
Non-blocking I/O operations

🔧 Setting Up Your Development Environment {#setup-environment}

Let's create a proper development environment for our FastAPI LLM series.

Step 1: Create Project Directory

# Create main project directory
mkdir fastapi-llm-series
cd fastapi-llm-series

# Create episode 1 directory
mkdir episode-01-fundamentals
cd episode-01-fundamentals

Step 2: Set Up Virtual Environment

# Create virtual environment
python -m venv venv

# Activate virtual environment
# On Windows:
venv\Scripts\activate
# On macOS/Linux:
source venv/bin/activate

You should see (venv) in your terminal prompt, indicating the virtual environment is active.

Step 3: Install Required Packages

# Install FastAPI and Uvicorn (ASGI server)
pip install fastapi uvicorn

# Install additional packages we'll use
pip install python-multipart pydantic

# Create requirements file
pip freeze > requirements.txt

Your requirements.txt should look like this:

annotated-types==0.6.0
anyio==4.2.0
click==8.1.7
fastapi==0.109.0
h11==0.14.0
idna==3.6
pydantic==2.5.3
pydantic_core==2.14.6
python-multipart==0.0.6
sniffio==1.3.0
starlette==0.35.1
typing_extensions==4.9.0
uvicorn==0.27.0

🏗️ Your First FastAPI Application {#first-app}

Let's build your first FastAPI application step by step.

Step 1: Create the Main Application File

Create a file named main.py:

# main.py
from fastapi import FastAPI

# Create FastAPI instance
app = FastAPI()

# Basic root endpoint
@app.get("/")
def read_root():
    return {"message": "Welcome to FastAPI LLM Series!"}

# Health check endpoint
@app.get("/health")
def health_check():
    return {"status": "healthy", "service": "fastapi-llm-api"}

Step 2: Run Your First API

# Run the development server
uvicorn main:app --reload

You should see output like this:

INFO:     Will watch for changes in these directories: ['/path/to/your/project']
INFO:     Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit)
INFO:     Started reloader process [12345] using WatchFiles
INFO:     Started server process [12346]
INFO:     Waiting for application startup.
INFO:     Application startup complete.

Step 3: Test Your API

Open your browser and visit:

API Root: http://127.0.0.1:8000
Health Check: http://127.0.0.1:8000/health
Interactive Docs: http://127.0.0.1:8000/docs

Congratulations! 🎉 You've created your first FastAPI application.

📖 Understanding Automatic Documentation {#documentation}

One of FastAPI's killer features is automatic API documentation. Let's explore this.

Step 1: Access Swagger UI

Visit http://127.0.0.1:8000/docs in your browser. You'll see:

Interactive API documentation
Ability to test endpoints directly
Automatic request/response schemas
Authentication testing capabilities

Step 2: Access ReDoc

Visit http://127.0.0.1:8000/redoc for an alternative documentation view:

Clean, readable format
Great for sharing with team members
Mobile-friendly design

Step 3: OpenAPI Schema

Visit http://127.0.0.1:8000/openapi.json to see the raw OpenAPI schema that powers the documentation.

🔤 Building a Text Processing API {#text-processing}

Now let's build a practical text processing API that will serve as the foundation for LLM integration.

Step 1: Create Pydantic Models

Update your main.py:

# main.py
from fastapi import FastAPI
from pydantic import BaseModel, Field
from typing import Optional
import re
from datetime import datetime

# Create FastAPI instance with metadata
app = FastAPI(
    title="FastAPI LLM Series - Episode 1",
    description="A text processing API built for LLM integration",
    version="1.0.0",
    docs_url="/docs",
    redoc_url="/redoc"
)

# Pydantic models for request/response
class TextInput(BaseModel):
    text: str = Field(..., min_length=1, max_length=10000, description="Text to process")
    options: Optional[dict] = Field(default={}, description="Processing options")

class TextOutput(BaseModel):
    original_text: str
    processed_text: str
    statistics: dict
    timestamp: datetime
    processing_time_ms: float

class HealthResponse(BaseModel):
    status: str
    service: str
    timestamp: datetime

# Basic endpoints
@app.get("/", tags=["Root"])
def read_root():
    return {"message": "Welcome to FastAPI LLM Series!", "episode": 1}

@app.get("/health", response_model=HealthResponse, tags=["Health"])
def health_check():
    return HealthResponse(
        status="healthy",
        service="fastapi-llm-api",
        timestamp=datetime.now()
    )

Step 2: Add Text Processing Endpoints

Add these functions to your main.py:

# Text processing functions
def analyze_text(text: str) -> dict:
    """Analyze text and return statistics"""
    words = text.split()
    sentences = text.split('.')
    paragraphs = text.split('\n\n')

    return {
        "character_count": len(text),
        "character_count_no_spaces": len(text.replace(' ', '')),
        "word_count": len(words),
        "sentence_count": len([s for s in sentences if s.strip()]),
        "paragraph_count": len([p for p in paragraphs if p.strip()]),
        "average_word_length": sum(len(word) for word in words) / len(words) if words else 0,
        "reading_time_minutes": len(words) / 200  # Average reading speed
    }

def clean_text(text: str, options: dict = {}) -> str:
    """Clean and normalize text"""
    cleaned = text

    # Remove extra whitespace
    if options.get("remove_extra_whitespace", True):
        cleaned = re.sub(r'\s+', ' ', cleaned)

    # Remove special characters
    if options.get("remove_special_chars", False):
        cleaned = re.sub(r'[^\w\s]', '', cleaned)

    # Convert to lowercase
    if options.get("lowercase", False):
        cleaned = cleaned.lower()

    # Remove numbers
    if options.get("remove_numbers", False):
        cleaned = re.sub(r'\d+', '', cleaned)

    return cleaned.strip()

# Text processing endpoints
@app.post("/process-text", response_model=TextOutput, tags=["Text Processing"])
async def process_text(input_data: TextInput):
    """
    Process text with cleaning and analysis

    This endpoint will be the foundation for LLM preprocessing in future episodes.
    """
    import time
    start_time = time.time()

    # Process the text
    processed_text = clean_text(input_data.text, input_data.options)

    # Analyze the text
    stats = analyze_text(input_data.text)

    # Calculate processing time
    processing_time = (time.time() - start_time) * 1000

    return TextOutput(
        original_text=input_data.text,
        processed_text=processed_text,
        statistics=stats,
        timestamp=datetime.now(),
        processing_time_ms=round(processing_time, 2)
    )

@app.post("/analyze-text", tags=["Text Processing"])
async def analyze_text_endpoint(input_data: TextInput):
    """
    Analyze text and return detailed statistics

    Useful for understanding text before sending to LLM APIs.
    """
    stats = analyze_text(input_data.text)

    return {
        "text_preview": input_data.text[:100] + "..." if len(input_data.text) > 100 else input_data.text,
        "statistics": stats,
        "llm_ready": len(input_data.text.split()) > 10 and len(input_data.text) < 8000,
        "estimated_tokens": len(input_data.text.split()) * 1.3  # Rough estimation
    }

Step 3: Add Error Handling

Add proper error handling to your API:

from fastapi import FastAPI, HTTPException
from fastapi.responses import JSONResponse

# Add this after your existing imports
@app.exception_handler(ValueError)
async def value_error_handler(request, exc):
    return JSONResponse(
        status_code=400,
        content={"error": "Invalid input", "detail": str(exc)}
    )

# Add a text validation endpoint
@app.post("/validate-text", tags=["Text Processing"])
async def validate_text(input_data: TextInput):
    """
    Validate text for LLM processing

    Checks for common issues that might cause problems with LLM APIs.
    """
    issues = []

    # Check text length
    if len(input_data.text) < 10:
        issues.append("Text too short (minimum 10 characters)")

    if len(input_data.text) > 8000:
        issues.append("Text too long (maximum 8000 characters for most LLMs)")

    # Check for potential prompt injection
    suspicious_patterns = [
        r"ignore previous instructions",
        r"system prompt",
        r"you are now",
        r"forget everything",
    ]

    for pattern in suspicious_patterns:
        if re.search(pattern, input_data.text, re.IGNORECASE):
            issues.append(f"Potential prompt injection detected: {pattern}")

    return {
        "is_valid": len(issues) == 0,
        "issues": issues,
        "text_stats": analyze_text(input_data.text)
    }

Step 4: Test Your Text Processing API

Restart your server:

uvicorn main:app --reload

Now visit http://127.0.0.1:8000/docs and test these endpoints:

POST /process-text: Test with sample text
POST /analyze-text: Analyze text statistics
POST /validate-text: Check text for LLM readiness

Sample test data:

{
  "text": "This is a sample text for processing. It contains multiple sentences and should demonstrate the API capabilities nicely!",
  "options": {
    "remove_extra_whitespace": true,
    "lowercase": false,
    "remove_special_chars": false
  }
}

📁 Project Structure Best Practices {#project-structure}

Let's organize our code properly for scaling. Create this file structure:

episode-01-fundamentals/
├── app/
│   ├── __init__.py
│   ├── main.py
│   ├── models.py
│   ├── services/
│   │   ├── __init__.py
│   │   └── text_processing.py
│   └── routers/
│       ├── __init__.py
│       └── text.py
├── tests/
│   ├── __init__.py
│   └── test_main.py
├── requirements.txt
└── README.md

Step 1: Create the App Structure

# Create directories
mkdir app app/services app/routers tests

# Create __init__.py files
touch app/__init__.py app/services/__init__.py app/routers/__init__.py tests/__init__.py

Step 2: Move Models to Separate File

Create app/models.py:

# app/models.py
from pydantic import BaseModel, Field
from typing import Optional
from datetime import datetime

class TextInput(BaseModel):
    text: str = Field(..., min_length=1, max_length=10000, description="Text to process")
    options: Optional[dict] = Field(default={}, description="Processing options")

class TextOutput(BaseModel):
    original_text: str
    processed_text: str
    statistics: dict
    timestamp: datetime
    processing_time_ms: float

class HealthResponse(BaseModel):
    status: str
    service: str
    timestamp: datetime

class TextValidationResponse(BaseModel):
    is_valid: bool
    issues: list
    text_stats: dict

Step 3: Create Text Processing Service

Create app/services/text_processing.py:

# app/services/text_processing.py
import re
from typing import Dict

class TextProcessor:
    """Service class for text processing operations"""

    @staticmethod
    def analyze_text(text: str) -> Dict:
        """Analyze text and return statistics"""
        words = text.split()
        sentences = text.split('.')
        paragraphs = text.split('\n\n')

        return {
            "character_count": len(text),
            "character_count_no_spaces": len(text.replace(' ', '')),
            "word_count": len(words),
            "sentence_count": len([s for s in sentences if s.strip()]),
            "paragraph_count": len([p for p in paragraphs if p.strip()]),
            "average_word_length": sum(len(word) for word in words) / len(words) if words else 0,
            "reading_time_minutes": len(words) / 200
        }

    @staticmethod
    def clean_text(text: str, options: Dict = {}) -> str:
        """Clean and normalize text"""
        cleaned = text

        if options.get("remove_extra_whitespace", True):
            cleaned = re.sub(r'\s+', ' ', cleaned)

        if options.get("remove_special_chars", False):
            cleaned = re.sub(r'[^\w\s]', '', cleaned)

        if options.get("lowercase", False):
            cleaned = cleaned.lower()

        if options.get("remove_numbers", False):
            cleaned = re.sub(r'\d+', '', cleaned)

        return cleaned.strip()

    @staticmethod
    def validate_for_llm(text: str) -> Dict:
        """Validate text for LLM processing"""
        issues = []

        if len(text) < 10:
            issues.append("Text too short (minimum 10 characters)")

        if len(text) > 8000:
            issues.append("Text too long (maximum 8000 characters)")

        suspicious_patterns = [
            r"ignore previous instructions",
            r"system prompt",
            r"you are now",
            r"forget everything",
        ]

        for pattern in suspicious_patterns:
            if re.search(pattern, text, re.IGNORECASE):
                issues.append(f"Potential prompt injection detected: {pattern}")

        return {
            "is_valid": len(issues) == 0,
            "issues": issues
        }

Step 4: Create Router

Create app/routers/text.py:

# app/routers/text.py
from fastapi import APIRouter
from app.models import TextInput, TextOutput, TextValidationResponse
from app.services.text_processing import TextProcessor
from datetime import datetime
import time

router = APIRouter(prefix="/text", tags=["Text Processing"])

@router.post("/process", response_model=TextOutput)
async def process_text(input_data: TextInput):
    """Process text with cleaning and analysis"""
    start_time = time.time()

    processor = TextProcessor()
    processed_text = processor.clean_text(input_data.text, input_data.options)
    stats = processor.analyze_text(input_data.text)

    processing_time = (time.time() - start_time) * 1000

    return TextOutput(
        original_text=input_data.text,
        processed_text=processed_text,
        statistics=stats,
        timestamp=datetime.now(),
        processing_time_ms=round(processing_time, 2)
    )

@router.post("/analyze")
async def analyze_text(input_data: TextInput):
    """Analyze text and return detailed statistics"""
    processor = TextProcessor()
    stats = processor.analyze_text(input_data.text)

    return {
        "text_preview": input_data.text[:100] + "..." if len(input_data.text) > 100 else input_data.text,
        "statistics": stats,
        "llm_ready": len(input_data.text.split()) > 10 and len(input_data.text) < 8000,
        "estimated_tokens": len(input_data.text.split()) * 1.3
    }

@router.post("/validate", response_model=TextValidationResponse)
async def validate_text(input_data: TextInput):
    """Validate text for LLM processing"""
    processor = TextProcessor()
    validation_result = processor.validate_for_llm(input_data.text)
    text_stats = processor.analyze_text(input_data.text)

    return TextValidationResponse(
        is_valid=validation_result["is_valid"],
        issues=validation_result["issues"],
        text_stats=text_stats
    )

Step 5: Update Main Application

Update app/main.py:

# app/main.py
from fastapi import FastAPI
from fastapi.responses import JSONResponse
from app.models import HealthResponse
from app.routers import text
from datetime import datetime

# Create FastAPI instance
app = FastAPI(
    title="FastAPI LLM Series - Episode 1",
    description="A text processing API built for LLM integration",
    version="1.0.0",
    docs_url="/docs",
    redoc_url="/redoc"
)

# Include routers
app.include_router(text.router)

# Root endpoints
@app.get("/", tags=["Root"])
def read_root():
    return {
        "message": "Welcome to FastAPI LLM Series!",
        "episode": 1,
        "title": "FastAPI Fundamentals",
        "endpoints": [
            "/docs - Interactive API documentation",
            "/health - Health check",
            "/text/process - Process text",
            "/text/analyze - Analyze text",
            "/text/validate - Validate text for LLM"
        ]
    }

@app.get("/health", response_model=HealthResponse, tags=["Health"])
def health_check():
    return HealthResponse(
        status="healthy",
        service="fastapi-llm-api",
        timestamp=datetime.now()
    )

# Error handlers
@app.exception_handler(ValueError)
async def value_error_handler(request, exc):
    return JSONResponse(
        status_code=400,
        content={"error": "Invalid input", "detail": str(exc)}
    )

Step 6: Create README

Create README.md:

# FastAPI LLM Series - Episode 1: Fundamentals

A text processing API built with FastAPI, designed as the foundation for LLM integration.

## Features

- ✅ Text processing and cleaning
- ✅ Text analysis and statistics
- ✅ LLM-ready text validation
- ✅ Automatic API documentation
- ✅ Type-safe request/response models

## Installation

bash
pip install -r requirements.txt


## Running the Application

bash
uvicorn app.main:app --reload


## API Documentation

Visit `http://127.0.0.1:8000/docs` for interactive API documentation.

## Endpoints

- `GET /` - Welcome message
- `GET /health` - Health check
- `POST /text/process` - Process and clean text
- `POST /text/analyze` - Analyze text statistics
- `POST /text/validate` - Validate text for LLM processing

## Next Episode

Episode 2 will cover advanced request handling and data validation.

Step 7: Run Your Refactored Application

# Run from the project root
uvicorn app.main:app --reload

🎯 Next Steps {#next-steps}

Congratulations! You've built a solid foundation for LLM applications with FastAPI. Here's what you've accomplished:

✅ What You Built

Complete FastAPI application with proper structure
Text processing service ready for LLM integration
Automatic API documentation
Error handling and validation
Scalable project architecture

🔄 What's Coming in Episode 2

Advanced Pydantic models and validation
File upload handling
Custom exception handling
Input sanitization for LLM security
Request/response middleware

🚀 Challenge Yourself

Try these exercises before Episode 2:

Add a word count endpoint that returns only word count
Implement text summarization using basic extractive methods
Add rate limiting to prevent API abuse
Create unit tests for your text processing functions

📚 Additional Resources

💡 Key Takeaways

FastAPI's automatic validation prevents malformed requests from reaching expensive LLM APIs
Type hints make your code more maintainable and catch errors early
Proper project structure is crucial for scaling LLM applications
Built-in documentation makes API testing and team collaboration effortless

Ready for Episode 2? We'll dive deep into advanced request handling and build more sophisticated validation systems that will protect your LLM endpoints from malicious inputs.

This is part of the FastAPI LLM Master Series. Follow along for the complete journey from basics to production-ready LLM applications!

Next Episode Preview: Advanced Request Handling & Data Validation - Building bulletproof APIs for LLM integration 🛡️

🏷️ Tags

#FastAPI #Python #LLM #API #WebDevelopment #MachineLearning #Tutorial #Beginners

DEV Community