Most FastAPI tutorials stop at "Hello World" and leave you stranded when it's time to deploy something real. This guide bridges that gap — here's everything you need to build, secure, and ship a production-grade FastAPI application that won't embarrass you at 3 AM when the pager goes off.
FastAPI has cemented itself as the go-to Python API framework, and for good reason: it's fast, it's type-safe, and its automatic OpenAPI documentation alone saves hours of developer time. But running it in production requires a layered approach that most developers only learn the hard way. Let's shortcut that process.
Project Structure That Scales
Before writing a single endpoint, get your structure right. The most common mistake developers make is starting with a flat file and refactoring under pressure.
app/
├── api/
│ ├── v1/
│ │ ├── endpoints/
│ │ │ ├── users.py
│ │ │ ├── products.py
│ │ │ └── auth.py
│ │ └── router.py
│ └── dependencies.py
├── core/
│ ├── config.py
│ ├── security.py
│ └── logging.py
├── db/
│ ├── base.py
│ ├── session.py
│ └── models/
├── schemas/
│ ├── user.py
│ └── product.py
├── services/
│ ├── user_service.py
│ └── product_service.py
├── tests/
│ ├── conftest.py
│ └── api/
├── main.py
└── pyproject.toml
This structure enforces separation of concerns: routers handle HTTP logic, services handle business logic, and models handle data persistence. Your endpoints never touch the database directly — that's what services are for.
Configuration Management Done Right
Hard-coded settings are a production incident waiting to happen. Use Pydantic Settings to enforce type-safe configuration with validation at startup.
# app/core/config.py
from pydantic_settings import BaseSettings, SettingsConfigDict
from functools import lru_cache
from typing import Literal
class Settings(BaseSettings):
model_config = SettingsConfigDict(
env_file=".env",
env_file_encoding="utf-8",
case_sensitive=False
)
# App
app_name: str = "My API"
environment: Literal["development", "staging", "production"] = "development"
debug: bool = False
api_v1_prefix: str = "/api/v1"
# Database
database_url: str
db_pool_size: int = 10
db_max_overflow: int = 20
# Security
secret_key: str
access_token_expire_minutes: int = 30
refresh_token_expire_days: int = 7
algorithm: str = "HS256"
# Redis
redis_url: str = "redis://localhost:6379"
# Rate limiting
rate_limit_requests: int = 100
rate_limit_window_seconds: int = 60
@lru_cache
def get_settings() -> Settings:
return Settings()
The @lru_cache decorator is critical here — it ensures the settings object is instantiated once and reused across the application, rather than re-reading environment variables on every request.
Async Database Sessions with SQLAlchemy 2.x
The async story for SQLAlchemy has matured significantly. Here's a session management pattern that plays well with FastAPI's dependency injection:
# app/db/session.py
from sqlalchemy.ext.asyncio import (
AsyncSession,
async_sessionmaker,
create_async_engine
)
from app.core.config import get_settings
settings = get_settings()
engine = create_async_engine(
settings.database_url,
pool_size=settings.db_pool_size,
max_overflow=settings.db_max_overflow,
pool_pre_ping=True, # Detect stale connections
echo=settings.debug,
)
AsyncSessionLocal = async_sessionmaker(
engine,
class_=AsyncSession,
expire_on_commit=False,
autoflush=False,
)
async def get_db() -> AsyncSession:
async with AsyncSessionLocal() as session:
try:
yield session
await session.commit()
except Exception:
await session.rollback()
raise
Notice pool_pre_ping=True. In production, idle connections get killed by load balancers and firewalls. Without this flag, you'll see cryptic OperationalError exceptions that are painful to diagnose.
Authentication: JWT with Refresh Tokens
A single access token without rotation is a security liability. Here's a complete auth flow with refresh token support:
# app/core/security.py
from datetime import datetime, timedelta, timezone
from jose import JWTError, jwt
from passlib.context import CryptContext
from app.core.config import get_settings
settings = get_settings()
pwd_context = CryptContext(schemes=["bcrypt"], deprecated="auto")
def create_access_token(subject: str) -> str:
expire = datetime.now(timezone.utc) + timedelta(
minutes=settings.access_token_expire_minutes
)
payload = {"sub": subject, "exp": expire, "type": "access"}
return jwt.encode(payload, settings.secret_key, algorithm=settings.algorithm)
def create_refresh_token(subject: str) -> str:
expire = datetime.now(timezone.utc) + timedelta(
days=settings.refresh_token_expire_days
)
payload = {"sub": subject, "exp": expire, "type": "refresh"}
return jwt.encode(payload, settings.secret_key, algorithm=settings.algorithm)
def verify_token(token: str, token_type: str) -> str:
try:
payload = jwt.decode(
token, settings.secret_key, algorithms=[settings.algorithm]
)
if payload.get("type") != token_type:
raise ValueError("Invalid token type")
subject: str = payload.get("sub")
if subject is None:
raise ValueError("Missing subject")
return subject
except JWTError:
raise ValueError("Could not validate token")
# app/api/dependencies.py
from fastapi import Depends, HTTPException, status
from fastapi.security import HTTPBearer, HTTPAuthorizationCredentials
from app.core.security import verify_token
bearer_scheme = HTTPBearer()
async def get_current_user(
credentials: HTTPAuthorizationCredentials = Depends(bearer_scheme),
db: AsyncSession = Depends(get_db),
) -> User:
try:
user_id = verify_token(credentials.credentials, token_type="access")
except ValueError:
raise HTTPException(
status_code=status.HTTP_401_UNAUTHORIZED,
detail="Invalid or expired token",
headers={"WWW-Authenticate": "Bearer"},
)
user = await user_service.get_by_id(db, user_id=user_id)
if not user:
raise HTTPException(status_code=404, detail="User not found")
return user
Structured Logging for Observability
print() statements don't cut it in production. You need structured, searchable logs that integrate with systems like Datadog, Grafana Loki, or AWS CloudWatch.
# app/core/logging.py
import logging
import sys
from pythonjsonlogger import jsonlogger
from app.core.config import get_settings
settings = get_settings()
def setup_logging() -> None:
log_level = logging.DEBUG if settings.debug else logging.INFO
handler = logging.StreamHandler(sys.stdout)
formatter = jsonlogger.JsonFormatter(
fmt="%(asctime)s %(name)s %(levelname)s %(message)s",
datefmt="%Y-%m-%dT%H:%M:%S"
)
handler.setFormatter(formatter)
root_logger = logging.getLogger()
root_logger.setLevel(log_level)
root_logger.addHandler(handler)
# Middleware for request logging
import time
from fastapi import Request
async def logging_middleware(request: Request, call_next):
start_time = time.perf_counter()
logger = logging.getLogger("api.request")
response = await call_next(request)
duration_ms = (time.perf_counter() - start_time) * 1000
logger.info(
"Request processed",
extra={
"method": request.method,
"path": request.url.path,
"status_code": response.status_code,
"duration_ms": round(duration_ms, 2),
"client_ip": request.client.host,
}
)
return response
Rate Limiting with Redis
Without rate limiting, a single misbehaving client can bring down your API. Use Redis and the sliding window algorithm:
# app/api/dependencies.py (additions)
import redis.asyncio as redis
from fastapi import Request
async def rate_limiter(request: Request):
settings = get_settings()
client_ip = request.client.host
key = f"rate_limit:{client_ip}"
redis_client = redis.from_url(settings.redis_url)
async with redis_client as r:
current = await r.get(key)
if current and int(current) >= settings.rate_limit_requests:
raise HTTPException(
status_code=status.HTTP_429_TOO_MANY_REQUESTS,
detail="Rate limit exceeded. Try again later.",
headers={"Retry-After": str(settings.rate_limit_window_seconds)},
)
pipe = r.pipeline()
await pipe.incr(key)
await pipe.expire(key, settings.rate_limit_window_seconds)
await pipe.execute()
Apply it selectively to expensive or sensitive endpoints:
@router.post(
"/auth/login",
dependencies=[Depends(rate_limiter)]
)
async def login(credentials: LoginSchema, db: AsyncSession = Depends(get_db)):
...
Health Checks and Readiness Probes
Kubernetes and load balancers need to know your app is healthy. A superficial health check that just returns {"status": "ok"} is worse than useless — it lies to your orchestration layer.
# app/api/v1/endpoints/health.py
from fastapi import APIRouter, Depends
from sqlalchemy import text
from sqlalchemy.ext.asyncio import AsyncSession
import redis.asyncio as redis
router = APIRouter()
@router.get("/health/live")
async def liveness():
"""Kubernetes liveness probe — is the process alive?"""
return {"status": "alive"}
@router.get("/health/ready")
async def readiness(db: AsyncSession = Depends(get_db)):
"""Readiness probe — can we actually serve traffic?"""
checks = {}
# Database check
try:
await db.execute(text("SELECT 1"))
checks["database"] = "healthy"
except Exception as e:
checks["database"] = f"unhealthy: {str(e)}"
# Redis check
settings = get_settings()
try:
async with redis.from_url(settings.redis_url) as r:
await r.ping()
checks["redis"] = "healthy"
except Exception as e:
checks["redis"] = f"unhealthy: {str(e)}"
all_healthy = all("unhealthy" not in v for v in checks.values())
status_code = 200 if all_healthy else 503
return JSONResponse(
content={"status": "ready" if all_healthy else "degraded", "checks": checks},
status_code=status_code
)
Putting It Together: The Application Factory
# app/main.py
from contextlib import asynccontextmanager
from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware
from fastapi.middleware.gzip import GZipMiddleware
from app.core.config import get_settings
from app.core.logging import setup_logging, logging_middleware
from app.api.v1.router import api_router
@asynccontextmanager
async def lifespan(app: FastAPI):
# Startup
setup_logging()
# Initialize connection pools, warm caches, etc.
yield
# Shutdown — clean up resources gracefully
await engine.dispose()
def create_application() -> FastAPI:
settings = get_settings()
app = FastAPI(
title=settings.app_name,
docs_url="/docs" if settings.environment != "production" else None,
redoc_url=None,
lifespan=lifespan,
)
app.add_middleware(
CORSMiddleware,
allow_origins=["https://yourdomain.com"],
allow_methods=["GET", "POST", "PUT", "DELETE"],
allow_headers=["Authorization", "Content-Type"],
)
app.add_middleware(GZipMiddleware, minimum_size=1000)
app.middleware("http")(logging_middleware)
app.include_router(api_router, prefix=settings.api_v1_prefix)
return app
app = create_application()
Note the docs_url=None in production. Exposing your OpenAPI schema publicly is a free gift to attackers who want a map of your attack surface.
Deployment: Gunicorn + Uvicorn Workers
For production, don't run Uvicorn directly. Use Gunicorn as the process manager with Uvicorn workers — you get multi-process stability with async performance:
gunicorn app.main:app \
--workers 4 \
--worker-class uvicorn.workers.UvicornWorker \
--bind 0.0.0.0:8000 \
--timeout 30 \
--keepalive 5 \
--access-logfile - \
--error-logfile -
A safe starting point for worker count is (2 × CPU cores) + 1. For a 2-core container, that's 5 workers.
Conclusion
Building a production-ready FastAPI application isn't about any single feature — it's about the compound effect of doing many small things correctly. Type-safe configuration prevents deployment surprises. Proper async session management prevents connection pool exhaustion. Real health checks prevent ghost traffic from a half-dead pod. Structured logging means you can actually debug the 3 AM incident.
The takeaway: Treat your FastAPI app like infrastructure, not a script. Every decision — from project layout to how you manage database connections — either compounds into reliability or compounds into technical debt. Start with the patterns in this guide, and you'll spend less time firefighting and more time shipping features.
Tags: fastapi python api-development backend web-development
Want the full resource?
SaaS Growth Engine — MRR & Churn Dashboard — $29.99 on Gumroad
Get the complete, downloadable version with everything in this post and more. Perfect for bookmarking, printing, or sharing with your team.
If you found this useful, drop a ❤️ and share it with a colleague. Follow me for more developer resources every week.
Top comments (0)