DEV Community

Lyazid LAMRIBAH
Lyazid LAMRIBAH

Posted on

FastAPI in Layers: A Production Structure That Makes Testing Trivial

Approx. 20 min read


What started all of this was wanting to build an LLM-powered application.

Not a toy. Something with real moving parts — a backend that could handle API calls, talk to a database, manage configuration, and eventually wire into an AI layer on top. I had a rough idea of what I wanted to build, but every time I sat down to start, I felt like I was building on sand.

For the longest time before that, my API projects started clean and then quietly fell apart. Not broken — they worked fine. But after a few weeks of adding features, I'd open the codebase and spend more time figuring out where something lived than actually writing code. Business logic bleeding into route handlers. Database calls scattered everywhere. A utils.py that became a graveyard for things I didn't know where else to put.

I kept hearing "structure your project properly" but nobody explained why a structure works — just what folders to create. So I paused on the LLM part and went back to basics.

I want to say something about how I approached that, because I think it matters. Right now, the easy path is to vibe-code your way through a project — describe what you want to an AI, get code back, paste it in, repeat. A lot of people are doing this and skipping over the fundamentals entirely. I get why. It's fast and it often works. But I wanted to use AI to actually understand — to have things explained until they clicked, to ask "why does this work this way" and get a real answer. Because the ability to reproduce something from scratch, debug it when it breaks, and extend it confidently — that's what being competent actually means. Generated code you don't understand is a liability that compounds over time.

This article is what I ended up with after going through that process. Not a reference guide — just what I figured out, explained the way I wish someone had explained it to me.


The Basic Layout

app/
├── api/           # HTTP boundary — routes, request/response shapes
├── core/
│   └── settings.py  # App configuration, loaded once at startup
├── services/      # Business logic
├── repositories/  # Everything that touches the database
├── clients/       # Third-party API wrappers (Stripe, SendGrid, etc.)
├── dependencies.py # FastAPI Depends() wiring — auth, current user, etc.
└── database.py    # Connection pool management
main.py            # App factory, lifespan, router registration
Enter fullscreen mode Exit fullscreen mode

What made this land for me was thinking about each layer as having a job — and more importantly, things it's not allowed to do. The folders are almost secondary to that idea.


Settings: Stop Scattering os.getenv() Everywhere

The first thing I figured out was that having os.getenv("DATABASE_URL") called in three different files is a slow-burning problem. If the variable name changes, or you want to add a default, or you want to understand what config the app actually needs — you're hunting.

The fix is a single app/core/settings.py that owns all of it:

from pydantic_settings import BaseSettings, SettingsConfigDict

class AppSettings(BaseSettings):
    model_config = SettingsConfigDict(
        env_file=".env",
        env_file_encoding="utf-8",
        case_sensitive=False,
        env_prefix="APP_",
        extra="forbid",
        frozen=True,
    )

    database_url: str
    debug: bool = False
    max_connections: int = 10
Enter fullscreen mode Exit fullscreen mode

The thing that really won me over here was extra="forbid". If you have a typo in your .env file — APP_DATABSE_URL instead of APP_DATABASE_URL — the app crashes immediately at startup with a validation error. Before I knew about this, I'd spend ages debugging a silent failure caused by a misspelled env variable. Now it just tells me.

frozen=True is another one I didn't appreciate at first. It makes the settings object immutable after it's created — if any code tries to change a setting after startup, it gets a TypeError immediately. The idea is simple: config should be read once when the app starts, and nothing should be able to change it mid-flight. Without this, some code could accidentally mutate a shared settings object and silently change behaviour for every subsequent caller.

Pydantic Settings resolves values in a layered priority order: init arguments win over environment variables, which win over .env file values, which win over defaults. So if you have APP_DATABASE_URL in both your shell environment and your .env, the shell always wins. This is the behaviour you want — local .env for development, real environment variables injected in production.

To make sure settings are only parsed once, you wrap the instantiation in lru_cache:

from functools import lru_cache

@lru_cache(maxsize=1)
def get_settings() -> AppSettings:
    return AppSettings()
Enter fullscreen mode Exit fullscreen mode

lru_cache is a standard library decorator that memoises the return value of the function. maxsize=1 means it caches exactly one result — the settings instance — and returns that same object on every subsequent call without touching the filesystem again. The .env is read exactly once. In tests, you call get_settings.cache_clear() to flush the cached instance and force a fresh read with different environment values.

You can plug get_settings directly into FastAPI's dependency injection:

def some_route(settings: AppSettings = Depends(get_settings)):
    ...
Enter fullscreen mode Exit fullscreen mode

Routes: The Narrowest Job in the App

I used to put a lot of logic in route handlers. It felt natural — the route is where the request comes in, so why not handle everything there?

The problem is that as soon as you do that, you can't reuse any of that logic. And it's hard to test without spinning up a full HTTP client.

The thing that shifted my thinking: a route handler's only job is to speak HTTP. Receive a request, validate the input, call something else, return a response with the right status code. That's it.

@router.post("/users", status_code=status.HTTP_201_CREATED)
async def create_user(
    payload: CreateUserRequest,
    user_service: UserService = Depends(get_user_service),
):
    user = await user_service.create(payload)
    return UserResponse.model_validate(user)
Enter fullscreen mode Exit fullscreen mode

Depends(get_user_service) is FastAPI's dependency injection. When a request comes in, FastAPI calls get_user_service and injects the result into your handler. This is how you avoid manually constructing services and repositories inside every route — you declare what you need, and FastAPI resolves it. The same mechanism handles auth, database sessions, rate limiting, and more. It's one of the things that makes FastAPI so composable.

UserResponse.model_validate(user) explicitly controls what goes back to the caller. Even if the User model internally has fields like hashed_password or internal_flags, they only leave the server if UserResponse includes them. This is intentional — you never want to accidentally serialise the wrong fields because you forgot to exclude something.

Every project I build now also gets two health endpoints:

@router.get("/health")
async def get_health():
    return {"status": "healthy", "version": "1.0.0"}

@router.get("/health/db")
async def get_health_db(db: Database = Depends(get_db)):
    try:
        await db.check_connection()
        return {"status": "healthy", "dependencies": {"database": "connected"}}
    except Exception as e:
        raise HTTPException(
            status_code=status.HTTP_503_SERVICE_UNAVAILABLE,
            detail={"status": "unhealthy", "error": str(e)},
        )
Enter fullscreen mode Exit fullscreen mode

/health is for "is the process running." /health/db is for "is the process actually useful right now." I didn't understand why you'd need both until I started deploying to environments where the server would start up just fine but silently fail to reach the database. The second endpoint catches that.


Services: Where the Actual Logic Goes

This is the layer that made everything else make sense when I finally got it.

A service is just a class that does a thing. It doesn't know it's being called over HTTP. No Request objects, no status codes, no serialization. Just: take clean data, do something with it, return a result or raise a meaningful exception.

class UserService:
    def __init__(self, user_repo: UserRepository):
        self.user_repo = user_repo

    async def create(self, payload: CreateUserRequest) -> User:
        existing = await self.user_repo.find_by_email(payload.email)
        if existing:
            raise EmailAlreadyExistsError(payload.email)

        hashed_pw = hash_password(payload.password)
        return await self.user_repo.create(payload.email, hashed_pw)
Enter fullscreen mode Exit fullscreen mode

Because it doesn't know about HTTP, it's trivially testable. You just pass in data and assert the output. No test client, no mock requests. That alone made me want to keep things this way.

The other thing I came to appreciate: this same service can be called from a background job, a CLI command, or a scheduled task. The logic lives once.


Repositories: The Only Place SQL Belongs

Before I understood this pattern, my SQL queries lived all over the place. In services, in route handlers, sometimes both. It made any schema change a scavenger hunt.

The idea is simple: one layer is allowed to touch the database, and everything else goes through it.

class UserRepository:
    def __init__(self, db: Database):
        self.db = db

    async def find_by_email(self, email: str) -> Optional[User]:
        async with self.db.pool.connection() as conn:
            async with conn.cursor() as cur:
                await cur.execute(
                    "SELECT id, email, created_at FROM users WHERE email = %s",
                    (email,)
                )
                row = await cur.fetchone()
                return User.model_validate(dict(row)) if row else None
Enter fullscreen mode Exit fullscreen mode

The payoff I've actually felt: when a table gets renamed or a column changes, I go to one file. The service doesn't care — it still calls find_by_email and gets a User back. For more complex domains you might eventually split repositories further or introduce other patterns — but that's a problem worth having later, once your app has actually grown into it.


The Database Class: Managing the Connection Pool

I didn't fully understand connection pools when I started. I was naively opening a new connection on every request, which works fine locally and starts falling apart under real load.

A connection pool keeps a number of database connections open and warm so requests can borrow one, use it, and return it without the overhead of a full connection handshake each time.

from psycopg_pool import AsyncConnectionPool
from typing import Optional

class Database:
    def __init__(self):
        self._pool: Optional[AsyncConnectionPool] = None

    async def connect(self):
        settings = get_settings()
        self._pool = AsyncConnectionPool(settings.database_url, open=False)
        await self._pool.open()

    async def disconnect(self):
        if self._pool:
            await self._pool.close()
            self._pool = None

    @property
    def pool(self) -> AsyncConnectionPool:
        if self._pool is None:
            raise RuntimeError("Database not connected. Call connect() first.")
        return self._pool

    async def check_connection(self):
        async with self.pool.connection() as conn:
            async with conn.cursor() as cur:
                await cur.execute("SELECT 1")

db = Database()
Enter fullscreen mode Exit fullscreen mode

The open=False argument was something I had to look up. It tells the pool not to open any connections at instantiation time — you control when it opens by calling await self._pool.open() explicitly, inside the lifespan. This way, connections aren't being created at import time before the app is ready.

The pool property with the RuntimeError guard is something I added after hitting a confusing bug where something tried to use the pool too early and I got a NoneType error deep inside the driver. The guard makes the mistake obvious immediately.


main.py: Just the Wiring

Once everything else is in place, main.py ends up almost boring — which I now think is a good sign.

from contextlib import asynccontextmanager
from app.database import db
from fastapi import FastAPI
from app.api.routes import router

@asynccontextmanager
async def lifespan(app: FastAPI):
    await db.connect()
    yield
    await db.disconnect()

app = FastAPI(lifespan=lifespan)
app.include_router(router)

if __name__ == "__main__":
    import uvicorn
    uvicorn.run("main:app", host="0.0.0.0", port=8000, reload=True)
Enter fullscreen mode Exit fullscreen mode

The lifespan context manager is where startup and shutdown logic goes. Everything before yield runs when the app starts. Everything after runs when it shuts down. Opening the database pool here felt like the right call once I understood the pattern — it's explicit, it's ordered, and you know exactly when connections exist and when they don't.

The if __name__ == "__main__" block is just for running locally with python main.py. In production it gets ignored entirely.


Testing: The Part That Took Me the Longest to Actually Understand

I want to spend more time on this than the other sections, because I think testing is both the most important thing to get right and the hardest to actually grasp — not the mechanics, but the thinking behind it.

The structure I described above was built, in large part, so things could be tested in isolation. All those layers with narrow jobs — they exist partly so you can swap out the real database for a fake one in tests without your app noticing. Understanding that changed how I looked at the whole architecture.

There are three kinds of tests and they do different things.

Unit tests hit a service or utility in total isolation. No database, no HTTP. You construct a service, pass in fake dependencies, call a method, assert the result. Fast, precise, and they tell you exactly what broke.

Integration tests hit a route via HTTP with a real (test) database. They check that your layers are wired together correctly. Slower, but they catch a class of bugs that unit tests can't.

End-to-end tests drive the whole app like an external client would. They're the most realistic but also the most expensive to run and maintain. Usually reserved for critical paths.

Here's how testing actually looks in practice, starting with the health endpoints:

import pytest
from unittest.mock import AsyncMock
from fastapi.testclient import TestClient
from main import app
from app.dependencies import get_db


def test_health_endpoint():
    client = TestClient(app)
    response = client.get("/health")

    assert response.status_code == 200
    data = response.json()
    assert data["status"] == "healthy"
    assert data["version"] == "1.0.0"
Enter fullscreen mode Exit fullscreen mode

This is straightforward — spin up TestClient, hit the endpoint, assert the response. TestClient is a thin wrapper around httpx that runs your ASGI app in-process. No actual server is started, no port is opened. It's fast.

The more interesting part is what happens when you need to test an endpoint that depends on the database:

def test_health_db_success():
    mock_db = AsyncMock()
    mock_db.check_connection.return_value = None

    app.dependency_overrides[get_db] = lambda: mock_db

    client = TestClient(app)
    response = client.get("/health/db")

    app.dependency_overrides.clear()

    assert response.status_code == 200
    data = response.json()
    assert data["status"] == "healthy"
    assert data["dependencies"]["database"] == "connected"
Enter fullscreen mode Exit fullscreen mode

This is where dependency_overrides comes in, and it took me a while to really understand what's happening here. When FastAPI processes a request, it resolves dependencies by calling the functions registered with Depends(). dependency_overrides lets you tell FastAPI: "for this test, whenever you would call get_db, call this lambda instead and use the mock it returns."

You're not patching the database itself — you're patching the injection point. The route handler receives mock_db exactly as it would receive a real Database instance, and it can't tell the difference. check_connection is an AsyncMock, so awaiting it works without complaining, and .return_value = None means it resolves successfully — simulating a healthy connection.

The failure case is just as important to test, and it's where side_effect comes in:

def test_health_db_failure():
    mock_db = AsyncMock()
    mock_db.check_connection.side_effect = Exception("Database connection failed")

    app.dependency_overrides[get_db] = lambda: mock_db

    client = TestClient(app)
    response = client.get("/health/db")

    app.dependency_overrides.clear()

    assert response.status_code == 503
    data = response.json()
    assert data["detail"]["status"] == "unhealthy"
    assert "error" in data["detail"]
Enter fullscreen mode Exit fullscreen mode

side_effect tells AsyncMock to raise that exception when check_connection is awaited, instead of returning a value. This tests the catch block in your route — that a real failure actually returns a 503 and the right error shape. Without this test, you could break your error handling and not know it until a real database failure hit production.

Two smaller tests that I've found consistently useful:

def test_health_response_structure():
    client = TestClient(app)
    response = client.get("/health")
    data = response.json()

    expected_keys = {"status", "version"}
    assert set(data.keys()) == expected_keys
    assert isinstance(data["status"], str)
    assert isinstance(data["version"], str)


@pytest.mark.parametrize("endpoint", ["/health", "/health/db"])
def test_health_endpoints_accept_get_only(endpoint):
    client = TestClient(app)
    for method in ["POST", "PUT", "DELETE", "PATCH"]:
        response = client.request(method, endpoint)
        assert response.status_code == 405
Enter fullscreen mode Exit fullscreen mode

test_health_response_structure pins the exact shape of the response. If someone adds a field, removes one, or renames a key, the test breaks — which is what you want, because the shape is a contract with anything that calls you.

@pytest.mark.parametrize runs the same test function against a list of inputs. Rather than writing two nearly identical test functions, you declare the varying input as a parameter and pytest handles the rest. It also reports failures per-parameter, so you know exactly which endpoint broke.

The big thing I kept getting confused about when I started: app.dependency_overrides.clear() at the end of each test isn't optional cleanup. If you forget it, the override from one test bleeds into the next, and you end up with tests that pass or fail based on execution order — one of the most confusing bugs you can have in a test suite. In practice, use a pytest fixture to handle this automatically:

@pytest.fixture(autouse=True)
def clear_overrides():
    yield
    app.dependency_overrides.clear()
Enter fullscreen mode Exit fullscreen mode

autouse=True means this fixture runs for every test in scope automatically, without you having to remember to include it. Everything after yield is teardown — it runs after the test completes, whether it passed or failed.


Things I Haven't Needed Yet (But Know Where They'd Go)

Part of figuring this out was learning not to add things before they're needed. These are the extensions I know about but only reach for when something tells me to:

Background tasks — when something like sending a confirmation email is blocking the HTTP response. FastAPI has a lightweight BackgroundTasks for simple cases. Celery + Redis when you need retries, scheduling, or a proper worker process.

Cache layer — when a read-heavy endpoint is slow under load and the data barely changes. Redis in front of the service call. I haven't needed this yet on the projects I've worked on, but I know where to slot it in.

External clients (app/clients/) — any time my app calls a third-party API (Stripe, SendGrid, internal services), they get their own file under clients/. Keeps that stuff out of the services so services stay testable.

Migrations — Alembic, from the first moment there's real data. This is one I learned the hard way. Managing schema changes without migration files is fine until it suddenly very much isn't.


What I Actually Took Away From All This

The thing that changed how I think about this wasn't any specific folder structure — it was the idea that each layer should have a clear job and clear limits on what it's allowed to do.

Routes translate HTTP. Services hold logic. Repositories hold queries. Settings holds config. When each piece stays in its lane, the codebase stays navigable and things are easy to find. When they don't, you end up where I started: code that works but takes 20 minutes to orient yourself in.

If you've structured this differently — particularly around the service/repository boundary — I'd genuinely love to hear how you've approached it in the comments.


Documentation and Further Reading

Everything I covered here has official docs worth reading. These are the ones I kept open while figuring this out:

FastAPI

Pydantic & Pydantic Settings

Database

Testing

Top comments (0)