DEV Community

Cover image for Building a Production-Ready Task Management API with FastAPI: Testing, Deployment & Production (Part 3)
Ravi Gupta
Ravi Gupta

Posted on

Building a Production-Ready Task Management API with FastAPI: Testing, Deployment & Production (Part 3)

Building a Production-Ready Task Management API with FastAPI: Testing, Deployment & Production (Part 3)

Part 3 of 3: After the architecture and development phases, I thought deployment would be straightforward. I was wrong. This is the final chapter of my journey from local code to live production system.

The API worked perfectly on my laptop.

50+ endpoints responding.
Authentication solid.
Clean architecture.
All tests passing locally.

I was ready to deploy.

Then I clicked "Deploy."

That's when everything broke.

CORS errors I'd never seen.
Environment variables mysteriously missing.
Database migrations failing for no reason.
Rate limiting crashing the entire app.
Cold starts taking 30 seconds.

This article covers Phase 3: Testing, Deployment & Production - where theory meets reality, and "working" becomes "production-ready."

Not just what worked.
What broke, what surprised me, and what I learned debugging a live system at midnight.


Quick Links:


๐Ÿ“š What You'll Learn

  • Testing async FastAPI with pytest (the setup that actually works)
  • Docker optimization strategies (1.2GB โ†’ 380MB)
  • Free-tier deployment with Render + Neon PostgreSQL
  • 5 production bugs that only happen after deployment
  • Real metrics from 30 days of uptime
  • What I'd do differently if I started over
  • Cost breakdown of running production API ($0!)

Reading time: ~15 minutes

GitHub Repository: https://github.com/ravigupta97/task_management_api

Live API: https://task-management-api-a775.onrender.com/docs


๐ŸŽฏ The Journey So Far

Phase 1: Spent days designing the perfect architecture.
Phase 2: Built features, fought async bugs, implemented JWT from scratch.
Phase 3: Time to deploy. How hard could it be?

Spoiler: Very hard.

In Part 1, I designed a clean architecture:

  • FastAPI + PostgreSQL
  • Repository pattern
  • Service layer separation
  • Clean project structure

In Part 2, I built the actual system:

  • JWT authentication
  • 50+ endpoints
  • Rate limiting
  • Advanced features

On paper, everything looked solid.

In development, everything worked.

But there's a gap between "works on my machine" and "works in production."

A massive gap.

This phase was about:

  • Writing tests that catch real bugs (not just make CI green)
  • Containerizing with Docker (and learning why 1.2GB is unacceptable)
  • Deploying to actual infrastructure (free tier!)
  • Debugging production issues at midnight
  • Measuring real performance (not synthetic benchmarks)

The honest truth: This phase took almost as long as building the features.

Because deployment isn't just "push to Render."

It's environment variables you forgot. CORS configurations that work locally but fail remotely. Database connection pools that exhaust mysteriously. Cold starts that make your API feel broken. Migrations that succeed locally but fail in production.

Every bug taught me something tutorials never mentioned.

This is where you stop being a tutorial follower and start being an engineer.


๐Ÿงช Testing Strategy

The Wake-Up Call

I thought I could skip comprehensive testing.

"I manually tested everything. It works."

Then I added a new feature.

Broke 3 existing endpoints.

Didn't notice until I tried to create a task and got a 500 error.

That's when I learned: Manual testing doesn't scale.

You need automated tests. Not because they're trendy.

Because your memory isn't perfect.

The Goal: 85%+ Coverage

Target: 85%+ test coverage

Not arbitrary.

85% forces you to test:

  • Happy paths (obviously)
  • Error cases (what happens when things fail)
  • Edge cases (empty strings, null values, boundary conditions)
  • Integration points (do layers work together?)

Below 85%? You're probably skipping important scenarios.

Above 90%? Diminishing returns (testing getters/setters, framework code).

My target: 85-87%. Test what matters.


The Challenge: Async Testing

Coming from Spring Boot, testing was familiar:

@Test
public void testGetTasks() {
    List<Task> tasks = taskService.getTasks();
    assertEquals(5, tasks.size());
}
Enter fullscreen mode Exit fullscreen mode

Synchronous. Simple. Works.

In FastAPI with async?

Nothing worked initially.

# This failed
def test_get_tasks():
    response = client.get("/tasks")  # Error: can't await
Enter fullscreen mode Exit fullscreen mode
# This also failed
async def test_get_tasks():
    response = await client.get("/tasks")  # Error: client not async
Enter fullscreen mode Exit fullscreen mode

Async testing requires:

  • Async test client (not regular TestClient)
  • Async fixtures
  • Event loop management
  • Database session handling

It's not intuitive.

But once you understand it, it's powerful.


The Setup That Finally Worked

After hours of trial and error, here's the pytest configuration that actually works:

# conftest.py
import pytest
import asyncio
from httpx import AsyncClient
from sqlalchemy.ext.asyncio import create_async_engine, AsyncSession
from sqlalchemy.orm import sessionmaker

from app.main import app
from app.database import get_db, Base

# Test database URL
TEST_DATABASE_URL = "postgresql+asyncpg://test_user:test_pass@localhost/test_db"

# Create test engine
test_engine = create_async_engine(TEST_DATABASE_URL, echo=False)

# Create test session factory
TestingSessionLocal = sessionmaker(
    test_engine,
    class_=AsyncSession,
    expire_on_commit=False
)

@pytest.fixture(scope="session")
def event_loop():
    """Create event loop for async tests."""
    loop = asyncio.get_event_loop_policy().new_event_loop()
    yield loop
    loop.close()

@pytest.fixture(scope="function")
async def db_session():
    """Create fresh test database for each test."""
    async with test_engine.begin() as conn:
        await conn.run_sync(Base.metadata.create_all)

    async with TestingSessionLocal() as session:
        yield session

    async with test_engine.begin() as conn:
        await conn.run_sync(Base.metadata.drop_all)

@pytest.fixture(scope="function")
async def client(db_session):
    """Create test client with database override."""
    async def override_get_db():
        yield db_session

    app.dependency_overrides[get_db] = override_get_db

    async with AsyncClient(app=app, base_url="http://test") as ac:
        yield ac

    app.dependency_overrides.clear()
Enter fullscreen mode Exit fullscreen mode

Why this setup works:

  1. Separate test database - Never pollute your dev database
  2. Fresh database per test - Isolation prevents flaky tests
  3. Async fixtures - Match your application's async nature
  4. Dependency override - FastAPI's killer feature for testing

Time to figure this out: 4 hours of Stack Overflow and trial and error.

Worth it? Absolutely. Tests are now fast, isolated, and reliable.


Testing Authentication: The Fixture Pattern

Challenge: How do you test protected endpoints without logging in every time?

Solution: Create authenticated test client fixture.

# conftest.py
import pytest
from app.core.security import create_access_token
from app.models.user import User

@pytest.fixture
async def test_user(db_session):
    """Create a test user."""
    user = User(
        email="test@example.com",
        username="testuser",
        hashed_password="$2b$12$...",  # hashed "testpassword"
        is_active=True,
        is_verified=True
    )
    db_session.add(user)
    await db_session.commit()
    await db_session.refresh(user)
    return user

@pytest.fixture
async def authenticated_client(client, test_user):
    """Create authenticated test client."""
    access_token = create_access_token(data={"sub": str(test_user.id)})
    client.headers = {
        **client.headers,
        "Authorization": f"Bearer {access_token}"
    }
    return client
Enter fullscreen mode Exit fullscreen mode

Now testing is clean:

@pytest.mark.asyncio
async def test_create_task(authenticated_client):
    """Test creating a task."""
    response = await authenticated_client.post(
        "/api/v1/tasks/",
        json={
            "title": "Test Task",
            "description": "Test Description",
            "status": "TODO",
            "priority": "HIGH"
        }
    )

    assert response.status_code == 201
    data = response.json()
    assert data["title"] == "Test Task"
    assert data["priority"] == "HIGH"
Enter fullscreen mode Exit fullscreen mode

No manual login. No token management. Just test the business logic.

This pattern saved me hundreds of lines of repetitive setup code.


Integration Tests: The Full Journey

Integration tests verify the entire flow works together.

@pytest.mark.asyncio
async def test_complete_task_lifecycle(authenticated_client):
    """Test full CRUD flow for tasks."""

    # Create
    create_response = await authenticated_client.post(
        "/api/v1/tasks/",
        json={"title": "Integration Test", "status": "TODO", "priority": "LOW"}
    )
    assert create_response.status_code == 201
    task_id = create_response.json()["id"]

    # Read
    get_response = await authenticated_client.get(f"/api/v1/tasks/{task_id}")
    assert get_response.status_code == 200
    assert get_response.json()["title"] == "Integration Test"

    # Update
    update_response = await authenticated_client.put(
        f"/api/v1/tasks/{task_id}",
        json={"status": "COMPLETED"}
    )
    assert update_response.status_code == 200
    assert update_response.json()["status"] == "COMPLETED"

    # Delete
    delete_response = await authenticated_client.delete(f"/api/v1/tasks/{task_id}")
    assert delete_response.status_code == 204

    # Verify deletion
    verify_response = await authenticated_client.get(f"/api/v1/tasks/{task_id}")
    assert verify_response.status_code == 404
Enter fullscreen mode Exit fullscreen mode

What this caught:

  • Forgot to add CASCADE DELETE (got constraint violation errors)
  • Token expiry during long test runs
  • Session closure issues in UPDATE operations

Integration tests = real-world scenarios = real bugs found.


Test Coverage: The Results

After writing tests for all critical paths:

$ pytest --cov=app tests/

---------- coverage: platform linux, python 3.11.16 -----------
Name                              Stmts   Miss  Cover
-----------------------------------------------------
app/main.py                          45      2    96%
app/api/v1/auth.py                   89      8    91%
app/api/v1/tasks.py                 102     10    90%
app/services/task_service.py        124     15    88%
app/core/security.py                 52      3    94%
-----------------------------------------------------
TOTAL                              1247    158    87%
Enter fullscreen mode Exit fullscreen mode

87% coverage achieved.

What I didn't test (the remaining 13%):

  • Some error handling edge cases
  • Specific database constraint failures
  • Rate limiter timing dependencies

Why that's okay: 87% covers all critical paths. The remaining 13% is mostly defensive code and edge cases that are hard to reliably test.

Time invested in testing: 1 week

Bugs caught before production: 15+

Worth it? Absolutely.


๐Ÿณ Docker: From 1.2GB to 380MB

My First Docker Image Was Embarrassing

1.2GB.

For a Python API.

I pushed it to Docker Hub, proud of myself.

Then I tried to deploy it to Render.

"Build timeout: Image too large"

Panic.

That's when I learned: Docker image size matters.

Not just for build time.
For deployment speed.
For resource usage.
For everything.

Time to optimize.


The Naive Approach

My first Dockerfile was... simple:

FROM python:3.11
COPY . /app
WORKDIR /app
RUN pip install -r requirements.txt
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]
Enter fullscreen mode Exit fullscreen mode

Image size: 1.2GB

Problems:

  • Includes full Python image (300MB+ of unnecessary tools)
  • Build dependencies stay in final image
  • pip cache included
  • Unnecessary files copied

Build time: 8 minutes

Deployment: Timed out on Render's free tier

Not acceptable.


Multi-Stage Builds: The Solution

After researching Docker best practices, I discovered multi-stage builds.

The concept: Build in one image, run in another.

# Stage 1: Builder - Install dependencies
FROM python:3.11-slim as builder

WORKDIR /app

# Install build dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
    gcc \
    postgresql-client \
    && rm -rf /var/lib/apt/lists/*

# Copy and install Python dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir --user -r requirements.txt

# Stage 2: Runtime - Small, clean image
FROM python:3.11-slim

WORKDIR /app

# Install only runtime dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
    libpq5 \
    && rm -rf /var/lib/apt/lists/*

# Copy installed packages from builder
COPY --from=builder /root/.local /root/.local

# Copy application code
COPY ./app ./app
COPY alembic.ini .
COPY ./alembic ./alembic

# Ensure pip packages are in PATH
ENV PATH=/root/.local/bin:$PATH

EXPOSE 8000

# Run migrations and start server
CMD alembic upgrade head && uvicorn app.main:app --host 0.0.0.0 --port 8000
Enter fullscreen mode Exit fullscreen mode

Final image size: 380MB

Reduction: 68% smaller!

Key optimizations:

  1. Multi-stage build - Builder stage separate from runtime
  2. Slim base image - python:3.11-slim instead of full python:3.11
  3. No cache in pip - --no-cache-dir flag
  4. Clean apt lists - rm -rf /var/lib/apt/lists/* after installs
  5. Only runtime dependencies - gcc and build tools left in builder stage

Build time: 3 minutes (down from 8!)

Deployment: Successful on Render free tier

That felt good.


Local Development: docker-compose

For local development, I needed:

  • PostgreSQL database
  • API server
  • Hot reload
  • Easy setup

docker-compose made it simple:

version: '3.8'

services:
  db:
    image: postgres:16
    environment:
      POSTGRES_USER: taskapi
      POSTGRES_PASSWORD: taskapi
      POSTGRES_DB: taskapi_dev
    ports:
      - "5432:5432"
    volumes:
      - postgres_data:/var/lib/postgresql/data

  api:
    build: .
    command: uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload
    volumes:
      - ./app:/app/app  # Hot reload
    ports:
      - "8000:8000"
    environment:
      DATABASE_URL: postgresql+asyncpg://taskapi:taskapi@db:5432/taskapi_dev
      SECRET_KEY: dev-secret-key
    depends_on:
      - db

volumes:
  postgres_data:
Enter fullscreen mode Exit fullscreen mode

One command to start everything:

docker-compose up
Enter fullscreen mode Exit fullscreen mode

Benefits:

  • Consistent environment across team
  • No "works on my machine" issues
  • Easy onboarding for new developers
  • Matches production setup

This decision saved hours of environment debugging.


๐Ÿš€ Deployment: Free Tier Reality

Platform Choice: The $0 Constraint

I had a strict budget for infrastructure.

$0.

Not "limited." Zero dollars.

This eliminated most options immediately.

The Contenders:

Platform Free Tier PostgreSQL Catches
Render โœ… Yes โœ… Neon integration Sleeps after 15min
Heroku โŒ No longer free โœ… Yes Killed free tier
AWS โš ๏ธ 12 months only โœ… Yes Complex setup
Railway โœ… Limited hours โœ… Yes $5 credit/month
Fly.io โœ… Yes โœ… Yes Credit system

Decision: Render

Why?

  • Truly free forever (not trial)
  • Docker support
  • Auto-deploy from GitHub
  • Simple configuration
  • Good documentation

Trade-off: Cold starts after 15 minutes of inactivity.

Acceptable for a learning project.


Database: Neon PostgreSQL

The Problem: Most free PostgreSQL offerings have catches.

  • Heroku: Limited to 10K rows (not enough)
  • Supabase: Pauses after 7 days inactivity (annoying)
  • AWS RDS: Only 12 months free (not sustainable)

Neon's free tier:

  • โœ… Actually free forever
  • โœ… 3GB storage
  • โœ… Serverless (auto-scales)
  • โœ… Branch-based development (like git!)
  • โœ… No sleep/pause

Setup time: 5 minutes

  1. Create Neon account
  2. Create database
  3. Copy connection string
  4. Add to Render environment variables

Done.

This was the easiest part of deployment.

The rest? Not so much.


First Deployment: Everything Broke

I pushed to GitHub.

Render detected the Dockerfile.

Build started.

Then:

Error: Environment variable DATABASE_URL not found
Error: SECRET_KEY is None
Error: Application failed to start
Enter fullscreen mode Exit fullscreen mode

What I forgot:

Environment variables don't automatically transfer from .env file to Render.

You have to set them manually in Render's dashboard.

Lesson learned: Environment configuration is separate from code.

Never assume.


The Health Check Endpoint

Critical for free tier deployments.

Render's free tier sleeps after 15 minutes of inactivity.

Health checks can:

  • Wake it up automatically
  • Verify it's actually running
  • Test database connectivity
# app/main.py
from fastapi import FastAPI, status
from sqlalchemy import text

@app.get("/health", status_code=status.HTTP_200_OK)
async def health_check(db: AsyncSession = Depends(get_db)):
    """
    Health check endpoint.
    Tests database connectivity.
    """
    try:
        await db.execute(text("SELECT 1"))
        return {
            "status": "healthy",
            "database": "connected",
            "environment": settings.ENVIRONMENT
        }
    except Exception as e:
        return {
            "status": "unhealthy",
            "database": "disconnected",
            "error": str(e)
        }
Enter fullscreen mode Exit fullscreen mode

Benefits:

  • Confirms app is running
  • Tests database connection
  • Used by monitoring services
  • Simple debugging tool

This endpoint became my best friend during deployment debugging.


๐Ÿšง Production Bugs: The 2 AM Stories

Bug #1: The 30-Second Cold Start

The Problem:

I sent the API link to a friend.

"Check out my live project!"

I waited.

30 seconds passed.

Still loading.

I refreshed. 30 more seconds.

Internal panic.

"Is it down? Did I break something??"

Checked Render logs: "Spinning up service from sleep..."

Oh.

Free tier sleeps after 15 minutes.

Every. Single. Time.

The Solution:

Can't eliminate cold starts on free tier.

But I could handle the UX:

  1. Health check endpoint - Wake it up faster
  2. Frontend loading state - "Server waking up, please wait..."
  3. Set expectations - Document it's free tier

Better solution: Upgrade to paid tier ($7/month for always-on).

But this is a learning project. Free tier it is.

Lesson learned: Constraints force creative solutions. Or at least, creative messaging.


Bug #2: The Case-Sensitive Environment Variable

The Problem:

API deployed successfully.

Opened the docs page.

Everything loaded.

Tried to login.

{
  "detail": "Internal Server Error"
}
Enter fullscreen mode Exit fullscreen mode

Checked logs:

AttributeError: 'NoneType' object has no attribute 'decode'
Enter fullscreen mode Exit fullscreen mode

What?

Spent an hour debugging.

Checked JWT code. Looked fine.
Checked database connection. Working.
Checked everything. Nothing made sense.

Then I saw it.

Environment variable in Render dashboard:

SECRECT_KEY instead of SECRET_KEY

I misspelled "SECRET" as "SECRECT".

JWT library tried to decode with None as the key.

Crashed.

The Fix:

  1. Fixed the typo
  2. Added startup validation:
# app/main.py
@app.on_event("startup")
async def validate_config():
    """Validate required configuration on startup."""
    required = ["SECRET_KEY", "REFRESH_SECRET_KEY", "DATABASE_URL"]

    missing = [key for key in required if not getattr(settings, key, None)]

    if missing:
        raise ValueError(f"Missing required config: {missing}")

    logger.info("โœ… Configuration validated")
Enter fullscreen mode Exit fullscreen mode

Now if a required variable is missing, the app fails immediately on startup.

Not after the first request.

Lesson learned: Fail fast. Validate early. Save yourself hours of debugging.


Bug #3: Database Connection Pool Exhaustion

The Problem:

API ran fine for a few hours.

Then started throwing errors:

asyncpg.exceptions.TooManyConnectionsError: too many connections
Enter fullscreen mode Exit fullscreen mode

What?

I wasn't doing anything different.

No traffic spike.

Just... ran out of connections.

The Root Cause:

Some error paths weren't closing database sessions.

Session leaked.
Connection leaked.
Pool exhausted.

The Fix:

  1. Ensure session cleanup in dependency:
# app/database.py
async def get_db() -> AsyncSession:
    async with AsyncSessionLocal() as session:
        try:
            yield session
        finally:
            await session.close()  # Always close, even on error
Enter fullscreen mode Exit fullscreen mode
  1. Configure connection pool limits:
# app/database.py
engine = create_async_engine(
    settings.DATABASE_URL,
    pool_size=5,          # Max connections
    max_overflow=10,      # Overflow allowed
    pool_pre_ping=True,   # Verify before use
    pool_recycle=3600     # Recycle after 1 hour
)
Enter fullscreen mode Exit fullscreen mode

After the fix: No more connection errors.

Lesson learned: Connection pools are finite resources. Manage them carefully.


Bug #4: Alembic Migration Conflicts

The Problem:

Deployed new code with database migration.

Render build failed:

alembic.util.exc.CommandError: 
Can't locate revision identified by 'abc123'
Enter fullscreen mode Exit fullscreen mode

Why?

I created migrations locally.
Pushed code.
But Render's database didn't have the previous migrations.

The Fix:

Run migrations as part of deployment:

# In Dockerfile CMD
CMD alembic upgrade head && uvicorn app.main:app --host 0.0.0.0 --port 8000
Enter fullscreen mode Exit fullscreen mode

This ensures:

  • Migrations run before app starts
  • Database is always up to date
  • No manual migration steps

Also added:

# Before creating new migration locally
alembic current  # Check current state
alembic upgrade head  # Apply all pending
Enter fullscreen mode Exit fullscreen mode

Lesson learned: Migrations are sequential. Treat them like git commits. Linear history matters.


๐Ÿ“Š Performance: 30 Days of Real Data

The Numbers

After 30 days of production traffic, here's how it actually performed:

Response Times:

Endpoint Average P95 P99
Health check 45ms 89ms 142ms
Login 156ms 278ms 421ms
Get tasks 87ms 156ms 289ms
Create task 112ms 198ms 334ms

Overall average: ~190ms

My initial target: 100ms

Reality: Almost 2x slower.

Why?

  • Neon serverless adds latency (~20ms)
  • Free tier CPU is shared
  • No caching layer
  • Some queries not optimized

Is 190ms acceptable?

For a free-tier learning project? Yes.

For a production SaaS? No. I'd need caching and optimization.


Uptime Tracking

Tool: UptimeRobot (free tier)

30-day results:

  • โœ… 99.2% uptime
  • โŒ 3 outages (total: 5 hours 47 minutes)
  • โฑ๏ธ Average response: 180ms
  • ๐Ÿ“ˆ Checks performed: 43,200

Outage breakdown:

  1. Render platform maintenance (2 hours)
  2. Neon database scaling issue (30 minutes)
  3. My deployment bug (15 minutes - the ENV variable typo)

99.2% uptime on $0 infrastructure?

Honestly better than I expected.

For comparison, AWS promises 99.99% (that's 4 minutes downtime per month).

I had ~5 hours downtime in 30 days.

Not bad for free.


Database Query Performance

Biggest optimization: Fixing N+1 queries.

Before:

# Get tasks - N+1 problem
tasks = await task_repository.get_by_user(user_id)
for task in tasks:
    print(task.category.name)  # Separate query for each!
Enter fullscreen mode Exit fullscreen mode

After:

# Eager load relationships
from sqlalchemy.orm import selectinload

tasks = await db.execute(
    select(Task)
    .options(selectinload(Task.category))
    .where(Task.user_id == user_id)
)
Enter fullscreen mode Exit fullscreen mode

Performance improvement: 3x faster for lists with categories.

Lesson learned: Async doesn't make bad queries good. Optimize your database access.


๐Ÿ’ฐ Cost Analysis: The $0 Stack

Monthly Infrastructure Costs

Service Plan Cost
Render Free tier $0
Neon PostgreSQL Free tier $0
Domain (optional) Namecheap ~$1/month
Monitoring UptimeRobot free $0
Total $0/month

Yearly cost: $0 (or $12 if you buy a domain)


Free Tier Constraints

Render Free Tier:

  • โœ… 750 hours/month (enough for one service)
  • โœ… Auto-deploy from Git
  • โœ… SSL certificate included
  • โŒ Sleeps after 15 min inactivity
  • โŒ Limited to 512MB RAM
  • โŒ Shared CPU

Neon Free Tier:

  • โœ… 3GB storage
  • โœ… Unlimited databases
  • โœ… Serverless auto-scaling
  • โŒ Limited compute units
  • โŒ Shared resources

When to Upgrade?

Stay on free tier if:

  • Learning project
  • Portfolio piece
  • Low traffic (<1000 requests/day)
  • Can tolerate cold starts
  • No SLA requirements

Upgrade to paid ($7-10/month) if:

  • Real users depending on it
  • Need consistent performance
  • Can't afford downtime
  • Traffic grows
  • Professional project

For this project: Free tier is perfect.

It does everything I need for a portfolio piece.


๐Ÿ’ก Lessons Learned

1. Tests Save More Time Than They Take

My initial thought: "Tests slow me down. I'll skip them."

Reality: Tests caught 15+ bugs before production.

Time spent writing tests: 1 week
Time saved debugging production: Probably 2+ weeks

ROI: Massively positive.

Tests aren't overhead. They're insurance.


2. Docker Isn't Optional

Why Docker matters:

Reason 1: "Works on my machine" becomes "works everywhere"
Reason 2: Deployment is just "push image"
Reason 3: Forces you to think about dependencies

Time invested learning Docker: 2 days
Time saved in deployment issues: Countless hours

Every modern backend project should use Docker.

Not optional.


3. Monitoring Isn't Optional Either

Without monitoring, you're blind.

Questions you can't answer:

  • Is the API down?
  • Why is it slow?
  • Where are errors happening?
  • What's the actual performance?

With basic monitoring:

  • Health check endpoint
  • Logging to stdout
  • UptimeRobot for uptime
  • Manual log review

Cost: $0
Value: Priceless

You can't fix what you can't measure.


4. Free Tier Forces Better Decisions

Constraints I faced:

  • No Redis โ†’ In-memory rate limiting
  • Cold starts โ†’ UX handling + health checks
  • Limited resources โ†’ Query optimization

Result: Better architecture.

Paid tiers let you throw money at problems.

Free tiers force you to solve them properly.

The best learning happens under constraints.


5. Production Is the Real Teacher

Local development teaches:

  • How to write code
  • How to structure projects
  • How to use frameworks

Production teaches:

  • How systems fail
  • How to debug without IDE
  • How to handle real constraints
  • How to think about users

The gap between these two is where engineers are made.


๐Ÿ”„ What I'd Do Differently

1. Write Tests Earlier (TDD)

What I did: Built features, then wrote tests.

Problem: Had to refactor code to make it testable.

Better approach: Test-Driven Development

  • Write test first
  • Implement feature
  • Refactor

Why: Forces good design from the start.


2. Set Up CI/CD Immediately

What I did: Manually ran tests before deploying.

Problem: Forgot twice. Deployed broken code.

Better approach: GitHub Actions

# Simple workflow
name: Tests

on: [push, pull_request]

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
    - uses: actions/checkout@v2
    - uses: actions/setup-python@v2
      with:
        python-version: 3.11
    - run: pip install -r requirements.txt
    - run: pytest --cov=app tests/
Enter fullscreen mode Exit fullscreen mode

Benefit: Tests run automatically. Can't merge broken code.


3. Add Caching Earlier

What I did: Direct database queries for everything.

Problem: Repeated queries for same data.

Better approach: Redis caching for:

  • User sessions
  • Frequently accessed data
  • API rate limit counters

Why I didn't: Free tier doesn't include Redis.

Workaround: functools.lru_cache for config/static data.


4. Implement Database Backups

What I did: Relied on Neon's automatic backups.

Problem: No control over backup schedule or retention.

Better approach:

  • Scheduled pg_dump to S3
  • Version-controlled schema
  • Tested restore process

Why it matters: Data loss is unacceptable. Even for learning projects.


5. Use Feature Flags

What I did: Deploy features directly to production.

Problem: Can't disable buggy features without rollback.

Better approach:

# Simple feature flags
class FeatureFlags:
    ENABLE_EMAIL_VERIFICATION = os.getenv("FEATURE_EMAIL", "true") == "true"
    ENABLE_ADVANCED_SEARCH = os.getenv("FEATURE_SEARCH", "false") == "true"

# Usage
if FeatureFlags.ENABLE_ADVANCED_SEARCH:
    # New feature logic
else:
    # Old feature logic
Enter fullscreen mode Exit fullscreen mode

Benefit: Enable/disable features without deploying.


๐ŸŽฏ Final Thoughts

What This Project Taught Me

1. Architecture matters โ€” but implementation teaches you why

2. Tests aren't optional โ€” they're the difference between hobby and professional

3. Production is different โ€” cold starts, CORS, connection pools, migrations... none of this shows up in tutorials

4. Free tier is enough โ€” to learn, to build portfolio, to prove you can ship

5. Building in public works โ€” documenting this journey kept me accountable


The Numbers

What I built:

  • โœ… 50+ API endpoints
  • โœ… JWT authentication
  • โœ… 87% test coverage
  • โœ… Clean architecture
  • โœ… Dockerized
  • โœ… Deployed (99.2% uptime)
  • โœ… $0 infrastructure cost

What I learned:

  • Way more than any tutorial could teach
  • The gap between "working" and "production-ready"
  • How to debug systems you can't see
  • How to make architectural decisions under constraints
  • How to ship something real

Was It Worth It?

Absolutely.

Not just for the portfolio.

Not just for the resume.

For the learning.

There's a massive difference between:

  • Following a tutorial
  • Building from scratch
  • Deploying to production

This project forced me through all three.

Every bug taught me something.
Every constraint forced creativity.
Every deployment taught me production.

That's the real value.


๐Ÿš€ Try It Yourself

Live API: https://task-management-api-a775.onrender.com/docs

(First request might take 30 seconds if it's sleeping - free tier!)

GitHub: https://github.com/ravigupta97/task_management_api

Clone and run:

git clone https://github.com/ravigupta97/task_management_api
cd task_management_api
docker-compose up
Enter fullscreen mode Exit fullscreen mode

Open http://localhost:8000/docs

Everything you need is in the README.


๐Ÿ’ฌ Discussion

Questions for you:

  1. What's your testing strategy for async APIs?
  2. How do you handle cold starts on free tiers?
  3. What was your biggest production surprise?
  4. What free tier tools do you use?

Let's learn together - comment below! ๐Ÿ‘‡


This completes the 3-part series on building a production-ready FastAPI application.

What's next for me?

Follow along: LinkedIn | GitHub

#FastAPI #Python #Testing #Docker #Deployment #DevOps #BackendDevelopment #BuildInPublic #SoftwareEngineering

Top comments (0)