Web Architecture Patterns for Modern Full-Stack Applications

#webarchitecture #performance #saas

Web Architecture Patterns for Modern Full-Stack Applications

I've shipped CitizenApp to production with 9 AI features across millions of API calls. The architecture decisions I made at months 1, 6, and 12 were completely different—and that's the point. This post covers the real trade-offs you'll face scaling from MVP to 10k+ users, without the consultant speak.

The Monolith vs Microservices False Choice

Everyone warns you about monoliths. I built CitizenApp as a monolith deliberately. Here's why: microservices don't solve business problems, they solve organizational problems.

At 50 users, a monolith deployed to Render costs $12/month. A microservices setup with 3 services, separate databases, and observability? $800+/month minimum, plus 6 months of engineering time fighting eventual consistency bugs.

I structured my monolith to feel modular:

// apps/api/src/modules/
// ├── auth/
// ├── documents/
// ├── ai-features/
// └── billing/

// Each module is self-contained with clear boundaries
// src/modules/documents/routes.ts
import { Router } from "express";
import { DocumentService } from "./service";
import { documentSchema } from "./schema";
import { authenticate } from "@/middleware/auth";

const router = Router();
const service = new DocumentService();

router.post("/", authenticate, async (req, res) => {
  const validated = await documentSchema.parse(req.body);
  const result = await service.create(validated, req.user.id);
  res.json(result);
});

export default router;

This structure lets me:

Test modules independently
Move a module to a separate service later (I moved billing to its own service at 8k users)
Onboard developers without explaining distributed systems

When to actually split: When a single team owns a module full-time AND it has fundamentally different scaling requirements. For CitizenApp, AI processing scaled independently, so I moved it to FastAPI workers. Everything else stayed together.

API Design: REST vs GraphQL vs RPC

I prefer REST with strong contracts over GraphQL. Here's what burned me: GraphQL's flexibility is a business liability in production.

With REST, a client request for /documents is explicit:

// FastAPI backend
@router.get("/documents")
async def list_documents(
    skip: int = 0,
    limit: int = 100,
    tenant_id: str = Depends(get_tenant),
    current_user: User = Depends(get_current_user)
):
    return await db.query(Document).filter(
        Document.tenant_id == tenant_id,
        Document.user_id == current_user.id
    ).offset(skip).limit(limit).all()

I can add rate limiting, caching, and analytics at the route level. I know exactly what queries clients will make.

With GraphQL, clients write arbitrary queries. I've watched 20 different clients each optimize for their use case, which means 20 database queries that could have been one. Yes, you can solve this with dataloaders and query complexity scoring, but that's complexity you're adding after shipping.

My rule: Use REST for business-critical APIs. Use GraphQL (or tRPC) for internal tools only.

Here's my preferred API shape for CitizenApp:

// POST /api/documents/{id}/ai-process
interface ProcessRequest {
  featureId: string;
  parameters: Record<string, unknown>;
  webhookUrl?: string; // async processing
}

interface ProcessResponse {
  jobId: string;
  status: "queued" | "processing" | "completed" | "failed";
  result?: unknown;
  error?: string;
}

This is explicit, versioning is clear, and I can rate-limit per feature. No client can accidentally request cross-tenant data by writing a weird query.

Data Flow: Where Does Business Logic Live?

This is where I see most architectures fail. Business logic ends up in three places at once: the backend, the database (triggers), and the frontend (optimistic updates). Then someone changes one place and nothing breaks visibly until production.

I enforce a rule: Business logic lives in the backend only. The database is a data store, not a server.

# src/modules/documents/service.py
class DocumentService:
    async def process_with_ai(
        self,
        document_id: str,
        feature_id: str,
        tenant_id: str,
        user_id: str
    ) -> ProcessResult:
        # 1. Authorization check (backend only)
        doc = await self.db.get_document(document_id, tenant_id)
        if not doc:
            raise PermissionError("Document not found")

        if not await self.check_access(user_id, doc.id):
            raise PermissionError("No access")

        # 2. Business logic
        feature = await self.get_feature(feature_id)
        if not feature.is_enabled_for_tenant(tenant_id):
            raise ValueError("Feature not available")

        # 3. State mutation (atomic)
        async with self.db.transaction():
            doc.status = "processing"
            doc.updated_at = datetime.now()
            await self.db.update(doc)

            job = Job(
                document_id=doc.id,
                feature_id=feature_id,
                tenant_id=tenant_id,
                status="queued"
            )
            await self.db.create(job)

        # 4. Side effects (after mutation)
        await self.queue.enqueue(
            "process_document",
            job_id=job.id,
            retry=3
        )

        return ProcessResult(job_id=job.id, status="queued")

The frontend is dumb:

// React component
const [isProcessing, setIsProcessing] = useState(false);

const handleProcess = async () => {
  setIsProcessing(true);
  try {
    const res = await fetch(`/api/documents/${docId}/ai-process`, {
      method: "POST",
      body: JSON.stringify({ featureId }),
    });

    const data = await res.json();
    // Only optimistic UI, no business logic
    setJobId(data.jobId);
    pollJobStatus(data.jobId); // SSE or polling
  } finally {
    setIsProcessing(false);
  }
};

This means:

Authorization is always checked (no auth bypasses)
State mutations are atomic
Audit logs are trivial to add
Frontend bugs can't corrupt your database

Scaling Pain Points You'll Hit

Problem 1: Tenant isolation in multi-tenancy

At 3k users across 50 tenants, I added row-level security the hard way—after a data leak. Now I do this:

# Middleware that runs on every request
async def enforce_tenant_isolation(request: Request, call_next):
    tenant_id = extract_tenant_id(request)
    # Store in context, NOT in headers the client can fake
    request.state.tenant_id = tenant_id
    request.state.current_user = await verify_token(request)

    return await call_next(request)

# Every database query checks tenant_id
await db.query(Document).filter(
    Document.tenant_id == request.state.tenant_id
).all()

Problem 2: Your ORM will betray you at scale

SQLAlchemy is great until you have 100 requests/second and every query has 5 eager loads. I switched critical paths to raw SQL:

# Instead of:
# documents = session.query(Document).filter(...).all()
# for doc in documents:
#     print(doc.user.email)  # N+1 query hell

# Use:
documents = await db.execute("""
    SELECT d.id, d.title, u.email
    FROM documents d
    JOIN users u ON d.user_id = u.id
    WHERE d.tenant_id = $1
""", [tenant_id])

Gotcha: The Cost of "Flexibility"

I built CitizenApp with JWT auth that could validate tokens without a database lookup. Sounded great for scale. Then a user's account got compromised and I couldn't revoke their token instantly—tokens were valid for 1 hour.

Now I do:

# Validation includes a fast cache check
async def verify_token(token: str):
    payload = jwt.decode(token, SECRET_KEY)

    # Redis blacklist check (instant revocation)
    if await redis.exists(f"token_blacklist:{payload['jti']}"):
        raise InvalidToken()

    return payload

This costs ~5ms per request but is non-negotiable for security.

The Actual Advice

Start with a monolith on Render ($12). Use REST APIs. Put logic in the backend. Add complexity only when you measure it matters. At 10k users, you'll have real data about what to split. That's worth more than architecture diagrams.