Nick Gojaman

Posted on Dec 24, 2025

Building a Production-Grade Serverless API on AWS

#aws #serverless #cloud #architecture

Most tutorials focus on building features. This project focused on operating a real backend system in production.

Intellpulse is a backend-first, API-only service designed to generate quantitative trading signals (BUY / HOLD / SELL) with explainability, quotas, and safe deployments. There is intentionally no UI — the goal was to design and ship a system that behaves like internal fintech infrastructure rather than a demo app.

The project emphasizes:

end-to-end system design

security and access control

rate limiting and usage tracking

safe CI/CD and deployment discipline

This article walks through the architecture, key design decisions, and tradeoffs involved in building a production-grade serverless API on AWS — without overengineering.

Architecture Overview

Intellpulse is implemented as a serverless, container-based API with a clear separation between runtime request flow and CI/CD deployment flow.

Request flow

Clients access the API over HTTPS using a Lambda Function URL

Requests are handled by a FastAPI application running inside a containerized AWS Lambda function

Authentication and quota enforcement are applied before any signal logic executes

Usage and rate-limit state is stored in DynamoDB

The API returns structured JSON responses containing signals and explainability metadata

Deployment flow

Code changes trigger a CI/CD pipeline

A container image is built and pushed to Amazon ECR

The Lambda function is updated using a digest-pinned image reference (@sha256)

Deployments are promoted through staging and production environments

This design keeps the runtime path minimal and predictable, while allowing deployments to be performed safely without configuration drift or accidental rollbacks.

🧠 Key Design Decisions
1️⃣ Why Lambda Function URLs instead of API Gateway

For this project, I intentionally used AWS Lambda Function URLs instead of Amazon API Gateway.

The goal was to expose a small, controlled API surface without introducing additional infrastructure or configuration overhead. Lambda Function URLs provide native HTTPS access, integrate cleanly with Lambda-based authentication logic, and are sufficient when advanced API Gateway features (custom domains, request mapping templates, usage plans) are not required.

This decision kept the architecture simpler while still supporting secure, production-grade access patterns.

2️⃣ Why container-based Lambda for FastAPI

Rather than adapting FastAPI to a zip-based Lambda deployment, the service runs as a containerized Lambda function stored in Amazon ECR.

Containers provided:

full control over Python dependencies

consistent local and cloud execution environments

easier iteration without Lambda packaging constraints

This approach allowed FastAPI to run naturally without framework-specific workarounds, while still benefiting from Lambda’s serverless execution model.

3️⃣ Why DynamoDB for rate limiting and quotas

Rate limiting and daily usage quotas are enforced using Amazon DynamoDB rather than in-memory or middleware-based solutions.

DynamoDB was chosen because it:

scales automatically with request volume

provides predictable performance under load

enables per-key usage tracking with TTL-based expiry

This design ensures that quota enforcement remains stateful, durable, and horizontally scalable, even as traffic increases.

4️⃣ Why digest-pinned deployments (@sha256)

CI/CD deployments update Lambda functions using digest-pinned container images (@sha256) rather than mutable tags such as latest.

This guarantees that:

each deployment is deterministic

rollbacks reference known artifacts

production never pulls unintended image versions

While this adds a small amount of complexity to the pipeline, it significantly improves deployment safety and traceability.

What this shows:
Senior-level deployment discipline.

5️⃣ Why separate staging and production environments

Even for a relatively small backend service, separate staging and production environments were maintained.

Staging allows:

validation of CI/CD changes

safe testing of deployment logic

confidence before production promotion

This mirrors patterns used in larger systems and reinforces habits that scale beyond single projects.

⚖️ Tradeoffs and Lessons Learned

Building Intellpulse reinforced the importance of intentional tradeoffs when designing production systems.

A backend-only approach meant there was no visual demo for non-technical users, but it allowed full focus on correctness, security, and operational discipline. This tradeoff was acceptable because the target audience was API consumers rather than end users.

Choosing Lambda Function URLs simplified the architecture, but also meant accepting fewer built-in features compared to API Gateway. This reinforced the need to clearly understand service boundaries and avoid defaulting to heavier components when they are not required.

Finally, implementing CI/CD safety mechanisms such as digest-pinned deployments added upfront complexity, but significantly reduced the risk of accidental production regressions. This tradeoff proved worthwhile even at small scale.
If I were extending this system further, the next improvements would focus on operational maturity and developer ergonomics, rather than additional features.

Planned improvements include:

Managing infrastructure using Infrastructure as Code (Terraform or AWS CDK)

Adding structured observability (metrics and tracing)

Introducing a lightweight UI or dashboard for internal usage

Supporting additional signal strategies and historical queries

Importantly, these enhancements build on a stable foundation rather than compensating for architectural gaps.