Matia Rašetina

Posted on Mar 2

Designing a Serverless Publishing Backend on AWS: Fan-Out as a First-Class Architecture Pattern

#aws #webdev #programming #architecture

Building a simple social platform where users can post about anything seems simple at first glance, until you reach the hardest problem: distributing content efficiently.

When you publish a blog post on any platform, did you ever wonder what happens next? For most of the platforms, many systems react to you publishing a post — content gets moderated, notifications about your activity sent, user’s feed updates. The real challenge here is doing everything without blocking the user or coupling the services together.

In this post, I’m going to tell you about fan-out architecture pattern — a strong pattern used to decouple services and enable data processing in multiple services at the same time.

This pattern will be used inside a full-stack blog post platform, like Medium, dev.to and others. The full-stack application will be covered in the next post.

Code for the whole backend is available by clicking on the link here.

Core AWS Services and Their Roles

At a high level, the system follows a simple rule:

APIs handle commands. Events handle consequences.

The backend is implemented using AWS CDK (Python) and relies exclusively on managed, serverless services. There are no always-on servers, no shared stateful services, and no synchronous dependency chains beyond the API boundary.

When a user publishes a post, the API does only three things:

Validate the request
Persist the post
Emit a Post Published event

Everything else — fetching the user’s feed, search indexing, notifications, post content moderation, analytics — happening in parallel and independent of each other.

This architecture uses the following serverless AWS services:

AWS Lambda (Python 3.12) for computing
AWS Cognito as a user management solution
Amazon DynamoDB as a database solution
Amazon SNS as the event bus (think of it as a "broadcast channel")
Amazon SQS for isolated, scalable workers
API Gateway for opening up the Lambdas to the Internet
AWS Bedrock for post content moderation via available LLMs

Here is the architecture diagram of the full-stack whole project, which we are going to take a closer look in the next blog post:

The Fan-Out Pattern: The Heart of the System

When a post is published, a single event is emitted to an SNS topic which sends the published post information to multiple SQS queues, as you can see in the architecture diagram.

Each downstream service subscribes via it’s own SQS queue, backed by a dedicated Lambda function worker.

This design has several important properties:

Loose coupling: Workers know nothing about each other and are completely independent
Fault isolation: One failure does not affect others
Independent scaling: Each worker scales based on its own load
Extensibility: New consumers of data can be added without changing existing code, just point to the new worker’s SQS queue

This is not just event-driven—it’s fan-out by construction, making it so easy to do multiple processes in parallel.

Another important detail to mention is that all SQS queue has it’s own Dead-letter queues (or DLQ for short), which collects the posts which couldn’t be processed, so the developer has an easier time debugging the failures in the system, making the system more mature and easier to work with. We are going to take a look how they work in the next blog post.

In the CDK code, creating a SNS topic and connecting the SQS queues looks like this:

self.post_published_topic = sns.Topic(self, "PostPublishedTopic")

def mk_queue(base: str) -> tuple[sqs.Queue, sqs.Queue]:
    dlq = sqs.Queue(self, f"{base}Dlq", retention_period=Duration.days(14))
    q = sqs.Queue(
        self,
        base,
        visibility_timeout=Duration.seconds(90),
        dead_letter_queue=sqs.DeadLetterQueue(queue=dlq, max_receive_count=5),
    )
    self.post_published_topic.add_subscription(subs.SqsSubscription(q))
    return q, dlq

self.feed_q, self.feed_dlq = mk_queue("FeedQueue")
self.search_q, self.search_dlq = mk_queue("SearchQueue")
self.email_q, self.email_dlq = mk_queue("EmailQueue")
self.moderation_q, self.moderation_dlq = mk_queue("ModerationQueue")
self.analytics_q, self.analytics_dlq = mk_queue("AnalyticsQueue")

Multiple Independent Workers Triggered by One Event

Every worker reacts to the same event but serves a different purpose:

Feed Worker: Distributes content to followers
Search Worker: Updates the inverted index
Email Worker: Sends notifications via SES
Moderation Worker: Runs AI moderation

Each worker:

Has its own IAM role
Has its own retry and DLQ configuration
Can fail without cascading impact

Connecting a Lambda worker to it’s SQS queue is done inside the CDK code like this:

# Any Lambda function, just copy, paste and adjust!
def mk_worker(name: str, path: str, queue, extra_env=None):
        fn = _lambda.Function(
            self,
            name,
            runtime=_lambda.Runtime.PYTHON_3_12,
            handler="handler.handler",
            code=_lambda.Code.from_asset(path),
            timeout=Duration.seconds(30),
            memory_size=256,
            environment={**common_env, **(extra_env or {})},
            layers=[powertools_layer],
        )
        fn.add_event_source(sources.SqsEventSource(queue, batch_size=10))

Now, let’s explain what each of the Lambda workers work in more detail.

Feed Worker

This Lambda takes the incoming data about the published post and updates the feed which should be shown to the followers of the user which published the post. A user’s feed is defined inside a DynamoDB table.

Moderation Worker

Content moderation is handled by a dedicated worker using AWS Bedrock’s Nova Micro model, because of it’s low price and it works wonders in this case.

Moderation is:

Asynchronous
Non-blocking
Cheap enough to run on every post

The worker takes the incoming data about the published post, like title, content and tags, builds a prompt with that data and sends it to the Nova Micro LLM model to analyze the content. The LLM will return either APPROVED or REJECTED, and the provided response is written inside the DynamoDB table, so the user cannot see any of the REJECTED posts.

Email Worker

This is very simple — take the incoming data of the blog post and send it to the followers found inside the DynamoDB table.

Search Worker

Instead of using OpenSearch or any other vector database, I’ve chosen to use DynamoDB for search indexing. The reason for choosing it is very simple — DynamoDB enables you to have the same functionality, but for a fraction of the cost.

All Lambda code is available in the Github repository available by clicking on the link at the beginning of this post.

Now that we’ve covered the heart of this application, there are some architectural decisions made for the DynamoDB data model and decisions made about exposing the API endpoints to the user.

Data Modeling: Designing for Access Patterns

The data layer is intentionally simple and optimized around how the application reads data, not how it looks relationally. This is explained as well, together with the fan-out pattern, as it’s very important for the application’s performance.

Posts Table

Stores the canonical representation of each post, indexed for:

Fetching a post by ID
Listing posts by author
Filtering by moderation status

Follows Table

Represents the social graph:

Partitioned by authorId
Enables fast lookup of followers when a post is published

Feed Table

This table exists to serve one purpose: fast feed reads.

Each user has their own partition containing pre-computed feed entries, sorted by publication time. There is no aggregation or computation at read time—fetching a feed is a single DynamoDB query.

This design only works because of one key decision: fan-out on write.

Search Index Table

Instead of using OpenSearch, the system implements a lightweight inverted index in DynamoDB by tokenizing titles and tags. It’s intentionally limited but dramatically cheaper and simpler for MVP-level search requirements.

The API Layer: Thin by Design

The API exposes a small set of endpoints:

Publish a post
Retrieve a post
Fetch a user’s feed
Search content

Very simple, very straight forward. To open up a Lambda to the Internet inside the CDK code, you would do it like this:

# Define the Lambda resource
get_post_fn = _lambda.Function(
    self,
    "GetPostFn",
    runtime=_lambda.Runtime.PYTHON_3_12,
    handler="get_post.handler",
    code=_lambda.Code.from_asset("services/api/handlers"),
    timeout=Duration.seconds(10),
    memory_size=256,
    environment={
        "POSTS_TABLE": storage.posts_table.table_name,
        "POWERTOOLS_LOG_LEVEL": "INFO",
    },
    layers=[powertools_layer],
)

# Create the root API via API Gateway
http_api = apigw.HttpApi(
    self,
    "HttpApi",
    cors_preflight=cors_preflight,
)

# Add a route to the root API
http_api.add_routes(
    path="/posts/{postId}",
    methods=[apigw.HttpMethod.GET],
    integration=integrations.HttpLambdaIntegration("GetPostInt", get_post_fn),
)

Security and Identity

Authentication is handled by Amazon Cognito with:

Email-based signup
JWT-based authorization
OAuth 2.0 support

All APIs which are open to the Internet are protected via the Cognito authorizer, that way we let API Gateway handle protection of our endpoints.

In addition, each Lambda function inside this sytem operates under least-privilege IAM, with access only to the resources it requires. Workers or API Lambdas cannot accidentally read or modify unrelated data.

To protect the endpoint, you’ll need the created Cognito User Pool ID when it gets initialized. We’ll use the same Lambda we’ve opened up to the Internet via API Gateway, just modify the add_routes method arguments:

# Create Cognito authorizer
cognito_authorizer = apigw.HttpUserPoolAuthorizer(
    "CognitoAuthorizer",
    "ADD YOUR USER POOL ID HERE",
)

# Add a route to the root API
# and protect the endpoint with the Cognito Authorizer
http_api.add_routes(
    path="/posts/{postId}",
    methods=[apigw.HttpMethod.GET],
    integration=integrations.HttpLambdaIntegration("GetPostInt", get_post_fn),
    authorizer=cognito_authorizer
)

Cost Characteristics

All of services used in this project are inside the AWS Free tier and, if your traffic is low-to-medium (i.e. around few thousand users), most of the services are going to still be inside the Free tier. Only service which might cost you a couple of dollars will be AWS Bedrock, since running the project is running the LLM on every blog post.

Conclusion

You’ve now seen the power of this pattern and how easy it makes it to run multiple processes in parallel by using AWS serverless services.

The fan-out pattern turns a single publish action into a scalable pipeline of independent reactions. It keeps APIs fast, workers isolated, and the system easy to extend.

Most importantly, it aligns perfectly with the realities of a publishing platform:

Reads must be instant
Side effects must not block users
Features will grow over time and they are easy to add as SNS destinations

By making fan-out a first-class architectural choice, the backend remains simple—even as it scales.

In the next post, we’re going to take a closer look in all functionalities of the full-stack application.

DEV Community