DEV Community

Cover image for Building Production Email Infrastructure with Postfix + AWS SES: Architecture & Design
Cyril Sebastian
Cyril Sebastian

Posted on

Building Production Email Infrastructure with Postfix + AWS SES: Architecture & Design

Learn how to build a scalable, cost-effective email infrastructure using Postfix and AWS SES with complete bounce tracking. Part 1 covers architecture decisions, security, and design patterns.

Part 1: Architecture & Design Decisions

Series Navigation

Part 1: Architecture & Design ← You are here

Part 2: Implementation Guide

Part 3: Operations & Troubleshooting


TL;DR

What we're building: A production email system combining Postfix SMTP relay with AWS SES, complete with real-time bounce tracking, all visible in unified logs.

Why it matters: Track emails from send → delivery → bounce in one log file, works in private subnets, costs ~$30/month.

Who it's for: DevOps engineers, backend developers, and SREs managing email infrastructure.


Introduction

Have you ever sent an email and wondered: "Did it actually reach the inbox? Or did it bounce? When?"

Traditional SMTP relays tell you when they sent the email, but not when it was delivered. This gap creates a blind spot in your infrastructure.

What We're Building

An email infrastructure that provides:

  • Complete visibility: Both "sent" and "delivered" statuses

  • Real-time bounce tracking: Know immediately when emails fail

  • Unified logging: Everything in one log file (grep-friendly!)

  • Private subnet compatible: No public endpoints needed

  • Cost-effective: ~$30/month for 50,000 emails

  • Enterprise deliverability: AWS SES's 99.9% delivery rate

Who This Series Is For

DevOps Engineers looking to build a reliable email infrastructure

Backend Developers integrating email into applications

SREs need observability into email delivery

Series Overview

Part 1 (this post): Understand the architecture and design decisions

Part 2: Step-by-step implementation guide

Part 3: Operations, monitoring, and troubleshooting


The Problem: Email Observability

Traditional SMTP Relay Limitations

When you send an email through a standard SMTP relay:

postfix: status=sent (250 Ok)
Enter fullscreen mode Exit fullscreen mode

But "sent" doesn't mean "delivered"! It just means your mail server handed the email to the next server.

What You Don't Know

- Did it reach the recipient's inbox?

- Did it bounce?

- Was it marked as spam?

- How long did the delivery take?

- Which ISP was slow?

This creates a visibility gap in your infrastructure.

Solution: The Best of Both Worlds

Your App → Postfix → SES → Recipient
     ↓         ↓        ↓
   Logs    Logs    SNS→SQS→Logger
                         ↓
                  Unified Logs ✅
Enter fullscreen mode Exit fullscreen mode

This approach delivers:

Both statuses: "sent" (Postfix) AND "delivered" (SES)

Private subnet: No public endpoints required

Cost-effective: ~$30/month

Standard SMTP: No code changes needed

Unix-friendly: Logs searchable with grep/awk

Full control: Own your infrastructure


Architecture Overview

The Big Picture

╔══════════════════════════════════╗
║     Application Layer            ║
║  "Send email to user@example.com"║
╚════════════╤═════════════════════╝
             │ SMTP • Port 25
             ▼
╔══════════════════════════════════╗
║      Postfix (Private Subnet)    ║
║  ├─ ✓ Sender authorized          ║
║  ├─ ➡ Forwarding to SES...       ║
║  └─ 📋 LOG: status=sent          ║
╚════════════╤═════════════════════╝
             │ SMTP + TLS • Port 587
             ▼
╔══════════════════════════════════╗
║     AWS Simple Email Service    ║
║  "Delivering to recipient..."    ║
╚════════════╤═════════════════════╝
             │
     ┌───────┴───────┐
     ▼               ▼
╔════════════╗ ╔════════════════════╗
║           ║ ║    Event Flow     ║
║  Recipient ║ ║  SNS → SQS → Logger║
║  Mail Server║╚══════════╤═════════╝
╚════════════╝            │
                          ▼
                 ╔════════════════════╗
                ║    Your Logs ✓    ║
                ║  ├─   sent        ║
                ║  ├─   delivered   ║
                ║  └─    bounced     ║
                 ╚════════════════════╝
Enter fullscreen mode Exit fullscreen mode

Data Flow Summary

  1. Application sends email to Postfix via SMTP

  2. Postfix validates sender, relays to SES

  3. SES delivers email to recipient

  4. SES publishes event (delivery/bounce) to SNS

  5. SNS forwards event to SQS queue

  6. Python logger polls SQS, writes to syslog

  7. Result: Both "sent" and "delivered" in your logs!


Core Components Explained

1. Postfix: The Smart Relay

Role: SMTP relay with sender validation and forwarding logic

What it does:

  • Receives emails from your application

  • Validates sender addresses against the whitelist

  • Forwards to AWS SES via authenticated SMTP

  • Logs "sent" status immediately

Why Postfix?

Industry standard: Powers millions of servers

Highly configurable: Fine-grained control over routing

Excellent logging: Detailed, parseable log format

Battle-tested: Decades of production use

Performance: Handles thousands of concurrent connections

Key Configuration:

# Sender validation (whitelist)
smtpd_sender_restrictions = 
    check_sender_access hash:/etc/postfix/allowed_senders,
    reject

# Only approved senders can use relay
# Example whitelist:
# info@example.com    OK
# noreply@example.com OK
# @example.com        REJECT
Enter fullscreen mode Exit fullscreen mode

Log Output Example:

Feb 25 00:05:15 mail postfix/smtp[123]: ABC123: 
  to=<user@example.com>, 
  relay=email-smtp.ap-south-1.amazonaws.com:587, 
  delay=0.14, 
  dsn=2.0.0, 
  status=sent (250 Ok 0109019c...)
Enter fullscreen mode Exit fullscreen mode

What this tells you:

  • Postfix accepted the email

  • Email forwarded to SES

  • SES accepted the email (250 Ok)

  • Took 0.14 seconds


2. AWS SES: The Delivery Engine

Role: Actual email delivery to recipients

What it does:

  • Delivers emails to recipient mail servers

  • Handles DKIM signing for authentication

  • Manages IP reputation

  • Publishes delivery/bounce events

Why SES?

99.9% deliverability: Enterprise-grade infrastructure

Global reach: AWS's worldwide network

Pay-as-you-go: $0.10 per 1,000 emails

Built-in auth: Automatic SPF, DKIM, DMARC

Event publishing: Real-time delivery notifications

Scalable: From 100 to millions of emails

Event Types:

  1. Delivery - Email reached the recipient's inbox

  2. Bounce - Email rejected (permanent or temporary)

  3. Complaint - Recipient marked as spam

Why not use SES directly?

While SES has an API, using Postfix as a relay provides:

  • Standard SMTP interface (no code changes)

  • Sender validation

  • Easy provider switching

  • Centralized configuration

  • Better logging


3. The Event Pipeline: SNS → SQS → Logger

This is the secret sauce that brings delivery confirmations into your logs.

SNS (Simple Notification Service)

Role: Event broadcaster from SES

Flow:

SES delivers email
    ↓
SES publishes event to SNS
    ↓
SNS broadcasts to subscribers
Enter fullscreen mode Exit fullscreen mode

Why SNS?

  • Real-time notifications

  • Fan-out to multiple destinations

  • Native SES integration

  • Filter by event type

SQS (Simple Queue Service)

Role: Message buffer between SNS and logger

Sample Event:

{
  "notificationType": "Delivery",
  "mail": {
    "messageId": "0109019c...",
    "destination": ["user@example.com"]
  },
  "delivery": {
    "timestamp": "2026-02-25T00:05:18.000Z",
    "smtpResponse": "250 ok dirdel",
    "processingTimeMillis": 3558
  }
}
Enter fullscreen mode Exit fullscreen mode

Why SQS instead of HTTP webhooks?

This is a critical design decision:

HTTP Webhook Approach:

SES → SNS → HTTP POST to your server
                      ↓
                Need public endpoint
                Need ALB/Load balancer  
                Security concerns
                Webhook authentication
Enter fullscreen mode Exit fullscreen mode

SQS Polling Approach:

SES → SNS → SQS Queue
              ↓
      Python script polls (outbound only)
              ↓
        No public endpoint needed!
        Works in private subnet
        Messages buffered if logger down
        IAM-based authentication
Enter fullscreen mode Exit fullscreen mode

Benefits of SQS:

Private subnet compatible: Polling is outbound-only

Resilient: Messages buffered for 14 days

No ALB needed: Saves $18/month

More reliable: No missed webhooks

IAM auth: No webhook secrets to manage


4. The Logger: Python + Syslog

Role: Poll SQS and write events to Postfix logs

What it does:

  • Polls SQS every 20 seconds (long polling)

  • Parses SES delivery/bounce events

  • Writes to syslog (same facility as Postfix)

  • Deletes processed messages from the queue

Code Snippet:

import boto3, syslog

# Initialize SQS client (uses IAM role automatically)
sqs = boto3.client('sqs', region_name='ap-south-1')

# Poll queue (long polling reduces API calls)
response = sqs.receive_message(
    QueueUrl=queue_url,
    MaxNumberOfMessages=10,
    WaitTimeSeconds=20  # Wait up to 20s for messages
)

# Process each event
for message in response.get('Messages', []):
    event = parse_ses_event(message)

    # Write to syslog (appears in Postfix logs!)
    syslog.openlog('postfix/ses-events', 
                   facility=syslog.LOG_MAIL)
    syslog.syslog(syslog.LOG_INFO, 
                  f"{msg_id}: to=<{recipient}>, "
                  f"status=delivered")

    # Remove from queue
    sqs.delete_message(...)
Enter fullscreen mode Exit fullscreen mode

Why Syslog?

Same log file: Appears alongside Postfix logs

Auto-rotation: System handles log management

Searchable: Standard Unix tools (grep, awk)

Integration: Works with existing log aggregators

Familiar format: Same as Postfix log entries

Log Output:

Feb 25 00:05:19 mail postfix/ses-events[456]: 0109019c...: 
  to=<user@example.com>, 
  relay=amazonses.com, 
  dsn=2.0.0, 
  status=delivered, 
  delay=3607ms,
  response=(250 ok dirdel)
Enter fullscreen mode Exit fullscreen mode

What this tells you:

  • Email successfully delivered

  • Took 3.6 seconds from SES to the inbox

  • Final SMTP response from recipient server


The Complete Email Journey

Let's trace a single email through the entire system with precise timings.

T+0ms: Application Sends Email

import smtplib
from email.mime.text import MIMEText

msg = MIMEText("Hello World")
msg['From'] = "info@example.com"
msg['To'] = "user@example.com"

s = smtplib.SMTP("10.0.0.23", 25)
s.sendmail("info@example.com", "user@example.com", msg.as_string())
s.quit()
Enter fullscreen mode Exit fullscreen mode

What happens: SMTP connection to Postfix relay


T+10ms: Postfix Validates & Queues

Checks performed:

  1. Is sender in whitelist? (info@example.com → OK)

  2. Is sender from allowed network? (10.0.0.0/16 → OK)

  3. Queue email for delivery

Log entries:

postfix/smtpd[123]: connect from ip-10-10-3-125
postfix/smtpd[123]: ABC123: client=ip-10-0-0-125
postfix/cleanup[124]: ABC123: message-id=<...>
postfix/qmgr[125]: ABC123: from=<info@example.com>, size=432
Enter fullscreen mode Exit fullscreen mode

T+150ms: Postfix → SES Relay

Process:

  1. Establish TLS 1.3 connection to SES

  2. Authenticate with SMTP credentials

  3. Transmit email content

  4. Receive confirmation

Log entry (FIRST "sent" status!):

postfix/smtp[126]: Trusted TLS connection established to 
  email-smtp.ap-south-1.amazonaws.com:587: 
  TLSv1.3 with cipher TLS_AES_256_GCM_SHA384

postfix/smtp[126]: ABC123: to=<user@example.com>, 
  relay=email-smtp.ap-south-1.amazonaws.com:587, 
  delay=0.14, 
  status=sent (250 Ok 0109019c...)
Enter fullscreen mode Exit fullscreen mode

What you know at this point:

  • Email left your infrastructure

  • SES accepted the email

  • Total time: 150ms


T+1500ms: SES Processes & Delivers

SES internal process:

  1. Add DKIM signature

  2. Perform SPF/DMARC checks

  3. Select optimal sending IP

  4. Connect to the recipient's mail server

  5. Deliver email

  6. Receive final confirmation

This happens entirely within AWS - you don't see these steps


T+1550ms: SES Publishes Event to SNS

Event generated:

{
  "notificationType": "Delivery",
  "mail": {
    "timestamp": "2026-02-25T00:05:15.140Z",
    "messageId": "0109019c...",
    "source": "info@example.com",
    "destination": ["user@example.com"]
  },
  "delivery": {
    "timestamp": "2026-02-25T00:05:18.698Z",
    "recipients": ["user@example.com"],
    "smtpResponse": "250 ok dirdel",
    "processingTimeMillis": 3558,
    "remoteMtaIp": "74.198.68.21",
    "reportingMTA": "a8-123.smtp-out.amazonses.com"
  }
}
Enter fullscreen mode Exit fullscreen mode

Published to SNS topic: ses-events-topic


T+1600ms: SNS → SQS Forward

SNS wraps the event:

{
  "Type": "Notification",
  "MessageId": "...",
  "TopicArn": "arn:aws:sns:ap-south-1:...:ses-events-topic",
  "Message": "{\"notificationType\":\"Delivery\",...}",
  "Timestamp": "2026-02-25T00:05:18.750Z"
}
Enter fullscreen mode Exit fullscreen mode

Delivered to SQS queue: ses-events-queue


T+5000ms: Logger Polls & Processes

Logger wakes up (polls every 20 seconds with long polling)

Process:

  1. Retrieve message from SQS

  2. Parse SNS wrapper

  3. Extract SES event

  4. Format for syslog

  5. Write to log file

  6. Delete message from queue

Log entry (SECOND "delivered" status!):

Feb 25 00:05:19 mail postfix/ses-events[456]: 0109019c...: 
  to=<user@example.com>, 
  relay=amazonses.com, 
  dsn=2.0.0, 
  status=delivered, 
  delay=3558ms,
  response=(250 ok dirdel)
Enter fullscreen mode Exit fullscreen mode

What you know now:

  • Email delivered to inbox

  • Delivery took 3.5 seconds

  • Final confirmation from Gmail


Final Result: Unified Logs

Complete journey in logs:

# T+150ms - Postfix → SES
Feb 25 00:05:15 mail postfix/smtp[126]: ABC123: 
  to=<user@example.com>, 
  status=sent (250 Ok)

# T+5000ms - SES → Inbox confirmed
Feb 25 00:05:19 mail postfix/ses-events[456]: 0109019c...: 
  to=<user@example.com>, 
  status=delivered, 
  delay=3558ms
Enter fullscreen mode Exit fullscreen mode

Search for any email:

grep "user@example.com" /var/log/postfix/*.log
Enter fullscreen mode Exit fullscreen mode

Output:

postfix.log:  status=sent (handed to SES)
mail.log:     status=delivered (reached inbox)
Enter fullscreen mode Exit fullscreen mode

Complete visibility from send to delivery!


Why Private Subnet?

Security Benefits:

Reduced attack surface: No public IP = can't be scanned

No direct internet access: Blocks many attack vectors

Network-level isolation: Additional security layer

AWS best practice: Recommended architecture

Compliance-friendly: Easier to meet security requirements

How it works:

┌─────────────────────────────────┐
│     Private Subnet             │
│                                 │
│  ┌──────────┐                  │
│  │ Postfix  │ (No public IP)   │
│  └────┬─────┘                  │
│       │ Outbound HTTPS only    │
└───────┼─────────────────────────┘
        │
        ↓ Via NAT Gateway
  ┌─────────────┐
  │   AWS SES   │
  │   AWS SQS   │
  └─────────────┘
Enter fullscreen mode Exit fullscreen mode

Key point: Both SES and SQS are "pull" services:

  • Postfix initiates connection to SES (outbound)

  • Logger initiates connection to SQS (outbound)

  • No inbound connections needed!


SQS Polling Benefits:

SES → SNS → SQS ← Python polls (outbound only)
Enter fullscreen mode Exit fullscreen mode

Simple infrastructure:

  1. No public endpoints

  2. No TLS cert management

  3. No inbound firewall rules

  4. IAM authentication (built-in)

  5. Automatic retry (queue buffering)


Security Architecture

Network Security

Multi-layer defense:

┌─────────────────────────────────────┐
│        Private Subnet               │
│  ┌──────────────────────────┐      │
│  │ Security Group           │      │
│  │ - Port 25: 10.0.0.0/21 │      │
│  │ - Port 22: Admin IPs     │      │
│  │ - Outbound: All          │      │
│  │                          │      │
│  │   ┌──────────┐          │      │
│  │   │ Postfix  │          │      │
│  │   │ (Private)│          │      │
│  │   └────┬─────┘          │      │
│  └────────┼─────────────────┘      │
└───────────┼─────────────────────────┘
            │ Outbound only
            ↓ (TLS 1.3)
      ┌─────────────┐
      │   AWS SES   │
      │  (Public)   │
      └─────────────┘
Enter fullscreen mode Exit fullscreen mode

Security controls:

  1. Network isolation: Private subnet, no public IP

  2. Sender validation: Whitelist checks before relay

  3. Encryption: TLS 1.3 to SES

  4. Authentication: SMTP credentials for SES


Additional Resources

AWS Documentation

Postfix Resources

Email Authentication


About This Series

This is Part 1 of a 3-part series on building production email infrastructure:

  • Part 1: Architecture & Design ← You just read this

  • Part 2: Implementation Guide - Step-by-step setup

  • Part 3: Operations & Troubleshooting - Day-to-day management


Next in series: Part 2: Implementation Guide

🔗 If this helped or resonated with you, connect with me on LinkedIn. Let’s learn and grow together.

👉 Stay tuned for more behind-the-scenes write-ups and system design breakdowns.

Top comments (0)