Learn how to build a scalable, cost-effective email infrastructure using Postfix and AWS SES with complete bounce tracking. Part 1 covers architecture decisions, security, and design patterns.
Part 1: Architecture & Design Decisions
Series Navigation
Part 1: Architecture & Design ← You are here
Part 2: Implementation Guide
Part 3: Operations & Troubleshooting
TL;DR
What we're building: A production email system combining Postfix SMTP relay with AWS SES, complete with real-time bounce tracking, all visible in unified logs.
Why it matters: Track emails from send → delivery → bounce in one log file, works in private subnets, costs ~$30/month.
Who it's for: DevOps engineers, backend developers, and SREs managing email infrastructure.
Introduction
Have you ever sent an email and wondered: "Did it actually reach the inbox? Or did it bounce? When?"
Traditional SMTP relays tell you when they sent the email, but not when it was delivered. This gap creates a blind spot in your infrastructure.
What We're Building
An email infrastructure that provides:
Complete visibility: Both "sent" and "delivered" statuses
Real-time bounce tracking: Know immediately when emails fail
Unified logging: Everything in one log file (grep-friendly!)
Private subnet compatible: No public endpoints needed
Cost-effective: ~$30/month for 50,000 emails
Enterprise deliverability: AWS SES's 99.9% delivery rate
Who This Series Is For
DevOps Engineers looking to build a reliable email infrastructure
Backend Developers integrating email into applications
SREs need observability into email delivery
Series Overview
Part 1 (this post): Understand the architecture and design decisions
Part 2: Step-by-step implementation guide
Part 3: Operations, monitoring, and troubleshooting
The Problem: Email Observability
Traditional SMTP Relay Limitations
When you send an email through a standard SMTP relay:
postfix: status=sent (250 Ok)
But "sent" doesn't mean "delivered"! It just means your mail server handed the email to the next server.
What You Don't Know
- Did it reach the recipient's inbox?
- Did it bounce?
- Was it marked as spam?
- How long did the delivery take?
- Which ISP was slow?
This creates a visibility gap in your infrastructure.
Solution: The Best of Both Worlds
Your App → Postfix → SES → Recipient
↓ ↓ ↓
Logs Logs SNS→SQS→Logger
↓
Unified Logs ✅
This approach delivers:
Both statuses: "sent" (Postfix) AND "delivered" (SES)
Private subnet: No public endpoints required
Cost-effective: ~$30/month
Standard SMTP: No code changes needed
Unix-friendly: Logs searchable with grep/awk
Full control: Own your infrastructure
Architecture Overview
The Big Picture
╔══════════════════════════════════╗
║ Application Layer ║
║ "Send email to user@example.com"║
╚════════════╤═════════════════════╝
│ SMTP • Port 25
▼
╔══════════════════════════════════╗
║ Postfix (Private Subnet) ║
║ ├─ ✓ Sender authorized ║
║ ├─ ➡ Forwarding to SES... ║
║ └─ 📋 LOG: status=sent ║
╚════════════╤═════════════════════╝
│ SMTP + TLS • Port 587
▼
╔══════════════════════════════════╗
║ AWS Simple Email Service ║
║ "Delivering to recipient..." ║
╚════════════╤═════════════════════╝
│
┌───────┴───────┐
▼ ▼
╔════════════╗ ╔════════════════════╗
║ ║ ║ Event Flow ║
║ Recipient ║ ║ SNS → SQS → Logger║
║ Mail Server║╚══════════╤═════════╝
╚════════════╝ │
▼
╔════════════════════╗
║ Your Logs ✓ ║
║ ├─ sent ║
║ ├─ delivered ║
║ └─ bounced ║
╚════════════════════╝
Data Flow Summary
Application sends email to Postfix via SMTP
Postfix validates sender, relays to SES
SES delivers email to recipient
SES publishes event (delivery/bounce) to SNS
SNS forwards event to SQS queue
Python logger polls SQS, writes to syslog
Result: Both "sent" and "delivered" in your logs!
Core Components Explained
1. Postfix: The Smart Relay
Role: SMTP relay with sender validation and forwarding logic
What it does:
Receives emails from your application
Validates sender addresses against the whitelist
Forwards to AWS SES via authenticated SMTP
Logs "sent" status immediately
Why Postfix?
Industry standard: Powers millions of servers
Highly configurable: Fine-grained control over routing
Excellent logging: Detailed, parseable log format
Battle-tested: Decades of production use
Performance: Handles thousands of concurrent connections
Key Configuration:
# Sender validation (whitelist)
smtpd_sender_restrictions =
check_sender_access hash:/etc/postfix/allowed_senders,
reject
# Only approved senders can use relay
# Example whitelist:
# info@example.com OK
# noreply@example.com OK
# @example.com REJECT
Log Output Example:
Feb 25 00:05:15 mail postfix/smtp[123]: ABC123:
to=<user@example.com>,
relay=email-smtp.ap-south-1.amazonaws.com:587,
delay=0.14,
dsn=2.0.0,
status=sent (250 Ok 0109019c...)
What this tells you:
Postfix accepted the email
Email forwarded to SES
SES accepted the email (250 Ok)
Took 0.14 seconds
2. AWS SES: The Delivery Engine
Role: Actual email delivery to recipients
What it does:
Delivers emails to recipient mail servers
Handles DKIM signing for authentication
Manages IP reputation
Publishes delivery/bounce events
Why SES?
99.9% deliverability: Enterprise-grade infrastructure
Global reach: AWS's worldwide network
Pay-as-you-go: $0.10 per 1,000 emails
Built-in auth: Automatic SPF, DKIM, DMARC
Event publishing: Real-time delivery notifications
Scalable: From 100 to millions of emails
Event Types:
Delivery - Email reached the recipient's inbox
Bounce - Email rejected (permanent or temporary)
Complaint - Recipient marked as spam
Why not use SES directly?
While SES has an API, using Postfix as a relay provides:
Standard SMTP interface (no code changes)
Sender validation
Easy provider switching
Centralized configuration
Better logging
3. The Event Pipeline: SNS → SQS → Logger
This is the secret sauce that brings delivery confirmations into your logs.
SNS (Simple Notification Service)
Role: Event broadcaster from SES
Flow:
SES delivers email
↓
SES publishes event to SNS
↓
SNS broadcasts to subscribers
Why SNS?
Real-time notifications
Fan-out to multiple destinations
Native SES integration
Filter by event type
SQS (Simple Queue Service)
Role: Message buffer between SNS and logger
Sample Event:
{
"notificationType": "Delivery",
"mail": {
"messageId": "0109019c...",
"destination": ["user@example.com"]
},
"delivery": {
"timestamp": "2026-02-25T00:05:18.000Z",
"smtpResponse": "250 ok dirdel",
"processingTimeMillis": 3558
}
}
Why SQS instead of HTTP webhooks?
This is a critical design decision:
HTTP Webhook Approach:
SES → SNS → HTTP POST to your server
↓
Need public endpoint
Need ALB/Load balancer
Security concerns
Webhook authentication
SQS Polling Approach:
SES → SNS → SQS Queue
↓
Python script polls (outbound only)
↓
No public endpoint needed!
Works in private subnet
Messages buffered if logger down
IAM-based authentication
Benefits of SQS:
Private subnet compatible: Polling is outbound-only
Resilient: Messages buffered for 14 days
No ALB needed: Saves $18/month
More reliable: No missed webhooks
IAM auth: No webhook secrets to manage
4. The Logger: Python + Syslog
Role: Poll SQS and write events to Postfix logs
What it does:
Polls SQS every 20 seconds (long polling)
Parses SES delivery/bounce events
Writes to syslog (same facility as Postfix)
Deletes processed messages from the queue
Code Snippet:
import boto3, syslog
# Initialize SQS client (uses IAM role automatically)
sqs = boto3.client('sqs', region_name='ap-south-1')
# Poll queue (long polling reduces API calls)
response = sqs.receive_message(
QueueUrl=queue_url,
MaxNumberOfMessages=10,
WaitTimeSeconds=20 # Wait up to 20s for messages
)
# Process each event
for message in response.get('Messages', []):
event = parse_ses_event(message)
# Write to syslog (appears in Postfix logs!)
syslog.openlog('postfix/ses-events',
facility=syslog.LOG_MAIL)
syslog.syslog(syslog.LOG_INFO,
f"{msg_id}: to=<{recipient}>, "
f"status=delivered")
# Remove from queue
sqs.delete_message(...)
Why Syslog?
Same log file: Appears alongside Postfix logs
Auto-rotation: System handles log management
Searchable: Standard Unix tools (grep, awk)
Integration: Works with existing log aggregators
Familiar format: Same as Postfix log entries
Log Output:
Feb 25 00:05:19 mail postfix/ses-events[456]: 0109019c...:
to=<user@example.com>,
relay=amazonses.com,
dsn=2.0.0,
status=delivered,
delay=3607ms,
response=(250 ok dirdel)
What this tells you:
Email successfully delivered
Took 3.6 seconds from SES to the inbox
Final SMTP response from recipient server
The Complete Email Journey
Let's trace a single email through the entire system with precise timings.
T+0ms: Application Sends Email
import smtplib
from email.mime.text import MIMEText
msg = MIMEText("Hello World")
msg['From'] = "info@example.com"
msg['To'] = "user@example.com"
s = smtplib.SMTP("10.0.0.23", 25)
s.sendmail("info@example.com", "user@example.com", msg.as_string())
s.quit()
What happens: SMTP connection to Postfix relay
T+10ms: Postfix Validates & Queues
Checks performed:
Is sender in whitelist? (
info@example.com→ OK)Is sender from allowed network? (
10.0.0.0/16→ OK)Queue email for delivery
Log entries:
postfix/smtpd[123]: connect from ip-10-10-3-125
postfix/smtpd[123]: ABC123: client=ip-10-0-0-125
postfix/cleanup[124]: ABC123: message-id=<...>
postfix/qmgr[125]: ABC123: from=<info@example.com>, size=432
T+150ms: Postfix → SES Relay
Process:
Establish TLS 1.3 connection to SES
Authenticate with SMTP credentials
Transmit email content
Receive confirmation
Log entry (FIRST "sent" status!):
postfix/smtp[126]: Trusted TLS connection established to
email-smtp.ap-south-1.amazonaws.com:587:
TLSv1.3 with cipher TLS_AES_256_GCM_SHA384
postfix/smtp[126]: ABC123: to=<user@example.com>,
relay=email-smtp.ap-south-1.amazonaws.com:587,
delay=0.14,
status=sent (250 Ok 0109019c...)
What you know at this point:
Email left your infrastructure
SES accepted the email
Total time: 150ms
T+1500ms: SES Processes & Delivers
SES internal process:
Add DKIM signature
Perform SPF/DMARC checks
Select optimal sending IP
Connect to the recipient's mail server
Deliver email
Receive final confirmation
This happens entirely within AWS - you don't see these steps
T+1550ms: SES Publishes Event to SNS
Event generated:
{
"notificationType": "Delivery",
"mail": {
"timestamp": "2026-02-25T00:05:15.140Z",
"messageId": "0109019c...",
"source": "info@example.com",
"destination": ["user@example.com"]
},
"delivery": {
"timestamp": "2026-02-25T00:05:18.698Z",
"recipients": ["user@example.com"],
"smtpResponse": "250 ok dirdel",
"processingTimeMillis": 3558,
"remoteMtaIp": "74.198.68.21",
"reportingMTA": "a8-123.smtp-out.amazonses.com"
}
}
Published to SNS topic: ses-events-topic
T+1600ms: SNS → SQS Forward
SNS wraps the event:
{
"Type": "Notification",
"MessageId": "...",
"TopicArn": "arn:aws:sns:ap-south-1:...:ses-events-topic",
"Message": "{\"notificationType\":\"Delivery\",...}",
"Timestamp": "2026-02-25T00:05:18.750Z"
}
Delivered to SQS queue: ses-events-queue
T+5000ms: Logger Polls & Processes
Logger wakes up (polls every 20 seconds with long polling)
Process:
Retrieve message from SQS
Parse SNS wrapper
Extract SES event
Format for syslog
Write to log file
Delete message from queue
Log entry (SECOND "delivered" status!):
Feb 25 00:05:19 mail postfix/ses-events[456]: 0109019c...:
to=<user@example.com>,
relay=amazonses.com,
dsn=2.0.0,
status=delivered,
delay=3558ms,
response=(250 ok dirdel)
What you know now:
Email delivered to inbox
Delivery took 3.5 seconds
Final confirmation from Gmail
Final Result: Unified Logs
Complete journey in logs:
# T+150ms - Postfix → SES
Feb 25 00:05:15 mail postfix/smtp[126]: ABC123:
to=<user@example.com>,
status=sent (250 Ok)
# T+5000ms - SES → Inbox confirmed
Feb 25 00:05:19 mail postfix/ses-events[456]: 0109019c...:
to=<user@example.com>,
status=delivered,
delay=3558ms
Search for any email:
grep "user@example.com" /var/log/postfix/*.log
Output:
postfix.log: status=sent (handed to SES)
mail.log: status=delivered (reached inbox)
Complete visibility from send to delivery!
Why Private Subnet?
Security Benefits:
Reduced attack surface: No public IP = can't be scanned
No direct internet access: Blocks many attack vectors
Network-level isolation: Additional security layer
AWS best practice: Recommended architecture
Compliance-friendly: Easier to meet security requirements
How it works:
┌─────────────────────────────────┐
│ Private Subnet │
│ │
│ ┌──────────┐ │
│ │ Postfix │ (No public IP) │
│ └────┬─────┘ │
│ │ Outbound HTTPS only │
└───────┼─────────────────────────┘
│
↓ Via NAT Gateway
┌─────────────┐
│ AWS SES │
│ AWS SQS │
└─────────────┘
Key point: Both SES and SQS are "pull" services:
Postfix initiates connection to SES (outbound)
Logger initiates connection to SQS (outbound)
No inbound connections needed!
SQS Polling Benefits:
SES → SNS → SQS ← Python polls (outbound only)
Simple infrastructure:
No public endpoints
No TLS cert management
No inbound firewall rules
IAM authentication (built-in)
Automatic retry (queue buffering)
Security Architecture
Network Security
Multi-layer defense:
┌─────────────────────────────────────┐
│ Private Subnet │
│ ┌──────────────────────────┐ │
│ │ Security Group │ │
│ │ - Port 25: 10.0.0.0/21 │ │
│ │ - Port 22: Admin IPs │ │
│ │ - Outbound: All │ │
│ │ │ │
│ │ ┌──────────┐ │ │
│ │ │ Postfix │ │ │
│ │ │ (Private)│ │ │
│ │ └────┬─────┘ │ │
│ └────────┼─────────────────┘ │
└───────────┼─────────────────────────┘
│ Outbound only
↓ (TLS 1.3)
┌─────────────┐
│ AWS SES │
│ (Public) │
└─────────────┘
Security controls:
Network isolation: Private subnet, no public IP
Sender validation: Whitelist checks before relay
Encryption: TLS 1.3 to SES
Authentication: SMTP credentials for SES
Additional Resources
AWS Documentation
Postfix Resources
Email Authentication
About This Series
This is Part 1 of a 3-part series on building production email infrastructure:
Part 1: Architecture & Design ← You just read this
Part 2: Implementation Guide - Step-by-step setup
Part 3: Operations & Troubleshooting - Day-to-day management
Next in series: Part 2: Implementation Guide
🔗 If this helped or resonated with you, connect with me on LinkedIn. Let’s learn and grow together.
👉 Stay tuned for more behind-the-scenes write-ups and system design breakdowns.
Top comments (0)