Making email infrastructure a pro.
Daily Operations (5 Minutes)
The Morning Health Check
Create this script and run it daily:
cat > ~/email-health.sh <<'EOF'
#!/bin/bash
YESTERDAY=$(date -d "yesterday" +"%b %d")
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
echo "Email Health ($YESTERDAY)"
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
# Service status
systemctl is-active --quiet postfix && echo "Postfix" || echo "Postfix DOWN"
systemctl is-active --quiet ses-logger && echo "Logger" || echo "Logger DOWN"
# Email stats
SENT=$(grep "$YESTERDAY" /var/log/postfix/postfix.log 2>/dev/null | grep -c "status=sent")
DELIVERED=$(grep "$YESTERDAY" /var/log/postfix/mail.log 2>/dev/null | grep -c "status=delivered")
BOUNCED=$(grep "$YESTERDAY" /var/log/postfix/mail.log 2>/dev/null | grep -c "status=bounced")
echo ""
echo "📊 Volume"
echo " Sent: $SENT"
echo " Delivered: $DELIVERED"
echo " Bounced: $BOUNCED"
if [ $SENT -gt 0 ]; then
DELIVERY_RATE=$((DELIVERED * 100 / SENT))
BOUNCE_RATE=$((BOUNCED * 100 / SENT))
echo ""
echo "📈 Rates"
echo " Delivery: ${DELIVERY_RATE}%"
echo " Bounce: ${BOUNCE_RATE}%"
[ $BOUNCE_RATE -gt 5 ] && echo " ⚠️ High bounce rate!"
fi
# Queue status
QUEUE=$(mailq | tail -1 | awk '{print $5}')
[ "$QUEUE" = "empty" ] && echo "" && echo "Queue empty" || echo "" && echo "⚠️ Queue: $QUEUE messages"
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
EOF
chmod +x ~/email-health.sh
Run it:
./email-health.sh
Automate it (runs at 9 AM, emails you results):
(crontab -l 2>/dev/null; echo "0 9 * * * ~/email-health.sh | mail -s 'Email Health Report' admin@yourdomain.com") | crontab -
Essential Monitoring
Real-Time Log Watching
Monitor live email flow:
# Watch everything
sudo tail -f /var/log/postfix/*.log
# Watch only delivered emails
sudo tail -f /var/log/postfix/mail.log | grep --line-buffered "status=delivered"
# Watch bounces
sudo tail -f /var/log/postfix/mail.log | grep --line-buffered "status=bounced"
Key Metrics to Track
| Metric | Target | Alert If |
|---|---|---|
| Delivery Rate | >95% | <90% |
| Bounce Rate | <5% | >10% |
| Queue Size | 0 | >100 |
| Service Uptime | 99.9% | Any downtime |
Quick metric checks:
# Today's delivery rate
SENT=$(grep "$(date +%b\ %d)" /var/log/postfix/postfix.log | grep -c "status=sent")
DELIVERED=$(grep "$(date +%b\ %d)" /var/log/postfix/mail.log | grep -c "status=delivered")
echo "Delivery rate: $((DELIVERED * 100 / SENT))%"
# Average delivery time (milliseconds)
grep "status=delivered" /var/log/postfix/mail.log | \
grep -oP 'delay=\K\d+' | \
awk '{sum+=$1; n++} END {print "Avg delay: " sum/n "ms"}'
# Top recipient domains
grep "status=delivered" /var/log/postfix/mail.log | \
grep -oP 'to=<[^@]+@\K[^>]+' | \
sort | uniq -c | sort -rn | head -5
Common Operations
Adding New Senders
# 1. Edit whitelist
sudo vim /etc/postfix/allowed_senders
# Add line:
# newsender@yourdomain.com OK
# 2. Rebuild database
sudo postmap /etc/postfix/allowed_senders
# 3. Reload (no restart needed!)
sudo systemctl reload postfix
# 4. Test
echo "Test" | mail -s "Test" -r newsender@yourdomain.com test@example.com
No downtime! Reload picks up changes instantly.
Removing Senders
# 1. Comment out or remove from whitelist
sudo vim /etc/postfix/allowed_senders
# #oldsender@yourdomain.com OK
# 2. Rebuild and reload
sudo postmap /etc/postfix/allowed_senders
sudo systemctl reload postfix
# 3. Verify rejection
echo "Test" | mail -s "Test" -r oldsender@yourdomain.com test@example.com
# Should see: "Sender address rejected"
Managing Mail Queue
View queue:
mailq
Flush queue (retry all deferred emails):
sudo postqueue -f
Delete specific email:
# Get queue ID from mailq
sudo postsuper -d QUEUE_ID
Delete all queued emails:
sudo postsuper -d ALL
Delete only deferred emails:
sudo postsuper -d ALL deferred
Searching Email History
Find specific email:
grep "user@example.com" /var/log/postfix/*.log
Find by sender:
grep "from=<sender@yourdomain.com>" /var/log/postfix/postfix.log
Find bounces to specific domain:
grep "gmail.com" /var/log/postfix/mail.log | grep "bounced"
Get complete email journey:
# Get message ID from sent log
MSG_ID=$(grep "user@example.com" /var/log/postfix/postfix.log | grep -oP 'status=sent \(250 Ok \K[^)]+' | head -1)
# Find all events for that message
grep "$MSG_ID" /var/log/postfix/*.log
Troubleshooting Guide
Problem 1: Postfix Won't Start
Symptoms:
sudo systemctl start postfix
# Job for postfix.service failed
Fix:
# 1. Check config syntax
sudo postfix check
# 2. View detailed error
sudo journalctl -u postfix -n 20 --no-pager
# 3. Common issues:
# Port in use?
sudo lsof -i :25
# Kill conflicting process: sudo systemctl stop sendmail
# Permission issue?
sudo chown -R postfix:postfix /var/log/postfix
sudo chown -R postfix:postfix /var/spool/postfix
# Check line number from 'postfix check' output
sudo vim /etc/postfix/main.cf +LINE_NUMBER
Problem 2: Emails Stuck in Queue
Diagnosis:
mailq # Shows queued emails
sudo tail -100 /var/log/postfix/postfix.log | grep "status=deferred"
Common causes and fixes:
Wrong SES credentials:
# Verify credentials
sudo postmap -q "[email-smtp.ap-south-1.amazonaws.com]:587" /etc/postfix/sasl_passwd
# Update if needed
sudo vim /etc/postfix/sasl_passwd
sudo postmap /etc/postfix/sasl_passwd
sudo systemctl restart postfix
Network blocked:
# Test SES connectivity
telnet email-smtp.ap-south-1.amazonaws.com 587
# Check security group allows outbound 587
# Check route table has internet gateway
SES quota exceeded:
aws ses get-send-quota --region ap-south-1
# If near limit, wait or request increase
After fixing, flush the queue:
sudo postqueue -f
Problem 3: Logger Service Keeps Crashing
Check logs:
sudo journalctl -u ses-logger -n 50 --no-pager
sudo tail -50 /var/log/ses-logger-error.log
Common fixes:
boto3 missing:
python3 -c "import boto3" || sudo yum install -y python3-boto3
sudo systemctl restart ses-logger
Wrong queue URL:
# Get correct URL
QUEUE_URL=$(aws sqs get-queue-url --queue-name ses-events-queue --region ap-south-1 --query 'QueueUrl' --output text)
# Update service
sudo sed -i "s|Environment=\"SQS_QUEUE_URL=.*\"|Environment=\"SQS_QUEUE_URL=$QUEUE_URL\"|" /etc/systemd/system/ses-logger.service
sudo systemctl daemon-reload
sudo systemctl restart ses-logger
IAM permissions:
# Verify role attached
aws sts get-caller-identity
# Should show: PostfixSESLogger role
# If not, reattach IAM instance profile
Problem 4: No Delivery Events in Logs
Diagnosis:
# 1. Check SQS queue has messages
aws sqs get-queue-attributes \
--queue-url "$(aws sqs get-queue-url --queue-name ses-events-queue --region ap-south-1 --query 'QueueUrl' --output text)" \
--attribute-names ApproximateNumberOfMessages \
--region ap-south-1
If messages are accumulating:
Logger not processing → Check
sudo journalctl -u ses-loggerRestart logger →
sudo systemctl restart ses-logger
If no messages in queue:
# 2. Verify SES publishing to SNS
aws ses get-identity-notification-attributes \
--identities yourdomain.com \
--region ap-south-1
# Should show all three topics configured
# 3. Reconfigure if needed
SNS_ARN=$(aws sns list-topics --region ap-south-1 --query "Topics[?contains(TopicArn, 'ses-events-topic')].TopicArn | [0]" --output text)
for EVENT in Delivery Bounce Complaint; do
aws ses set-identity-notification-topic \
--identity yourdomain.com \
--notification-type $EVENT \
--sns-topic "$SNS_ARN" \
--region ap-south-1
done
Problem 5: High Bounce Rate (>10%)
Analyze bounce reasons:
grep "status=bounced" /var/log/postfix/mail.log | \
grep -oP 'reason=\(\K[^\)]+' | \
sort | uniq -c | sort -rn | head -10
Common reasons:
"User unknown" (invalid addresses):
# Extract bounced addresses
grep "status=bounced" /var/log/postfix/mail.log | \
grep "bounce_type=Permanent" | \
grep -oP 'to=<\K[^>]+' | \
sort -u > bounced_addresses.txt
# Remove from your mailing list
"Mailbox full":
Temporary issue, will resolve
Retry after 24 hours
"550 Spam":
Review email content
Check SPF/DKIM/DMARC setup
Verify sender reputation
Problem 6: Emails Going to Spam
Verification checklist:
# 1. Check SPF
dig +short TXT yourdomain.com | grep spf
# Should include: include:amazonses.com
# 2. Check DKIM
aws ses get-identity-dkim-attributes \
--identities yourdomain.com \
--region ap-south-1
# Should show: DkimEnabled=true, Status=Success
# 3. Check DMARC
dig +short TXT _dmarc.yourdomain.com
# Should return DMARC policy
# 4. Check SES reputation
aws ses get-account-sending-enabled --region ap-south-1
# Should be enabled
Content checklist:
Avoid spam trigger words (FREE!, ACT NOW!)
Include unsubscribe link
Balance text/image ratio (60% text minimum)
Use a consistent "From" name and address
Authenticate with SPF/DKIM/DMARC
Performance Optimization
Postfix Tuning
For higher throughput:
sudo vim /etc/postfix/main.cf
Add/update:
# Increase concurrent deliveries
default_destination_concurrency_limit = 50
default_destination_recipient_limit = 50
# Reduce queue lifetime
maximal_queue_lifetime = 1d
bounce_queue_lifetime = 1d
# Connection caching
smtp_connection_cache_on_demand = yes
smtp_connection_cache_destinations = email-smtp.ap-south-1.amazonaws.com
Reload:
sudo systemctl reload postfix
Logger Optimization
For high volume (>1000 events/min):
Edit /usr/local/bin/ses_logger.py:
# Increase batch size
response = sqs.receive_message(
QueueUrl=queue_url,
MaxNumberOfMessages=30, # Up from 10
WaitTimeSeconds=20
)
Restart:
sudo systemctl restart ses-logger
Scaling Strategies
When to Scale
| Metric | Scale Trigger |
|---|---|
| CPU Usage | Sustained >70% |
| Emails/day | >40,000 (80% of quota) |
| Queue Size | Sustained >100 |
| Memory | >80% used |
Vertical Scaling (Bigger Instance)
Current performance by instance:
| Instance | vCPU | RAM | Emails/day |
|---|---|---|---|
| t3a.small | 2 | 2GB | 10,000 |
| t3a.medium | 2 | 4GB | 50,000 |
| t3a.large | 2 | 8GB | 100,000 |
| c6a.xlarge | 4 | 8GB | 500,000 |
Security Hardening
Restrict Relay Access
Tighten network access:
sudo vim /etc/postfix/main.cf
# Only specific IPs
mynetworks = 127.0.0.1, 10.10.3.125
# Or specific subnet
mynetworks = 10.10.0.0/21
Rate Limiting
Prevent abuse:
sudo vim /etc/postfix/main.cf
# Max 100 connections/min per client
smtpd_client_connection_rate_limit = 100
# Max 100 emails/min per client
smtpd_client_message_rate_limit = 100
Monitor IAM Usage
Enable CloudTrail for audit:
aws cloudtrail create-trail \
--name email-infrastructure-audit \
--s3-bucket-name my-audit-logs
Resources
AWS Documentation:
Postfix:
Series Complete! 🎉
Part 3: Operations ← . You just finished this
🔗 If this helped or resonated with you, connect with me on LinkedIn. Let’s learn and grow together.
👉 Stay tuned for more behind-the-scenes write-ups and system design breakdowns.
Top comments (0)