Deploying a new version of a Lambda function sounds simple — upload code, done. But in production, you never want 100% of traffic hitting an untested version simultaneously.
How does Lambda route traffic between versions? How do you do a canary release that shifts 5% of traffic to a new version and automatically rolls back on errors? How does async traffic flow differently from sync traffic?
This article covers Lambda's traffic routing model from the inside out.
Lambda Versions and Aliases: The Foundation
Before traffic routing makes sense, you need to understand Lambda's versioning model.
Versions
Every time you publish a Lambda function, AWS creates an immutable version — a snapshot of your code and configuration at that point in time.
$LATEST → always points to the latest unpublished code (mutable)
:1 → first published version (immutable)
:2 → second published version (immutable)
:3 → third published version (immutable)
# Publish a new version via boto3
import boto3
lambda_client = boto3.client('lambda')
response = lambda_client.publish_version(
FunctionName='brand-api',
Description='v2.1.0 — faster logo lookup with DynamoDB cache'
)
version_arn = response['FunctionArn']
version_number = response['Version']
print(f'Published version {version_number}: {version_arn}')
# → Published version 42: arn:aws:lambda:us-east-1:123:function:brand-api:42
Versions are immutable — you cannot change the code of :42 after it's published. This is the foundation of safe deployments.
Aliases
An alias is a named pointer to a specific version. Your API Gateway, EventBridge rules, and other triggers should always point to an alias — never to $LATEST or a version number directly.
brand-api:prod → points to :42 (production traffic)
brand-api:staging → points to :43 (staging traffic)
brand-api:canary → points to :42 (95%) + :43 (5%) ← weighted routing
# Create or update an alias
lambda_client.create_alias(
FunctionName='brand-api',
Name='prod',
FunctionVersion='42',
Description='Production alias'
)
# Update alias to point to new version
lambda_client.update_alias(
FunctionName='brand-api',
Name='prod',
FunctionVersion='43'
)
Traffic Splitting: Canary Deployments with Weighted Aliases
The most powerful traffic routing feature in Lambda is weighted aliases — you can split traffic between two versions with any percentage split.
brand-api:prod
├── version :42 → 95% of traffic
└── version :43 → 5% of traffic ← canary
This is Lambda's equivalent of what Knative achieves with Istio VirtualService traffic splitting — but built natively into the Lambda service.
Implementing a Canary Deployment
# deploy_canary.py
import boto3
import time
lambda_client = boto3.client('lambda')
cloudwatch = boto3.client('cloudwatch')
def deploy_canary(function_name: str, new_version: str, canary_percent: int = 5):
"""
Deploy a new Lambda version as a canary.
Routes canary_percent% of traffic to new version.
"""
# Get current prod alias
alias = lambda_client.get_alias(
FunctionName=function_name,
Name='prod'
)
current_version = alias['FunctionVersion']
print(f'Current prod version: {current_version}')
print(f'Deploying canary: version {new_version} at {canary_percent}%')
# Update alias with weighted routing
lambda_client.update_alias(
FunctionName=function_name,
Name='prod',
FunctionVersion=current_version, # stable version gets majority
RoutingConfig={
'AdditionalVersionWeights': {
new_version: canary_percent / 100 # e.g., 0.05 = 5%
}
}
)
print(f'Canary deployed: {100 - canary_percent}% → v{current_version}, '
f'{canary_percent}% → v{new_version}')
def promote_canary(function_name: str, new_version: str):
"""Promote canary to 100% — full deployment"""
lambda_client.update_alias(
FunctionName=function_name,
Name='prod',
FunctionVersion=new_version,
RoutingConfig={
'AdditionalVersionWeights': {} # clear weighted routing
}
)
print(f'Canary promoted: 100% traffic now on version {new_version}')
def rollback_canary(function_name: str, stable_version: str):
"""Roll back — remove canary, restore 100% to stable version"""
lambda_client.update_alias(
FunctionName=function_name,
Name='prod',
FunctionVersion=stable_version,
RoutingConfig={
'AdditionalVersionWeights': {} # clear canary
}
)
print(f'Rolled back: 100% traffic restored to version {stable_version}')
# Usage
deploy_canary('brand-api', new_version='43', canary_percent=5)
Automated Canary with CloudWatch Alarms (CodeDeploy)
Manually managing canary percentages is error-prone. AWS CodeDeploy integrates with Lambda to automate the shift — and automatically roll back if CloudWatch alarms fire.
# serverless.yml — automated canary deployment
provider:
name: aws
deploymentMethod: direct
functions:
brandApi:
handler: handler.handler
deploymentSettings:
type: Canary10Percent5Minutes # shift 10% now, 100% after 5 minutes
alias: prod
alarms:
- BrandApiErrorRateAlarm # rollback if this alarm fires
- BrandApiLatencyAlarm
# CloudFormation — define the rollback alarms
resources:
Resources:
BrandApiErrorRateAlarm:
Type: AWS::CloudWatch::Alarm
Properties:
AlarmName: brand-api-error-rate-canary
MetricName: Errors
Namespace: AWS/Lambda
Dimensions:
- Name: FunctionName
Value: brand-api
- Name: Resource
Value: brand-api:prod # monitor the alias, not a specific version
Statistic: Sum
Period: 60
EvaluationPeriods: 2
Threshold: 5 # rollback if >5 errors in 2 minutes
ComparisonOperator: GreaterThanThreshold
BrandApiLatencyAlarm:
Type: AWS::CloudWatch::Alarm
Properties:
AlarmName: brand-api-p99-latency-canary
MetricName: Duration
Namespace: AWS/Lambda
Dimensions:
- Name: FunctionName
Value: brand-api
- Name: Resource
Value: brand-api:prod
ExtendedStatistic: p99
Period: 60
EvaluationPeriods: 2
Threshold: 2000 # rollback if P99 > 2000ms
ComparisonOperator: GreaterThanThreshold
CodeDeploy deployment types for Lambda:
| Type | Behavior |
|---|---|
AllAtOnce |
100% traffic shifts immediately (no canary) |
Canary10Percent5Minutes |
10% for 5 min, then 100% |
Canary10Percent10Minutes |
10% for 10 min, then 100% |
Canary10Percent15Minutes |
10% for 15 min, then 100% |
Linear10PercentEvery1Minute |
+10% every minute until 100% |
Linear10PercentEvery2Minutes |
+10% every 2 minutes until 100% |
How Traffic Flows: Sync vs Async
Traffic routing in Lambda isn't just about version weights — the entire flow differs between synchronous and asynchronous invocations.
Synchronous Traffic Flow (API Gateway)
Client Request
│
▼
API Gateway
│ (points to alias: brand-api:prod)
▼
Lambda Service (weighted routing)
├── 95% → Execution Environment running v42
└── 5% → Execution Environment running v43
│
▼
Response returned to API Gateway → Client
Key characteristics:
- Direct path: client waits for the response
-
No buffering: if Lambda is throttled, API Gateway immediately returns
429to the client - Version routing: Lambda's weighted alias determines which version handles each request
# handler.py — use context to log which version is handling the request
import os
def handler(event, context):
# Log version info for canary monitoring
function_version = context.function_version
print(f'Handled by version: {function_version}')
# Your business logic
brand_id = event['pathParameters']['brandId']
return get_brand(brand_id)
Asynchronous Traffic Flow (SQS / EventBridge)
Async traffic introduces a buffer layer between the event source and Lambda execution. This is the key architectural difference.
Event Source (S3 upload / EventBridge rule)
│
▼
Lambda Internal Queue ← traffic is buffered here
│
▼ (Lambda polls the queue)
Lambda Service (weighted routing)
├── 95% → Execution Environment running v42
└── 5% → Execution Environment running v43
│
▼
Result → CloudWatch Logs
→ Success destination (SNS/SQS/EventBridge/Lambda)
→ Failure destination (DLQ) on repeated failures
The buffer is critical: it decouples the event producer from Lambda's availability. If Lambda is throttled or scaling out, events queue up and are processed when capacity is available — nothing is dropped.
# handler.py — async handler with destination routing
import json
import boto3
def handler(event, context):
"""
Async handler — processes S3 upload events.
On success: result routed to success-destination SQS.
On failure: after 2 retries, routed to DLQ.
"""
for record in event['Records']:
bucket = record['s3']['bucket']['name']
key = record['s3']['object']['key']
try:
result = process_brand_asset(bucket, key)
print(f'Successfully processed: {key}')
return {'processed': key, 'result': result}
except Exception as e:
print(f'Failed to process {key}: {e}')
raise # re-raise to trigger Lambda retry + eventual DLQ routing
# serverless.yml — configure async destinations
functions:
processBrandAsset:
handler: handler.handler
destinations:
onSuccess: arn:aws:sqs:us-east-1:123:brand-asset-success
onFailure: arn:aws:sqs:us-east-1:123:brand-asset-dlq
maximumRetryAttempts: 2
events:
- s3:
bucket: brand-assets
event: s3:ObjectCreated:*
Concurrency Control at the Traffic Layer
In Knative's model, the queue-proxy sidecar acts as a per-pod concurrency limiter — it queues excess requests locally before forwarding to the user container, and reports metrics to the autoscaler.
AWS Lambda implements an equivalent mechanism natively, without requiring a sidecar:
Per-Function Concurrency Limiting
# Set maximum concurrency — Lambda queues excess async requests
lambda_client.put_function_concurrency(
FunctionName='brand-logo-processor',
ReservedConcurrentExecutions=50 # max 50 simultaneous executions
)
For synchronous invocations: requests beyond the concurrency limit are immediately throttled (429).
For asynchronous invocations: requests beyond the concurrency limit are queued in Lambda's internal event queue (up to 6 hours) and retried as capacity becomes available.
Per-Alias Concurrency (Provisioned Concurrency on Aliases)
You can apply Provisioned Concurrency specifically to an alias, ensuring the production alias always has warm environments while the canary alias uses on-demand scaling:
# Apply provisioned concurrency to prod alias only
lambda_client.put_provisioned_concurrency_config(
FunctionName='brand-api',
Qualifier='prod', # the alias name
ProvisionedConcurrentExecutions=20
)
# Canary alias uses on-demand (may cold start, but that's acceptable for 5% traffic)
# No provisioned concurrency set on 'canary' alias
Blue/Green Deployment Pattern
For changes that are too risky for gradual canary (e.g., breaking schema changes), use a full blue/green deployment:
Blue environment: brand-api:prod → version :42 (100% traffic)
Green environment: brand-api:green → version :43 (0% traffic, fully tested)
After validation:
Blue environment: brand-api:prod → version :43 (100% traffic, instant cutover)
Green environment: brand-api:green → version :42 (kept for instant rollback)
# blue_green_deploy.py
import boto3
lambda_client = boto3.client('lambda')
def blue_green_cutover(function_name: str, new_version: str):
"""
Instant traffic cutover from current prod version to new version.
Previous version kept on 'previous' alias for instant rollback.
"""
# Get current prod version (this becomes 'blue' / previous)
current = lambda_client.get_alias(
FunctionName=function_name,
Name='prod'
)
current_version = current['FunctionVersion']
# Preserve current version on 'previous' alias for rollback
try:
lambda_client.update_alias(
FunctionName=function_name,
Name='previous',
FunctionVersion=current_version
)
except lambda_client.exceptions.ResourceNotFoundException:
lambda_client.create_alias(
FunctionName=function_name,
Name='previous',
FunctionVersion=current_version
)
# Cut over prod to new version (instant, no gradual shift)
lambda_client.update_alias(
FunctionName=function_name,
Name='prod',
FunctionVersion=new_version,
RoutingConfig={'AdditionalVersionWeights': {}}
)
print(f'Cutover complete: prod now on v{new_version}')
print(f'Rollback available: run rollback() to restore v{current_version}')
def instant_rollback(function_name: str):
"""Roll back to previous version instantly"""
previous = lambda_client.get_alias(
FunctionName=function_name,
Name='previous'
)
previous_version = previous['FunctionVersion']
lambda_client.update_alias(
FunctionName=function_name,
Name='prod',
FunctionVersion=previous_version,
RoutingConfig={'AdditionalVersionWeights': {}}
)
print(f'Rolled back: prod restored to v{previous_version}')
Deployment Strategy Decision Guide
How risky is this deployment?
│
├── Low risk (config change, minor bug fix)
│ └── AllAtOnce — deploy directly to 100%
│
├── Medium risk (new feature, refactor)
│ └── Canary — start at 5–10%, monitor errors/latency,
│ auto-promote or rollback via CodeDeploy alarms
│
├── High risk (breaking change, new external dependency)
│ └── Blue/Green — full parallel environment,
│ instant cutover after validation, instant rollback
│
└── Schema/data migration (irreversible changes)
└── Feature flags in code + gradual rollout
(decouple deployment from feature activation)
Summary
| Concept | AWS Lambda Implementation |
|---|---|
| Traffic splitting | Weighted aliases (e.g., 95% v42 / 5% v43) |
| Canary deployment | CodeDeploy + Lambda aliases + CloudWatch alarms |
| Blue/Green | Two aliases pointing to different versions, instant cutover |
| Async traffic buffering | Lambda internal event queue (up to 6 hours) |
| Concurrency control | Reserved concurrency + Provisioned Concurrency per alias |
| Automatic rollback | CodeDeploy monitors alarms, rolls back if threshold breached |
The key insight: Lambda's alias + versioning system is its traffic routing layer. Every production Lambda function should be invoked via an alias — never via $LATEST. This single practice unlocks canary deployments, blue/green releases, and instant rollbacks.
Next in this series: **Part 5 — Event-Driven Automation: Building a Serverless Maintenance Bot with Lambda & EventBridge**
Top comments (0)