Amazon Cognito's long-awaited Multi-Region Replication (MRR) feature is now generally available, automatically synchronizing user data, credentials, and pool configurations to a secondary AWS Region. Alongside this, AWS has added native support for customer managed KMS keys for encryption control — a critical feature for regulated industries like healthcare and financial services.
Why This Matters
Before MRR, teams building HA authentication on Cognito had to maintain error-prone custom replication solutions using Lambda triggers, DynamoDB Global Tables, and complex sync logic. End users experienced forced password resets during regional failovers, and machine-to-machine (M2M) clients needed to be manually reconfigured in secondary regions.
With MRR, Cognito now:
- Automatically replicates user profiles, credentials, MFA secrets, and pool configurations from primary → secondary region
- Allows both regions to recognize tokens issued by either region, preserving active sessions
- Supports all auth methods — social federation (Google, Apple, Amazon, Facebook), SAML, OIDC, and M2M OAuth2 flows
- Provides a built-in Route 53 health check-based failover for custom domains
Architecture Overview
The diagram above shows the complete MRR architecture with:
Prerequisites
Before enabling MRR, your user pool must meet these requirements:
- Essentials or Plus feature plan (not available on Lite tier)
- Multi-region customer managed KMS key replicated in all target regions
- Multi-region OIDC issuer configured on the user pool
- A custom domain configured (required for automatic Route 53-based failover)
Step 1: Create a Multi-Region KMS Key
AWS CLI
# Step 1: Create the primary multi-region KMS key in us-west-2
aws kms create-key \
--region us-west-2 \
--description "Cognito MRR Key" \
--multi-region \
--key-usage ENCRYPT_DECRYPT \
--origin AWS_KMS \
--tags TagKey=Purpose,TagValue=CognitoMRR
# Capture the key ARN
PRIMARY_KEY_ARN=$(aws kms list-keys --region us-west-2 \
--query "Keys[?contains(KeyId, 'mrk')]" \
--output text | head -1)
# Step 2: Replicate the key to the secondary region
aws kms replicate-key \
--region us-west-2 \
--key-id $PRIMARY_KEY_ARN \
--replica-region us-east-1 \
--description "Cognito MRR Key Replica (us-east-1)"
# Step 3: Update key policy to allow Cognito access
aws kms put-key-policy \
--region us-west-2 \
--key-id $PRIMARY_KEY_ARN \
--policy-name default \
--policy '{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "AllowCognitoKMSAccess",
"Effect": "Allow",
"Principal": {
"Service": "cognito-idp.amazonaws.com"
},
"Action": [
"kms:Encrypt",
"kms:Decrypt",
"kms:GenerateDataKey",
"kms:DescribeKey"
],
"Resource": "*"
},
{
"Sid": "AllowRootAccount",
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::YOUR_ACCOUNT_ID:root"
},
"Action": "kms:*",
"Resource": "*"
}
]
}'
Step 2: Configure the Cognito User Pool
Attach the KMS Key and Configure Multi-Region OIDC Issuer (CLI)
# Update the user pool to use the customer managed KMS key
aws cognito-idp update-user-pool \
--region us-west-2 \
--user-pool-id us-west-2_XXXXXXXXX \
--kms-key-id arn:aws:kms:us-west-2:<ACCOUNT_ID>:key/mrk-XXXXXXXXXX
# Switch the user pool to a multi-region OIDC issuer
# (This is done via the console "Change issuer type" step;
# verify issuer type via describe-user-pool)
aws cognito-idp describe-user-pool \
--region us-west-2 \
--user-pool-id us-west-2_XXXXXXXXX \
--query "UserPool.{IssuerConfiguration:IssuerConfiguration, Domain:Domain}"
⚠️ Important: Switching to a multi-region OIDC issuer changes the
issclaim in all tokens. Update all backend services, mobile apps, and SPAs to use the new issuer URL before proceeding.
Step 3: Create the Replica User Pool
AWS CLI
# Create the replica in us-east-1
# Note: The API call is made against the PRIMARY region
aws cognito-idp create-user-pool-replica-region \
--region us-west-2 \
--user-pool-id us-west-2_XXXXXXXXX \
--replica-region '{"RegionName": "us-east-1", "KmsKeyId": "arn:aws:kms:us-east-1:<ACCOUNT_ID>:key/mrk-XXXXXXXXXX"}'
# Check replication status — replica info lives on the PRIMARY pool's ReplicaRegions field
aws cognito-idp describe-user-pool \
--region us-west-2 \
--user-pool-id us-west-2_XXXXXXXXX \
--query "UserPool.ReplicaRegions[*].{Region:RegionName, Status:Status}"
# Describe the replica pool directly in the secondary region
aws cognito-idp describe-user-pool \
--region us-east-1 \
--user-pool-id us-east-1_XXXXXXXXX \
--query "UserPool.{Id:Id, Status:Status}"
⚠️ Note: There is no
update-user-pool-replicaorlist-user-pool-replicasCLI command. The replica becomes active automatically once initial sync completes. Replica status is tracked via the primary pool'sReplicaRegionsfield.
Step 4: Configure Route 53 Health Check & Failover
Normal Traffic Flow
Failover Scenario
CLI Configuration
# Create a Route 53 health check for the primary Cognito endpoint
aws route53 create-health-check \
--caller-reference "cognito-primary-hc-$(date +%s)" \
--health-check-config '{
"Type": "HTTPS",
"FullyQualifiedDomainName": "cognito-idp.us-west-2.amazonaws.com",
"Port": 443,
"RequestInterval": 30,
"FailureThreshold": 3,
"ResourcePath": "/health",
"MeasureLatency": true,
"Regions": ["us-east-1","us-west-2","eu-west-1"]
}'
# Store health check ID
HC_ID=$(aws route53 list-health-checks \
--query "HealthChecks[-1].Id" --output text)
echo "Health Check ID: $HC_ID"
# Update the Cognito custom domain to use this health check for auto-failover
# This is done in the console: Branding > Domain > Edit multi-Region failover
# Associate the $HC_ID with the custom domain
Infrastructure as Code (Terraform)
# main.tf
terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = ">= 5.50.0"
}
}
}
# ─────────────────────────────────────────────────────────────
# Provider configurations
# ─────────────────────────────────────────────────────────────
provider "aws" {
alias = "primary"
region = "us-west-2"
}
provider "aws" {
alias = "secondary"
region = "us-east-1"
}
data "aws_caller_identity" "current" {}
# ─────────────────────────────────────────────────────────────
# Multi-Region KMS Key
# ─────────────────────────────────────────────────────────────
resource "aws_kms_key" "cognito_mrk" {
provider = aws.primary
description = "Multi-region KMS key for Cognito MRR"
multi_region = true
deletion_window_in_days = 30
enable_key_rotation = true
policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Sid = "AllowRoot"
Effect = "Allow"
Principal = {
AWS = "arn:aws:iam::${data.aws_caller_identity.current.account_id}:root"
}
Action = "kms:*"
Resource = "*"
},
{
Sid = "AllowCognito"
Effect = "Allow"
Principal = {
Service = "cognito-idp.amazonaws.com"
}
Action = [
"kms:Encrypt",
"kms:Decrypt",
"kms:GenerateDataKey",
"kms:DescribeKey",
"kms:CreateGrant"
]
Resource = "*"
}
]
})
tags = {
Name = "cognito-mrr-key"
Environment = "production"
}
}
resource "aws_kms_alias" "cognito_mrk" {
provider = aws.primary
name = "alias/cognito-mrr-key"
target_key_id = aws_kms_key.cognito_mrk.key_id
}
# Replicate the key to secondary region
resource "aws_kms_replica_key" "cognito_mrk_replica" {
provider = aws.secondary
description = "Replica of Cognito MRR key in us-east-1"
primary_key_arn = aws_kms_key.cognito_mrk.arn
deletion_window_in_days = 30
enabled = true
policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Sid = "AllowRoot"
Effect = "Allow"
Principal = {
AWS = "arn:aws:iam::${data.aws_caller_identity.current.account_id}:root"
}
Action = "kms:*"
Resource = "*"
},
{
Sid = "AllowCognito"
Effect = "Allow"
Principal = {
Service = "cognito-idp.amazonaws.com"
}
Action = [
"kms:Encrypt",
"kms:Decrypt",
"kms:GenerateDataKey",
"kms:DescribeKey",
"kms:CreateGrant"
]
Resource = "*"
}
]
})
tags = {
Name = "cognito-mrr-key-replica"
Environment = "production"
}
}
# ─────────────────────────────────────────────────────────────
# Primary Cognito User Pool
# ─────────────────────────────────────────────────────────────
resource "aws_cognito_user_pool" "primary" {
provider = aws.primary
name = "myapp-user-pool-primary"
# Use customer managed KMS key
user_pool_add_ons {
advanced_security_mode = "ENFORCED"
}
password_policy {
minimum_length = 12
require_lowercase = true
require_uppercase = true
require_numbers = true
require_symbols = true
temporary_password_validity_days = 7
}
mfa_configuration = "OPTIONAL"
software_token_mfa_configuration {
enabled = true
}
# Email verification
auto_verified_attributes = ["email"]
account_recovery_setting {
recovery_mechanism {
name = "verified_email"
priority = 1
}
}
schema {
name = "email"
attribute_data_type = "String"
required = true
mutable = true
}
tags = {
Name = "myapp-primary"
Environment = "production"
Region = "us-west-2"
}
}
# App Client for the primary pool
resource "aws_cognito_user_pool_client" "primary" {
provider = aws.primary
name = "myapp-client-primary"
user_pool_id = aws_cognito_user_pool.primary.id
explicit_auth_flows = [
"ALLOW_USER_SRP_AUTH",
"ALLOW_REFRESH_TOKEN_AUTH",
"ALLOW_USER_PASSWORD_AUTH"
]
access_token_validity = 60
id_token_validity = 60
refresh_token_validity = 30
token_validity_units {
access_token = "minutes"
id_token = "minutes"
refresh_token = "days"
}
prevent_user_existence_errors = "ENABLED"
}
# ─────────────────────────────────────────────────────────────
# Replica User Pool (Secondary Region)
# NOTE: There is no standalone aws_cognito_user_pool_replica resource in
# the AWS Terraform provider. Replication is configured via the
# replica_regions block inside aws_cognito_user_pool.
# ─────────────────────────────────────────────────────────────
# Add a replica_regions block to aws_cognito_user_pool.primary:
#
# resource "aws_cognito_user_pool" "primary" {
# ...
# replica_regions {
# region_name = "us-east-1"
# kms_key_id = aws_kms_replica_key.cognito_mrk_replica.arn
# }
# }
#
# Reference: https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/cognito_user_pool
# ─────────────────────────────────────────────────────────────
# Route 53 Health Check for Failover
# ─────────────────────────────────────────────────────────────
resource "aws_route53_health_check" "cognito_primary" {
fqdn = "cognito-idp.us-west-2.amazonaws.com"
port = 443
type = "HTTPS"
resource_path = "/health"
failure_threshold = 3
request_interval = 30
tags = {
Name = "cognito-primary-health-check"
}
}
# ─────────────────────────────────────────────────────────────
# CloudWatch Alarm for Failover Monitoring
# ─────────────────────────────────────────────────────────────
resource "aws_cloudwatch_metric_alarm" "cognito_errors_primary" {
provider = aws.primary
alarm_name = "cognito-high-error-rate-primary"
comparison_operator = "GreaterThanThreshold"
evaluation_periods = 3
metric_name = "Errors"
namespace = "AWS/Cognito"
period = 60
statistic = "Sum"
threshold = 10
dimensions = {
UserPool = aws_cognito_user_pool.primary.id
UserPoolClient = aws_cognito_user_pool_client.primary.id
}
alarm_description = "Cognito primary region error rate too high - consider failover"
alarm_actions = [aws_sns_topic.cognito_alerts.arn]
}
resource "aws_sns_topic" "cognito_alerts" {
provider = aws.primary
name = "cognito-mrr-alerts"
}
# ─────────────────────────────────────────────────────────────
# Outputs
# ─────────────────────────────────────────────────────────────
output "primary_user_pool_id" {
value = aws_cognito_user_pool.primary.id
}
output "primary_user_pool_endpoint" {
value = aws_cognito_user_pool.primary.endpoint
}
# Replica pool ID is obtained from describe-user-pool in the secondary region,
# not from a separate Terraform resource output.
output "kms_primary_key_arn" {
value = aws_kms_key.cognito_mrk.arn
}
output "kms_replica_key_arn" {
value = aws_kms_replica_key.cognito_mrk_replica.arn
}
output "route53_health_check_id" {
value = aws_route53_health_check.cognito_primary.id
}
Python Automation Scripts
Script 1: Full Setup Orchestrator
#!/usr/bin/env python3
"""
cognito_mrr_setup.py
Automates Amazon Cognito Multi-Region Replication setup using boto3.
"""
import boto3
import json
import time
import logging
from typing import Optional
logging.basicConfig(
level=logging.INFO,
format="%(asctime)s [%(levelname)s] %(message)s"
)
log = logging.getLogger(__name__)
PRIMARY_REGION = "us-west-2"
SECONDARY_REGION = "us-east-1"
ACCOUNT_ID = boto3.client("sts").get_caller_identity()["Account"]
# ─────────────────────────────────────────────────────────────
# KMS: Create and Replicate a Multi-Region Key
# ─────────────────────────────────────────────────────────────
def create_multi_region_kms_key(primary_region: str) -> str:
kms = boto3.client("kms", region_name=primary_region)
key_policy = json.dumps({
"Version": "2012-10-17",
"Statement": [
{
"Sid": "AllowRoot",
"Effect": "Allow",
"Principal": {"AWS": f"arn:aws:iam::{ACCOUNT_ID}:root"},
"Action": "kms:*",
"Resource": "*"
},
{
"Sid": "AllowCognito",
"Effect": "Allow",
"Principal": {"Service": "cognito-idp.amazonaws.com"},
"Action": [
"kms:Encrypt", "kms:Decrypt",
"kms:GenerateDataKey", "kms:DescribeKey", "kms:CreateGrant"
],
"Resource": "*"
}
]
})
response = kms.create_key(
Description="Multi-region KMS key for Cognito MRR",
MultiRegion=True,
KeyUsage="ENCRYPT_DECRYPT",
Origin="AWS_KMS",
Policy=key_policy,
Tags=[{"TagKey": "Purpose", "TagValue": "CognitoMRR"}]
)
key_arn = response["KeyMetadata"]["Arn"]
key_id = response["KeyMetadata"]["KeyId"]
log.info(f"✅ Created multi-region KMS key: {key_arn}")
kms.create_alias(AliasName="alias/cognito-mrr-key", TargetKeyId=key_id)
return key_arn
def replicate_kms_key(primary_key_arn: str, target_region: str) -> str:
kms = boto3.client("kms", region_name=PRIMARY_REGION)
response = kms.replicate_key(
KeyId=primary_key_arn,
ReplicaRegion=target_region,
Description=f"Cognito MRR key replica in {target_region}"
)
replica_arn = response["ReplicaKeyMetadata"]["Arn"]
log.info(f"✅ Replicated KMS key to {target_region}: {replica_arn}")
# Update replica key policy for Cognito access
kms_secondary = boto3.client("kms", region_name=target_region)
kms_secondary.put_key_policy(
KeyId=replica_arn,
PolicyName="default",
Policy=json.dumps({
"Version": "2012-10-17",
"Statement": [
{
"Sid": "AllowRoot",
"Effect": "Allow",
"Principal": {"AWS": f"arn:aws:iam::{ACCOUNT_ID}:root"},
"Action": "kms:*",
"Resource": "*"
},
{
"Sid": "AllowCognito",
"Effect": "Allow",
"Principal": {"Service": "cognito-idp.amazonaws.com"},
"Action": [
"kms:Encrypt", "kms:Decrypt",
"kms:GenerateDataKey", "kms:DescribeKey", "kms:CreateGrant"
],
"Resource": "*"
}
]
})
)
return replica_arn
# ─────────────────────────────────────────────────────────────
# Cognito: Create User Pool with KMS Encryption
# ─────────────────────────────────────────────────────────────
def create_primary_user_pool(kms_key_arn: str, region: str) -> str:
cognito = boto3.client("cognito-idp", region_name=region)
response = cognito.create_user_pool(
PoolName="myapp-user-pool-primary",
Policies={
"PasswordPolicy": {
"MinimumLength": 12,
"RequireUppercase": True,
"RequireLowercase": True,
"RequireNumbers": True,
"RequireSymbols": True,
"TemporaryPasswordValidityDays": 7
}
},
MfaConfiguration="OPTIONAL",
UserPoolAddOns={"AdvancedSecurityMode": "ENFORCED"},
AutoVerifiedAttributes=["email"],
Schema=[
{
"Name": "email",
"AttributeDataType": "String",
"Required": True,
"Mutable": True
}
],
UserPoolTags={
"Environment": "production",
"Region": region,
"MRR": "enabled"
},
UserPoolEncryptionConfig={"KMSKeyID": kms_key_arn}
)
pool_id = response["UserPool"]["Id"]
log.info(f"✅ Created primary user pool: {pool_id}")
return pool_id
# ─────────────────────────────────────────────────────────────
# Cognito: Create Replica User Pool
# ─────────────────────────────────────────────────────────────
def create_user_pool_replica(
primary_pool_id: str,
target_region: str,
source_region: str = PRIMARY_REGION
) -> dict:
cognito = boto3.client("cognito-idp", region_name=source_region)
response = cognito.create_user_pool_replica_region(
UserPoolId=primary_pool_id,
ReplicaRegion={
"RegionName": target_region
}
)
replica = response["UserPoolReplica"]
log.info(
f"✅ Created replica user pool in {target_region}\n"
f" ARN: {replica['UserPoolArn']}\n"
f" Status: {replica['Status']}"
)
return replica
# ─────────────────────────────────────────────────────────────
# Cognito: Poll until replica is INACTIVE (synced), then ACTIVATE
# ─────────────────────────────────────────────────────────────
def wait_and_activate_replica(
replica_pool_id: str,
secondary_region: str,
timeout_seconds: int = 600
):
cognito = boto3.client("cognito-idp", region_name=secondary_region)
elapsed = 0
log.info(f"⏳ Waiting for replica pool {replica_pool_id} to be ready...")
while elapsed < timeout_seconds:
resp = cognito.describe_user_pool(UserPoolId=replica_pool_id)
status = resp["UserPool"].get("Status", "UNKNOWN")
log.info(f" Pool status: {status} ({elapsed}s elapsed)")
if status == "ACTIVE":
# Replica becomes ACTIVE automatically once initial sync completes
log.info(f"✅ Replica pool {replica_pool_id} is ACTIVE")
return
time.sleep(30)
elapsed += 30
raise TimeoutError(f"Replica pool did not become INACTIVE within {timeout_seconds}s")
# ─────────────────────────────────────────────────────────────
# Route 53: Create Health Check
# ─────────────────────────────────────────────────────────────
def create_route53_health_check(primary_region: str) -> str:
r53 = boto3.client("route53")
response = r53.create_health_check(
CallerReference=f"cognito-hc-{int(time.time())}",
HealthCheckConfig={
"Type": "HTTPS",
"FullyQualifiedDomainName": f"cognito-idp.{primary_region}.amazonaws.com",
"Port": 443,
"RequestInterval": 30,
"FailureThreshold": 3,
"MeasureLatency": True,
"Regions": ["us-east-1", "us-west-2", "eu-west-1"]
}
)
hc_id = response["HealthCheck"]["Id"]
log.info(f"✅ Created Route 53 health check: {hc_id}")
r53.change_tags_for_resource(
ResourceType="healthcheck",
ResourceId=hc_id,
AddTags=[{"Key": "Name", "Value": "cognito-primary-hc"}]
)
return hc_id
# ─────────────────────────────────────────────────────────────
# CloudWatch: Alarms and SNS notifications
# ─────────────────────────────────────────────────────────────
def setup_monitoring(
pool_id: str,
client_id: str,
region: str,
alert_email: Optional[str] = None
) -> str:
sns = boto3.client("sns", region_name=region)
cw = boto3.client("cloudwatch", region_name=region)
# Create SNS topic
topic = sns.create_topic(Name="cognito-mrr-alerts")
topic_arn = topic["TopicArn"]
if alert_email:
sns.subscribe(
TopicArn=topic_arn,
Protocol="email",
Endpoint=alert_email
)
log.info(f"📧 Subscribed {alert_email} to alerts topic")
# Create CloudWatch alarm for auth errors
cw.put_metric_alarm(
AlarmName="cognito-primary-high-errors",
AlarmDescription="Cognito primary region auth error rate is high - consider failover",
MetricName="Errors",
Namespace="AWS/Cognito",
Dimensions=[
{"Name": "UserPool", "Value": pool_id},
{"Name": "UserPoolClient", "Value": client_id}
],
Statistic="Sum",
Period=60,
EvaluationPeriods=3,
Threshold=10,
ComparisonOperator="GreaterThanThreshold",
AlarmActions=[topic_arn],
OKActions=[topic_arn],
TreatMissingData="notBreaching"
)
# Alarm for sign-in latency
cw.put_metric_alarm(
AlarmName="cognito-primary-high-latency",
AlarmDescription="Cognito primary region sign-in latency > 2000ms",
MetricName="SignInSuccesses",
Namespace="AWS/Cognito",
Dimensions=[{"Name": "UserPool", "Value": pool_id}],
Statistic="p99",
Period=60,
EvaluationPeriods=5,
Threshold=2000,
ComparisonOperator="GreaterThanThreshold",
AlarmActions=[topic_arn],
TreatMissingData="notBreaching"
)
log.info("✅ CloudWatch monitoring configured")
return topic_arn
# ─────────────────────────────────────────────────────────────
# Main Orchestration
# ─────────────────────────────────────────────────────────────
def main():
log.info("🚀 Starting Cognito Multi-Region Replication setup...")
# 1. Create multi-region KMS key
primary_key_arn = create_multi_region_kms_key(PRIMARY_REGION)
# 2. Replicate KMS key to secondary region
replica_key_arn = replicate_kms_key(primary_key_arn, SECONDARY_REGION)
# 3. Create primary Cognito user pool with KMS encryption
primary_pool_id = create_primary_user_pool(primary_key_arn, PRIMARY_REGION)
# NOTE: Before calling create_user_pool_replica, you must:
# a) Update the user pool to use the multi-region OIDC issuer (via console)
# b) Update your applications with the new OIDC issuer URLs
# 4. Create replica user pool
replica = create_user_pool_replica(primary_pool_id, SECONDARY_REGION)
# Extract the replica pool ID from the ARN
replica_pool_id = replica["UserPoolArn"].split("/")[-1]
# 5. Wait for replication to complete and activate
wait_and_activate_replica(replica_pool_id, SECONDARY_REGION)
# 6. Set up Route 53 health check
health_check_id = create_route53_health_check(PRIMARY_REGION)
# 7. Set up monitoring and alerting
cognito_primary = boto3.client("cognito-idp", region_name=PRIMARY_REGION)
clients = cognito_primary.list_user_pool_clients(UserPoolId=primary_pool_id)
client_id = clients["UserPoolClients"][0]["ClientId"] if clients["UserPoolClients"] else "NONE"
setup_monitoring(
pool_id=primary_pool_id,
client_id=client_id,
region=PRIMARY_REGION,
alert_email=os.environ.get("ALERT_EMAIL", "")
)
log.info("\n" + "="*60)
log.info("✅ COGNITO MULTI-REGION REPLICATION SETUP COMPLETE")
log.info("="*60)
log.info(f"Primary Pool ID : {primary_pool_id}")
log.info(f"Primary Region : {PRIMARY_REGION}")
log.info(f"Replica Pool ID : {replica_pool_id}")
log.info(f"Secondary Region : {SECONDARY_REGION}")
log.info(f"KMS Key (Primary) : {primary_key_arn}")
log.info(f"KMS Key (Replica) : {replica_key_arn}")
log.info(f"Route53 HC ID : {health_check_id}")
log.info("="*60)
if __name__ == "__main__":
main()
Script 2: Failover Health Monitor (Lambda-Compatible)
#!/usr/bin/env python3
"""
cognito_failover_monitor.py
Monitors primary Cognito health and can be deployed as a Lambda function
to automate failover decisions or send alerts.
"""
import boto3
import json
import logging
import os
from datetime import datetime, timezone, timedelta
log = logging.getLogger()
log.setLevel(logging.INFO)
PRIMARY_REGION = os.environ.get("PRIMARY_REGION", "us-west-2")
SECONDARY_REGION = os.environ.get("SECONDARY_REGION", "us-east-1")
PRIMARY_POOL_ID = os.environ.get("PRIMARY_POOL_ID", "")
REPLICA_POOL_ID = os.environ.get("REPLICA_POOL_ID", "")
SNS_TOPIC_ARN = os.environ.get("SNS_TOPIC_ARN", "")
HC_ID = os.environ.get("ROUTE53_HEALTH_CHECK_ID", "")
def get_cognito_error_rate(pool_id: str, region: str) -> float:
"""Returns the error count in the last 5 minutes."""
cw = boto3.client("cloudwatch", region_name=region)
end = datetime.now(timezone.utc)
start = end - timedelta(minutes=5)
resp = cw.get_metric_statistics(
Namespace="AWS/Cognito",
MetricName="Errors",
Dimensions=[{"Name": "UserPool", "Value": pool_id}],
StartTime=start,
EndTime=end,
Period=300,
Statistics=["Sum"]
)
datapoints = resp.get("Datapoints", [])
return datapoints[0]["Sum"] if datapoints else 0.0
def get_route53_health_check_status(hc_id: str) -> str:
r53 = boto3.client("route53")
resp = r53.get_health_check_status(HealthCheckId=hc_id)
statuses = resp.get("HealthCheckObservations", [])
healthy = sum(1 for s in statuses if s["StatusReport"]["Status"].startswith("Success"))
total = len(statuses)
return "HEALTHY" if healthy > (total / 2) else "UNHEALTHY"
def describe_replica_status(pool_id: str, region: str) -> str:
cognito = boto3.client("cognito-idp", region_name=region)
resp = cognito.describe_user_pool(UserPoolId=pool_id)
return resp["UserPool"].get("Status", "UNKNOWN")
def send_alert(message: str, subject: str):
if not SNS_TOPIC_ARN:
log.warning("SNS_TOPIC_ARN not set — skipping alert")
return
sns = boto3.client("sns")
sns.publish(TopicArn=SNS_TOPIC_ARN, Message=message, Subject=subject)
log.info(f"📢 Alert sent: {subject}")
def lambda_handler(event, context):
"""
Lambda entry point.
Checks Cognito primary health, logs status, and sends alert if degraded.
"""
log.info("🔍 Running Cognito MRR health check...")
error_count = get_cognito_error_rate(PRIMARY_POOL_ID, PRIMARY_REGION)
hc_status = get_route53_health_check_status(HC_ID) if HC_ID else "NOT_CONFIGURED"
replica_status = describe_replica_status(REPLICA_POOL_ID, SECONDARY_REGION) if REPLICA_POOL_ID else "N/A"
report = {
"timestamp": datetime.now(timezone.utc).isoformat(),
"primary_region": PRIMARY_REGION,
"error_count_5m": error_count,
"route53_hc": hc_status,
"replica_status": replica_status,
"recommendation": "FAILOVER" if (error_count > 10 or hc_status == "UNHEALTHY") else "HEALTHY"
}
log.info(json.dumps(report, indent=2))
if report["recommendation"] == "FAILOVER":
send_alert(
message=json.dumps(report, indent=2),
subject="⚠️ Cognito Primary Region Degraded — Consider Failover"
)
return {"statusCode": 200, "body": json.dumps(report)}
# Local testing entry point
if __name__ == "__main__":
result = lambda_handler({}, {})
print(result)
Key Limitations to Know
| Limitation | Details |
|---|---|
| Write operations | Secondary pools are read-only — no new sign-ups, password resets, or profile edits during failover |
| TOTP MFA | Not supported in secondary replicas; TOTP users must authenticate via primary |
| Replica count | Maximum one secondary replica per user pool |
| Federated users | Must have previously signed in via primary before they can use the replica |
| Lockout counts | Failed auth attempt counters are not synced across regions |
| Custom domain required | Automatic Route 53 failover only works with a custom domain |
| Feature plan | Requires Essentials or Plus tier — not available on Lite |
Pricing Summary
| Auth Type | Essentials Tier | Plus Tier |
|---|---|---|
| User Authentication | \$0.0045 / MAU / replica region | \$0.006 / MAU / replica region |
| M2M Authentication | +30% on standard token pricing | +30% on standard token pricing |
Pricing details are per replica region added on top of standard Cognito costs.
Available Regions
MRR is available across major AWS regions as of June 2026, including US East/West, EU (Frankfurt, Ireland, London, Paris, Stockholm), APAC (Mumbai, Tokyo, Seoul, Singapore, Sydney), Canada (Central), and South America (São Paulo). Any of these can serve as either the source or destination for replication.
Operational Checklist Before Going Live
- [ ] Upgrade user pool to Essentials or Plus plan
- [ ] Create a multi-region KMS key and replicate it to target region
- [ ] Update key policy to allow
cognito-idp.amazonaws.comaccess - [ ] Switch to multi-region OIDC issuer and update all app clients
- [ ] Deploy Lambda triggers, WAF rules, and logging config in secondary region
- [ ] Create replica and wait for INACTIVE → ACTIVE transition
- [ ] Set up Route 53 health check and link it to your custom domain's failover config
- [ ] Configure CloudWatch alarms for error rates and latency
- [ ] Test failover during off-peak hours by routing a small traffic slice to secondary
- [ ] Disable sign-up/password-reset UI elements when operating in failover mode
Migrating Existing Cognito User Pools to Multi-Region Replication
Migrating an existing Cognito user pool to MRR is more involved than a fresh setup because you have live users, active sessions, and applications already hardcoded to the original OIDC issuer URL. This guide walks you through every phase — eligibility check, issuer migration, KMS attachment, replica creation, and app updates — without forcing users to re-authenticate or reset passwords.
Phase 0: Eligibility Check — Are You on Next-Gen Infrastructure?
This is the most critical gating factor. MRR only works on next-generation Cognito infrastructure. Older existing pools will be automatically upgraded by AWS, but they cannot self-opt-in. Until then, the console shows an exception message on ineligible pools.
Check Your Pool's Eligibility via CLI
# Check your user pool details for infrastructure version
aws cognito-idp describe-user-pool \
--region us-west-2 \
--user-pool-id us-west-2_XXXXXXXXX \
--query "UserPool.{Tier:UserPoolTier, Status:Status, Domain:Domain}"
# Check if MRR options are available by inspecting ReplicaRegions on describe-user-pool
aws cognito-idp describe-user-pool \
--region us-west-2 \
--user-pool-id us-west-2_XXXXXXXXX \
--query "UserPool.{Tier:UserPoolTier, ReplicaRegions:ReplicaRegions}"
⚠️ Note: There is no
list-user-pool-replicasCLI command. Replica information is returned via theReplicaRegionsfield indescribe-user-poolon the primary pool. If the field is absent or the feature returns an error, the pool is not yet on next-gen infrastructure.💡 Tip: Check the AWS Security Blog post on Cognito next-generation infrastructure to understand the upgrade timeline.
Phase 1: Pre-Migration Audit
Before touching anything, run a full audit of your existing pool. This prevents surprises mid-migration.
Python: Audit Script for Existing Pool
#!/usr/bin/env python3
"""
cognito_mrr_audit.py
Audits an existing Cognito user pool for MRR readiness.
Outputs a checklist of items that need remediation.
"""
import boto3
import json
import sys
from dataclasses import dataclass, field
from typing import List
@dataclass
class AuditResult:
check: str
status: str # PASS / FAIL / WARN / INFO
detail: str
action_required: str = ""
def audit_pool_for_mrr(pool_id: str, region: str) -> List[AuditResult]:
cognito = boto3.client("cognito-idp", region_name=region)
results = []
pool = cognito.describe_user_pool(UserPoolId=pool_id)["UserPool"]
# ── 1. Feature Plan (Tier) ────────────────────────────────
tier = pool.get("UserPoolTier", "LITE")
results.append(AuditResult(
check="Feature Plan",
status="PASS" if tier in ["ESSENTIALS", "PLUS"] else "FAIL",
detail=f"Current tier: {tier}",
action_required="" if tier != "LITE" else "Upgrade to Essentials or Plus tier before enabling MRR"
))
# ── 2. KMS Key Configuration ──────────────────────────────
kms_config = pool.get("UserPoolEncryptionConfig", {})
kms_key_id = kms_config.get("KMSKeyID", "")
if kms_key_id:
kms = boto3.client("kms", region_name=region)
key_meta = kms.describe_key(KeyId=kms_key_id)["KeyMetadata"]
is_mrk = key_meta.get("MultiRegion", False)
results.append(AuditResult(
check="KMS Key",
status="PASS" if is_mrk else "FAIL",
detail=f"Key ARN: {kms_key_id}, MultiRegion: {is_mrk}",
action_required="" if is_mrk else "Replace with a multi-region KMS key (mrk- prefix)"
))
else:
results.append(AuditResult(
check="KMS Key",
status="FAIL",
detail="No customer managed KMS key configured",
action_required="Create a multi-region KMS key and attach it to the user pool"
))
# ── 3. OIDC Issuer Type ───────────────────────────────────
issuer_config = pool.get("IssuerConfiguration", {})
issuer_type = issuer_config.get("Type", "LEGACY")
results.append(AuditResult(
check="OIDC Issuer Type",
status="PASS" if issuer_type == "UPDATED" else "FAIL",
detail=f"Issuer type: {issuer_type}",
action_required="" if issuer_type == "UPDATED" else (
"Switch to UPDATED issuer — WARNING: breaking change for existing apps. "
"Update all services that validate the 'iss' claim before switching."
)
))
# ── 4. Current Issuer URL ─────────────────────────────────
old_issuer = f"https://cognito-idp.{region}.amazonaws.com/{pool_id}"
new_issuer = f"https://issuer-cognito-idp.{region}.amazonaws.com/{pool_id}"
results.append(AuditResult(
check="Issuer URL (for reference)",
status="INFO",
detail=f"OLD: {old_issuer}\nNEW: {new_issuer}",
action_required="Update all apps, API GWs, and JWK validators to use NEW issuer URL"
))
# ── 5. MFA Configuration ──────────────────────────────────
mfa = pool.get("MfaConfiguration", "OFF")
totp_enabled = pool.get("UserPoolAddOns", {}).get("AdvancedSecurityMode", "OFF")
results.append(AuditResult(
check="TOTP MFA",
status="WARN" if mfa != "OFF" else "PASS",
detail=f"MFA Config: {mfa}",
action_required="TOTP MFA users CANNOT authenticate on the replica. "
"Plan a communication strategy and disable TOTP-reliant flows in failover mode."
if mfa != "OFF" else ""
))
# ── 6. Custom Domain ──────────────────────────────────────
domain = pool.get("Domain", "") or pool.get("CustomDomain", "")
results.append(AuditResult(
check="Custom Domain",
status="PASS" if domain else "WARN",
detail=f"Domain: {domain or 'NOT CONFIGURED'}",
action_required="Automatic Route 53 failover requires a custom domain. "
"Without it, your app must manually switch regional endpoints."
if not domain else ""
))
# ── 7. App Clients ────────────────────────────────────────
clients = cognito.list_user_pool_clients(UserPoolId=pool_id, MaxResults=10)
client_count = len(clients["UserPoolClients"])
results.append(AuditResult(
check="App Clients",
status="INFO",
detail=f"{client_count} app client(s) found — will be auto-replicated after MRR enabled",
action_required="Verify each client's callback URLs and allowed OAuth flows post-migration"
))
# ── 8. Lambda Triggers ────────────────────────────────────
triggers = pool.get("LambdaConfig", {})
has_triggers = bool(triggers)
results.append(AuditResult(
check="Lambda Triggers",
status="WARN" if has_triggers else "PASS",
detail=f"Triggers configured: {list(triggers.keys()) if has_triggers else 'None'}",
action_required="Lambda triggers must be separately configured for the replica region. "
"Cross-region Lambda invocations won't work automatically."
if has_triggers else ""
))
# ── 9. User Count ─────────────────────────────────────────
try:
stats = cognito.describe_user_pool(UserPoolId=pool_id)["UserPool"]
estimated_users = stats.get("EstimatedNumberOfUsers", "Unknown")
except Exception:
estimated_users = "Unknown"
results.append(AuditResult(
check="Estimated Users",
status="INFO",
detail=f"~{estimated_users} users — larger pools may take longer to initially sync",
action_required="Allow additional time for initial replication if user count is large"
))
return results
def print_audit_report(results: List[AuditResult], pool_id: str):
icons = {"PASS": "✅", "FAIL": "❌", "WARN": "⚠️ ", "INFO": "ℹ️ "}
print(f"\n{'='*65}")
print(f" COGNITO MRR MIGRATION READINESS AUDIT: {pool_id}")
print(f"{'='*65}")
blockers = [r for r in results if r.status == "FAIL"]
warnings = [r for r in results if r.status == "WARN"]
for r in results:
print(f"\n{icons[r.status]} [{r.status}] {r.check}")
print(f" Detail : {r.detail}")
if r.action_required:
print(f" Action : {r.action_required}")
print(f"\n{'='*65}")
print(f" BLOCKERS: {len(blockers)} | WARNINGS: {len(warnings)}")
print(f" {'🚫 MIGRATION BLOCKED — Fix all FAIL items first.' if blockers else '🟢 Ready to proceed (review warnings).'}")
print(f"{'='*65}\n")
if __name__ == "__main__":
POOL_ID = sys.argv[1] if len(sys.argv) > 1 else "us-west-2_XXXXXXXXX"
REGION = sys.argv[2] if len(sys.argv) > 2 else "us-west-2"
results = audit_pool_for_mrr(POOL_ID, REGION)
print_audit_report(results, POOL_ID)
Usage:
python3 cognito_mrr_audit.py us-west-2_XXXXXXXXX us-west-2
Phase 2: Upgrade the Feature Plan (If Needed)
# Upgrade from Lite to Essentials
aws cognito-idp update-user-pool \
--region us-west-2 \
--user-pool-id us-west-2_XXXXXXXXX \
--user-pool-tier ESSENTIALS
# Verify the upgrade
aws cognito-idp describe-user-pool \
--region us-west-2 \
--user-pool-id us-west-2_XXXXXXXXX \
--query "UserPool.UserPoolTier"
Phase 3: Attach a Multi-Region KMS Key to an Existing Pool
⚠️ If your pool is already using an AWS managed key or a single-region CMK, you must create a new multi-region key. You cannot convert an existing single-region key to multi-region.
CLI: Create, Replicate, and Attach
PRIMARY_REGION="us-west-2"
SECONDARY_REGION="us-east-1"
POOL_ID="us-west-2_XXXXXXXXX"
ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)
# Step 1: Create multi-region primary key
MRK_KEY_ID=$(aws kms create-key \
--region $PRIMARY_REGION \
--multi-region \
--description "Cognito MRR CMK" \
--query "KeyMetadata.KeyId" \
--output text)
echo "Primary MRK Key ID: $MRK_KEY_ID"
MRK_KEY_ARN="arn:aws:kms:${PRIMARY_REGION}:${ACCOUNT_ID}:key/${MRK_KEY_ID}"
# Step 2: Apply key policy (must include identitystore.amazonaws.com for replication)
cat > /tmp/key-policy.json <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "AllowRoot",
"Effect": "Allow",
"Principal": {"AWS": "arn:aws:iam::${ACCOUNT_ID}:root"},
"Action": "kms:*",
"Resource": "*"
},
{
"Sid": "AllowCognitoAndIdentityStore",
"Effect": "Allow",
"Principal": {
"Service": [
"cognito-idp.amazonaws.com",
"identitystore.amazonaws.com"
]
},
"Action": [
"kms:Encrypt","kms:Decrypt","kms:ReEncrypt*",
"kms:GenerateDataKey*","kms:DescribeKey","kms:CreateGrant"
],
"Resource": "*",
"Condition": {
"StringEquals": {"aws:SourceAccount": "${ACCOUNT_ID}"}
}
}
]
}
EOF
# Apply policy to primary key
aws kms put-key-policy \
--region $PRIMARY_REGION \
--key-id $MRK_KEY_ID \
--policy-name default \
--policy file:///tmp/key-policy.json
# Step 3: Replicate key to secondary region
aws kms replicate-key \
--region $PRIMARY_REGION \
--key-id $MRK_KEY_ARN \
--replica-region $SECONDARY_REGION
# Wait for the replica key to become active
echo "Waiting for replica key to become active..."
sleep 15
# Step 4: Apply the same policy to the replica key
aws kms put-key-policy \
--region $SECONDARY_REGION \
--key-id $MRK_KEY_ARN \ # Multi-region keys share same ARN prefix
--policy-name default \
--policy file:///tmp/key-policy.json
# Step 5: Attach the multi-region KMS key to the existing user pool
aws cognito-idp update-user-pool \
--region $PRIMARY_REGION \
--user-pool-id $POOL_ID \
--kms-key-id $MRK_KEY_ARN
Phase 4: The Critical — Switch to Updated OIDC Issuer
This is the highest-risk step. The OIDC issuer change modifies the iss claim in all new tokens . Every backend service, API Gateway authorizer, and JWT validator that checks iss will break if not updated first.
Before Switching — Find All Affected Services
# Search CloudFormation stacks for Cognito issuer references
aws cloudformation list-stacks \
--stack-status-filter CREATE_COMPLETE UPDATE_COMPLETE \
--query "StackSummaries[].StackName" \
--output text | tr '\t' '\n' | while read stack; do
aws cloudformation get-template --stack-name "$stack" 2>/dev/null | \
grep -l "cognito-idp\." && echo " ↳ Found in: $stack"
done
# Check API Gateway authorizers
aws apigateway get-rest-apis --query "items[].id" --output text | \
tr '\t' '\n' | while read api_id; do
aws apigateway get-authorizers --rest-api-id $api_id \
--query "items[?type=='COGNITO_USER_POOLS'].{name:name,uri:authorizerUri}" \
--output table 2>/dev/null
done
Issuer URL Change Reference
| Old Issuer (Legacy) | New Issuer (Updated) | |
|---|---|---|
| Format | https://cognito-idp.{region}.amazonaws.com/{poolId} |
https://issuer-cognito-idp.{region}.amazonaws.com/{poolId} |
| JWKS Endpoint | .../..well-known/jwks.json |
Same path, new base URL |
| Breaking? | Current | ✅ Yes — update apps before switching |
Token iss claim |
Per-region | Same for both primary and replica |
CLI: Perform the Issuer Switch
# ⚠️ Only run this AFTER updating all downstream JWT validators
# ⚠️ IMPORTANT: Switching to the multi-region OIDC issuer ("Updated" issuer type)
# is a CONSOLE-ONLY operation — it cannot be performed via the CLI or SDK.
# Navigate to: Cognito Console → User Pool → App Integration → Issuer URL → Change issuer type
# After switching via the console, verify the change:
aws cognito-idp describe-user-pool \
--region $PRIMARY_REGION \
--user-pool-id $POOL_ID \
--query "UserPool.IssuerConfiguration"
Python: Bulk-Update API Gateway Authorizers
#!/usr/bin/env python3
"""
update_apigw_authorizers.py
Finds all API Gateway Cognito authorizers and updates the
issuer URL from the old legacy format to the new Updated format.
"""
import boto3
import re
REGION = "us-west-2"
OLD_ISS_PREFIX = "https://cognito-idp."
NEW_ISS_PREFIX = "https://issuer-cognito-idp."
apigw = boto3.client("apigateway", region_name=REGION)
apigwv2 = boto3.client("apigatewayv2", region_name=REGION)
def update_rest_api_authorizers():
"""Update REST API Cognito authorizers."""
apis = apigw.get_rest_apis()["items"]
for api in apis:
api_id = api["id"]
api_name = api["name"]
authorizers = apigw.get_authorizers(restApiId=api_id).get("items", [])
for auth in authorizers:
if auth.get("type") != "COGNITO_USER_POOLS":
continue
provider_arns = auth.get("providerARNs", [])
print(f"\n🔍 API: {api_name} ({api_id}) | Authorizer: {auth['name']}")
# The issuer is embedded in the userPoolArn — patch the audience/config
# For REST APIs, update the JWT issuer in policy or Lambda authorizer config
# For Cognito-native authorizers, the issuer is inferred from the user pool ARN
# No direct issuer string to patch here — update your custom JWT validators instead
print(f" Provider ARNs: {provider_arns}")
print(f" ℹ️ Cognito-native authorizers use pool ARN — no direct issuer string to patch.")
print(f" ✅ These will automatically use the new issuer once pool is updated.")
def update_http_api_authorizers():
"""Update HTTP API JWT authorizers (explicitly reference issuer URL)."""
apis = apigwv2.get_apis()["Items"]
for api in apis:
api_id = api["ApiId"]
api_name = api["Name"]
authorizers = apigwv2.get_authorizers(ApiId=api_id).get("Items", [])
for auth in authorizers:
if auth.get("AuthorizerType") != "JWT":
continue
jwt_config = auth.get("JwtConfiguration", {})
current_issuer = jwt_config.get("Issuer", "")
if OLD_ISS_PREFIX in current_issuer:
new_issuer = current_issuer.replace(OLD_ISS_PREFIX, NEW_ISS_PREFIX)
print(f"\n🔧 Updating HTTP API: {api_name} ({api_id})")
print(f" Auth: {auth['Name']}")
print(f" OLD issuer: {current_issuer}")
print(f" NEW issuer: {new_issuer}")
apigwv2.update_authorizer(
ApiId=api_id,
AuthorizerId=auth["AuthorizerId"],
JwtConfiguration={
"Issuer": new_issuer,
"Audience": jwt_config.get("Audience", [])
}
)
print(f" ✅ Updated.")
else:
print(f"\n✅ HTTP API: {api_name} ({api_id}) | {auth['Name']} — already updated or not Cognito")
if __name__ == "__main__":
print("=" * 60)
print(" Scanning REST API Authorizers")
print("=" * 60)
update_rest_api_authorizers()
print("\n" + "=" * 60)
print(" Scanning HTTP API JWT Authorizers")
print("=" * 60)
update_http_api_authorizers()
print("\n✅ Scan complete. Review any remaining custom JWT validators in your application code.")
Phase 5: Create the Replica and Activate
Once the pool has a multi-region KMS key and an updated OIDC issuer, creating the replica is straightforward .
# Create the replica (call is made against the PRIMARY region)
aws cognito-idp create-user-pool-replica-region \
--region $PRIMARY_REGION \
--user-pool-id $POOL_ID \
--replica-region '{"RegionName": "'$SECONDARY_REGION'", "KmsKeyId": "'$MRK_KEY_ARN'"}'
# Poll status — replica info lives on the primary pool's ReplicaRegions field
watch -n 15 "aws cognito-idp describe-user-pool \
--user-pool-id $POOL_ID \
--region $PRIMARY_REGION \
--query 'UserPool.ReplicaRegions[*].{Region:RegionName,Status:Status}' \
--output table"
# Once ACTIVE, get the replica pool ID from the secondary region directly
REPLICA_POOL_ID=$(aws cognito-idp describe-user-pool \
--region $SECONDARY_REGION \
--user-pool-id "${POOL_ID/$PRIMARY_REGION/$SECONDARY_REGION}" \
--query "UserPool.Id" \
--output text)
# Configure replica-specific Lambda triggers (must point to secondary-region functions)
aws cognito-idp update-user-pool \
--region $SECONDARY_REGION \
--user-pool-id $REPLICA_POOL_ID \
--lambda-config '{
"PostAuthentication": "arn:aws:lambda:us-east-1:<ACCOUNT_ID>:function:cognito-post-auth",
"PreTokenGeneration": "arn:aws:lambda:us-east-1:<ACCOUNT_ID>:function:cognito-pre-token"
}'
echo "✅ Replica is ACTIVE in $SECONDARY_REGION"
Phase 6: Configure Failover-Aware Application Code
After MRR is enabled, your app must intelligently route write vs. read operations and handle failover.
Python: Smart Cognito Client with Regional Failover
#!/usr/bin/env python3
"""
cognito_smart_client.py
A resilient Cognito client that:
- Routes writes to primary region
- Routes reads (sign-in) to nearest healthy region
- Falls back automatically on OperationNotEnabledException
"""
import boto3
import logging
from botocore.exceptions import ClientError
from typing import Optional
log = logging.getLogger(__name__)
PRIMARY_REGION = "us-west-2"
SECONDARY_REGION = "us-east-1"
CLIENT_ID = "your-app-client-id"
class ResilientCognitoClient:
def __init__(self, prefer_secondary: bool = False):
self._primary = boto3.client("cognito-idp", region_name=PRIMARY_REGION)
self._secondary = boto3.client("cognito-idp", region_name=SECONDARY_REGION)
# Determine which region to use for authentication
self._auth_client = self._secondary if prefer_secondary else self._primary
self._auth_region = SECONDARY_REGION if prefer_secondary else PRIMARY_REGION
# ── READ OPERATIONS (replica-safe) ──────────────────────
def sign_in(self, username: str, password: str) -> dict:
"""
Attempts sign-in on preferred region; falls back to primary if replica is degraded.
"""
for client, region in [
(self._auth_client, self._auth_region),
(self._primary, PRIMARY_REGION)
]:
try:
resp = client.initiate_auth(
AuthFlow="USER_PASSWORD_AUTH",
ClientId=CLIENT_ID,
AuthParameters={"USERNAME": username, "PASSWORD": password}
)
log.info(f"✅ Authenticated via {region}")
return resp["AuthenticationResult"]
except ClientError as e:
code = e.response["Error"]["Code"]
if code in ("OperationNotEnabledException", "ServiceUnavailableException"):
log.warning(f"⚠️ {region} unavailable ({code}), trying fallback...")
continue
raise # Re-raise auth errors (wrong password, etc.)
raise RuntimeError("Authentication failed in all regions")
def get_user(self, access_token: str) -> dict:
"""Token was issued by either region — try both if needed."""
for client, region in [
(self._auth_client, self._auth_region),
(self._primary, PRIMARY_REGION)
]:
try:
return client.get_user(AccessToken=access_token)
except ClientError as e:
if "NotAuthorizedException" in e.response["Error"]["Code"]:
raise # Bad token — don't retry
log.warning(f"get_user failed on {region}: {e}")
continue
raise RuntimeError("get_user failed in all regions")
# ── WRITE OPERATIONS (primary only) ──────────────────────
def sign_up(self, username: str, password: str, email: str) -> dict:
"""Always routes to primary — writes are rejected on replica."""
return self._primary.sign_up(
ClientId=CLIENT_ID,
Username=username,
Password=password,
UserAttributes=[{"Name": "email", "Value": email}]
)
def change_password(self, access_token: str, old_pw: str, new_pw: str):
"""Must go to primary — OperationNotEnabledException on replica."""
return self._primary.change_password(
AccessToken=access_token,
PreviousPassword=old_pw,
ProposedPassword=new_pw
)
def forgot_password(self, username: str):
"""Password reset always to primary."""
return self._primary.forgot_password(
ClientId=CLIENT_ID,
Username=username
)
def refresh_tokens(self, refresh_token: str) -> dict:
"""Refresh works on both regions; try preferred first."""
for client, region in [
(self._auth_client, self._auth_region),
(self._primary, PRIMARY_REGION)
]:
try:
resp = client.initiate_auth(
AuthFlow="REFRESH_TOKEN_AUTH",
ClientId=CLIENT_ID,
AuthParameters={"REFRESH_TOKEN": refresh_token}
)
log.info(f"✅ Token refreshed via {region}")
return resp["AuthenticationResult"]
except ClientError as e:
if e.response["Error"]["Code"] == "NotAuthorizedException":
raise # Invalid/expired refresh token
log.warning(f"Refresh failed on {region}: {e}")
continue
raise RuntimeError("Token refresh failed in all regions")
# ── Usage Example ────────────────────────────────────────────
if __name__ == "__main__":
client = ResilientCognitoClient(prefer_secondary=False) # Set True during failover
try:
tokens = client.sign_in("testuser@example.com", "MyPassword123!")
print("Access Token:", tokens["AccessToken"][:20], "...")
print("ID Token: ", tokens["IdToken"][:20], "...")
except RuntimeError as e:
print(f"❌ Auth failed: {e}")
Terraform: Migrating an Existing Pool (State Import + MRR Attachment)
If you manage your existing pool in Terraform but it was created before MRR, use terraform import to bring it under the new MRR-enabled config without re-creating users.
# terraform.tf — Update existing pool resource block to add MRR config
resource "aws_cognito_user_pool" "existing" {
provider = aws.primary
name = "myapp-user-pool" # Keep the same name
# ── Add these new blocks to existing resource ─────────────
user_pool_tier = "ESSENTIALS" # Upgrade from LITE if needed
# Attach multi-region KMS key
# (aws_kms_key.cognito_mrk created as shown in previous blog post)
tags = {
Environment = "production"
MRR = "enabled"
}
}
# After terraform apply, create the replica
resource "aws_cognito_user_pool_replica" "secondary" {
provider = aws.primary
user_pool_id = aws_cognito_user_pool.existing.id
region_name = "us-east-1"
depends_on = [
aws_kms_replica_key.cognito_mrk_replica
]
}
# Import existing pool into Terraform state (no re-creation of users)
terraform import aws_cognito_user_pool.existing us-west-2_XXXXXXXXX
# Plan — verify only MRR-related changes, not destructive ones
terraform plan -out=mrr-migration.tfplan
# Review the plan carefully — ensure no "destroy" on user_pool
grep -i "destroy\|replace" mrr-migration.tfplan
# Apply when satisfied
terraform apply mrr-migration.tfplan
Full Migration Sequence Summary
| Step | Action | Risk | CLI Command |
|---|---|---|---|
| 0 | Check next-gen eligibility | Low | list-user-pool-replicas |
| 1 | Upgrade to Essentials/Plus | Low | update-user-pool --user-pool-tier ESSENTIALS |
| 2 | Create multi-region KMS key | Low | kms create-key --multi-region |
| 3 | Replicate KMS key to secondary | Low | kms replicate-key |
| 4 | Attach KMS key to user pool | Low | update-user-pool --key-configuration |
| 5 | Update all JWT validators to new issuer | HIGH | Manual app update |
| 6 | Switch OIDC issuer to UPDATED
|
HIGH | update-user-pool --issuer-configuration Type=UPDATED |
| 7 | Configure replica-specific settings | Medium |
update-user-pool on secondary |
| 8 | Create replica | Low | create-user-pool-replica |
| 9 | Wait for replica to become ACTIVE
|
Low | describe-user-pool --query UserPool.ReplicaRegions |
| 10 | Set up Route 53 health check | Low | route53 create-health-check |
| 11 | Enable failover on custom domain | Low | Console or CLI |
| 12 | Test failover in staging | Medium |
initiate-auth against replica region |
Post-Migration Verification
# Verify replica is syncing users correctly
REPLICA_POOL_ID="us-east-1_XXXXXXXXX" # Get from list-user-pool-replicas
# Check a known user on the replica
aws cognito-idp admin-get-user \
--region $SECONDARY_REGION \
--user-pool-id $REPLICA_POOL_ID \
--username testuser@example.com
# Test auth on the replica region
aws cognito-idp initiate-auth \
--region $SECONDARY_REGION \
--auth-flow USER_PASSWORD_AUTH \
--client-id $CLIENT_ID \
--auth-parameters USERNAME=testuser@example.com,PASSWORD='TestPass123!'
# Confirm the iss claim matches the updated issuer format
REPLICA_TOKEN=$(aws cognito-idp initiate-auth \
--region $SECONDARY_REGION \
--auth-flow USER_PASSWORD_AUTH \
--client-id $CLIENT_ID \
--auth-parameters USERNAME=testuser@example.com,PASSWORD='TestPass123!' \
--query "AuthenticationResult.IdToken" \
--output text)
# Decode and check the iss claim (requires jq and base64)
echo $REPLICA_TOKEN | cut -d'.' -f2 | base64 -d 2>/dev/null | jq '{iss: .iss, sub: .sub, aud: .aud}'
The iss claim should now read https://issuer-cognito-idp.us-west-2.amazonaws.com/us-west-2_XXXXXXXXX — identical for tokens from both regions, confirming that your JWT validators need no per-region branching .








Top comments (0)