DEV Community

Rishi Vachhani
Rishi Vachhani

Posted on

How to Automatically Replicate AWS SSM Parameters Across Regions Using EventBridge and Lambda

AWS Systems Manager Parameter Store is regional by design.

If you create a parameter in ap-south-1, it doesn't automatically exist in us-east-1.

For teams running disaster recovery environments, active-passive architectures, or multi-region applications, maintaining identical configuration across regions quickly becomes an operational challenge.

In this guide, we'll build a fully serverless solution that automatically replicates SSM parameters whenever they are created, updated, or deleted.

The solution uses:

  • AWS Systems Manager Parameter Store
  • AWS CloudTrail
  • Amazon EventBridge
  • AWS Lambda
  • Python (boto3)

The end result is a low-cost, event-driven synchronization mechanism that keeps regions aligned automatically.


Why I Built This

While working on a disaster recovery setup recently, I noticed something that often gets overlooked.

Infrastructure was deployed in multiple regions.

Application containers were deployed.

Databases were replicated.

Monitoring was configured.

Everything looked good.

Until we started validating failover.

Several services immediately failed because critical SSM parameters only existed in the primary region.

Things like:

/prod/app/DB_PASSWORD
/prod/app/API_KEY
/prod/api/JWT_SECRET
Enter fullscreen mode Exit fullscreen mode

were missing in the secondary region.

AWS Parameter Store is regional, which means your applications only see parameters that exist within the region they're running in.

The obvious solution is manually maintaining copies.

The practical solution is automating synchronization.


Architecture Overview

Whenever someone creates, updates, or deletes a parameter:

  1. CloudTrail records the API call.
  2. EventBridge captures the event.
  3. Lambda is invoked.
  4. Lambda determines the operation type.
  5. Lambda replicates the change to one or more destination regions.
Create / Update / Delete Parameter
                │
                ▼
           CloudTrail
                │
                ▼
           EventBridge
                │
                ▼
              Lambda
                │
                ▼
      Destination Region(s)
Enter fullscreen mode Exit fullscreen mode

No servers.

No polling.

No scheduled jobs.

No manual intervention.


Why Replicate Deletes?

In many organizations, disaster recovery environments are expected to mirror production as closely as possible.

If a parameter is intentionally removed from the source region, keeping it in destination regions can introduce configuration drift.

By replicating delete operations as well:

  • Regions remain fully synchronized
  • Stale parameters are automatically removed
  • Configuration remains consistent across environments
  • Operational overhead is reduced

⚠️ Production Note:

If accidental deletions are a concern in your organization, consider removing delete replication and handling deletions manually through change management processes.


Prerequisites

Before getting started, ensure you have:

  • AWS CLI configured
  • Python 3.10+
  • boto3 installed
  • CloudTrail enabled
  • Lambda execution role with SSM permissions

Install boto3:

pip install boto3
Enter fullscreen mode Exit fullscreen mode

Step 1 - Enable CloudTrail

EventBridge receives SSM API activity through CloudTrail.

Without CloudTrail, EventBridge never sees Parameter Store changes.

Navigate to:

AWS Console
→ CloudTrail
→ Trails
→ Create Trail
Enter fullscreen mode Exit fullscreen mode

Recommended settings:

Management Events : Enabled
API Activity      : Write
CloudWatch Logs   : Enabled
Enter fullscreen mode Exit fullscreen mode

Once enabled, CloudTrail begins recording Parameter Store API calls.


Step 2 - Create the EventBridge Rule

Navigate to:

AWS Console
→ EventBridge
→ Rules
→ Create Rule
Enter fullscreen mode Exit fullscreen mode

Use the following event pattern:

{
  "source": ["aws.ssm"],
  "detail-type": ["AWS API Call via CloudTrail"],
  "detail": {
    "eventSource": ["ssm.amazonaws.com"],
    "eventName": [
      "PutParameter",
      "DeleteParameter",
      "DeleteParameters"
    ]
  }
}
Enter fullscreen mode Exit fullscreen mode

This ensures that parameter creations, updates, and deletions trigger the replication Lambda.


Step 3 - Deploy the Lambda Replicator

Environment Variables

Variable Description Example
DEST_REGIONS Destination AWS regions us-east-1,eu-west-1
PARAM_PATHS Parameter paths to replicate /prod/app,/prod/api
DEST_KMS_KEY_ID KMS key for SecureString parameters alias/prod-key
DRY_RUN Preview mode without writes false

Example Configuration

DEST_REGIONS=us-east-1,eu-west-1
PARAM_PATHS=/prod/app,/prod/api
DEST_KMS_KEY_ID=alias/prod-key
DRY_RUN=false
Enter fullscreen mode Exit fullscreen mode

Lambda Function

The Lambda performs the following actions:

  1. Receives EventBridge events
  2. Extracts parameter names
  3. Determines create, update, or delete operations
  4. Validates watched paths
  5. Reads source parameter values
  6. Replicates values and tags
  7. Deletes parameters in destination regions when required
  8. Produces structured CloudWatch logs

Supported Features

  • PutParameter
  • DeleteParameter
  • DeleteParameters
  • Native Parameter Store Change Events
  • Multiple destination regions
  • SecureString replication
  • Tag replication
  • Dry-run mode
"""
SSM Parameter Store Cross-Region Replicator
============================================
Triggered by: EventBridge rule watching SSM Parameter Store events in source region
              (PutParameter, DeleteParameter)

Environment Variables (set in Lambda config):
  DEST_REGIONS      : Comma-separated destination regions  e.g. "us-east-1,eu-west-1"
  PARAM_PATHS       : Comma-separated path prefixes to watch e.g. "/prod/app,/prod/api"
  DEST_PARAM_TYPE   : (Optional) Force a type — String | SecureString | StringList
                      Default: preserves the source parameter's original type
  DEST_KMS_KEY_ID   : (Optional) KMS Key ID/ARN to use in destination for SecureString
                      Default: uses AWS managed key (alias/aws/ssm)
  DRY_RUN           : (Optional) Set to "true" to log actions without writing. Default: false

IAM permissions required on Lambda execution role:
  Source region:
    ssm:GetParameter
    ssm:GetParametersByPath
    ssm:DescribeParameters

  Destination region(s):
    ssm:PutParameter
    ssm:DeleteParameter
    ssm:AddTagsToResource

  KMS (if using SecureString with custom key):
    kms:GenerateDataKey
    kms:Decrypt
"""

import boto3
import json
import logging
import os
import re
from botocore.exceptions import ClientError, BotoCoreError

# ── Logging ──────────────────────────────────────────────────────────────────
logger = logging.getLogger()
logger.setLevel(logging.INFO)

def log_info(msg, **kwargs):
    logger.info(json.dumps({"level": "INFO", "message": msg, **kwargs}))

def log_warn(msg, **kwargs):
    logger.warning(json.dumps({"level": "WARN", "message": msg, **kwargs}))

def log_error(msg, **kwargs):
    logger.error(json.dumps({"level": "ERROR", "message": msg, **kwargs}))


# ── Config ───────────────────────────────────────────────────────────────────

class Config:
    """Reads and validates all environment variables at cold-start."""

    def __init__(self):
        self.dest_regions   = self._parse_list("DEST_REGIONS")
        self.param_paths    = self._parse_list("PARAM_PATHS")
        self.dest_param_type = os.environ.get("DEST_PARAM_TYPE", "").strip() or None
        self.dest_kms_key   = os.environ.get("DEST_KMS_KEY_ID", "").strip() or None
        self.dry_run        = os.environ.get("DRY_RUN", "false").strip().lower() == "true"
        self._validate()

    def _parse_list(self, env_var):
        raw = os.environ.get(env_var, "").strip()
        if not raw:
            raise EnvironmentError(f"Required environment variable '{env_var}' is not set or empty.")
        return [item.strip() for item in raw.split(",") if item.strip()]

    def _validate(self):
        valid_regions_pattern = re.compile(r'^[a-z]{2}-[a-z]+-\d+$')
        for region in self.dest_regions:
            if not valid_regions_pattern.match(region):
                raise ValueError(f"Invalid region format: '{region}'. Expected format like 'us-east-1'.")

        for path in self.param_paths:
            if not path.startswith("/"):
                raise ValueError(f"PARAM_PATHS entry '{path}' must start with '/'. e.g. '/prod/app'")

        valid_types = {"String", "StringList", "SecureString"}
        if self.dest_param_type and self.dest_param_type not in valid_types:
            raise ValueError(f"DEST_PARAM_TYPE '{self.dest_param_type}' is invalid. Choose from: {valid_types}")

        if self.dry_run:
            log_warn("DRY_RUN is enabled — no parameters will be written.")

        log_info("Configuration loaded",
                 dest_regions=self.dest_regions,
                 param_paths=self.param_paths,
                 dest_param_type=self.dest_param_type or "preserve-source",
                 dest_kms_key=self.dest_kms_key or "aws-managed-default",
                 dry_run=self.dry_run)


# Load config once at cold-start so errors surface immediately in CloudWatch
try:
    CONFIG = Config()
except (EnvironmentError, ValueError) as e:
    log_error("Configuration error — Lambda will not process events until fixed.", error=str(e))
    CONFIG = None


# ── Helpers ──────────────────────────────────────────────────────────────────

def normalize_path(path: str) -> str:
    """Ensure path starts with / and has no trailing slash."""
    path = path.strip()
    if not path.startswith("/"):
        path = "/" + path
    return path.rstrip("/")


def path_is_watched(param_name: str, watched_paths: list) -> bool:
    """Return True if param_name is under one of the watched path prefixes."""
    for prefix in watched_paths:
        if param_name == prefix or param_name.startswith(prefix + "/"):
            return True
    return False


def get_source_parameter(ssm_source, param_name: str) -> dict | None:
    """
    Fetch the current value + metadata of a parameter from the source region.
    Returns None if the parameter does not exist (e.g. was deleted).
    """
    try:
        response = ssm_source.get_parameter(Name=param_name, WithDecryption=True)
        param = response["Parameter"]
        log_info("Fetched source parameter",
                 name=param_name,
                 type=param.get("Type"),
                 version=param.get("Version"))
        return param
    except ClientError as e:
        code = e.response["Error"]["Code"]
        if code == "ParameterNotFound":
            log_warn("Parameter not found in source region — may have been deleted.", name=param_name)
            return None
        log_error("Failed to fetch source parameter",
                  name=param_name,
                  error_code=code,
                  error=str(e))
        raise


def get_parameter_tags(ssm_source, param_name: str) -> list:
    """Fetch tags from the source parameter. Returns empty list on failure."""
    try:
        response = ssm_source.list_tags_for_resource(
            ResourceType="Parameter",
            ResourceId=param_name
        )
        return response.get("TagList", [])
    except ClientError as e:
        log_warn("Could not fetch tags for parameter — tags will not be replicated.",
                 name=param_name,
                 error=str(e))
        return []


def replicate_put(ssm_dest, dest_region: str, param: dict, tags: list, config: Config):
    """Write a parameter to the destination region."""
    name  = param["Name"]
    value = param["Value"]
    ptype = config.dest_param_type or param.get("Type", "SecureString")
    desc  = param.get("Description", "Replicated by SSM cross-region replicator Lambda")

    put_kwargs = dict(
        Name=name,
        Value=value,
        Type=ptype,
        Description=desc,
        Overwrite=True,
        Tier="Standard",
    )

    if ptype == "SecureString" and config.dest_kms_key:
        put_kwargs["KeyId"] = config.dest_kms_key

    log_info("Replicating parameter",
             action="PUT",
             name=name,
             dest_region=dest_region,
             type=ptype,
             dry_run=config.dry_run)

    if config.dry_run:
        return

    try:
        response = ssm_dest.put_parameter(**put_kwargs)
        log_info("Parameter replicated successfully",
                 name=name,
                 dest_region=dest_region,
                 version=response.get("Version"))

        # Replicate tags if any
        if tags:
            ssm_dest.add_tags_to_resource(
                ResourceType="Parameter",
                ResourceId=name,
                Tags=tags
            )
            log_info("Tags replicated", name=name, dest_region=dest_region, tag_count=len(tags))

    except ClientError as e:
        log_error("Failed to replicate parameter",
                  name=name,
                  dest_region=dest_region,
                  error_code=e.response["Error"]["Code"],
                  error=str(e))
        raise


def replicate_delete(ssm_dest, dest_region: str, param_name: str, config: Config):
    """Delete a parameter in the destination region."""
    log_info("Replicating delete",
             action="DELETE",
             name=param_name,
             dest_region=dest_region,
             dry_run=config.dry_run)

    if config.dry_run:
        return

    try:
        ssm_dest.delete_parameter(Name=param_name)
        log_info("Parameter deleted successfully",
                 name=param_name,
                 dest_region=dest_region)
    except ClientError as e:
        code = e.response["Error"]["Code"]
        if code == "ParameterNotFound":
            log_warn("Parameter already absent in destination region — nothing to delete.",
                     name=param_name,
                     dest_region=dest_region)
            return
        log_error("Failed to delete parameter",
                  name=param_name,
                  dest_region=dest_region,
                  error_code=code,
                  error=str(e))
        raise


def parse_event(event: dict) -> tuple[str, list, str]:
    """
    Extract (action, param_names, source_region) from the EventBridge event.

    param_names is always a list — most events carry a single parameter,
    but the console's "Delete" button (and any batch delete) calls the
    DeleteParameters (plural) API, which can carry multiple names in one
    CloudTrail event.

    Supported event sources:
      - EventBridge + CloudTrail: detail-type = "AWS API Call via CloudTrail"
      - EventBridge native SSM events: detail-type = "Parameter Store Change"
    """
    detail      = event.get("detail", {})
    detail_type = event.get("detail-type", "")
    source_region = event.get("region", os.environ.get("AWS_REGION", ""))

    # ── CloudTrail-based event ────────────────────────────────────────────────
    if detail_type == "AWS API Call via CloudTrail":
        event_name = detail.get("eventName", "")
        request_params = detail.get("requestParameters", {}) or {}

        action_map = {
            "PutParameter"     : "PUT",
            "DeleteParameter"  : "DELETE",
            "DeleteParameters" : "DELETE",
        }
        action = action_map.get(event_name)
        if not action:
            raise ValueError(f"Unsupported or ignored CloudTrail eventName: '{event_name}'")

        if event_name == "DeleteParameters":
            # Batch delete — names come back as a list
            raw_names = request_params.get("names") or request_params.get("Names") or []
            if not raw_names:
                raise ValueError(f"Could not extract parameter names from CloudTrail event: {json.dumps(detail)}")
            param_names = [normalize_path(n) for n in raw_names]
        else:
            # Single put/delete — name comes back as a string
            raw_name = request_params.get("name") or request_params.get("Name") or ""
            if not raw_name:
                raise ValueError(f"Could not extract parameter name from CloudTrail event: {json.dumps(detail)}")
            param_names = [normalize_path(raw_name)]

        return action, param_names, source_region

    # ── Native SSM EventBridge event ──────────────────────────────────────────
    if detail_type == "Parameter Store Change":
        operation = detail.get("operation", "")
        param_name = detail.get("name", "")

        if not param_name:
            raise ValueError(f"Could not extract parameter name from SSM event: {json.dumps(detail)}")

        action_map = {
            "Create" : "PUT",
            "Update" : "PUT",
            "Delete" : "DELETE",
            "LabelParameterVersion": None,  # Ignored
        }
        action = action_map.get(operation)
        if action is None:
            raise ValueError(f"Unsupported or ignored SSM operation: '{operation}'")

        return action, [normalize_path(param_name)], source_region

    raise ValueError(f"Unsupported event detail-type: '{detail_type}'. "
                     f"Expected 'AWS API Call via CloudTrail' or 'Parameter Store Change'.")


# ── Lambda Handler ────────────────────────────────────────────────────────────

def lambda_handler(event, context):
    log_info("Lambda invoked",
             request_id=context.aws_request_id,
             function=context.function_name,
             remaining_ms=context.get_remaining_time_in_millis())

    # Config failed at cold-start — abort early with a clear message
    if CONFIG is None:
        log_error("Aborting: Lambda has invalid configuration. Fix environment variables and redeploy.")
        return {"statusCode": 500, "body": "Invalid Lambda configuration — check CloudWatch logs."}

    # ── Parse the incoming event ──────────────────────────────────────────────
    try:
        action, param_names, source_region = parse_event(event)
    except ValueError as e:
        log_error("Event parsing failed — skipping.", error=str(e), raw_event=json.dumps(event))
        return {"statusCode": 400, "body": str(e)}

    log_info("Event parsed",
             action=action,
             param_names=param_names,
             source_region=source_region)

    # ── Filter down to only the parameters under a watched path ────────────────
    watched_param_names = [p for p in param_names if path_is_watched(p, CONFIG.param_paths)]
    skipped_param_names = [p for p in param_names if p not in watched_param_names]

    if skipped_param_names:
        log_info("Some parameters are not under a watched path — skipping them.",
                 skipped=skipped_param_names,
                 watched_paths=CONFIG.param_paths)

    if not watched_param_names:
        log_info("No parameters in this event are under a watched path — skipping.",
                 param_names=param_names,
                 watched_paths=CONFIG.param_paths)
        return {"statusCode": 200, "body": "No parameters in watched paths — skipped."}

    # ── Source SSM client ─────────────────────────────────────────────────────
    ssm_source = boto3.client("ssm", region_name=source_region)

    all_results = {}

    for param_name in watched_param_names:
        results = {}

        # ── Replicate to each destination region ──────────────────────────────
        for dest_region in CONFIG.dest_regions:

            if dest_region == source_region:
                log_warn("Destination region is the same as source — skipping to avoid overwrite.",
                         region=dest_region)
                results[dest_region] = "skipped-same-region"
                continue

            ssm_dest = boto3.client("ssm", region_name=dest_region)

            try:
                if action == "DELETE":
                    replicate_delete(ssm_dest, dest_region, param_name, CONFIG)
                    results[dest_region] = "deleted"
                    continue

                # Fetch latest value from source
                param = get_source_parameter(ssm_source, param_name)
                if param is None:
                    log_warn("Source parameter vanished before replication — skipping.",
                             name=param_name,
                             dest_region=dest_region)
                    results[dest_region] = "skipped-source-not-found"
                    continue

                tags = get_parameter_tags(ssm_source, param_name)
                replicate_put(ssm_dest, dest_region, param, tags, CONFIG)
                results[dest_region] = "replicated"

            except (ClientError, BotoCoreError) as e:
                log_error("Replication failed for destination region",
                          dest_region=dest_region,
                          param_name=param_name,
                          error=str(e))
                results[dest_region] = f"error: {str(e)}"

        all_results[param_name] = results

    log_info("Replication complete", action=action, results=all_results)

    return {
        "statusCode": 200,
        "body": json.dumps({
            "action": action,
            "results": all_results
        })
    }

Enter fullscreen mode Exit fullscreen mode

IAM Permissions

The Lambda execution role requires the following permissions.

Source Region

{
  "Effect": "Allow",
  "Action": [
    "ssm:GetParameter",
    "ssm:GetParametersByPath",
    "ssm:DescribeParameters",
    "ssm:ListTagsForResource"
  ],
  "Resource": "*"
}
Enter fullscreen mode Exit fullscreen mode

Destination Regions

{
  "Effect": "Allow",
  "Action": [
    "ssm:PutParameter",
    "ssm:DeleteParameter",
    "ssm:AddTagsToResource"
  ],
  "Resource": "*"
}
Enter fullscreen mode Exit fullscreen mode

If SecureString parameters use a customer-managed KMS key:

{
  "Effect": "Allow",
  "Action": [
    "kms:Decrypt",
    "kms:GenerateDataKey"
  ],
  "Resource": "*"
}
Enter fullscreen mode Exit fullscreen mode

Step 4 - Bulk Sync Existing Parameters

The Lambda only handles future changes.

What about parameters that already exist?

For that we use a one-time bulk synchronization script.

This script:

  • Reads all parameters under configured prefixes
  • Preserves parameter values
  • Preserves SecureString encryption
  • Preserves descriptions
  • Preserves tags
  • Writes everything to the destination region

Preview Before Writing

Always start with a dry run.

DRY_RUN=true \
SOURCE_REGION=ap-south-1 \
DESTINATION_REGION=us-east-1 \
PARAMETER_PREFIXES=/prod/app,/prod/api \
python bulk_sync_ssm.py
Enter fullscreen mode Exit fullscreen mode

Example output:

[DRY-RUN] Would write → /prod/app/DB_HOST
[DRY-RUN] Would write → /prod/app/DB_PASSWORD
[DRY-RUN] Would write → /prod/api/JWT_SECRET
Enter fullscreen mode Exit fullscreen mode

No parameters are modified.


Execute the Synchronization

Once satisfied:

SOURCE_REGION=ap-south-1 \
DESTINATION_REGION=us-east-1 \
PARAMETER_PREFIXES=/prod/app,/prod/api \
python bulk_sync_ssm.py
Enter fullscreen mode Exit fullscreen mode

Example output:

[OK] /prod/app/DB_HOST
[OK] /prod/app/DB_PASSWORD
[OK] /prod/api/JWT_SECRET
Enter fullscreen mode Exit fullscreen mode

Testing the Setup

Test Parameter Creation

aws ssm put-parameter \
  --name "/prod/app/DB_HOST" \
  --value "database.example.com" \
  --type String \
  --overwrite \
  --region ap-south-1
Enter fullscreen mode Exit fullscreen mode

Verify:

aws ssm get-parameter \
  --name "/prod/app/DB_HOST" \
  --region us-east-1
Enter fullscreen mode Exit fullscreen mode

Test Parameter Update

aws ssm put-parameter \
  --name "/prod/app/DB_HOST" \
  --value "new-database.example.com" \
  --type String \
  --overwrite \
  --region ap-south-1
Enter fullscreen mode Exit fullscreen mode

Verify the updated value exists in the destination region.


Test Parameter Deletion

aws ssm delete-parameter \
  --name "/prod/app/DB_HOST" \
  --region ap-south-1
Enter fullscreen mode Exit fullscreen mode

Verify:

aws ssm get-parameter \
  --name "/prod/app/DB_HOST" \
  --region us-east-1
Enter fullscreen mode Exit fullscreen mode

Expected result:

ParameterNotFound
Enter fullscreen mode Exit fullscreen mode

Common Issues

Lambda Never Triggers

Usually caused by:

  • CloudTrail disabled
  • EventBridge rule in the wrong region

AccessDeniedException

Ensure the Lambda role has:

ssm:GetParameter
ssm:GetParametersByPath
ssm:DescribeParameters
ssm:PutParameter
ssm:DeleteParameter
ssm:AddTagsToResource
Enter fullscreen mode Exit fullscreen mode

InvalidKeyId

The destination KMS key does not exist.

Either:

  • Create the key
  • Use AWS-managed encryption

Parameters Not Replicating

Verify the parameter path matches one of the configured prefixes:

/prod/app
/prod/api
Enter fullscreen mode Exit fullscreen mode

Security Recommendations

Use SecureString

Encrypt sensitive values:

  • Database passwords
  • API keys
  • Access tokens
  • JWT secrets

Restrict IAM Permissions

Avoid:

ssm:* on *
Enter fullscreen mode Exit fullscreen mode

Instead:

arn:aws:ssm:*:*:parameter/prod/*
Enter fullscreen mode Exit fullscreen mode

Enable Monitoring

Monitor:

  • Lambda errors
  • Replication failures
  • Unexpected Parameter Store changes

Operational visibility becomes invaluable when troubleshooting.


Cost Considerations

This solution is surprisingly inexpensive.

You're only using:

  • EventBridge
  • Lambda
  • CloudTrail
  • Parameter Store

There are no always-running resources.

For most teams, the monthly cost is negligible compared to the operational effort saved.


Complete Workflow

Day 0
│
├─ Enable CloudTrail
├─ Deploy Lambda
├─ Create EventBridge Rule
├─ Run bulk_sync_ssm.py
└─ Verify replication

Day 1+
│
├─ Developer creates parameter
│     └─ Replicated
│
├─ Developer updates parameter
│     └─ Replicated
│
├─ Developer deletes parameter
│     └─ Deleted from destination regions
│
└─ Regions remain synchronized automatically
Enter fullscreen mode Exit fullscreen mode

Final Thoughts

AWS doesn't currently provide native cross-region replication for Parameter Store.

Fortunately, combining CloudTrail, EventBridge, and Lambda makes it easy to build your own event-driven synchronization mechanism.

The biggest lesson from implementing this in production is that configuration drift is rarely caused by technology—it is usually caused by manual processes.

By automating parameter replication for creates, updates, and deletions, you remove an entire category of operational risk and ensure that all regions remain synchronized automatically.

If you're running multi-region AWS workloads, this is one of those small automations that quietly saves hours of troubleshooting later.

Happy building 🚀


Acknowledgements

A special thanks to *@vikram_patel * for collaboration, testing, and valuable feedback during the implementation of this solution.

Top comments (0)