How to Use Amazon SNS Data Protection Policies to Prevent Sensitive Data Leakage

#aws #security #data #privacy

When we build things using event-driven architecture, we almost always run into Amazon SNS and for good reason. It’s simple, scalable, and makes it incredibly easy to fan out messages to multiple subscribers.

Imagine a fintech or healthcare application that sends transaction alerts or patient updates via SMS or email. These messages may accidentally include sensitive information such as account details, names, or dates of birth. Encrypting SNS topics and applying strict access controls helps ensure compliance, but it’s equally important to prevent sensitive data from leaking into messages themselves.
That’s where ** SNS Message Data Protection** comes in — because your architecture isn’t just about fast delivery, it’s also about secure delivery.

In this blog, I will walk through how we can protect personal and sensitive information while sending notifications through SNS using Data Protection Policies.

Data Protection in SNS

Amazon SNS uses data protection policies to identify and manage sensitive data (like PII and PHI) in message payloads with Predefined or custom data identifiers by using Machine learning and pattern matching.

Each policy allows you to define operations based on detection:

Audit – Log findings without interrupting delivery
De-identify – Mask or redact sensitive data
Deny – Block messages containing sensitiv e data

A policy is defined in JSON format and includes elements like:

DataDirection (Inbound/Outbound)
Principal (IAM identity publishing/subscribing)
DataIdentifier (e.g., name, phone number)
Operation (Audit, De-identify, Deny)

Only one data protection policy per SNS topic is allowed, but it can have multiple statements. This helps organizations enforce privacy controls and reduce compliance risks.

Why should I use message data protection?

Introducing SNS Data Protection into your governance, risk, and compliance programs helps you automatically detect, prevent, and control data leakage. It safeguards regulated data (PII/PHI) and reduces the overhead of building your own detection or masking pipeline.

Defining SNS topics with policy

We define three SNS topics — each configured with its own data protection policy: Audit, De-identify, and Deny. These sample policies demonstrate how SNS handles sensitive data under different rules.

Below are the three sample data protection policies used in this blog—Deny to block sensitive data, Audit to log sensitive content without stopping delivery, and De-identify to automatically mask regulated fields before the message is published.

Audit - Data protection policy

This policy detects email, date of birth, and credit card numbers.
If found, the finding is logged in CloudWatch but delivery continues.

One important thing to note, CloudWatch log group name in this case must have prefix /aws/vendedlogs/

{
  "Description": "Audit sensitive data without blocking delivery",
  "Version": "2021-06-01",
  "Statement": [
    {
      "DataDirection": "Inbound",
      "DataIdentifier": [
        "arn:aws:dataprotection::aws:data-identifier/EmailAddress",
        "arn:aws:dataprotection::aws:data-identifier/DateOfBirth",
        "arn:aws:dataprotection::aws:data-identifier/CreditCardNumber"
      ],
      "Operation": {
        "Audit": {
          "FindingsDestination": {
            "CloudWatchLogs": {
              "LogGroup": "/aws/vendedlogs/sns-audit/"
            }
          },
          "SampleRate": "99"
        }
      },
      "Principal": [
        "*"
      ],
      "Sid": "AuditSensitiveData"
    }
  ],
  "Name": "sns-audit-policy"
}

De-Identify : Data protection policy

This policy masks sensitive fields using # characters. Subscribers never see actual sensitive data.

{
  "Description": "Mask or redact sensitive data",
  "Version": "2021-06-01",
  "Statement": [
    {
      "DataDirection": "Inbound",
      "DataIdentifier": [
        "arn:aws:dataprotection::aws:data-identifier/EmailAddress",
        "arn:aws:dataprotection::aws:data-identifier/DateOfBirth",
        "arn:aws:dataprotection::aws:data-identifier/CreditCardNumber"
      ],
      "Operation": {
        "Deidentify": {
          "MaskConfig": {
            "MaskWithCharacter": "#"
          }
        }
      },
      "Principal": [
        "*"
      ],
      "Sid": "DeidentifySensitiveData"
    }
  ],
  "Name": "sns-deidentify-policy"
}

Deny : Data protection policy

This policy blocks the publish request entirely if sensitive data is detected.

{
  "Description": "Block messages containing sensitive data",
  "Version": "2021-06-01",
  "Statement": [
    {
      "DataDirection": "Inbound",
      "DataIdentifier": [
        "arn:aws:dataprotection::aws:data-identifier/EmailAddress",
        "arn:aws:dataprotection::aws:data-identifier/DateOfBirth",
        "arn:aws:dataprotection::aws:data-identifier/CreditCardNumber"
      ],
      "Operation": {
        "Deny": {}
      },
      "Principal": [
        "*"
      ],
      "Sid": "DenySensitiveData"
    }
  ],
  "Name": "sns-deny-policy"
}

Demo

I have created simple lambda function to test these topics:

import boto3
import os
import json
import logging

sns = boto3.client('sns')
logger = logging.getLogger()
logger.setLevel(logging.INFO)

def lambda_handler(event, context):
    message = {
        "patientId": "PAT123456",
        "name": "John Doe",
        "dob": "12-01-2012",
        "diagnosis": "Flu"
    }

    topics = ["AUDIT_TOPIC_ARN", "DEIDENTIFY_TOPIC_ARN", "DENY_TOPIC_ARN"]
    results = {}

    for topic_env in topics:
        topic_arn = os.environ.get(topic_env)

        try:
            response = sns.publish(
                TopicArn=topic_arn,
                Message=json.dumps(message)
            )
            results[topic_env] = {
                "status": "success",
                "messageId": response.get("MessageId")
            }
            logger.info(f"Published to {topic_env}: {response.get('MessageId')}")

        except sns.exceptions.InvalidParameterException as e:
            # Common case: DENY_TOPIC_ARN rejects sensitive fields
            logger.error(f"[{topic_env}] Sensitive data detected or invalid parameter: {str(e)}")
            results[topic_env] = {"status": "failed", "error": "Sensitive data not allowed"}

        except sns.exceptions.AuthorizationErrorException as e:
            logger.error(f"[{topic_env}] Access denied: {str(e)}")
            results[topic_env] = {"status": "failed", "error": "Access denied"}

        except Exception as e:
            # Catch any unexpected exception
            logger.error(f"[{topic_env}] Unexpected error: {str(e)}")
            results[topic_env] = {"status": "failed", "error": str(e)}

    return {
        "status": "completed",
        "results": results
    }

I'm sending below JSON where I have date of birth as personal information.

  {
        "patientId": "PAT123456",
        "name": "John Doe",
        "dob": "12-01-2012",
        "diagnosis": "Flu"
    }

On testing I get below results ,

For Audit, I do get message but its get logged under CloudWatch log group

For De-identify, we get masked message for date of birth

And for Deny, we get Access Denied error.

**_> Access denied: An error occurred (AuthorizationError) when calling the Publish operation: One or more data identifiers were found_**

Below is summary for three data protection policy types:

Policy Type	Purpose	What Happens When Sensitive Data Is Detected?	Impact on Message Delivery	Ideal Use Case
Audit	Monitor sensitive data exposure	Logs findings to CloudWatch using `/aws/vendedlogs/`	✔️ Delivered	Compliance monitoring, security insights, debugging sensitive data flow
De-Identify	Mask or redact sensitive data before delivery	Sensitive fields are replaced with `#` or a chosen mask	✔️ Delivered (masked)	Sending events to analytics systems, external subscribers, or downstream apps that shouldn't see PII
Deny	Prevent sensitive data from being published	Publish request is blocked with an `AuthorizationError`	❌ Not delivered	Strict compliance environments (PCI/HIPAA), preventing accidental PII leakage

I hope this walkthrough gives you a clear understanding of how SNS Data Protection policies work and where they can be applied across real-world scenarios. By using these capabilities, teams can significantly reduce compliance risks, strengthen data governance, and build more secure, trustworthy systems—without sacrificing the speed or scalability of their event-driven architecture.

Stay secure. Stay responsible. Keep building.