DEV Community

Vivek0712 for AWS Heroes

Posted on

Blur Personal Identifiable Information (PII) in Text, Images and Videos

Ever worried that your Personal Identifiable Information such as your name, credit card number etc is exposed in personal documents, images or videos?

With the help of AWS AI services, we will see how to safeguard your personal information.

How it works?

As you might all be aware that the PII Detection is already available for text documents using Amazon Comprehend, I have extended this feature for images and videos as well using Amazon Rekognition. The possible PII Information that can be detected are listed over here.

arhcitecture

Text Documents:

The PII is identified and redacted using Amazon Comprehend PII Detection API.

Images and Videos:

First, the text presented in the image is detected using Amazon Rekognition Text Detection API. Information such as text, their bounding box and relevant details are obtained from the API.

The text is prepared as a document and passed to the Amazon Comprehend PPI Detection API where all PII available in the text is concatenated. Then the relevant PII text in the image is blurred using Rekognition and OpenCV.

The same above technique is performed for every frame of the video.

Results

Text

Original Text:

Hello Zhang Wei, I am John. Your AnyCompany Financial Services, LLC credit card account 1111-0000-1111-0008 has a minimum payment of $24.53 that is due by July 31st. Based on your autopay settings, we will withdraw your payment on the due date from your bank account number XXXXXX1111 with the routing number XXXXX0000.
Your latest statement was mailed to 2200 West Cypress Creek Road, 1st Floor, Fort Lauderdale, Florida, 33309.
After your payment is received, you will receive a confirmation text message at 206-555-0100.
If you have questions about your bill, AnyCompany Customer Service is available by phone at 206-555-0199 or email at support@anycompany.com.

PII Redacted Text:

Hello [NAME], I am [NAME]. Your AnyCompany Financial Services, LLC credit card account [CREDIT_DEBIT_NUMBER] has a minimum payment of $24.53 that is due by [DATE_TIME]. Based on your autopay settings, we will withdraw your payment on the due date from your bank account number [BANK_ACCOUNT_NUMBER] with the routing number [BANK_ROUTING].
Your latest statement was mailed to [ADDRESS].
After your payment is received, you will receive a confirmation text message at [PHONE].
If you have questions about your bill, AnyCompany Customer Service is available by phone at [PHONE] or email at [EMAIL].

Image:

Original Image:

originalimage

PII Blurred Image:

pii image

Video:

Original Video:

Image description

PII Blurred Video:

Image description

Code Samples:

Using PythonSDK (Boto3), all the above mentioned functionalities of blurring PII from text, images and videos are available. The source and destination can be your local file system or S3 bucket.

GitHub logo Vivek0712 / blur-pii-aws-ai

PII Detection and Blurring in Text, Images and Video using AWS AI Services

Future Enhancements:

  • The approach and architecture for Stored Videos will be improved and also support for streaming videos will be added.
  • Other Personal Information such as Faces can also be easily integrated
  • Support for Custom PII Entity detection can be added by creating custom entity recognition using Amazon Comprehend Service.
  • The feature can be evolved into browser extensions, plugins etc.

Now you can easily detect and blur PII and share text documents, images and videos without having to risk your privacy. Do share if you found this useful. If you have any doubts, or for future collaboration, kindly reach me out through my social handles.

Top comments (0)