DEV Community

IronSoftware
IronSoftware

Posted on

How to Properly Redact PDFs in C#

Thousands of FBI, CIA, and government documents thought to be redacted using Adobe Acrobat can be unredacted with a simple Ctrl+A. The "redacted" text is still in the PDF—it's just visually hidden.

This isn't redaction. It's cosmetic covering.

Here's the difference between visual redaction (what Adobe does) and true redaction (removing data permanently), and how to implement secure redaction in C# using IronPDF.

What is PDF Redaction?

PDF redaction is the permanent removal of sensitive information from documents.

What gets redacted:

  • Social Security numbers
  • Credit card numbers
  • Classified information
  • Personal health information (PHI)
  • Financial data
  • Legal discovery materials
using IronPdf;
// Install via NuGet: Install-Package IronPdf

var pdf = PdfDocument.FromFile("classified-document.pdf");

// TRUE redaction: Permanently removes text
pdf.RedactTextOnAllPages("SECRET");

pdf.SaveAs("redacted-document.pdf");
Enter fullscreen mode Exit fullscreen mode

Result: The word "SECRET" is permanently removed from the PDF. It cannot be recovered.

Visual Redaction vs True Redaction

Visual Redaction (Adobe Acrobat's Mistake)

What Adobe Acrobat often does:

  1. Draws a black rectangle over sensitive text
  2. Leaves the original text underneath
  3. Text is still selectable, searchable, and copyable

Example of Adobe's "redaction":

Original PDF text: "John Doe, SSN: 123-45-6789"
After Adobe "redaction": [BLACK BOX] (but text still exists in PDF)
Select All (Ctrl+A): Shows "John Doe, SSN: 123-45-6789"
Enter fullscreen mode Exit fullscreen mode

This is not redaction. It's visual covering.

True Redaction (What Should Happen)

Proper redaction:

  1. Finds sensitive text
  2. Deletes text from PDF structure
  3. Replaces with empty space or black box
  4. Text cannot be recovered by any means

Example of true redaction:

Original PDF text: "John Doe, SSN: 123-45-6789"
After true redaction: "John Doe, SSN: [REDACTED]"
Select All (Ctrl+A): Shows "John Doe, SSN: [REDACTED]"
Enter fullscreen mode Exit fullscreen mode

The original text is gone forever.

Real-World Failures: FBI and CIA Documents

The FBI Manafort Memo (2019)

What happened:

  • FBI released a redacted PDF about Paul Manafort
  • Journalists copied the "redacted" text with Ctrl+A
  • Revealed classified investigation details

The failure:

  • Adobe Acrobat used visual redaction (black boxes)
  • Original text remained in PDF structure
  • Simple copy-paste exposed secrets

CIA Torture Report (2014)

What happened:

  • CIA released redacted torture report
  • Researchers discovered unredacted text in PDF metadata
  • Exposed names of CIA operatives

The failure:

  • Metadata not removed
  • Visual redaction only
  • Security breach

DOJ Court Filings (2018-2023)

Ongoing problem:

  • Lawyers file redacted PDFs with courts
  • Opposing counsel extracts "redacted" text
  • Confidential settlement amounts, trade secrets exposed

The failure:

  • Relying on Adobe's default redaction tool
  • Not verifying text is truly removed
  • Legal malpractice claims filed

How Adobe Acrobat "Redaction" Fails

Problem 1: Black Boxes Over Text

Adobe's default redaction:

Step 1: Select text
Step 2: Click "Redact"
Step 3: Apply redaction (draws black box)
Enter fullscreen mode Exit fullscreen mode

What this does:

  • Adds a black rectangle to the PDF
  • Original text remains underneath
  • Text is still searchable via Find (Ctrl+F)

Test:

  1. Open "redacted" PDF in Adobe Reader
  2. Press Ctrl+A (Select All)
  3. Press Ctrl+C (Copy)
  4. Paste into Notepad
  5. See "redacted" text in plain text

Problem 2: Copy-Paste Exposes Text

Example:

// This is what Adobe "redacted" PDFs contain:

PDF Layer 1: Black rectangle (visual)
PDF Layer 2: "Social Security Number: 123-45-6789" (hidden but accessible)
Enter fullscreen mode Exit fullscreen mode

Anyone with Adobe Reader can:

  • Select the black box
  • Copy underlying text
  • Paste into any text editor

Problem 3: OCR Can Read "Redacted" Scans

Scenario:

  1. Print document
  2. Use marker to black out sensitive text
  3. Scan to PDF
  4. Release PDF

What attackers do:

  1. Run OCR (Optical Character Recognition) on scanned image
  2. Enhance image contrast
  3. Reveal text under black marker

This works because:

  • Scanned images capture ink bleed-through
  • OCR can read faint text
  • Image enhancement reveals hidden text

How to Properly Redact PDFs in C

Option 1: IronPDF True Redaction

IronPDF removes text permanently:

using IronPdf;
// Install via NuGet: Install-Package IronPdf

var pdf = PdfDocument.FromFile("sensitive-document.pdf");

// Redact specific text
pdf.RedactTextOnAllPages("123-45-6789"); // SSN
pdf.RedactTextOnAllPages("CONFIDENTIAL");

// Redact using regex (all SSNs)
pdf.RedactTextOnAllPages(@"\d{3}-\d{2}-\d{4}");

pdf.SaveAs("redacted-document.pdf");
Enter fullscreen mode Exit fullscreen mode

What this does:

  1. Finds text matching the pattern
  2. Deletes text from PDF structure
  3. Replaces with empty space
  4. Text cannot be recovered

Verification:

  1. Open redacted PDF
  2. Press Ctrl+A (Select All)
  3. Press Ctrl+C (Copy)
  4. Paste into Notepad
  5. Text is gone (shows "[REDACTED]" or empty space)

Option 2: Redact with Visual Placeholder

Replace redacted text with visible marker:

using IronPdf;
// Install via NuGet: Install-Package IronPdf

var pdf = PdfDocument.FromFile("document.pdf");

// Redact and replace with placeholder
pdf.RedactTextOnAllPages("SECRET", "[REDACTED]");

pdf.SaveAs("redacted.pdf");
Enter fullscreen mode Exit fullscreen mode

Result:

Before: "The password is SECRET and should not be shared."
After: "The password is [REDACTED] and should not be shared."
Enter fullscreen mode Exit fullscreen mode

Option 3: Redact Specific Pages

Redact only certain pages:

using IronPdf;
// Install via NuGet: Install-Package IronPdf

var pdf = PdfDocument.FromFile("multi-page-report.pdf");

// Redact page 3 only
pdf.Pages[2].RedactText("CLASSIFIED");

// Redact pages 5-10
for (int i = 4; i < 10; i++)
{
    pdf.Pages[i].RedactText(@"\d{3}-\d{2}-\d{4}"); // SSNs
}

pdf.SaveAs("partially-redacted.pdf");
Enter fullscreen mode Exit fullscreen mode

Option 4: Redact by Bounding Box (Area-Based)

Redact a specific area of the page:

using IronPdf;
// Install via NuGet: Install-Package IronPdf

var pdf = PdfDocument.FromFile("form.pdf");

// Redact rectangular area (x, y, width, height in points)
pdf.Pages[0].RedactArea(100, 200, 300, 50); // Redacts specific region

pdf.SaveAs("area-redacted.pdf");
Enter fullscreen mode Exit fullscreen mode

Use case: Forms where sensitive data is always in the same location.

Redacting Different Types of Sensitive Data

Social Security Numbers

using IronPdf;
// Install via NuGet: Install-Package IronPdf

var pdf = PdfDocument.FromFile("hr-records.pdf");

// Regex for SSN formats: 123-45-6789 or 123456789
pdf.RedactTextOnAllPages(@"\d{3}-?\d{2}-?\d{4}");

pdf.SaveAs("redacted-hr-records.pdf");
Enter fullscreen mode Exit fullscreen mode

Credit Card Numbers

using IronPdf;
// Install via NuGet: Install-Package IronPdf

var pdf = PdfDocument.FromFile("payment-records.pdf");

// Regex for credit card numbers (4-4-4-4 or 16 digits)
pdf.RedactTextOnAllPages(@"\d{4}[\s-]?\d{4}[\s-]?\d{4}[\s-]?\d{4}");

pdf.SaveAs("redacted-payments.pdf");
Enter fullscreen mode Exit fullscreen mode

Email Addresses

using IronPdf;
// Install via NuGet: Install-Package IronPdf

var pdf = PdfDocument.FromFile("communications.pdf");

// Regex for email addresses
pdf.RedactTextOnAllPages(@"[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}");

pdf.SaveAs("redacted-emails.pdf");
Enter fullscreen mode Exit fullscreen mode

Phone Numbers

using IronPdf;
// Install via NuGet: Install-Package IronPdf

var pdf = PdfDocument.FromFile("contact-list.pdf");

// Regex for US phone numbers
pdf.RedactTextOnAllPages(@"\(?\d{3}\)?[\s.-]?\d{3}[\s.-]?\d{4}");

pdf.SaveAs("redacted-phones.pdf");
Enter fullscreen mode Exit fullscreen mode

Compliance Requirements for Redaction

HIPAA (Healthcare)

Requirement: Permanent removal of Protected Health Information (PHI)

What must be redacted:

  • Patient names
  • Social Security numbers
  • Medical record numbers
  • Dates of service
  • Addresses, phone numbers

IronPDF approach:

using IronPdf;
// Install via NuGet: Install-Package IronPDF

var pdf = PdfDocument.FromFile("patient-record.pdf");

// Redact PHI
pdf.RedactTextOnAllPages("John Doe", "[PATIENT NAME]");
pdf.RedactTextOnAllPages(@"\d{3}-\d{2}-\d{4}", "[SSN]"); // SSN
pdf.RedactTextOnAllPages(@"\d{2}/\d{2}/\d{4}", "[DATE]"); // Dates

pdf.SaveAs("hipaa-compliant-redacted.pdf");
Enter fullscreen mode Exit fullscreen mode

Legal Discovery

Requirement: Redact privileged communications, trade secrets

Best practice:

  • Redact attorney-client communications
  • Remove confidential settlement amounts
  • Hide trade secret formulas
using IronPdf;
// Install via NuGet: Install-Package IronPdf

var pdf = PdfDocument.FromFile("discovery-document.pdf");

// Redact legal privilege markers
pdf.RedactTextOnAllPages("Attorney-Client Privileged");
pdf.RedactTextOnAllPages("Work Product");

// Redact dollar amounts
pdf.RedactTextOnAllPages(@"\$[\d,]+");

pdf.SaveAs("privilege-redacted.pdf");
Enter fullscreen mode Exit fullscreen mode

Government Classifications

Requirement: Remove classified markings and content

using IronPdf;
// Install via NuGet: Install-Package IronPdf

var pdf = PdfDocument.FromFile("classified-report.pdf");

// Redact classification markers
pdf.RedactTextOnAllPages("TOP SECRET");
pdf.RedactTextOnAllPages("CONFIDENTIAL");
pdf.RedactTextOnAllPages("FOR OFFICIAL USE ONLY");

pdf.SaveAs("declassified-report.pdf");
Enter fullscreen mode Exit fullscreen mode

How to Verify Redaction Worked

Test 1: Copy-Paste Test

  1. Open redacted PDF
  2. Press Ctrl+A (Select All)
  3. Press Ctrl+C (Copy)
  4. Paste into Notepad
  5. Verify redacted text is not present

Test 2: Search Test

  1. Open redacted PDF
  2. Press Ctrl+F (Find)
  3. Search for redacted text
  4. Verify "No results found"

Test 3: Metadata Inspection

Check PDF properties:

using IronPdf;
// Install via NuGet: Install-Package IronPdf

var pdf = PdfDocument.FromFile("redacted.pdf");
var metadata = pdf.MetaData;

Console.WriteLine($"Author: {metadata.Author}");
Console.WriteLine($"Title: {metadata.Title}");
Console.WriteLine($"Subject: {metadata.Subject}");

// Redact metadata if needed
pdf.MetaData.Author = "[REDACTED]";
pdf.MetaData.Title = "Redacted Document";

pdf.SaveAs("metadata-cleaned.pdf");
Enter fullscreen mode Exit fullscreen mode

Test 4: Text Extraction

using IronPdf;
// Install via NuGet: Install-Package IronPdf

var pdf = PdfDocument.FromFile("redacted.pdf");
var extractedText = pdf.ExtractAllText();

Console.WriteLine(extractedText);

// Verify sensitive text is not in extracted text
if (extractedText.Contains("SENSITIVE"))
{
    Console.WriteLine("WARNING: Redaction failed!");
}
Enter fullscreen mode Exit fullscreen mode

The Bottom Line: Don't Trust Visual Redaction

Adobe Acrobat's default redaction often fails because:

  • It uses visual covering (black boxes)
  • Original text remains in PDF structure
  • Text is copyable, searchable, extractable

Real-world consequences:

  • FBI and CIA documents exposed
  • Legal malpractice lawsuits
  • HIPAA violations
  • National security breaches

True redaction requires:

  1. Permanent deletion of text from PDF structure
  2. Verification via copy-paste test
  3. Metadata cleaning
  4. Compliance with HIPAA, legal discovery, classification requirements

IronPDF provides true redaction:

using IronPdf;
// Install via NuGet: Install-Package IronPdf

var pdf = PdfDocument.FromFile("sensitive.pdf");

// Permanently removes text (cannot be recovered)
pdf.RedactTextOnAllPages("CONFIDENTIAL");

pdf.SaveAs("truly-redacted.pdf");
Enter fullscreen mode Exit fullscreen mode

If sensitive data is in your PDFs, don't rely on visual covering. Use true redaction.

Learn more: IronPDF Redaction Guide


Written by Jacob Mellor, CTO at Iron Software. Jacob created IronPDF and leads a team of 50+ engineers building .NET document processing libraries.

Top comments (0)