DEV Community

Mohammad Waseem
Mohammad Waseem

Posted on

Securing Legacy Test Environments: Preventing PII Leaks with TypeScript

In large organizations with legacy codebases, test environments often harbor sensitive Personally Identifiable Information (PII), increasing the risk of leaks that can lead to privacy violations and compliance issues. Addressing this challenge requires a strategic approach that can be integrated into the existing infrastructure without overhauling the entire system.

As a senior developer, I recently tackled this problem by implementing a runtime PII detection and masking layer in a TypeScript environment. While legacy codebases tend to be heterogeneous and sometimes lack type safety, TypeScript offers a robust way to augment code with type guards, static analysis, and runtime checks.

The first step involved identifying common PII patterns (such as email addresses, phone numbers, and social security numbers). Instead of relying solely on static code review, I decided to introduce a dynamic interception mechanism that can scan data before it leaves the application or is written to test logs.

Here's a foundational example of how to implement a PII detector for JSON data structures:

// PII detection regex patterns
const patterns = {
  email: /[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}/,
  ssn: /\d{3}-\d{2}-\d{4}/,
  phone: /\+?\d{1,3}?[-.\s]?\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}/
};

function containsPII(value: any): boolean {
  if (typeof value === 'string') {
    return Object.values(patterns).some((regex) => regex.test(value));
  }
  if (typeof value === 'object' && value !== null) {
    return Object.values(value).some(containsPII);
  }
  return false;
}

// Example data object
const sampleData = {
  name: "John Doe",
  email: "john.doe@example.com",
  phone: "+1-555-123-4567",
  address: "123 Main St"
};

// Check for PII
console.log(containsPII(sampleData)); // true
Enter fullscreen mode Exit fullscreen mode

This function recursively scans JSON-like objects for patterns that match PII indicators. When a match is found, we can implement masking or alerting.

To mitigate exposure, I introduced an automatic masking function:

function maskPII(value: any): any {
  if (typeof value === 'string') {
    if (patterns.email.test(value)) {
      return value.replace(/([a-zA-Z0-9._%+-]+)@([a-zA-Z0-9.-]+)\.[a-zA-Z]{2,}/, '***@****');
    }
    if (patterns.ssn.test(value)) {
      return '***-**-****';
    }
    if (patterns.phone.test(value)) {
      return '***-***-****';
    }
  }
  if (typeof value === 'object' && value !== null) {
    const maskedObject: any = {};
    for (const key in value) {
      maskedObject[key] = maskPII(value[key]);
    }
    return maskedObject;
  }
  return value;
}

const sanitizedData = maskPII(sampleData);
console.log(sanitizedData);
// Output:
// { name: 'John Doe', email: '***@****', phone: '***-***-****', address: '123 Main St' }
Enter fullscreen mode Exit fullscreen mode

Integrating these checks into the test suite involves wrapping data serialization functions or API response handlers. For legacy systems, adding a middleware or interceptor that inspects data before logging or testing can drastically reduce the risk of leaking PII.

Additionally, TypeScript’s type assertions and strict modes help ensure that data structures adhere to expected formats, catching potential issues at compile time. It's crucial to augment static safety with runtime checks, especially in legacy scenarios where input data may be unpredictable.

In conclusion, by leveraging TypeScript's capabilities to perform dynamic and static analysis, combined with pattern matching for PII, organizations can effectively prevent sensitive data leaks during testing. This approach harmonizes with existing systems without requiring extensive rewrites, making it a practical security measure for legacy codebases.

Implementing such runtime safeguards can significantly enhance compliance posture, reduce risk, and foster a culture of privacy-aware development.

Tags: security, typescript, legacy, pii, datahandling


🛠️ QA Tip

To test this safely without using real user data, I use TempoMail USA.

Top comments (0)