DEV Community

Mohammad Waseem
Mohammad Waseem

Posted on

Securing Test Environments: Preventing PII Leaks with Go in Enterprise Testing

In enterprise software development, protecting Personally Identifiable Information (PII) during testing is critical to ensure compliance with regulations such as GDPR and CCPA. Leaking PII in test environments not only jeopardizes customer trust but can also lead to hefty fines. As a Lead QA Engineer, I overcame this challenge by implementing a robust, efficient solution using Go, leveraging its performance, concurrency handling, and simplicity.

Identifying the Issue

Many organizations use synthetic data or anonymized datasets; however, during test setup or logs, PII can inadvertently leak, especially in large-scale, distributed environments. The primary goal was to intercept data at the source, sanitize it, and prevent leaks without introducing significant latency or complexity.

Designing a Go-based Filtering Solution

Go's native support for concurrent processing makes it an ideal choice for building a high-performance, low-overhead filter. The core idea was to create a middleware that intercepts data streams, scans for PII patterns, and masks or removes sensitive information before it proceeds further.

Data Stream Interception

Assuming a microservice architecture with REST APIs, we implemented middleware that wraps HTTP handlers. This middleware captures request payloads and responses for inspection.

package main

import (
    "net/http"
    "io/ioutil"
    "log"
    "regexp"
)

// PII regex patterns (e.g., emails, phone numbers)
var piiRegex = []*regexp.Regexp{
    regexp.MustCompile(`[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}`), // email
    regexp.MustCompile(`\b\d{3}[-.]?\d{3}[-.]?\d{4}\b`), // phone
}

// Middleware to sanitize PII
func piiSanitizer(next http.Handler) http.Handler {
    return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        bodyBytes, err := ioutil.ReadAll(r.Body)
        if err != nil {
            log.Println("Error reading request body:", err)
            http.Error(w, "Invalid request", http.StatusBadRequest)
            return
        }
        // Sanitize data
        sanitizedBody := sanitizePII(string(bodyBytes))
        r.Body = ioutil.NopCloser(strings.NewReader(sanitizedBody))
        next.ServeHTTP(w, r)
    })
}

// Function to mask PII
func sanitizePII(data string) string {
    for _, regex := range piiRegex {
        data = regex.ReplaceAllStringFunc(data, func(matched string) string {
            return maskPII(matched)
        })
    }
    return data
}

// Mask function
func maskPII(matched string) string {
    return "[REDACTED]"
}
Enter fullscreen mode Exit fullscreen mode

Offline Data Sanitization

For datasets loaded into test environments, we developed a Go CLI tool that scans CSV, JSON, or database dumps, applying the same pattern-matching masking process.

package main

import (
    "encoding/csv"
    "flag"
    "fmt"
    "log"
    "os"
    "regexp"
    "strings"
)

var piiPatterns = []*regexp.Regexp{
    regexp.MustCompile(`[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}`),
    regexp.MustCompile(`\b\d{3}[-.]?\d{3}[-.]?\d{4}\b`),
}

func main() {
    inputFile := flag.String("input", "", "Input dataset file")
    outputFile := flag.String("output", "", "Output sanitized dataset")
    flag.Parse()

    if *inputFile == "" || *outputFile == "" {
        log.Fatal("Please specify input and output files")
    }

    file, err := os.Open(*inputFile)
    if err != nil {
        log.Fatal("Error opening file:", err)
    }
    defer file.Close()

    reader := csv.NewReader(file)
    records, err := reader.ReadAll()
    if err != nil {
        log.Fatal("Error reading dataset:", err)
    }

    for i, record := range records {
        for j, field := range record {
            records[i][j] = sanitizeField(field)
        }
    }

    out, err := os.Create(*outputFile)
    if err != nil {
        log.Fatal("Error creating output file:", err)
    }
    defer out.Close()

    writer := csv.NewWriter(out)
    writer.WriteAll(records)
    if err := writer.Error(); err != nil {
        log.Fatal("Error writing sanitized dataset:", err)
    }
}

func sanitizeField(field string) string {
    for _, regex := range piiPatterns {
        field = regex.ReplaceAllStringFunc(field, func(matched string) string {
            return "[REDACTED]"
        })
    }
    return field
}
Enter fullscreen mode Exit fullscreen mode

Performance and Compliance

Go’s concurrency model ensures that the sanitization process is scalable and fast, even under high throughput scenarios typical in enterprise environments. By integrating real-time data filtering and offline sanitization tools, organizations can significantly reduce the risk of PII leaks.

Conclusion

A proactive, code-embedded approach to PII masking using Go not only meets compliance but also enhances trust and data security. With minimal performance overhead, this method enables QA teams to confidently use test environments without risking sensitive data leaks.


This strategy exemplifies how leveraging Go's strengths in concurrent processing and simplicity can deliver robust, enterprise-grade data protection solutions.


🛠️ QA Tip

I rely on TempoMail USA to keep my test environments clean.

Top comments (0)