DEV Community

KILLALLSKYWALKER
KILLALLSKYWALKER

Posted on

The Phantom Records: Hiding Sensitive Data for Debugging

When I was debugging a patch for our job portal, I found out the only way to be sure it worked was to run it against production like data.At that time our fake seeders also not solid and don't have kind of data that messy and edge cased that real user can create .

But I also couldn’t just pull raw production into my laptop. That database was full of personal information. If any of it slipped into logs or screenshots, it would be a really big problem .

Thanks PostgreSQL Anonymizer , you make my life easier at that time and till today :)

What is PostgreSQL Anonyimzer

An extension to mask or replace personally identifiable information (PII) or commercially sensitive data from a Postgres database by using declarative approach of anonymization .

You just need to ensure the masking rules implemented directly inside the database schema.

Masking Method

Anonymous Dumps : Simply export the masked data into an SQL file
Static Masking : Remove the PII according to the rules
Dynamic Masking : Hide PII only for the masked users
Masking Views : Build dedicated views for the masked users
Masking Data Wrappers : Apply masking rules on external data

You can read more about this in the documentation PostgreSQL Anonymizer .

Demo

First run this postgresSQL with this . For demo we just use this image to make thing simple . But if in real environment you can follow the step in the doc based on how you host your postgreSQL

docker run -d -e POSTGRES_PASSWORD=password -p 6543:5432 registry.gitlab.com/dalibo/postgresql_anonymizer
Enter fullscreen mode Exit fullscreen mode

Once the container already up , run this to create a table and activate the extension

CREATE TABLE public.candidates (
    id BIGSERIAL PRIMARY KEY,
    first_name VARCHAR(100) NOT NULL,
    last_name VARCHAR(100) NOT NULL,
    email VARCHAR(150) UNIQUE NOT NULL,
    phone_number VARCHAR(20),
    date_of_birth DATE,
    national_id VARCHAR(50),   
    address TEXT,
    city VARCHAR(100),
    state VARCHAR(100),
    postal_code VARCHAR(20),
    country VARCHAR(100),
    created_at TIMESTAMP DEFAULT NOW(),
    updated_at TIMESTAMP DEFAULT NOW()
);
Enter fullscreen mode Exit fullscreen mode

Then insert some data that containing PII

INSERT INTO public.candidates 
(first_name, last_name, email, phone_number, date_of_birth, national_id, address, city, state, postal_code, country) 
VALUES
('Aisha', 'Rahman', 'aisha.rahman@example.com', '+60123456789', '1990-04-15', '900415-10-1234', '12 Jalan Bukit Bintang', 'Kuala Lumpur', 'Wilayah Persekutuan', '55100', 'Malaysia'),
('John', 'Tan', 'john.tan@example.com', '+6598765432', '1985-11-22', '851122-08-5678', '55 Orchard Road', 'Singapore', 'Singapore', '238880', 'Singapore'),
('Mei', 'Chen', 'mei.chen@example.com', '+60179876543', '1993-06-09', '930609-14-1122', '8 Taman Sutera', 'Johor Bahru', 'Johor', '80250', 'Malaysia'),
('Arjun', 'Patel', 'arjun.patel@example.com', '+919812345678', '1992-02-28', 'AADHAR1234567890', '22 MG Road', 'Bengaluru', 'Karnataka', '560001', 'India'),
('Sarah', 'Lim', 'sarah.lim@example.com', '+60123459876', '1995-09-05', '950905-05-3344', '19 Jalan Gasing', 'Petaling Jaya', 'Selangor', '46000', 'Malaysia');
Enter fullscreen mode Exit fullscreen mode

Okay now you have the data to test . Now lets ensure the extension is there and enable it and create user for anonymize .

ALTER DATABASE postgres SET session_preload_libraries TO 'anon';

CREATE EXTENSION IF NOT EXISTS anon;

SELECT anon.init();

CREATE ROLE anonymize_user LOGIN PASSWORD 'password';
ALTER ROLE anonymize_user SET anon.transparent_dynamic_masking = True;
SECURITY LABEL FOR anon ON ROLE anonymize_user IS 'MASKED';

Enter fullscreen mode Exit fullscreen mode

Now you can set the rule for masking , i just add one for sample , you can add depending on what you want to mask .

SECURITY LABEL FOR anon ON COLUMN candidates.last_name IS 'MASKED WITH FUNCTION anon.dummy_last_name()';
Enter fullscreen mode Exit fullscreen mode

Now the final step just use the pg_dump

pg_dump postgres --user anonymize_user --no-security-labels --exclude-extension="anon" --file=postgres_anonymized.sql
Enter fullscreen mode Exit fullscreen mode

Now you will get the dump with mask data based on your rules . On top of that if you using anonymize_user , you will see the masking data instead the real data . There's a lot of thing and the way you can use this extension . Give it a try . I will write another usage in my previous work for this in next article .

Top comments (0)