Privacy Policies Are Designed to Be Unread. I Built a Tool to Make Them Honest.

#privacy #python #ethics #opensource

Nobody reads privacy policies. That is the point.

The average privacy policy is 4,000 words. It takes 18 minutes to read. Researchers estimate that if Americans actually read every privacy policy for every service they use, it would consume 76 work days per year.

Companies know this. The length is not accidental. The legal jargon is not accidental. The buried clauses about selling your data to third parties, about using your content to train AI models, about sharing your location data with "partners" — none of it is accidental.

You click "I agree" because you have no real choice. The alternative is not using the service.

But what if you could get the plain-English version in ten seconds?

I built consentmap to do exactly that.

What consentmap does

consentmap fetches any website's privacy policy, runs it through a rule-based pattern matcher, and outputs a structured plain-English summary. No LLM required. No data leaves your machine except the HTTP request to fetch the policy.

consentmap scan spotify.com

Output:

CONSENTMAP REPORT — spotify.com
================================
RISK SCORE: HIGH (87)

Data Collection
  [CRITICAL] Location data collected
  [CRITICAL] Device identifiers collected
  [WARNING]  Usage analytics collected

Data Sharing
  [CRITICAL] Data sold to third parties
  [CRITICAL] Advertising partners mentioned
  [WARNING]  Data broker relationships

AI Training
  [CRITICAL] Content used to train AI models
  [WARNING]  Vague "improve our services" language

Your Rights
  [OK] Opt-out available
  [OK] Deletion rights mentioned
  [OK] GDPR rights acknowledged

Run 'consentmap scan spotify.com --json' for machine-readable output.

The scoring system

Risk is scored numerically. Each finding adds or subtracts points:

Level	Score range
LOW	0–30
MEDIUM	31–70
HIGH	71–120
CRITICAL	121+

Critical findings (data selling, AI training, biometric collection) add 40 points each. Warnings add 20. User rights — deletion rights, opt-out mechanisms, data portability — reduce the score by 10 each. A company that collects a lot but gives you real control scores lower than one that collects the same amount and makes opt-out impossible.

The compare feature

The most interesting use is side-by-side comparison:

consentmap compare spotify.com netflix.com

This produces a table showing both companies across every category. You can see at a glance which one sells data and which one does not, which one trains AI on your content and which one does not.

I ran this on a dozen streaming services. The differences are larger than you would expect between companies that feel similar.

How it works: pattern matching, not AI

consentmap does not use an LLM to interpret policies. It uses curated regex patterns for specific phrases that reliably signal data practices:

"we sell" / "we may sell" / "sale of personal information" — data selling
"train" / "fine-tune" / "machine learning model" combined with "your content" — AI training
"biometric" / "face recognition" / "fingerprint" — biometric collection
"right to delete" / "right to erasure" / "opt out" — user rights

This approach has a clear limitation: it can miss novel phrasing. A lawyer who writes "we may leverage your inputs to enhance our intelligent systems" instead of "we use your data to train AI" would evade detection. The README documents this honestly.

But for the major companies with standard legal templates, pattern matching catches the important clauses reliably — and it does so without sending your policy text to any third-party API.

V1 limitation

consentmap does not execute JavaScript. Some companies render their privacy policies entirely via JS — in those cases, you may get incomplete results. The tool will tell you when this happens. A future version will add a headless browser option for JS-heavy policies.

Install

pip install consentmap

# Scan a domain
consentmap scan spotify.com

# Export as JSON
consentmap scan netflix.com --json

# Compare two companies
consentmap compare spotify.com netflix.com

# Scan a specific URL
consentmap scan --url https://example.com/privacy-policy