Dark patterns manipulate users into actions they didn't intend — from hidden subscription traps to guilt-tripping opt-outs. What if you could automatically detect them? In this guide, we'll build a Python tool that scrapes websites and flags common UI/UX anti-patterns.
What Are Dark Patterns?
Dark patterns are deceptive design choices: confirmshaming ("No thanks, I don't want to save money"), hidden costs revealed at checkout, roach motels (easy to enter, impossible to leave), and forced continuity. The EU's Digital Services Act and California's CPPA now regulate these practices.
Architecture Overview
Our detector will:
- Crawl target pages and extract DOM structure
- Analyze text for manipulation patterns (confirmshaming, urgency)
- Detect visual tricks (hidden checkboxes, tiny unsubscribe links)
- Generate a dark pattern audit report
Setting Up the Scraper
import requests
from bs4 import BeautifulSoup
import re
API_KEY = "YOUR_SCRAPERAPI_KEY"
def scrape_page(url):
payload = {
"api_key": API_KEY,
"url": url,
"render": "true" # JavaScript rendering for dynamic UIs
}
response = requests.get(
"https://api.scraperapi.com", params=payload, timeout=60
)
return BeautifulSoup(response.text, "html.parser")
Using ScraperAPI handles JavaScript rendering — essential since dark patterns often rely on dynamic UI elements.
Detecting Confirmshaming
Confirmshaming uses guilt to push users away from opting out:
SHAME_PATTERNS = [
r"no\s*,?\s*i\s+don'?t\s+want",
r"i('?m|\s+am)\s+(not interested|fine without)",
r"no\s+thanks,?\s+i('?d|\s+would)\s+rather",
r"i\s+prefer\s+not\s+to",
r"i\s+don'?t\s+(need|like|care)",
]
def detect_confirmshaming(soup):
findings = []
for link in soup.find_all(["a", "button", "span"]):
text = link.get_text(strip=True).lower()
for pattern in SHAME_PATTERNS:
if re.search(pattern, text):
findings.append({
"type": "confirmshaming",
"text": link.get_text(strip=True),
"element": str(link)[:200],
"severity": "high"
})
return findings
Detecting Hidden Pre-Checked Boxes
Pre-checked checkboxes for newsletters or terms are a classic trick:
def detect_prechecked_boxes(soup):
findings = []
for checkbox in soup.find_all("input", {"type": "checkbox"}):
if checkbox.get("checked") is not None:
label = ""
label_el = soup.find("label", {"for": checkbox.get("id", "")})
if label_el:
label = label_el.get_text(strip=True)
if any(kw in label.lower() for kw in
["newsletter", "marketing", "partner", "third party", "agree"]):
findings.append({
"type": "pre_checked_consent",
"label": label,
"severity": "high"
})
return findings
Urgency and Scarcity Detection
Fake urgency ("Only 2 left!") pressures users into impulse decisions:
URGENCY_PATTERNS = [
r"only\s+\d+\s+(left|remaining|available)",
r"(offer|deal|sale)\s+(ends|expires)\s+(soon|today|in\s+\d+)",
r"\d+\s+people\s+(are\s+)?(viewing|watching|looking)",
r"limited\s+(time|stock|availability)",
r"act\s+(now|fast|quickly)",
r"don'?t\s+miss\s+(out|this)",
]
def detect_urgency(soup):
findings = []
page_text = soup.get_text()
for pattern in URGENCY_PATTERNS:
matches = re.finditer(pattern, page_text, re.IGNORECASE)
for match in matches:
findings.append({
"type": "urgency_scarcity",
"text": match.group(),
"severity": "medium"
})
return findings
Running the Full Audit
def audit_dark_patterns(url):
soup = scrape_page(url)
results = {
"url": url,
"confirmshaming": detect_confirmshaming(soup),
"prechecked_boxes": detect_prechecked_boxes(soup),
"urgency_scarcity": detect_urgency(soup),
}
total = sum(len(v) for v in results.values() if isinstance(v, list))
results["total_findings"] = total
results["risk_level"] = (
"high" if total > 5 else "medium" if total > 2 else "low"
)
return results
# Audit multiple e-commerce sites
targets = [
"https://example-store.com/checkout",
"https://example-saas.com/pricing",
]
for target in targets:
report = audit_dark_patterns(target)
print(f"{report['url']}: {report['total_findings']} findings ({report['risk_level']})")
Scaling with Proxy Rotation
When auditing many sites, rotate proxies to avoid blocks. ThorData provides residential proxies with geo-targeting — useful when dark patterns vary by region. ScrapeOps offers a proxy aggregator that auto-rotates between providers.
Extending the Detector
Consider adding:
- Cookie consent analysis: Detect reject buttons that are harder to find than accept
- Price comparison: Scrape the same product from different sessions to detect dynamic pricing
- Accessibility dark patterns: Hidden elements only visible to screen readers
- Screenshot comparison: Use Playwright to capture visual hierarchy differences
Legal and Ethical Notes
Dark pattern detection is a legitimate audit tool. Many compliance teams use similar techniques to ensure their own sites meet DSA and CPPA requirements. Always respect robots.txt and terms of service.
Dark patterns cost consumers billions annually. Building detection tools isn't just technically interesting — it contributes to a more transparent web. The techniques here can be adapted for compliance auditing, competitive analysis, or consumer advocacy.
Happy scraping!
Top comments (0)