Yuchen Lin

Posted on Feb 16

Building FlowLens-Web: A HAR-Driven Data-Flow Observatory for Tracking Research

#playwright #privacy #webdev #typescript

I wanted a practical answer to one question:

How do we measure web tracking signals in a way that is reproducible, explainable, and non-invasive?

This post walks through the approach, what we built, and what we learned from a 10-site batch run.

TL;DR

FlowLens-Web is a TypeScript CLI that:

records browser sessions with Playwright + HAR,
extracts identifier-like request signals,
scores evidence levels (L1-L5),
reports cross-domain reuse and cross-run persistence,
outputs Markdown + Mermaid summaries.

It is a research/measurement tool, not a blocker.

Architecture

Core stack:

Node.js + TypeScript
Playwright (Chromium)
tldts (eTLD+1 classification)
SHA-256 hashing for safe identifier matching

Pipeline:

run scripted browsing scenario
save HAR
parse entries + normalize request metadata
extract candidate identifier fields
compute reuse/persistence signals
assign evidence levels
generate reports (case, matrix, A/B, funnel, longitudinal)

Evidence Model

We use explicit confidence tiers:

L1: third-party domain observed
L2: identifier-like field observed
L3: repeated within run
L4: cross-domain hash reuse
L5: cross-run persistence

This keeps interpretation honest: higher level = stronger network evidence, not guaranteed ad-decision proof.

CLI Workflows

Matrix (multi-site)

npm run flowlens -- study-matrix \
  --sites https://www.google.com,https://www.youtube.com \
  --scenarios baseline,engaged,ad-click \
  --runs 3

A/B (causal contrast)

npm run flowlens -- study-ab \
  --url https://www.youtube.com \
  --control baseline \
  --treatment ad-click \
  --runs 3

Funnel (stage deltas)

npm run flowlens -- study-funnel \
  --url https://www.google.com \
  --query running+shoes \
  --runs 3

Longitudinal (stability over samples)

npm run flowlens -- study-longitudinal \
  --url https://www.wikipedia.org \
  --samples 7 \
  --runs 1

Full-Batch Findings (Current Run)

Batch design:

10 sites
3 scenarios
target 3 runs/scenario

Outcome:

9/10 sites produced complete scenario outputs
Amazon repeatedly failed under runtime constraints in this environment (timeouts/session closure), and was kept as explicit failed evidence

Pattern-level observations:

signal intensity varied strongly by site/scenario
deeper interaction stages often increased observed signal metrics
some content-centric cases remained low-signal across repeated runs

Why the Redaction Layer Matters

Raw tokens are not published.
Instead, FlowLens stores:

redacted preview
token length
stable hash for equality/reuse checks

That gives us reproducibility without leaking sensitive raw values.

What You Can Claim Responsibly

From this tooling and dataset, you can claim:

network-observed data-flow signals vary by context,
controlled behavior changes can shift measured signals,
reuse/persistence patterns are measurable in a repeatable way.

You cannot claim from network traces alone:

definitive platform-internal ad decision logic,
person-level identity resolution.

Engineering Notes

What worked well:

modular analysis pipeline
evidence-level abstraction for communication quality
matrix/funnel/A-B/longitudinal complement each other

What remains hard:

large-site reliability under fixed timeouts
anti-bot/session constraints
balancing coverage vs runtime cost

Read the Full Materials

Repository: https://github.com/yul761/FlowLens
Full-batch summary: data/reports/published/formal-v1-full-overall-summary.md
Academic-style article: data/reports/published/public-v1-academic-article.md

If You Want to Build on This

Next useful extensions:

stronger single-variable controls (consent, login, click-id toggles)
bootstrap confidence intervals on key deltas
cross-environment runs (device profile/region)
publication-grade data manifests

Closing

A lot of tracking debates are stuck between oversimplified claims and opaque internals.
A HAR-first, evidence-tier approach gives a practical middle path: measurable, repeatable, and honest about uncertainty.