Most lead scoring assumes you know who the visitor is. We wanted to score how likely an anonymous session is to convert without storing anything that identifies a person. No third party cookie, no enrichment vendor, no PII. Here is how we did it for our analytics tool, including the parts that did not work.
The goal
Given a live, anonymous session, output a number from 0 to 100 that says "pay attention to this one". The constraint: we never store who the person is. We score the behavior of a session id that rotates, not a profile.
The signals that actually correlate
We tested a lot of inputs against real conversion data. The ones that held up:
type SessionFeatures = {
returnSessions: number; // distinct prior sessions. strongest signal by far
viewedPricing: boolean;
pricingThenReturned: boolean; // viewed pricing, left, came back
docsDepth: number; // scroll + pages on docs/integrations
pathShape: "browse" | "direct" | "compare";
};
The ones we dropped because they were noise: raw time on page (a backgrounded tab destroys it), single session scroll depth on its own, and device type (much weaker than people claim).
The scoring function
For low volume sites we do not use a model. Weighted rules beat a model until you have enough data to train on:
function scoreSession(f: SessionFeatures): number {
let score = 0;
score += Math.min(f.returnSessions, 5) * 12; // return visits dominate
if (f.viewedPricing) score += 10;
if (f.pricingThenReturned) score += 20; // intent
score += Math.min(f.docsDepth, 100) * 0.2;
if (f.pathShape === "compare") score += 8;
return Math.min(Math.round(score), 100);
}
It is deliberately boring. A heuristic you can explain beats a black box you cannot, especially when a customer asks "why is this session a 78".
Running it at the edge on the event stream
Scoring runs on each event as it arrives, so the score updates in near real time instead of in a nightly batch:
export default {
async fetch(req: Request, env: Env) {
const event = await req.json<AnalyticsEvent>();
const features = await loadSessionFeatures(env, event.sessionId);
const updated = applyEvent(features, event);
const score = scoreSession(updated);
await env.SESSIONS.put(event.sessionId, JSON.stringify({ ...updated, score }), {
expirationTtl: 60 * 30, // sessions expire, nothing persists about a person
});
return new Response(JSON.stringify({ score }));
},
};
When we switch to a learned model
Only once a site has enough conversions to train on. Below that, a model overfits to a handful of events and is worse than the rules. We gate it on volume, not on whether it sounds impressive.
The honest limitations
- You are scoring a session, not a person. A high score might be a competitor doing research.
- No identity means no true cross device. We do not pretend otherwise.
- Low volume sites get scores barely better than a coin flip, and the product says so rather than faking confidence. ## Takeaway
You can get genuinely useful visitor scoring without cookies or PII, but start with explainable weighted rules, not a model. The model is the last step, not the first.
Disclosure: my co founder and I build a web analytics tool called Zenovay that does this. The approach above is what we actually shipped. See zenovay.com
Top comments (0)