gyani

Posted on May 15 • Originally published at byvibration.com

A dating algorithm that physically cannot read photos (and why I wrote it that way)

#webdev #discuss #typescript #architecture

I have been writing a connection app for a year. Last week I open-sourced the matching engine, and the only design choice I want to walk through is the one that took the longest to talk myself into: the matcher does not have access to photos. Not "it ignores them." Not "it deprioritizes them." It cannot see them. The TypeScript build fails if you try.

If you only want the punchline, here it is.

// soulmate-core/src/rank.ts
export function rank(viewer: Profile, candidate: Profile): number;

type Profile = {
  prompts: PromptAnswers;       // five short text answers
  voice: VoiceTranscript;       // ~30 sec, kept as text
  intent: Intent;               // friendship | relationship | community
  meta: ProfileMeta;            // age band, city, language, etc.
};
// no photo field. anywhere on this type.

The image bytes live in a different table, behind a different read path, behind a mutualVibe boolean. The function above has no reference to that table and no way to obtain one through normal app wiring. The constraint is enforced by the compiler.

The repo is at github.com/donnowyu/soulmate-core if you want to read along.

Why type-level, not flag-level

The natural shape of this is a feature flag. if (allow_photo_in_ranking) { ... }. Several products built on this shape. I think it is the wrong shape. Three reasons.

Flags get flipped by people who weren't in the room when the principle was set. A future engineer, looking at the engagement dashboard on a tired evening, will propose a "secondary signal" A/B test. They will be right that the metric will move. They will be wrong that what is being measured is what we said we cared about. A flag does not survive that conversation. A type signature does.
The constraint should live in the artifact, not the documentation. A README that says "do not use photos in ranking" is a memo. A type that has no photo field is a build error. Banks do not enforce referential integrity with memos.
It is honest in a way I can verify in public. The repo is open. You can look at the entry-point type and convince yourself in 60 seconds. You do not have to take my word for anything.

The cost of doing it this way

I will not pretend this was free.

The most expensive part was the data model. I had to design the schema so that the photo entity has its own service, its own access control, its own read path. The image upload pipeline never returns to the matching service. The "show me a face" step is a separate request, gated server-side on the existence of a mutualVibe row keyed by both user IDs. That is not a refactor you do in an afternoon.

The second cost was deciding what Profile should contain so that ranking still works. I tried a lot of things. The current shape (five prompts plus a transcribed voice clip plus intent metadata) is the smallest set I found that produces matches I can defend on inspection. Most of a year was spent reducing it to that.

The third cost is a soft one. There is a class of user who, on the existing apps, sorts mostly by face. They will look at this product and bounce. That is fine. They were not the users I was trying to find.

The trick of the embedding

The text answers and the voice transcript get concatenated into a single document per user. That document is embedded into a 1536-dim vector. Ranking is cosine similarity over those vectors, with two soft rerankers (ideology distance, shared-passion overlap) breaking ties.

This is not exotic. The trick is not in the math. The trick is in the input. By construction, the model has never seen a pixel. By construction, the model has no learned latent dimension that correlates with attractiveness, because nothing in the training distribution ever encoded one. The rerank loop is small enough to read.

// rerank pseudocode
const baseline = cosine(viewerEmb, candidateEmb);
const ideologyPenalty = distance(viewer.ideology, candidate.ideology);
const passionBoost   = jaccard(viewer.passions, candidate.passions);
return baseline - 0.15 * ideologyPenalty + 0.10 * passionBoost;

You can argue with the coefficients. I have. The coefficients are not the point of the post.

Why I am writing this on Dev.to

Because the type-system argument is the part of the project that is interesting to people who write code for a living, and because most of the press around "no photo dating apps" handles the question at the marketing layer, where it is much less interesting. The interesting question is whether the constraint is structural, and structural constraints are something a dev audience can read in source. I wanted that audience to be able to verify the claim without me in the room.

If you want the long-form essay version of this argument, it is on the product site at byvibration.com/essays/why-matching-layer-is-physically-blind. If you want the code, the repo link is at the top. If you want to push back on any of the choices, the comments are open and I will be in them.

I work on byvibration. The matching engine is open source. I am writing about it here because I think the type-signature framing is a transferable idea: constraints you want to honor across a long time should be expressed in the artifact, not the team's memory.---
title: A dating algorithm that physically cannot read photos (and why I wrote it that way)
published: false
canonical_url: https://byvibration.com/essays/why-matching-layer-is-physically-blind

tags: typescript, webdev, discuss, architecture

If you only want the punchline, here it is.

// soulmate-core/src/rank.ts
export function rank(viewer: Profile, candidate: Profile): number;

type Profile = {
  prompts: PromptAnswers;       // five short text answers
  voice: VoiceTranscript;       // ~30 sec, kept as text
  intent: Intent;               // friendship | relationship | community
  meta: ProfileMeta;            // age band, city, language, etc.
};
// no photo field. anywhere on this type.

The repo is at github.com/donnowyu/soulmate-core if you want to read along.

Why type-level, not flag-level

The natural shape of this is a feature flag. if (allow_photo_in_ranking) { ... }. Several products built on this shape. I think it is the wrong shape. Three reasons.

Flags get flipped by people who weren't in the room when the principle was set. A future engineer, looking at the engagement dashboard on a tired evening, will propose a "secondary signal" A/B test. They will be right that the metric will move. They will be wrong that what is being measured is what we said we cared about. A flag does not survive that conversation. A type signature does.
The constraint should live in the artifact, not the documentation. A README that says "do not use photos in ranking" is a memo. A type that has no photo field is a build error. Banks do not enforce referential integrity with memos.
It is honest in a way I can verify in public. The repo is open. You can look at the entry-point type and convince yourself in 60 seconds. You do not have to take my word for anything.

The cost of doing it this way

I will not pretend this was free.

The trick of the embedding

// rerank pseudocode
const baseline = cosine(viewerEmb, candidateEmb);
const ideologyPenalty = distance(viewer.ideology, candidate.ideology);
const passionBoost   = jaccard(viewer.passions, candidate.passions);
return baseline - 0.15 * ideologyPenalty + 0.10 * passionBoost;

You can argue with the coefficients. I have. The coefficients are not the point of the post.

DEV Community

A dating algorithm that physically cannot read photos (and why I wrote it that way)

Why type-level, not flag-level

The cost of doing it this way

The trick of the embedding

Why I am writing this on Dev.to

tags: typescript, webdev, discuss, architecture

Why type-level, not flag-level

The cost of doing it this way

The trick of the embedding

Why I am writing this on Dev.to

Top comments (0)