las

Posted on Feb 2

Building a Two-Tower Recommendation System

#programming #tutorial #python #ai

I was using Algolia for search and recommendations on POSH, my ecommerce app. It worked great, but the bill kept growing. Every search request, every recommendation call—it all adds up when users are constantly browsing products.

So I built my own recommendation system using a two-tower model. It's the same approach YouTube and Google use: one tower represents products as vectors, the other represents users based on their behavior. To get recommendations, you just find products closest to the user's vector.

Here's how I built it.

Data Pipeline

Everything starts with user behavior. I use Firebase Analytics to track how users interact with products:

Product viewed — just browsing
Product clicked — showed interest
Added to cart — strong intent

Not all interactions are equal. Someone adding a product to cart is way more valuable than a passing view. So I weight them:

Event	Weight
View	0.1
Click	2.0
Add to cart	5.0

Product Vectorization

All my products live in Elasticsearch. To make recommendations work, I need to represent each product as a vector — a list of 384 numbers that captures what the product is about.

I use the all-MiniLM-L6-v2 model from Sentence Transformers. It's fast, lightweight, and works well for semantic similarity.

For each product, I combine its attributes into a single text string:

This includes:

Product name
Merchant name
Category hierarchy (parent → category → subcategory)
Color
Price tier (budget / mid-range / premium / luxury)

The model turns this text into a 384-dimensional vector. Products with similar attributes end up close together in vector space — a blue Nike sneaker will be near other blue sneakers, other Nike products, and other premium shoes.

These vectors get stored back in Elasticsearch as a dense_vector field, ready for similarity search.

User Tower Architecture

This is the core of the system. The user tower takes someone's interaction history and outputs a single vector that represents their preferences.

Input: up to 20 recent interactions, each with:

The product's vector (384 dims)
The interaction type (view, click, or add-to-cart)

Output: one user vector (384 dims) that lives in the same space as product vectors

How it works

The model combines each product vector with an embedding for the interaction type. A clicked product gets different treatment than a viewed one.

Then it runs through a multi-head attention layer — this lets the model figure out which interactions matter most. Maybe that one add-to-cart from yesterday is more important than ten views from last week.

I also add recency decay. Newer interactions get higher weight. Someone's taste from yesterday matters more than what they looked at two weeks ago.

Finally, everything gets pooled into a single vector and normalized. This user vector now sits in the same 384-dimensional space as all the products.

Training

I trained the model using contrastive learning. For each user:

Positive: the next product they actually interacted with
Negatives: 10 random products they didn't interact with

The model learns to push the user vector closer to products they'll engage with, and away from ones they won't.

Real-Time Updates

Training the model is a one-time thing. But user preferences change constantly — someone might discover a new brand or shift from sneakers to boots. The system needs to keep up.

I use AWS SQS to handle this. When a user interacts with a product, Firebase sends an event, and a message lands in my queue:

{
  "customer_id": 12345,
  "product_id": 5678,
  "event_name": "product_clicked"
}

An SQS consumer picks it up and:

Fetches the product's vector from Elasticsearch
Loads the user's recent interaction history
Runs it through the trained user tower model
Saves the new user vector back to Elasticsearch

The whole thing takes milliseconds. By the time the user scrolls to the next page, their recommendations are already updated.

I also prune old interactions — anything older than 2 days gets dropped. This keeps the model focused on recent behavior, not what someone browsed months ago.

Recommendations with Cosine Similarity

Now the fun part — actually recommending products.

Both user vectors and product vectors live in the same 384-dimensional space. To find relevant products, I just look for the ones closest to the user's vector.

When a user browses products, the API checks if they have a stored vector. If they do, Elasticsearch uses script_score to rank products by cosine similarity:

script: {
  source: "cosineSimilarity(params.user_vector, 'product_vector') + 1.0",
  params: { user_vector: userVector }
}

The + 1.0 shifts scores to positive range since cosine similarity can be negative.

If the user has no vector yet (new user, not enough interactions), it falls back to default sorting — popularity score and recency. Same goes if they explicitly sort by price.

The result: logged-in users with interaction history get a personalized feed. Everyone else still gets a sensible default. No hard-coded limits on recommendations — it works with the existing pagination, just reordered by relevance to that user.

Results & Learnings

I'll be honest — I'm not a data scientist. This was my first time building anything like this. I just knew Algolia was too expensive and figured there had to be a way to do it myself.

Turns out there was.

I self-hosted everything — Elasticsearch, the PyTorch model, the SQS consumers. No managed ML services, no third-party recommendation APIs. Just my own infrastructure.

An unexpected bonus: latency dropped. When everything runs on the same private network, there's no round-trip to external APIs. My app server talks to Elasticsearch over the local subnet — way faster than hitting Algolia's servers.

Since launching the two-tower model:

40% increase in app orders
10% increase in user retention

Users are finding products they actually want, and they're coming back more often.

What's next

The model works, but there's room to improve:

More events — adding product_favorited, product_shared, and product_purchased to capture stronger intent signals
Product labels — tagging products with attributes like "vintage", "handmade", "streetwear" and using those labels to fine-tune the model

Takeaway

You don't need a machine learning team to build personalized recommendations. The two-tower architecture is well-documented, PyTorch is approachable, and tools like Elasticsearch and SQS handle the infrastructure. If your recommendation costs are eating into your margins, it might be worth building your own.

If you've built something similar or have suggestions to improve this approach, I'd love to hear from you.