<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Carles</title>
    <description>The latest articles on DEV Community by Carles (@carles_2b45749f26609ec400).</description>
    <link>https://dev.to/carles_2b45749f26609ec400</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3879272%2F62169c7d-d420-4893-8d2a-ffaf7f56a245.jpg</url>
      <title>DEV Community: Carles</title>
      <link>https://dev.to/carles_2b45749f26609ec400</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/carles_2b45749f26609ec400"/>
    <language>en</language>
    <item>
      <title>How We Rerank 565K Products Using Deep Learning</title>
      <dc:creator>Carles</dc:creator>
      <pubDate>Tue, 14 Apr 2026 20:28:52 +0000</pubDate>
      <link>https://dev.to/carles_2b45749f26609ec400/how-we-rerank-565k-products-using-deep-learning-oen</link>
      <guid>https://dev.to/carles_2b45749f26609ec400/how-we-rerank-565k-products-using-deep-learning-oen</guid>
      <description>&lt;p&gt;At &lt;a href="https://es.seestocks.com" rel="noopener noreferrer"&gt;SeeStocks&lt;/a&gt;, we run a price comparison engine that tracks over 565,000 products across multiple retailers in Spain. One of our biggest challenges? Making sure that when a user lands on a category page, the most relevant products appear first — not just sorted by price, but ranked by actual relevance to what they're looking for.&lt;br&gt;
This is the story of how we built a multi-stage reranking pipeline using deep learning, and what we learned along the way.&lt;br&gt;
The Problem With Naive Sorting&lt;br&gt;
Early on, our category pages were simple: pull all products tagged under a category, sort by price, done. But this quickly broke down:&lt;/p&gt;

&lt;p&gt;A "pepper" category would surface hot sauce bottles before actual peppercorns&lt;br&gt;
A "tool bags" page showed backpacks that happened to be in the same parent taxonomy&lt;br&gt;
Products with misleading titles would float to the top simply because they were cheap&lt;/p&gt;

&lt;p&gt;We needed something smarter than keyword matching and price sorting.&lt;br&gt;
Our Approach: A Three-Stage Pipeline&lt;br&gt;
We settled on a multi-stage architecture that balances speed with accuracy:&lt;br&gt;
Stage 1: Candidate Retrieval (Fast, Broad)&lt;br&gt;
We maintain a vector index of all products using embeddings from a fine-tuned vision-language model. When a user hits a category page, we first retrieve a broad set of candidates using approximate nearest neighbor search against the category centroid — a pre-computed embedding that represents the "ideal" product in that category.&lt;br&gt;
This stage is optimized for recall over precision. We intentionally cast a wide net, pulling 3-5x more candidates than we'll ultimately display.&lt;br&gt;
Stage 2: Cross-Encoder Reranking (Slow, Precise)&lt;br&gt;
The candidates from Stage 1 are then passed through a cross-encoder model that scores each product against the category context. Unlike the bi-encoder in Stage 1 (which computes embeddings independently), the cross-encoder processes the product and category jointly, capturing fine-grained interactions.&lt;br&gt;
We encode several signals:&lt;/p&gt;

&lt;p&gt;Visual similarity: How well does the product image match the expected visual prototype for this category?&lt;br&gt;
Taxonomic distance: How far is the product's assigned category from the target category in our taxonomy tree?&lt;br&gt;
Title-category coherence: Does the product title semantically align with the category name and its parent path?&lt;br&gt;
Price distribution fit: Is the product priced within a reasonable range for this category, or is it a statistical outlier?&lt;/p&gt;

&lt;p&gt;Each signal produces a score, and we combine them using learned weights from a lightweight gradient-boosted model trained on human relevance judgments.&lt;br&gt;
Stage 3: Business Rules &amp;amp; Diversity&lt;br&gt;
The final stage applies hard constraints:&lt;/p&gt;

&lt;p&gt;Deduplicate near-identical products from different retailers (keeping the cheapest)&lt;br&gt;
Ensure retailer diversity (no single store dominates the top positions)&lt;br&gt;
Apply freshness decay (products not seen in recent crawls get penalized)&lt;br&gt;
Enforce minimum confidence thresholds&lt;/p&gt;

&lt;p&gt;The Taxonomy Challenge&lt;br&gt;
Our product taxonomy follows Google's Shopping taxonomy with 5,700+ categories organized in a deep hierarchy. One thing we learned: flat classification doesn't work for ecommerce at this scale.&lt;br&gt;
A product image of black ground pepper could reasonably match:&lt;/p&gt;

&lt;p&gt;Food &amp;gt; Condiments &amp;gt; Spices &amp;gt; Pepper ✅&lt;br&gt;
Food &amp;gt; Condiments &amp;gt; Spices &amp;gt; Seasoning Mixes ❌ (close, but wrong)&lt;br&gt;
Food &amp;gt; Condiments ❌ (too broad)&lt;/p&gt;

&lt;p&gt;We built what we call a hierarchical disambiguation layer: when the model is uncertain between sibling categories, we generate discriminative text prompts that highlight the differences between them and re-score. This reduced misclassification between sibling categories by 34%.&lt;br&gt;
What We Run in Production&lt;br&gt;
The full pipeline runs on a single GPU server:&lt;/p&gt;

&lt;p&gt;Candidate retrieval: ~15ms per category (pre-computed index)&lt;br&gt;
Cross-encoder reranking: ~120ms for 200 candidates&lt;br&gt;
Business rules: ~5ms&lt;br&gt;
Total latency: under 200ms end-to-end&lt;/p&gt;

&lt;p&gt;We pre-compute rankings for our 1,347 active category pages on a nightly batch job, so users never wait for the ML pipeline — they get served from cache.&lt;br&gt;
Results&lt;br&gt;
After deploying the reranking pipeline:&lt;/p&gt;

&lt;p&gt;Product relevance score (human-evaluated): 71% → 94%&lt;br&gt;
Category pages with misclassified products in top 10: 23% → 3%&lt;br&gt;
User engagement (click-through to retailer): +41%&lt;br&gt;
Bounce rate on category pages: -28%&lt;/p&gt;

&lt;p&gt;Lessons Learned&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Your taxonomy is your moat. We spent more time curating our taxonomy tree and training discriminators between confusing categories than on any model architecture decision.&lt;/li&gt;
&lt;li&gt;Embeddings are just the beginning. The bi-encoder gets you 80% of the way there. The last 20% — which is what users actually notice — comes from cross-encoder reranking and business logic.&lt;/li&gt;
&lt;li&gt;Batch &amp;gt; real-time for this use case. We initially tried to run the full pipeline on every request. Switching to nightly batch computation with cache cut our GPU costs by 90% and simplified everything.&lt;/li&gt;
&lt;li&gt;Outlier detection matters more than ranking. Removing the wrong products from a category page had more impact than perfecting the order of the right ones.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;We're continuing to iterate on this system. Next on our roadmap: using multimodal LLMs for attribute extraction (color, material, size) to enable smarter filtering within categories.&lt;br&gt;
If you're working on similar problems in ecommerce search or product categorization, I'd love to hear how you approach it. Drop a comment or find us at &lt;a href="https://es.seestocks.com" rel="noopener noreferrer"&gt;es.seestocks.com&lt;/a&gt; .&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>deeplearning</category>
      <category>python</category>
      <category>machinelearning</category>
    </item>
  </channel>
</rss>
