DEV Community

Cover image for My AI Kept Recommending Pajamas for Date Night — Here's Why
Ali Afana
Ali Afana

Posted on

My AI Kept Recommending Pajamas for Date Night — Here's Why

I'm Ali, building Provia — an AI-powered sales platform — from Gaza. This is one of the bugs that taught me the most.


The Problem

A customer typed "show me something for a date night" and my AI chatbot returned the "Cozy Night Deluxe Loungewear Set" — pajamas — as the top result. Because "night" in "date night" is semantically close to "night" in "loungewear set." Vector similarity search doesn't understand context. It understands distance between points in 1536-dimensional space, and in that space, pajama night and date night are neighbors.

This wasn't just an annoyance. The loungewear set was matching nearly every query that included common words. "Night out outfit" — pajamas. "Good night cream" (wrong category entirely) — pajamas. "Something nice for tonight" — pajamas. The product had become a black hole, sucking in every vaguely related search because its name and description contained high-frequency semantic tokens.

The Context

Provia uses OpenAI's text-embedding-3-small model to generate 1536-dimensional vectors for every product. When a customer sends a message with product intent, the system generates an embedding for their query and runs a similarity search against the product catalog using a Supabase PostgreSQL function.

Here's the original search function:

CREATE OR REPLACE FUNCTION search_products(
  query_embedding vector(1536),
  match_threshold float DEFAULT 0.1,
  match_count int DEFAULT 5,
  p_store_id uuid DEFAULT NULL
)
RETURNS TABLE (
  id uuid,
  name text,
  description text,
  price numeric,
  category text,
  similarity float
)
LANGUAGE plpgsql
AS $$
BEGIN
  RETURN QUERY
  SELECT
    p.id,
    p.name,
    p.description,
    p.price,
    p.category,
    1 - (p.embedding <=> query_embedding) AS similarity
  FROM products p
  WHERE
    (p_store_id IS NULL OR p.store_id = p_store_id)
    AND 1 - (p.embedding <=> query_embedding) > match_threshold
  ORDER BY p.embedding <=> query_embedding
  LIMIT match_count;
END;
$$;
Enter fullscreen mode Exit fullscreen mode

The match_threshold was set to 0.1. That's basically saying "return anything that isn't completely random." In a catalog of 15 products, almost everything would clear that bar for any query containing a common English word.

The Attempts

Attempt 1: Raise the threshold to 0.3.

The obvious fix. If 0.1 is too loose, make it tighter.

const { data: results } = await supabase.rpc("search_products", {
  query_embedding: embedding,
  match_threshold: 0.3,
  match_count: 5,
  p_store_id: storeId,
});
Enter fullscreen mode Exit fullscreen mode

Result: This killed the pajama problem but also killed legitimate matches. "Show me jackets" returned zero results because the similarity between the query "show me jackets" and a product named "Classic Cool Denim Jacket" was 0.28. The threshold was too aggressive for short, simple queries.

Attempt 2: Two-tier threshold system.

I tried a near-match tier. Products above 0.3 were "strong matches" and products between 0.2 and 0.3 were "near matches" shown as suggestions:

const strongMatches = results.filter(r => r.similarity >= 0.3);
const nearMatches = results.filter(r => r.similarity >= 0.2 && r.similarity < 0.3);

if (strongMatches.length > 0) {
  return { products: strongMatches, tier: "strong" };
} else if (nearMatches.length > 0) {
  return { products: nearMatches, tier: "near" };
} else {
  return { products: [], tier: "none" };
}
Enter fullscreen mode Exit fullscreen mode

Result: This made things worse. The near-match tier was basically the old problem with extra steps. "Date night outfit" would return pajamas as a "near match" and the bot would say "I found something that might work..." and show the loungewear set. The customer experience was the same — irrelevant pajamas.

Attempt 3: Higher threshold with more results.

Threshold at 0.25, but return 10 results instead of 5, hoping the relevant ones would be in there somewhere.

Result: The pajamas were still in the results. More results just meant more noise. The loungewear set would appear alongside the actually relevant products, and sometimes the bot would mention it because it was in the context.

The fundamental issue was that vector similarity alone couldn't solve this. The semantic space doesn't understand shopping intent. It just measures distance between concept clusters, and "night" creates a bridge between concepts that should be separate.

The Solution

I killed the two-tier system and built a fallback chain instead. Three search strategies, tried in order, stopping at the first one that returns results.

Step 1: Tightened semantic search.

Raised the threshold to 0.3 and accepted that some queries would return nothing. That's fine — that's what the fallback is for.

CREATE OR REPLACE FUNCTION search_products(
  query_embedding vector(1536),
  match_threshold float DEFAULT 0.3,
  match_count int DEFAULT 5,
  p_store_id uuid DEFAULT NULL
)
RETURNS TABLE (
  id uuid,
  name text,
  description text,
  price numeric,
  category text,
  similarity float
)
LANGUAGE plpgsql
AS $$
BEGIN
  RETURN QUERY
  SELECT
    p.id,
    p.name,
    p.description,
    p.price,
    p.category,
    1 - (p.embedding <=> query_embedding) AS similarity
  FROM products p
  WHERE
    (p_store_id IS NULL OR p.store_id = p_store_id)
    AND 1 - (p.embedding <=> query_embedding) > match_threshold
  ORDER BY p.embedding <=> query_embedding
  LIMIT match_count;
END;
$$;
Enter fullscreen mode Exit fullscreen mode

Step 2: ILIKE fallback for keyword matching.

If semantic search returns nothing, fall back to plain text matching. This catches cases where the customer uses the exact product name or category but the embedding similarity is below threshold:

async function searchWithFallback(query, storeId) {
  // 1. Try semantic search first
  const embedding = await generateEmbedding(query);
  const { data: semanticResults } = await supabase.rpc("search_products", {
    query_embedding: embedding,
    match_threshold: 0.3,
    match_count: 5,
    p_store_id: storeId,
  });

  if (semanticResults && semanticResults.length > 0) {
    return { results: semanticResults, method: "semantic" };
  }

  // 2. Fall back to ILIKE keyword search
  const keywords = query
    .toLowerCase()
    .split(/\s+/)
    .filter(w => w.length > 2 && !["show", "me", "find", "the", "for", "and", "with"].includes(w));

  let keywordResults = [];
  for (const keyword of keywords) {
    const { data } = await supabase
      .from("products")
      .select("id, name, description, price, category")
      .eq("store_id", storeId)
      .or(`name.ilike.%${keyword}%,description.ilike.%${keyword}%,category.ilike.%${keyword}%`)
      .limit(5);

    if (data && data.length > 0) {
      keywordResults.push(...data);
    }
  }

  // Deduplicate
  const unique = [...new Map(keywordResults.map(r => [r.id, r])).values()];

  if (unique.length > 0) {
    return { results: unique.slice(0, 5), method: "keyword" };
  }

  // 3. Fall back to category browsing
  const { data: categories } = await supabase
    .from("products")
    .select("category")
    .eq("store_id", storeId)
    .not("category", "is", null);

  const uniqueCategories = [...new Set(categories.map(c => c.category))];

  return { results: [], method: "none", availableCategories: uniqueCategories };
}
Enter fullscreen mode Exit fullscreen mode

Step 3: Category fallback for total misses.

If both semantic and keyword search fail, the bot gets a list of available categories and can ask the customer to browse. "I couldn't find an exact match, but we have items in Jackets, Dresses, Accessories, and Loungewear. Which category interests you?"

The chain works like this:

  1. Semantic search (threshold 0.3) — catches queries where the intent is clear and the embedding is close
  2. ILIKE keyword search — catches queries using exact product words that embeddings missed
  3. Category browsing — catches everything else with a graceful fallback

The Result

Before the fix:

"date night outfit"        → Cozy Night Deluxe Loungewear Set (pajamas)
"something for tonight"    → Cozy Night Deluxe Loungewear Set (pajamas)
"night out look"           → Cozy Night Deluxe Loungewear Set (pajamas)
"show me jackets"          → Cozy Night Deluxe Loungewear Set (pajamas + jackets mixed)
Enter fullscreen mode Exit fullscreen mode

After the fix:

"date night outfit"        → Elegant Evening Dress, Statement Heels (semantic, 0.42)
"something for tonight"    → Elegant Evening Dress, Bold Blazer (semantic, 0.35)
"night out look"           → Bold Blazer, Statement Heels (semantic, 0.38)
"show me jackets"          → Classic Cool Denim Jacket, Vintage Leather Bomber (keyword fallback)
"cozy loungewear"          → Cozy Night Deluxe Loungewear Set (semantic, 0.67)
Enter fullscreen mode Exit fullscreen mode

The pajamas now only appear when someone actually asks for loungewear or pajamas. The fallback chain catches queries that the tighter threshold would have dropped. And when nothing matches, the bot asks about categories instead of guessing wrong.

The Lesson

Vector similarity search is powerful but naive. It measures distance in embedding space without understanding intent, context, or shopping behavior. A 0.1 threshold in a small catalog means everything matches everything. A 0.3 threshold means some legitimate queries return nothing. There's no single threshold that works for all queries.

The solution isn't finding the perfect threshold — it's accepting that no single search method works for everything. Build a fallback chain. Start with the most precise method, fall back to the broadest. Semantic search handles the 70% of queries where intent is clear. Keyword search handles the 20% where the customer uses exact product terms. Category browsing handles the remaining 10% where the query is too vague or unusual for any automated matching.

And test with real product names. I never would have found the pajama problem if my test catalog only had products with unique, distinct names. The bug only appeared because "night" was a common word that bridged unrelated concepts. Your catalog probably has the same issue with words like "classic," "premium," "comfort," or "style." Check your embeddings. Your search is probably returning pajamas too.


I'm documenting my entire journey building an AI sales platform from Gaza. Follow me @AliMAfana for more real bugs from a real product.

Top comments (0)