DEV Community

Cover image for How We Built AI Search for WooCommerce Using RAG
Anton Pronin
Anton Pronin

Posted on

How We Built AI Search for WooCommerce Using RAG

TL;DR: Default WooCommerce search is a glorified blog search engine. We built a RAG-based AI search layer that understands natural language queries like "running shoes under $120" and translates them into structured product retrieval. Here's the full architecture, code, and lessons learned.

WooCommerce powers over 6 million online stores.

Yet one core feature still behaves almost the same way it did a decade ago: product search.
Most stores rely on the default WordPress search system, which was never designed for e-commerce. It was designed for finding blog posts.

Customers search like this:

running shoes under $120
red dress size M
cheap laptop for travel
laptop good for video editing
Enter fullscreen mode Exit fullscreen mode

But the default search engine expects something closer to:

Nike Pegasus 40
MacBook Air M2
Levi's 501 Red W8 L32
Enter fullscreen mode Exit fullscreen mode

When queries don't match product titles exactly, results fail — silently. The customer sees an empty page or irrelevant products and leaves.

We ran into this problem repeatedly while building WooCommerce stores. So we built Assist My Shop — an AI-powered search layer for WooCommerce that understands what customers actually mean.
This article explains the full architecture: why default search fails, how RAG solves it, and the specific technical decisions we made along the way.

The Problem with WooCommerce Search

WooCommerce search is essentially WordPress post search with a product type filter.
Under the hood, it behaves roughly like this:

SELECT *
FROM wp_posts
WHERE post_type = 'product'
  AND post_status = 'publish'
  AND (
    post_title LIKE '%running shoes%'
    OR post_content LIKE '%running shoes%'
  )
ORDER BY post_date DESC;
Enter fullscreen mode Exit fullscreen mode

This causes several fundamental problems.

Price Is Not Searchable

If a user searches for a cheap laptop, the engine looks for the word "cheap" in product titles and descriptions.
But price is structured metadata:

-- Price lives here, not in post_title
SELECT meta_value
FROM wp_postmeta
WHERE meta_key = '_price'
  AND post_id = 1421;
Enter fullscreen mode Exit fullscreen mode

The search system has no mechanism to interpret under $120 as a price filter. These are completely separate systems with no connection.

Product Attributes Live in Taxonomy Tables

Attributes like size, color, and brand are stored across multiple joined tables:

SELECT p.ID, p.post_title, t.name AS attribute_value
FROM wp_posts p
JOIN wp_term_relationships tr ON p.ID = tr.object_id
JOIN wp_term_taxonomy tt ON tr.term_taxonomy_id = tt.term_taxonomy_id
JOIN wp_terms t ON tt.term_id = t.term_id
WHERE p.post_type = 'product'
  AND tt.taxonomy = 'pa_color'
  AND t.name = 'red';
Enter fullscreen mode Exit fullscreen mode

A simple customer query, such as red dress size M requires multiple joins across wp_terms, wp_term_taxonomy, and wp_term_relationships. Most WooCommerce stores never implement this logic — and the default search certainly doesn't.

Search Has No Concept of Semantic Meaning

These three queries mean very different things:

monitor for gaming        → high refresh rate, low latency, large screen
monitor for design work   → high color accuracy, wide color gamut
monitor for office use    → compact, low price, basic specs
Enter fullscreen mode Exit fullscreen mode

WordPress search treats all three identically. It looks for the word monitor in product titles. It returns the same results for all three queries. The semantic difference — the intent — is completely invisible to it.

What We Wanted to Build

Our goal was straightforward:
Allow customers to search in the way they naturally think.

running shoes under $120

Should be interpreted as:

{
  "category": "running shoes",
  "filters": {
    "price_max": 120
  }
}
Enter fullscreen mode Exit fullscreen mode


markdown

Not as a literal string match against product titles.
To achieve this we built a RAG-based search architecture.

System Architecture Overview

Assist My Shop acts as a search layer that sits between the customer and WooCommerce. WooCommerce remains the source of truth — we never modify it. Our system handles product indexing, query understanding, and retrieval.

Customer Query
     ↓
Search UI (Chat Widget / Search Bar)
     ↓
Assist My Shop API (FastAPI)
     ↓
┌────────────────────────────────┐
│  1. Embedding Generation       │  ← OpenAI / text-embedding-3-small
│  2. Vector Search              │  ← Pinecone / similarity search
│  3. Structured Filter Parsing  │  ← price, size, stock, category
│  4. Hybrid Retrieval           │  ← vector + filters combined
│  5. LLM Reasoning (optional)   │  ← GPT-4o-mini, grounded only
└────────────────────────────────┘
     ↓
WooCommerce Product IDs
     ↓
Store renders products normally
Enter fullscreen mode Exit fullscreen mode


python

Step 1 — Product Indexing

The first challenge: WooCommerce product data is fragmented across many tables.
We solve this by building a normalized product document at sync time — one clean JSON object per product that contains everything needed for search.

# product_indexer.py

import openai
import json
from dataclasses import dataclass, asdict
from typing import Optional

@dataclass
class ProductDocument:
    id: int
    title: str
    description: str
    short_description: str
    price: float
    regular_price: float
    sale_price: Optional[float]
    category: str
    tags: list[str]
    attributes: dict
    in_stock: bool
    stock_quantity: Optional[int]
    sku: str
    embedding_text: str  # Denormalized text used for embedding

def build_embedding_text(product: dict) -> str:
    """
    Construct a rich text representation of the product
    for embedding generation. More context = better retrieval.
    """
    parts = [
        product["title"],
        product.get("short_description", ""),
        product.get("category", ""),
        " ".join(product.get("tags", [])),
    ]

    # Add attributes as natural language
    for key, value in product.get("attributes", {}).items():
        if isinstance(value, list):
            parts.append(f"{key}: {', '.join(value)}")
        else:
            parts.append(f"{key}: {value}")

    return " | ".join(filter(None, parts))


def generate_embedding(text: str) -> list[float]:
    client = openai.OpenAI()
    response = client.embeddings.create(
        model="text-embedding-3-small",
        input=text
    )
    return response.data[0].embedding


def index_product(product: dict, pinecone_index) -> None:
    embedding_text = build_embedding_text(product)
    embedding = generate_embedding(embedding_text)

    # Store vector with full metadata for filtered retrieval
    pinecone_index.upsert(vectors=[{
        "id": str(product["id"]),
        "values": embedding,
        "metadata": {
            "title": product["title"],
            "price": float(product.get("price", 0)),
            "category": product.get("category", ""),
            "in_stock": product.get("in_stock", True),
            "attributes": json.dumps(product.get("attributes", {})),
            "wc_product_id": product["id"],
        }
    }])
Enter fullscreen mode Exit fullscreen mode

Why denormalize into embedding text?

Embedding quality depends heavily on input richness. A product titled "Nike Air Zoom Pegasus 40" in isolation doesn't embed as well as:

Nike Air Zoom Pegasus 40 | Lightweight running shoes for daily training |
running shoes | brand: Nike, color: black, size: 8 9 10
Enter fullscreen mode Exit fullscreen mode

The more context the embedding sees, the better it captures semantic meaning

Step 2 — Query Processing and Intent Parsing

When a customer submits a query, we do two things in parallel:

  1. Generate a vector embedding of the query
  2. Parse structured constraints (price, size, stock, category)
# query_processor.py

import re
from dataclasses import dataclass, field
from typing import Optional

@dataclass
class ParsedQuery:
    raw: str
    semantic_text: str        # Cleaned text for embedding
    price_max: Optional[float] = None
    price_min: Optional[float] = None
    in_stock_only: bool = True
    size_filter: Optional[str] = None
    color_filter: Optional[str] = None

def parse_query(raw_query: str) -> ParsedQuery:
    query = raw_query.lower().strip()
    parsed = ParsedQuery(raw=raw_query, semantic_text=query)

    # Extract price constraints
    # Matches: "under $120", "below 100", "less than $80", "up to $200"
    price_max_match = re.search(
        r'(?:under|below|less than|up to|max|cheaper than)\s*\$?(\d+)',
        query
    )
    if price_max_match:
        parsed.price_max = float(price_max_match.group(1))
        parsed.semantic_text = re.sub(price_max_match.re, '', parsed.semantic_text)

    # Matches: "over $50", "above $100", "more than $200", "at least $30"
    price_min_match = re.search(
        r'(?:over|above|more than|at least|starting from)\s*\$?(\d+)',
        query
    )
    if price_min_match:
        parsed.price_min = float(price_min_match.group(1))
        parsed.semantic_text = re.sub(price_min_match.re, '', parsed.semantic_text)

    # Extract size (US clothing/shoe sizes)
    size_match = re.search(r'\bsize\s+([smlxl\d]+)\b', query)
    if size_match:
        parsed.size_filter = size_match.group(1).upper()

    # Extract color
    colors = ['red', 'blue', 'black', 'white', 'green', 'yellow',
              'pink', 'purple', 'orange', 'grey', 'gray', 'brown']
    for color in colors:
        if re.search(rf'\b{color}\b', query):
            parsed.color_filter = color
            break

    parsed.semantic_text = parsed.semantic_text.strip()
    return parsed
Enter fullscreen mode Exit fullscreen mode

Step 3 — Hybrid Search

Vector similarity alone is not enough for e-commerce. Customers include hard constraints that must be respected precisely. A $120 budget means $120 — not $140.
We combine vector similarity with metadata filters:

# search.py

from pinecone import Pinecone
from typing import Optional

def hybrid_search(
    query_embedding: list[float],
    price_max: Optional[float] = None,
    price_min: Optional[float] = None,
    in_stock_only: bool = True,
    color_filter: Optional[str] = None,
    top_k: int = 10
) -> list[dict]:

    pc = Pinecone(api_key=PINECONE_API_KEY)
    index = pc.Index(PINECONE_INDEX_NAME)

    # Build metadata filter
    filter_conditions = {}

    if in_stock_only:
        filter_conditions["in_stock"] = {"$eq": True}

    if price_max is not None and price_min is not None:
        filter_conditions["price"] = {
            "$gte": price_min,
            "$lte": price_max
        }
    elif price_max is not None:
        filter_conditions["price"] = {"$lte": price_max}
    elif price_min is not None:
        filter_conditions["price"] = {"$gte": price_min}

    results = index.query(
        vector=query_embedding,
        top_k=top_k,
        filter=filter_conditions if filter_conditions else None,
        include_metadata=True
    )

    return [
        {
            "wc_product_id": int(match.metadata["wc_product_id"]),
            "title": match.metadata["title"],
            "score": match.score,
            "price": match.metadata["price"],
        }
        for match in results.matches
    ]
Enter fullscreen mode Exit fullscreen mode

The key insight: vector search handles what the customer wants. Metadata filters handle constraints on that want. Neither alone is sufficient — together they're very powerful.

Step 4 — The RAG Layer

For conversational queries, we add an optional LLM reasoning step on top of retrieved results.
Critical rule: the LLM never invents products. It only works with what was retrieved.

# rag_layer.py

import openai
import json

def generate_response(
    user_query: str,
    retrieved_products: list[dict],
    conversation_history: list[dict]
) -> str:
    """
    Use an LLM to reason about retrieved products and
    generate a helpful, grounded response.
    """

    products_context = json.dumps(retrieved_products, indent=2)

    system_prompt = """You are a helpful shopping assistant for a WooCommerce store.

IMPORTANT RULES:
- You ONLY recommend products from the Retrieved Products list below.
- Never invent, guess, or hallucinate products that are not in the list.
- If no products match the query, say so honestly.
- Keep responses concise and focused on helping the customer find what they need.
- If the customer's request is ambiguous, ask ONE clarifying question.

Retrieved Products:
{products}
""".format(products=products_context)

    messages = [
        {"role": "system", "content": system_prompt},
        *conversation_history,
        {"role": "user", "content": user_query}
    ]

    client = openai.OpenAI()
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=messages,
        temperature=0.3,   # Low temperature = consistent, factual responses
        max_tokens=400
    )

    return response.choices[0].message.content
Enter fullscreen mode Exit fullscreen mode

Why temperature=0.3?

Higher temperatures make LLMs more creative — which is the last thing you want when recommending products. You want consistency and factual accuracy. We settled on 0.3 after testing: low enough to be reliable, high enough to handle varied phrasings naturally.

Step 5 — WooCommerce Plugin Integration

The WordPress plugin handles three responsibilities:
Product Sync

// assist-my-shop-sync.php

/**
 * Hook into WooCommerce product save/update/delete events
 * and push changes to the Assist My Shop API index.
 */

add_action('woocommerce_update_product', 'ams_sync_product');
add_action('woocommerce_new_product', 'ams_sync_product');

function ams_sync_product(int $product_id): void {
    $product = wc_get_product($product_id);
    if (!$product) return;

    $payload = ams_build_product_payload($product);

    wp_remote_post(AMS_API_URL . '/products/sync', [
        'headers' => [
            'Authorization' => 'Bearer ' . get_option('ams_api_key'),
            'Content-Type'  => 'application/json',
        ],
        'body'    => wp_json_encode($payload),
        'timeout' => 15,
    ]);
}

function ams_build_product_payload(WC_Product $product): array {
    $attributes = [];
    foreach ($product->get_attributes() as $key => $attr) {
        $attributes[$key] = $attr->get_options();
    }

    $categories = wp_get_post_terms(
        $product->get_id(),
        'product_cat',
        ['fields' => 'names']
    );

    return [
        'id'                => $product->get_id(),
        'title'             => $product->get_name(),
        'description'       => wp_strip_all_tags($product->get_description()),
        'short_description' => wp_strip_all_tags($product->get_short_description()),
        'price'             => (float) $product->get_price(),
        'regular_price'     => (float) $product->get_regular_price(),
        'sale_price'        => (float) $product->get_sale_price() ?: null,
        'sku'               => $product->get_sku(),
        'in_stock'          => $product->is_in_stock(),
        'stock_quantity'    => $product->get_stock_quantity(),
        'category'          => implode(', ', $categories),
        'attributes'        => $attributes,
        'permalink'         => get_permalink($product->get_id()),
    ];
}

// Handle product deletion
add_action('before_delete_post', function(int $post_id): void {
    if (get_post_type($post_id) !== 'product') return;

    wp_remote_request(AMS_API_URL . '/products/' . $post_id, [
        'method'  => 'DELETE',
        'headers' => [
            'Authorization' => 'Bearer ' . get_option('ams_api_key'),
        ],
    ]);
});
Enter fullscreen mode Exit fullscreen mode

FastAPI Endpoint

# main.py

from fastapi import FastAPI, HTTPException
from pydantic import BaseModel

app = FastAPI()

class SearchRequest(BaseModel):
    query: str
    conversation_history: list[dict] = []
    top_k: int = 8

@app.post("/search")
async def search(request: SearchRequest):
    # 1. Parse structured constraints from query
    parsed = parse_query(request.query)

    # 2. Generate embedding for semantic part
    embedding = generate_embedding(parsed.semantic_text)

    # 3. Hybrid retrieval: vector + filters
    products = hybrid_search(
        query_embedding=embedding,
        price_max=parsed.price_max,
        price_min=parsed.price_min,
        in_stock_only=True,
        color_filter=parsed.color_filter,
        top_k=request.top_k
    )

    if not products:
        return {
            "products": [],
            "message": "No products found matching your search."
        }

    # 4. Optional: LLM reasoning layer for conversational responses
    response_text = generate_response(
        user_query=request.query,
        retrieved_products=products,
        conversation_history=request.conversation_history
    )

    return {
        "products": products,
        "message": response_text
    }
Enter fullscreen mode Exit fullscreen mode

Performance

Typical request timeline on our production stack

Step Time
Query parsing < 5ms
Embedding generation 80–150ms
Vector retrieval (Pinecone) 10–30ms
LLM reasoning (gpt-4o-mini) 400–800ms
Total (without LLM) ~150–200ms
Total (with LLM) ~600–1000ms

For the search bar (instant results), we skip the LLM layer and return raw vector results — fast enough for real-time search. For the conversational chat widget, we include the LLM layer and users tolerate the slightly longer response time because they expect a chat interaction.

What We Learned

1. Embedding text quality matters more than model choice

We spent weeks testing different embedding models. The biggest improvement came not from switching models, but from improving the input text. Concatenating title + description + category + attributes into a rich text string improved retrieval quality dramatically compared to embedding just the product title.

2. Structured filter parsing catches most edge cases

We initially over-engineered the filter parsing with an LLM. It was slower, more expensive, and not more accurate for the common patterns. A well-tuned regex parser handles 95% of real customer queries: price ranges, sizes, colors, stock status. Save the LLM for the response layer, not the parsing layer.

3. The LLM must be strictly grounded

Early prototypes let the LLM answer general product questions from its training data. It would occasionally recommend products that didn't exist in the store, or quote wrong prices. The fix: the system prompt must explicitly prohibit the LLM from going outside the retrieved product list. temperature=0.3 helps but the system prompt constraint is non-negotiable.

4. Sync latency is a UX problem

When a store owner updates a product price, they expect search to reflect that immediately. Our initial architecture had a sync delay of up to 5 minutes. We fixed this with direct webhook hooks on woocommerce_update_product — now sync completes within seconds of any product change.

5. Mobile query patterns differ significantly from desktop

Mobile users send shorter, more fragmented queries:

  • Desktop
    running shoes for flat feet under $100

  • Mobile
    flat feet shoes $100

The semantic search handles both well. But mobile users also send more typos and informal language. Embedding models handle this gracefully — far better than keyword search ever could.

Current Stack

Component Technology
API FastAPI (Python)
Embeddings OpenAI text-embedding-3-small
Vector DB Pinecone
LLM GPT-4o-mini
WP Plugin PHP / WooCommerce hooks

Final Thoughts

The core problem with WooCommerce search isn't the search algorithm — it's the combination of:

Fragmented product data across multiple database tables
A keyword-based query system with no concept of semantic meaning
No mechanism to translate natural language constraints into structured filters

RAG-based search addresses all three. Products are normalized into rich documents and indexed as vectors. Queries are parsed for structured constraints and embedded for semantic similarity. Retrieval combines both. The LLM reasons only over what was found.
The result: a search experience that handles running shoes under $120 the same way a knowledgeable sales associate would — understanding what the customer means, not just what they typed.
We built Assist My Shop to bring this to any WooCommerce store without requiring a data engineering team to build it from scratch. If you're working on something similar, or have questions about the architecture, drop a comment below.


Free plan available · WooCommerce plugin on GitHub

Top comments (0)