tzedaka

Posted on Apr 9

Keyword bot vs. LLM agent for e-commerce Q&A: a technical breakdown

#ai #machinelearning #ecommerce #python

Most automation tools for MercadoLibre sellers fall into one of two categories: keyword-based chatbots (MercadoBot, Yobot, JaimeBot) or LLM-powered agents. From the outside, they look similar — buyer asks a question, system sends a reply. Under the hood, they're completely different architectures with very different failure modes.

This post breaks down how each works technically, where each fails, and why the difference matters at scale.

How a keyword bot works

The core logic is a rule engine. At setup, the seller configures a list of keyword → response pairs. At runtime:

def keyword_bot_reply(question: str, rules: list[dict]) -> str | None:
    question_lower = question.lower()
    for rule in rules:
        if any(kw.lower() in question_lower for kw in rule["keywords"]):
            return rule["response"]
    return None  # no match → fallback or no reply

That's essentially it. The system checks if the incoming text contains a preconfigured string. If it does, it returns the associated template. If it doesn't, the question goes unanswered or gets a generic fallback.

The structural problem: the bot has no access to the product listing. It responds with text the seller wrote at configuration time — which may be outdated, incomplete, or wrong for a specific variant. The seller has to manually anticipate every phrasing a buyer might use.

At small scale with simple catalogs (10 products, predictable questions), this works fine. At scale or with technical products, it breaks.

How an LLM agent works

An LLM agent doesn't pattern-match — it reasons. The key architectural difference is context injection: before generating a reply, the agent retrieves real data from the product listing and injects it into the prompt.

def agent_reply(question: str, listing_id: str) -> str:
    # 1. Fetch live listing context from MeLi API
    listing = mercadolibre_api.get_listing(listing_id)
    context = {
        "title": listing["title"],
        "attributes": listing["attributes"],       # voltage, dimensions, compatibility...
        "variations": listing["variations"],       # sizes, colors, models
        "description": listing["description"],
        "seller_profile": get_seller_profile(),    # warranty, shipping, return policy
    }

    # 2. Build prompt with context
    prompt = f"""
    You are a sales assistant for a MercadoLibre seller.

    Product context:
    {json.dumps(context, ensure_ascii=False)}

    Buyer question: {question}

    Answer the question using only the product context provided.
    If the information is not available in the context, say so clearly.
    """

    # 3. Call LLM
    return llm.complete(prompt)

The agent can answer questions about voltage, compatibility, variants, and warranty without any preconfigured rules — because it reads the actual listing before generating the reply.

Failure modes by question type

This is where the architectural difference becomes practical:

Question type	Example	Keyword bot	LLM agent
Exact keyword match	"¿tiene garantía?"	✅ works	✅ works
Synonym / paraphrase	"¿qué cobertura tiene en fallas?"	❌ no match	✅ reads warranty from profile
Technical spec	"¿sirve para 220V?"	❌ unless pre-configured	✅ reads voltage attribute
Variant combination	"¿el azul también viene en XL?"	❌	✅ reads variations
Compatibility	"¿funciona con mi HP 14s?"	❌	✅ reads compatibility attributes
Out-of-scope	"¿hacen instalación a domicilio?"	❌	✅ returns "not available" cleanly

For catalogs with simple, predictable questions, keyword bots cover 60–70% of cases. For technical catalogs (electronics, tools, auto parts), the unmatched rate is typically 30–50%.

The context window problem

LLM agents have their own failure mode: context quality.

The agent is only as good as the data injected into the prompt. If the listing has incomplete attributes — no voltage listed, missing dimensions, vague description — the agent has nothing to work with and will either hallucinate or return a non-answer.

// Bad listing attributes
attributes = [
    {"id": "BRAND", "value": "Samsung"},
    {"id": "MODEL", "value": "Galaxy Tab"},
]
// Agent can't answer: "¿funciona con 220V?" — voltage not in context

// Good listing attributes  
attributes = [
    {"id": "BRAND", "value": "Samsung"},
    {"id": "MODEL", "value": "Galaxy Tab"},
    {"id": "VOLTAGE", "value": "110V/220V"},
    {"id": "CONNECTIVITY", "value": "WiFi, Bluetooth 5.0"},
    {"id": "COMPATIBLE_DEVICES", "value": "Android, Windows, Mac"},
]
// Agent answers voltage, compatibility questions correctly

This means the quality of automated replies is directly tied to how complete the listing data is — which is actually a useful forcing function for sellers to maintain better catalog hygiene.

MercadoLibre's native AI: a hybrid case

MeLi introduced an AI suggestion feature in 2024–2025. It reads the listing and generates a suggested reply — which is genuinely good quality for standard questions.

The catch: it doesn't auto-send. The seller has to approve each suggestion manually. Architecturally, it's an LLM agent without the final delivery step. For sellers with off-hours volume (60% of questions arrive outside business hours per Ventiapp 2025 data), this doesn't solve the automation problem.

The delta between MeLi's native AI and external agents like Shopao is one step in the pipeline: auto-send vs. manual approval.

When to use each

Keyword bot: small catalog, simple questions, budget-constrained, seller willing to invest setup time per product category.

LLM agent: technical catalog, high off-hours volume, multi-variant products, seller wants zero configuration per question type.

Hybrid (keyword first, LLM fallback): high volume scenarios where you want to minimize LLM API calls for trivially answerable questions. The keyword layer handles FAQ-type questions cheaply; the LLM handles everything else.

def hybrid_reply(question: str, listing_id: str, rules: list[dict]) -> str:
    # Try cheap keyword match first
    quick_reply = keyword_bot_reply(question, rules)
    if quick_reply:
        return quick_reply

    # Fall back to LLM for complex/unmatched questions
    return agent_reply(question, listing_id)

Takeaway

The choice isn't really "chatbot vs. AI" — it's about whether your reply system has access to real product context at inference time. A keyword bot configured with correct answers can outperform a poorly-prompted LLM agent. But a keyword bot fundamentally cannot answer questions it wasn't configured for, while an LLM agent with good listing data can handle the full distribution of buyer questions without any manual rule configuration.

For technical catalogs on MercadoLibre, the unmatched question rate with keyword bots is high enough that the difference in conversion is measurable.

Full comparison including pricing and real case studies: shopao.io/blog

Tags: #ai #machinelearning #ecommerce #python

DEV Community