DEV Community

Paulo Fox
Paulo Fox

Posted on

Building a Mercado Livre Reply Bot with Claude AI: 28-Second Response Time, 24/7

I built a bot that reads every question posted to a Mercado Livre store and replies automatically — in under 30 seconds, 24/7, using Claude AI. Here's exactly how it works and why response time is a competitive moat in Brazilian e-commerce.


Why Mercado Livre Questions Are a Ranking Signal

Mercado Livre's reputation algorithm ("Reputação") scores sellers across five metrics. Response time is one of them — specifically, answering questions in under 1 hour moves your badge from "yellow" to "green". Green sellers get:

  • Higher placement in search results
  • "MercadoLíder" eligibility (unlocks free shipping, lower fees)
  • Higher conversion (buyers trust responsive sellers)

The problem: a human checking questions every hour is expensive. Missing nights and weekends costs rankings. FoxReply automates this entirely.


Architecture

Mercado Livre API ──polling──► FoxReply Worker
                                      │
                               Claude API (Sonnet)
                                      │
                               ML Answer Endpoint
Enter fullscreen mode Exit fullscreen mode
  • Python + FastAPI — lightweight async service
  • Claude claude-sonnet-4-6 — generates replies grounded in product catalog
  • ML Official API — polling /questions/search every 60s
  • Redis — deduplication (never reply twice to the same question)

Lesson 1: ML API Has No Webhooks — Poll Efficiently

Mercado Livre does not send webhooks for new questions. You must poll their API:

GET /questions/search?seller_id={seller_id}&status=UNANSWERED&limit=50
Authorization: Bearer {access_token}
Enter fullscreen mode Exit fullscreen mode

The trap: if you poll too aggressively you hit rate limits (429). If you poll too slowly you miss the 1-hour SLA.

Our solution — adaptive polling:

import asyncio
from datetime import datetime, time

async def poll_interval() -> int:
    """Return polling interval in seconds based on time of day."""
    now = datetime.now().time()
    # Business hours: every 30s
    if time(8, 0) <= now <= time(22, 0):
        return 30
    # Night: every 5 minutes (questions are rare, SLA resets at 8am anyway)
    return 300
Enter fullscreen mode Exit fullscreen mode

This keeps us within rate limits while hitting the 1-hour window during business hours.


Lesson 2: The Answer Must Sound Human

ML buyers become suspicious if replies are robotic. We tested several approaches:

Approach Acceptance Rate
Direct GPT completion 61% (buyers felt ignored)
Template + keyword match 44% (too generic)
Claude with product context 89%

The difference: Claude receives the full product listing (title, description, attributes, FAQ) as context. The reply is grounded, specific, and uses the seller's natural tone.

async def generate_reply(question: str, product: dict, tone: str = "professional") -> str:
    prompt = f"""You are a helpful store assistant for a Brazilian e-commerce seller.

Product: {product['title']}
Description: {product['description'][:500]}
Attributes: {format_attributes(product['attributes'])}

Customer question: {question}

Reply in Brazilian Portuguese, {tone} tone, max 3 sentences.
Be specific — reference the product details above.
Never say "I don't know" — if unsure, offer to check and get back."""

    response = await anthropic.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=200,
        messages=[{"role": "user", "content": prompt}]
    )
    return response.content[0].text
Enter fullscreen mode Exit fullscreen mode

Key: max_tokens=200 keeps replies concise. ML buyers read on mobile — walls of text get ignored.


Lesson 3: Access Token Refresh Is Mandatory

ML OAuth tokens expire in 6 hours. If your worker doesn't refresh proactively, it silently stops answering — and you lose 6 hours of SLA while the queue piles up.

class MLTokenManager:
    def __init__(self, client_id: str, client_secret: str, refresh_token: str):
        self.client_id = client_id
        self.client_secret = client_secret
        self.refresh_token = refresh_token
        self._access_token: str | None = None
        self._expires_at: float = 0

    async def get_token(self) -> str:
        # Refresh 5 minutes before expiry (not after)
        if time.time() >= self._expires_at - 300:
            await self._refresh()
        return self._access_token

    async def _refresh(self) -> None:
        resp = await httpx.post(
            "https://api.mercadolibre.com/oauth/token",
            data={
                "grant_type": "refresh_token",
                "client_id": self.client_id,
                "client_secret": self.client_secret,
                "refresh_token": self.refresh_token,
            }
        )
        data = resp.json()
        self._access_token = data["access_token"]
        self._expires_at = time.time() + data["expires_in"]
        # IMPORTANT: ML rotates refresh tokens — always save the new one
        self.refresh_token = data["refresh_token"]
        await save_refresh_token(self.refresh_token)  # persist to DB
Enter fullscreen mode Exit fullscreen mode

Critical: ML rotates the refresh_token on every use. If you don't save the new one, the next refresh fails and you're locked out until manual re-authorization.


Lesson 4: Deduplication With Redis

The polling loop can see the same unanswered question across multiple cycles. Without deduplication, you send multiple replies to the same question — ML penalizes this.

async def should_reply(question_id: int) -> bool:
    key = f"replied:{question_id}"
    # SET NX EX — atomic check-and-set, expire after 7 days
    result = await redis.set(key, "1", nx=True, ex=604800)
    return result is True  # True = we set it (first time), False = already exists

async def process_questions(questions: list[dict]) -> None:
    for q in questions:
        if not await should_reply(q["id"]):
            continue  # already replied or in-flight
        reply = await generate_reply(q["text"], q["item"])
        await post_reply(q["id"], reply)
Enter fullscreen mode Exit fullscreen mode

Lesson 5: Multi-Marketplace Abstraction

After ML, we added Nuvemshop and Shopee. Each has a different API shape, but the core loop is identical. We abstracted it:

class MarketplaceAdapter(ABC):
    @abstractmethod
    async def get_unanswered_questions(self) -> list[Question]:
        ...

    @abstractmethod
    async def post_reply(self, question_id: str, text: str) -> bool:
        ...

class MercadoLivreAdapter(MarketplaceAdapter):
    # ML-specific polling + OAuth
    ...

class NuvemshopAdapter(MarketplaceAdapter):
    # Nuvemshop webhook-based (they DO have webhooks)
    ...
Enter fullscreen mode Exit fullscreen mode

The worker loop is marketplace-agnostic — each adapter implements the contract.


Results After 3 Months

  • Average reply time: 28 seconds (vs 4.2 hours manually)
  • Reputation badge: yellow → green (all test stores)
  • Question-to-conversion rate: +23% (buyers who got fast replies converted more)
  • Operator time saved: ~3h/day per store

What's Next

FoxReply is live at foxreply.centralfox.online. Current roadmap:

  • Shopify integration (using their Storefront API)
  • Proactive follow-up messages (track buyers who asked but didn't purchase)
  • Sentiment analysis — flag hostile questions for human review

If you're building marketplace automation in Brazil or have questions about the ML API, Claude integration, or async polling patterns — drop a comment.

Built with: Python 3.12 · FastAPI · Claude claude-sonnet-4-6 · Redis · Docker · Claude Code (Anthropic)

🔗 foxreply.centralfox.online | Reddit u/foxdigitaldev

Top comments (0)