Ricardo Cuba

Posted on Jun 24

VTEX, Magento, and Retail APIs — A Developer's Guide to Commerce Data

#api #backend #tutorial #webdev

What we learned integrating 41 LATAM retailers — and how you can do it without losing your mind.

If you've ever tried to integrate a VTEX or Magento store's API, you know the drill: read the docs (which may or may not be current), get an API key (which may or may not work), figure out the product schema (which is different for every store), handle rate limits (undocumented), and pray nothing breaks when the retailer updates their frontend next week.

We've done this 41 times. Here's what we learned.

VTEX: The Good, The Bad, and The API

VTEX powers most major retailers in LATAM — Wong, Metro, Plaza Vea, Carrefour, Éxito, and dozens more. It has a well-documented REST API.

What works

Search API (/api/catalog_system/pub/products/search) — reliable, structured product data
Pagination — cursor-based, predictable
Product details — consistent schema for price, name, description

What doesn't

Auth varies by store — some use X-VTEX-API-AppKey + token, others custom headers
Rate limits are undisclosed — you discover them by getting 429'd
Product IDs are not globally unique — same SKU, different IDs across stores
Image URLs expire — cached URLs break after hours/days

Code: querying VTEX search

python
import httpx

async def search_vtex(domain: str, query: str, key: str, token: str):
    url = f"https://{domain}/api/catalog_system/pub/products/search/{query}"
    headers = {"X-VTEX-API-AppKey": key, "X-VTEX-API-AppToken": token}
    async with httpx.AsyncClient() as client:
        resp = await client.get(url, headers=headers, timeout=30)
        return resp.json()

Magento: GraphQL When It Works

Magento stores expose a GraphQL endpoint. Flexible in theory, fragmented in practice:

Pros: GraphQL introspection, single endpoint, field selection.
Cons: Version fragmentation (2.3 vs 2.4), OAuth 2.0 complexity, some stores disable introspection, nested pricing (price_range.minimum_price.regular_price.value — yes, that's a real path).

The normalization problem

The hardest part isn't querying 41 APIs. It's normalizing the results:

"1kg arroz" ≠ "arroz 1000g" ≠ "arroz x1kg" ≠ "Arroz Extra 1Kg"

Our normalizer handles unit conversion (kg/g/lb/oz/L/mL), fuzzy product matching across Spanish/Portuguese/English, brand extraction, and always returns price_per_kg.

Error handling across 41 stores

When you query 41 stores, something is always broken:

Failure mode Frequency Our handling

Timeout (>30s) ~5% Retry 3x with exponential backoff

429 Rate Limit ~3% Queue and retry after Retry-After

503/502 Error ~2% Skip store, log, retry next cycle

Schema change ~1%/week Alert → manual fix → update collector

Auth expired ~1%/month Rotate tokens, alert ops

Failure mode	Frequency	Our handling
Timeout (>30s)	~5%	Retry 3x with exponential backoff
429 Rate Limit	~3%	Queue and retry after Retry-After
503/502 Error	~2%	Skip store, log, retry next cycle
Schema change	~1%/week	Alert → manual fix → update collector
Auth expired	~1%/month	Rotate tokens, alert ops

The collector runs every 4 hours. Even with 5% failure rate, 95% of data is fresh. 7-day coverage: 100%.

What we'd do differently

Start with GraphQL introspection — saves hours of doc-reading for Magento
Build the normalizer first — collection is easy, normalization is hard
Assume auth will break — build token rotation from day 1
Log everything — when a retailer changes their schema at 3am, you want logs
One async HTTP client, many stores — httpx.AsyncClient with connection pooling

Self-serve retailer onboarding (coming soon)

We're building a flow where retailers register their VTEX/Magento store and appear in CLI Market searches within 24 hours. No technical integration — just your domain and API credentials.

If you run a VTEX or Magento store in LATAM: cli-market.dev/retailers

CLI Market — Commerce infrastructure for AI agents. 41 retailers. 8 countries. One API.

DEV Community