DEV Community

Aman Patel
Aman Patel

Posted on

Did you know ChatGPT gives more than 70% of product information wrong?

Did you know ChatGPT gives more than 70% of product information wrong?
Did you know ChatGPT, Gemini, and other AI agents still struggle with accurate real-time product information?

Not because the models are bad - but because commerce data itself is extremely difficult to process in real time.

Most AI systems and shopping agents still rely on:
1) Traditional scrapers
2) Static page parsing
3) Generic web extraction tools
The problem? They often miss how e-commerce actually works.
For example: The same product can show different prices, delivery times, availability, sellers, and discounts - based purely on location or pincode.
But most scraping systems are not location-aware.

Another major issue: product variations. A single iPhone listing may contain different colors, storage options, prices, and availability statuses.
Most systems treat this as one static page. Reality is far more dynamic.

And finally - many traditional scrapers still return raw HTML or markdown instead of structured commerce intelligence.
That makes life harder for ChatGPT, Gemini, AI agents, shopping copilots, analytics systems, and pricing engines alike.
We believe a large percentage of commerce data consumed by AI today is still incomplete or inaccurate.
So we started building infrastructure to solve this - properly.
Turns out, commerce intelligence is much harder than it looks.

Top comments (2)

Collapse
 
harjjotsinghh profile image
Harjot Singh

That stat (if it holds up) is a flashing warning light for the whole "AI as product oracle" trend - and the root cause is structural: a base model answers product questions from stale training data and pattern-matching, not from your live catalog, so it confidently invents specs, prices, and availability. It's not that the model is dumb; it's that it's answering from memory when it should be answering from your data. Ungrounded = guessing, however fluent.

The fix is the same one that keeps coming up: ground answers in a live, authoritative source (RAG over your actual product DB) and make the model cite/abstain rather than improvise. A model constrained to "answer only from retrieved product data, say 'I don't know' otherwise" goes from 70% wrong to mostly-right-or-silent. That grounding-and-verify discipline is core to how I build with Moonshift (a multi-agent pipeline that ships a prompt to a deployed SaaS) - a confident-but-ungrounded answer is treated as a failure, not output. Important PSA-style post. Is the 70% from base-model answers specifically, or did it persist even with retrieval/grounding? Because if grounding doesn't fix it, that's a much scarier finding.

Collapse
 
amanp8l profile image
Aman Patel

Exactly - that’s the key difference. Most failures happen when models answer from memory instead of live commerce data.

Even retrieval helps only if the underlying commerce data is accurate, structured, location-aware, and variation-aware. That’s where things still break today.

A lot of current “grounded” systems are still grounding on incomplete or badly parsed product data.

And yes — the 70% issue was primarily around ungrounded/base-model style answers. Proper retrieval improves things massively, but commerce infrastructure itself still needs a big upgrade.