The Problem Nobody Talks About
Here's a dirty secret in e-commerce: 90% of new products fail. Not because they're bad products, but because founders rely on gut feeling instead of data.
The traditional playbook looks like this:
- Browse AliExpress or attend trade shows
- Pick products that "look good"
- Order 500 units
- Run ads and pray
We've watched brands burn through $10K–$50K on inventory that ends up collecting dust. The issue isn't the product — it's the validation methodology (or lack thereof).
What If You Could Predict Success Before Investing?
That's the question that led us to build Lexi — an AI-powered market intelligence platform that validates products before you commit to inventory.
The core idea: instead of guessing, measure real consumer behavior.
The Architecture Behind It
Our stack is Laravel 12 + Vue 3 (Inertia SSR) + Python microservices for ML. Here's how the system works at a high level:
┌─────────────────┐ ┌──────────────┐ ┌─────────────────┐
│ Trend Discovery │────▶│ AI Filtering│────▶│ Validation │
│ (Social Scraper)│ │ (GPT-4V) │ │ (SCS Algorithm)│
└─────────────────┘ └──────────────┘ └─────────────────┘
│ │ │
Instagram Removes noise Predicts if
TikTok and spam product will
Pinterest programmatically scale profitably
1. Trend Discovery: Finding Signals in Noise
Most "spy tools" show you what's already saturated. We focused on a different approach: detecting products in their asymptotic growth phase — the moment between early adoption and mainstream, where the real money is.
We scrape public content from Instagram, TikTok, and Pinterest using ethical scraping practices. But raw data is chaos. A search for "summer dress" returns lifestyle photos, memes, influencer selfies — and occasionally, actual products going viral.
2. AI Content Filtering: GPT-4 Vision as a Bouncer
This is where it gets interesting. We feed every scraped image through GPT-4 Vision with a carefully engineered prompt:
"Analyze this image. Is it a commercial product photograph suitable for e-commerce? Rate confidence 0-100. Extract: product category, dominant colors, material estimate, price range estimate, target demographic."
This single step eliminates ~70% of noise. What remains is a curated feed of actual products gaining traction in real-time.
3. The SCS Algorithm: Our Predictive Engine
The Scalability Confidence Score (SCS) is the core IP. It's a composite score (0-100) that predicts commercial viability by combining four sub-scores:
| Component | What it Measures |
|---|---|
| SVS (Social Validation Score) | Engagement quality: saves, shares, "where can I buy this?" comments vs. generic likes |
| CHS (Creative Hook Score) | How well the product stops the scroll — visual distinctiveness in a feed |
| ISS (Intent Signal Score) | NLP analysis of comments to detect purchase intent vs. casual browsing |
| PES (Price Efficiency Score) | Estimated margin viability based on perceived value vs. sourcing cost |
The formula weights these dynamically based on the product category. Fashion products lean heavily on CHS and SVS. Tech gadgets weight ISS higher.
Current accuracy: 85% on predicting which products achieve positive ROAS within 14 days.
4. AI Image Generation: Zero-Cost Catalogs
Once a product passes validation, brands need catalog images. Traditional product photography costs $500–$2,000 per SKU.
We use Gemini 2.5 Flash to generate photorealistic catalog images:
- Studio-quality product shots on clean backgrounds
- Lifestyle context images (product in use)
- Virtual fashion models with diverse body types and ethnicities
All generated assets come with full commercial licensing.
The Technical Decisions That Shaped Us
Why Inertia SSR Instead of a Separate API + SPA?
SEO and AI crawlability. Our /learn pages (25+ feature pages) need to be indexable by Google, ChatGPT, Claude, and Perplexity. With Inertia SSR:
- Server renders the initial HTML with full content
- Vue hydrates for interactivity
- AI crawlers get complete, semantic HTML on first request
- We maintain a single codebase (no API duplication)
We also implemented llms.txt and llms-full.txt following the proposed standard to help LLMs understand our platform structure.
Why a Composite Score Instead of a Single ML Model?
Interpretability. When we tell a brand "this product scored 78/100", they inevitably ask why. A black-box model can't answer that.
With SCS, we can say: "Social validation is strong (SVS: 89), but the creative hook is below average for this category (CHS: 62). Consider testing with a more visually distinctive angle."
This makes the score actionable, not just informative.
Sentiment Analysis: Beyond Positive/Negative
Standard sentiment analysis tells you if a comment is "positive" or "negative". Useless for e-commerce.
We built a custom classification layer on top of LLMs that detects purchase intent:
| Comment | Standard Sentiment | Our Classification |
|---|---|---|
| "So cute! 😍" | Positive | Low Intent (generic appreciation) |
| "Does this come in blue?" | Neutral | Medium Intent (specific interest) |
| "TAKE MY MONEY where do I buy" | Positive | High Intent (ready to purchase) |
| "Bought it, arriving Tuesday" | Positive | Confirmed Purchase |
This distinction is what separates a vanity metric from a revenue signal.
What's Next
We're currently building:
- Drill-down validation: When a product category wins, automatically test sub-variables (color, material, price point)
- Shopify one-click sync: Push validated products directly to a Shopify store with AI-generated copy and images
- Multi-company support: Isolated workspaces for agencies managing multiple brands
If you're interested in the intersection of AI, e-commerce, and data-driven product development, check out our feature documentation — we've open-sourced our methodology there.
Questions? Drop them below. I'll dive deeper into any of these systems in follow-up posts.
Tags: #ai #ecommerce #machinelearning #webdev
Top comments (0)