Last year, a friend of mine was pitching his startup to investors. He needed market research — TAM/SAM/SOM numbers, competitor analysis, the whole nine yards. A consulting firm quoted him $18,000 and a 6-week timeline.
He couldn't afford it. So he did what most founders do: he asked ChatGPT.
The investor checked the numbers. Half of them were fabricated. The "sources" linked to pages that didn't exist. The pitch died in that room.
That conversation haunted me. Not because my friend lost the deal — but because I realized millions of founders face this exact problem every day. They need real market data, but they're stuck between a $20K agency and a hallucinating chatbot.
So I built something in between.
The Problem Nobody Talks About
Here's a dirty secret about AI-generated research: ChatGPT hallucinates roughly 37% of its citations. I tested this myself. I asked it for market size data on the European SaaS market. It gave me confident numbers, complete with source URLs. Three out of five URLs returned 404 errors. One linked to an article that said something completely different.
Perplexity is better — but it gives you raw text. No structure. No frameworks. No print-ready report you can hand to an investor or a client. You still spend hours formatting, cross-referencing, and building the narrative.
I wanted something that does the entire job: pulls data from real sources, verifies it, applies proper analytical frameworks, and outputs a professional report — all in under 5 minutes.
Building the Research Engine
I started with a simple question: what does a $15,000 market research report actually contain?
I bought three reports from different agencies (used older ones available through university access). I dissected them. Every single one followed the same pattern:
- Market sizing (TAM/SAM/SOM)
- Competitive landscape (10-15 players with pricing)
- Target audience segmentation (3-5 personas)
- Demand validation (search trends, growth signals)
- Unit economics and go-to-market strategy
So I built five parallel research agents — one for each section. They work simultaneously, each pulling from different source types:
- Agent 1 (Market): Statista, IBISWorld, World Bank, OECD databases
- Agent 2 (Competitors): Crunchbase, G2, company websites, pricing pages
- Agent 3 (Audience): Google Trends, Reddit, industry surveys, demographic data
- Agent 4 (Demand): Search volume data, job postings, funding rounds, news mentions
- Agent 5 (Unit Economics): SaaS benchmarks, industry margins, CAC/LTV ratios
Each agent has a 4-level source verification system:
Level A: Government/institutional sources (World Bank, Census, OECD)
Level B: Industry databases (Statista, IBISWorld, Gartner)
Level C: Media & analyst reports (TechCrunch, Bloomberg)
Level D: Expert opinions & estimates (clearly labeled as such)
Every single fact in the report gets tagged with its source level. If the system can't find verified data, it says "data not available in open sources" instead of making something up. Zero hallucinations by design.
The "Holy Crap" Moment
The first time I ran it on a real query — "Electric vehicle charging infrastructure market in Southeast Asia" — I expected garbage. What came back was a 42-page HTML report with:
- TAM of $4.2B (2025) growing to $12.8B (2030), sourced from BloombergNEF
- 14 competitors mapped with pricing tiers and market share
- Three audience segments with pain points and willingness-to-pay data
- SWOT analysis, Porter's Five Forces, and JTBD framework applied
- Every number linked to its source with reliability grade
It took 4 minutes and 22 seconds.
I sent it to my friend — the one who lost the investor pitch. His response: "This is better than what I paid $3,000 for last time."
That's when I knew this wasn't just a side project.
What Actually Surprised Me
1. The formatting matters more than you think.
Early versions outputted Markdown. Nobody cared. The moment I switched to professional HTML — A4 format, print-ready, with a table of contents, executive summary, and branded layout — people started taking it seriously. Same data, completely different perception.
2. Multi-language was a game-changer.
I added support for 5 languages (English, Russian, Chinese, Spanish, French), and suddenly the tool was useful for cross-border market analysis. A Spanish startup researching the Chinese EV market? Done. A Russian VC evaluating Latin American fintech? Done.
3. People don't want raw data — they want a story.
The biggest improvement came when I added narrative sections. Instead of just "TAM: $4.2B" — the report now explains why the market is growing, what's driving the shift, and where the opportunity gaps are. Frameworks like JTBD and Porter's Five Forces turned raw numbers into strategic insight.
4. Speed changes behavior.
When a report takes 6 weeks, you commission one and pray it's the right market. When it takes 5 minutes, you test five markets before breakfast. I've seen founders run 10+ reports in a day, rapidly narrowing from "I want to do something in healthtech" to "I'm going after remote patient monitoring for rural clinics in India."
The Architecture (For Fellow Builders)
I built this as an MCP (Model Context Protocol) server — meaning any AI agent running on Claude, Cursor, or any MCP-compatible client can call it directly. One line in your MCP config and your AI assistant can generate market research on demand.
The tech stack is deliberately simple:
- MCP server handling tool calls
- 5 specialized research agents running in parallel
- Web scraping layer for real-time data (Apify infrastructure)
- Source verification pipeline with deduplication
- HTML report generator with professional templates
The whole thing is available on Apify Store (search for "Market Research MCP Server") and the core methodology is open-sourced on GitHub.
What I'd Do Differently
If I started over:
- Start with the output format, not the data pipeline. I spent weeks on data collection before realizing the HTML template was what made people say "wow."
- Charge from day one. I gave away 50+ free reports before charging. Should have validated willingness-to-pay earlier.
- Focus on one vertical first. "All market research" is a positioning nightmare. "Market research for SaaS startups" would have been a sharper wedge.
The Numbers
In the first month:
- 147 reports generated across 23 industries
- Average generation time: 4 minutes 38 seconds
- Most requested: SaaS, fintech, healthtech, e-commerce, AI/ML
- Languages: 60% English, 20% Chinese, 10% Spanish, 5% Russian, 5% French
The tool pulls from 20+ source types and applies analytical frameworks that would take a human analyst days to compile. It's not perfect — no AI tool is. But it gives you 80% of the insight at 0.1% of the cost.
Try It Yourself
I'm offering a deal for the Dev.to community:
Want a custom market research report for your industry? I'll generate a full 30-50 page report with verified data, competitor analysis, TAM/SAM/SOM, and strategic frameworks — tailored to your specific market.
$20 for a complete report bundle (2 full reports for any industries you choose).
Just drop a comment below with your industry/market, and I'll show you a preview of what the report covers before you commit.
You can also:
- Try the tool yourself on Apify Store (search "Market Research MCP Server")
- Check out the methodology on GitHub
- Or just ask me anything in the comments — happy to share more about the build process
This tool was born from a real problem: the gap between expensive agency research and unreliable AI chat. If you've ever needed market data and felt stuck between those two extremes — I built this for you.
Top comments (0)