Building with Synthetic Survey Data: How We Made 16,500 AI Personas Answer Market Research Questions
Traditional market research takes weeks and costs thousands. We built an API that gives you answers in seconds — grounded in real data, not hallucinations.
The Problem
You have a product idea. You want to know:
- Would German professionals aged 30-45 pay €99/month for this?
- Does this messaging resonate with sustainability-conscious parents?
- How do urban vs. rural Europeans feel about remote work?
Your options: commission a panel study (€3,000+, 4 weeks) or... guess.
Our Approach: Survey-Grounded AI Personas
Every persona in AskAudience maps to a real, individual-level survey record from the European Social Survey (ESS) and World Values Survey (WVS). We don't average across respondents — each persona carries 80+ measured attributes:
- Demographics (age, gender, education, income, location)
- Political orientation and trust levels
- Media consumption and technology adoption
- Environmental attitudes and religiosity
- Work values and life satisfaction
When you ask a persona a question, the LLM is constrained by that person's actual measured attributes. The result includes a Grounding Score (0–1) that tells you how much of the response comes from real data vs. model inference.
How It Works: A Quick API Example
# 1. Create a target audience
curl -X POST https://askaudience.de/api/v1/audiences \
-H "Authorization: Bearer aa_your_key" \
-d '{
"name": "German Tech Professionals 25-40",
"filters": {
"countryCode": "DE",
"ageRange": {"min": 25, "max": 40},
"jobSearch": "tech"
},
"sampleSize": 30
}'
# 2. Ask your audience a question
curl -X POST https://askaudience.de/api/v1/audiences/{id}/ask \
-H "Authorization: Bearer aa_your_key" \
-d '{
"question": "Would you pay €99/month for an AI writing assistant?",
"responseFormat": "likert_5",
"sampleSize": 20
}'
Response includes individual answers from each persona, an aggregated distribution, and average confidence + grounding scores.
The Grounding Score
This is what makes AskAudience different from "just asking ChatGPT to pretend to be a persona."
The Grounding Score (0–1) quantifies how much of each response is attributable to real survey data:
- 0.85+: Response strongly determined by measured attributes
- 0.5–0.85: Mix of real data and model inference
- < 0.5: Treat with skepticism — model is extrapolating
We're transparent because synthetic research has limits. It's pre-validation — filter 100 ideas to the 5 worth testing with real people.
MCP Integration for Claude Code
We ship an MCP server so you can use AskAudience directly in Claude Code:
npx @askaudience/mcp-server
Then in Claude Code:
> Create an audience of sustainability-conscious parents in Germany
and ask them about organic food pricing willingness
Claude handles the API calls, formats results, and runs follow-up comparisons.
What We Measured: 94% Directional Accuracy
In our benchmark against real panel responses (n=165, matched demographics), we measured 94% directional accuracy — the synthetic audience's majority opinion matched the real panel in 94% of questions.
This doesn't mean individual answers are 94% accurate. It means: if you want to know "which direction does my audience lean?", synthetic research gets it right almost every time.
Try It
- API Docs: askaudience.de/docs
-
MCP Server:
npx @askaudience/mcp-server -
Claude Code Plugin:
@askaudience/claude-plugin - 14-day free trial: askaudience.de/pricing
Self-serve from €79/month. No sales calls, no minimum commitment.
I'd love feedback — especially on the Grounding Score approach. Is transparency about AI limitations a feature or a bug in your view?
Top comments (0)