DEV Community

AskAudience
AskAudience

Posted on

Building with Synthetic Survey Data: How We Made 16,500 AI Personas Answer Market Research Questions

Building with Synthetic Survey Data: How We Made 16,500 AI Personas Answer Market Research Questions

Traditional market research takes weeks and costs thousands. We built an API that gives you answers in seconds — grounded in real data, not hallucinations.

The Problem

You have a product idea. You want to know:

  • Would German professionals aged 30-45 pay €99/month for this?
  • Does this messaging resonate with sustainability-conscious parents?
  • How do urban vs. rural Europeans feel about remote work?

Your options: commission a panel study (€3,000+, 4 weeks) or... guess.

Our Approach: Survey-Grounded AI Personas

Every persona in AskAudience maps to a real, individual-level survey record from the European Social Survey (ESS) and World Values Survey (WVS). We don't average across respondents — each persona carries 80+ measured attributes:

  • Demographics (age, gender, education, income, location)
  • Political orientation and trust levels
  • Media consumption and technology adoption
  • Environmental attitudes and religiosity
  • Work values and life satisfaction

When you ask a persona a question, the LLM is constrained by that person's actual measured attributes. The result includes a Grounding Score (0–1) that tells you how much of the response comes from real data vs. model inference.

How It Works: A Quick API Example

# 1. Create a target audience
curl -X POST https://askaudience.de/api/v1/audiences \
  -H "Authorization: Bearer aa_your_key" \
  -d '{
    "name": "German Tech Professionals 25-40",
    "filters": {
      "countryCode": "DE",
      "ageRange": {"min": 25, "max": 40},
      "jobSearch": "tech"
    },
    "sampleSize": 30
  }'

# 2. Ask your audience a question
curl -X POST https://askaudience.de/api/v1/audiences/{id}/ask \
  -H "Authorization: Bearer aa_your_key" \
  -d '{
    "question": "Would you pay €99/month for an AI writing assistant?",
    "responseFormat": "likert_5",
    "sampleSize": 20
  }'
Enter fullscreen mode Exit fullscreen mode

Response includes individual answers from each persona, an aggregated distribution, and average confidence + grounding scores.

The Grounding Score

This is what makes AskAudience different from "just asking ChatGPT to pretend to be a persona."

The Grounding Score (0–1) quantifies how much of each response is attributable to real survey data:

  • 0.85+: Response strongly determined by measured attributes
  • 0.5–0.85: Mix of real data and model inference
  • < 0.5: Treat with skepticism — model is extrapolating

We're transparent because synthetic research has limits. It's pre-validation — filter 100 ideas to the 5 worth testing with real people.

MCP Integration for Claude Code

We ship an MCP server so you can use AskAudience directly in Claude Code:

npx @askaudience/mcp-server
Enter fullscreen mode Exit fullscreen mode

Then in Claude Code:

> Create an audience of sustainability-conscious parents in Germany
  and ask them about organic food pricing willingness
Enter fullscreen mode Exit fullscreen mode

Claude handles the API calls, formats results, and runs follow-up comparisons.

What We Measured: 94% Directional Accuracy

In our benchmark against real panel responses (n=165, matched demographics), we measured 94% directional accuracy — the synthetic audience's majority opinion matched the real panel in 94% of questions.

This doesn't mean individual answers are 94% accurate. It means: if you want to know "which direction does my audience lean?", synthetic research gets it right almost every time.

Try It

Self-serve from €79/month. No sales calls, no minimum commitment.


I'd love feedback — especially on the Grounding Score approach. Is transparency about AI limitations a feature or a bug in your view?

Top comments (0)