Adam Pitera

Posted on Mar 13

I Built Surveys That Get Smarter With Every Response

#ai #gemini #nextjs #webdev

Using Google Gemini to generate follow-up questions based on what each person says — and what's missing from the dataset.

I wrote about the concept behind Dyadem when I built it for the Gemini 3 Hackathon. Since then I've iterated on it — added an AI question authoring step, deployed it for a real community questionnaire, and learned a lot about what works. This article goes into the engineering.

Most surveys ask everyone the same questions. You get shallow data and bored respondents.

I built Dyadem — an open-source survey platform where AI generates follow-up questions for each respondent based on two things: what they just said, and what gaps exist across all responses so far. Early respondents get broad exploratory questions. By response 50, the AI is asking about specific things nobody's mentioned yet.

Here's how it works and what I learned.

The Flow

User answers 3 fixed questions
  → App gathers dataset context (aggregate stats, themes, gaps)
  → Builds a prompt: this person's answers + what everyone else said
  → Gemini generates 1-2 follow-up questions
  → User answers them
  → Everything submitted together

Stack: Next.js 14.2, TypeScript, PostgreSQL (Neon), Drizzle ORM, Google Gemini.

The Prompt That Makes It Work

This is the bit I found most interesting to design. The prompt gives Gemini two types of context: what the individual said, and a statistical picture of the whole dataset.

You are designing follow-up questions for an anonymous survey.

This person just answered:
- Biggest financial pressure: {{biggest_pressure}}
- How things have changed (1=much better, 5=much worse): {{change_direction}}
- What they've sacrificed: "{{sacrifice}}"

Current dataset ({{total_responses}} responses so far):
- Top pressure: {{top_pressure}} ({{top_pressure_pct}}%)
- Average change score: {{avg_change}}/5
- Most common sacrifice themes: {{sacrifice_themes}}
{{emerging_gap_line}}

Generate 1-2 follow-up questions that:
1. DIG DEEPER into this person's specific situation
2. FILL GAPS — ask about something the community hasn't covered yet
3. Are QUICK to answer (choice, scale, or short text)
4. Feel CONVERSATIONAL, not clinical
5. NEVER ask for identifying information

{{volume_guidance}}

The {{volume_guidance}} bit shifts the AI's strategy based on how much data exists:

if (context.totalResponses < 10) {
  volumeGuidance = "We have very few responses. Ask broader questions "
    + "to establish baseline understanding.";
} else if (context.totalResponses < 50) {
  volumeGuidance = "We're building a picture. Start probing for "
    + "nuance within the dominant themes.";
} else {
  volumeGuidance = "We have substantial data. Ask targeted questions "
    + "to uncover the most surprising or underreported patterns.";
}

This means the survey changes character as it collects data. The first few people get "tell us about you" questions. Later respondents get "nobody's mentioned X yet — what's your experience?" It's a small amount of logic that makes a big difference to data quality.

Getting Structured Output From an LLM

The other problem worth solving: Gemini returns text, but I need a question object I can render in React. Zod + Gemini's structured output mode handles this.

Define what a valid question looks like:

export const adaptiveQuestionSchema = z.object({
  questions: z.array(
    z.object({
      question_text: z.string(),
      input_type: z.enum(["single_choice", "scale", "short_text"]),
      options: z.array(z.string()).optional(),
      scale_min_label: z.string().optional(),
      scale_max_label: z.string().optional(),
      reasoning: z.string(),
    })
  ).min(1).max(2),
});

Pass that schema directly to Gemini:

const response = await getAI().models.generateContent({
  model: MODELS.flash,
  contents: prompt,
  config: {
    responseMimeType: "application/json",
    responseJsonSchema: zodToJsonSchema(adaptiveQuestionSchema),
  },
});

No parsing. No regex. No "please return JSON" in the prompt and hoping for the best. The model is constrained at the API level to return exactly this shape.

The reasoning field is worth calling out — it forces the model to explain why it chose each question. This works like chain-of-thought prompting but baked into the output schema. It improves question quality and gives you something to debug with.

Cost: Cheaper Than You'd Think

I used three different Gemini models matched to the task:

Task	Model	Cost per call
Content moderation	Flash Lite	~$0.0005
Adaptive questions	Flash	~$0.001–0.003
Theme extraction & reports	Pro	~$0.01–0.03

Total per survey submission: about $0.002–0.004. Cheap enough for a community project with no budget.

If the AI Breaks, the Survey Still Works

One design rule I'd push for any AI-enhanced feature: the AI should improve the experience, never block it.

try {
  const res = await fetch(`/api/s/${slug}/adaptive-questions`, {
    method: "POST",
    body: JSON.stringify(answers),
  });
  if (res.ok) {
    const data = await res.json();
    if (data.questions?.length > 0) {
      setAdaptiveQuestions(data);
      setStage("adaptive");
      return;
    }
  }
} catch {
  // AI failed — skip to submission
}
doSubmit(answers, null);

If Gemini is down or slow, people still submit their survey. You lose the follow-ups, not the whole product.

Next Steps

Response quality scoring. Use the AI to score how substantive each free-text answer is. Vague responses get a gentler follow-up encouraging more detail. Detailed answers get acknowledged and move on. Adaptive effort, not just adaptive questions.

Live gap visualisation. A real-time dashboard showing survey creators which topics are well-covered and which have gaps as responses come in — with the ability to manually nudge the AI's priorities. "Stop asking about rent. Start asking about childcare."

Respondent clustering. After enough responses, identify respondent archetypes and tailor the follow-up strategy per cluster, not just per individual answer.

Research-grade export. Structured export that preserves the adaptive question context — not just a CSV. A researcher needs to know why each person got different follow-ups to interpret the data properly.

Takeaways

Give the AI more than just the current input. The follow-up questions are good because Gemini sees the individual's answers and the aggregate dataset. Without that broader context, you just get generic follow-ups.

Constrain output with schemas, not prompts. Asking an LLM to "please return JSON" is fragile. Gemini's responseJsonSchema with Zod validation is reliable.

Volume-aware prompting matters. Shifting the AI's behaviour based on how much data exists is a small detail that changes the character of the whole system.

Dyadem is open source — GitHub repo here if you want to look at the full implementation.

Top comments (21)

Ben Halpern • Mar 13

Fascinating

Adam Pitera • Mar 16

Thanks Ben!

L. Cordero • Mar 14

Great build with so many applications!

Adam Pitera • Mar 16

Thanks! It's deployed in a community feedback context now. But the would work anywhere qualitative data is collected at scale

Harjot Singh • May 31

Adaptive surveys are a great AI use case because the value is obvious and the failure mode is gentle - a slightly-off follow-up question is low-stakes, but a sharper one materially lifts response quality. Static surveys leave so much on the table by asking everyone the same thing; branching on the actual answer is the unlock.

The bit I'd watch as it scales: keep the question-generation grounded so it doesn't wander into leading or off-topic questions (which quietly bias your data). A light validation pass on generated questions pays off. The "automate the tedious 80%, guard the part that matters" pattern again - same thesis behind Moonshift (prompt to a shipped SaaS on your own GitHub+Vercel). Cool product; is the adaptation per-response live, or does it learn across all respondents over time? (Moonshift's first run's free if useful.)

Victor Okefie • Mar 13

The insight isn't the adaptive questions it's that you built a survey that learns what it doesn't know. Most data collection tools optimize for completion. You optimized for coverage. That's the difference between counting answers and understanding a problem.

Adam Pitera • Mar 16

That's a much better way of putting it than i did in the article. Coverage over completion. Traditionally survey would be optimised for 'did they finish', here it's optimised for 'did we learn something new'

klement Gunndu • Mar 13

The dual-context prompt design is smart — feeding both individual answers and aggregate gaps. Have you noticed the AI questions getting too narrow as the dataset grows, or does the gap detection keep them balanced?

Adam Pitera • Mar 16

So it tracks which response category has the least coverage and prompts "we have very few responses about X." So as dominant themes get well-covered, the AI gets nudged towards what's underrepresented rather than drilling further into what everyone's already talking about. Haven't tested past ~100 responses yet though.

View full discussion (21 comments)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.