DEV Community

Michael O
Michael O

Posted on • Originally published at xeroaiagency.com

How to Do Customer Research With an AI Agent

Traditional customer discovery advice assumes you have time. Talk to 50 people. Run a survey. Build a persona. Schedule interviews. That process works if you have two co-founders and no day job. Most solo founders have neither.

There is a faster path. AI agents can scan thousands of real customer conversations on Reddit, App Store reviews, and niche forums and pull the signal out in a fraction of the time. Not summaries of what AI thinks people say. Actual quotes, actual complaints, actual questions people typed in public.

This is how the Xero operating system does it internally, and it is the same logic behind Xero Scout.


What Is AI-Assisted Customer Research?

AI-assisted customer research means using an agent to collect and synthesize publicly available conversations at scale, instead of replacing human judgment with generated assumptions. You point AI at real data from Reddit threads, forum posts, and review sites. The agent reads and extracts. You decide what to build next.

The key distinction: this is not asking AI to imagine what customers might say. Reddit posts, forum threads, and review text are primary sources. The agent reads and extracts. You decide what matters.

Four things it can do well: find recurring complaints, surface exact language people use about a problem, flag what frustrates users about existing tools, and group patterns across hundreds of comments faster than any human.

It cannot replace a real customer conversation. But it can help you walk into that conversation knowing you have already read the market.


Where Does the Real Customer Signal Live?

The real customer signal lives in unfiltered public complaint threads on Reddit, in one-star app reviews, and in niche forums where people talk without a sales audience watching. Google surfaces polished SEO content. These platforms surface raw buyer psychology. The difference in signal quality is significant enough to change what you build.

Most founders start with Google. That is the wrong move for customer research. Google surfaces SEO content, not raw buyer psychology.

The real signal is in places where people complain publicly without a sales agenda:

Reddit is the most valuable. Subreddits like r/SaaS, r/Entrepreneur, r/solopreneur, r/webdev, and r/smallbusiness have thousands of posts from people describing real problems, failed tools, and specific frustrations. Nobody is pitching. They are just talking. According to Similarweb, Reddit receives over 1.5 billion visits per month globally, with a significant share coming from people actively searching for product recommendations and solutions.

App Store and G2 reviews are gold for competitive research. One-star reviews tell you exactly what the market's existing solutions get wrong. Filter for "I switched because" and "I wish it could" to surface high-intent pain.

Indie Hackers comments carry strong signal for B2B SaaS. Founders discuss what broke their product, what they would pay for, and what they have tried.

Niche forums and Discord servers are harder to scrape systematically but worth monitoring for specific verticals.

The pattern across all of them: you are looking for the same complaint appearing in different threads from different people. One complaint is an edge case. The same complaint in ten threads is a real problem worth solving.


How Does the Agent Workflow Actually Work?

The workflow has five steps: define your query set, collect threads, extract and label each comment, group patterns, then do a human review to decide what matters. Each run takes 90 minutes the first time and under 30 on repeat. This process runs weekly inside the Evo operating system.

Here is the full structure, along with how it runs inside the Evo operating system:

Step 1: Define the query set

Give the agent your product URL or a short description of the problem you are solving. The agent generates 10 to 20 search queries it will use to find relevant threads. These are variations of the pain, not the product name. "I wish there was a tool that" or "anyone else struggling with" framing finds better signal than searching for competitor names.

Step 2: Collect threads

The agent searches Reddit (or the target platform) and collects the top threads, comments, and replies matching those queries. Aim for 100 to 200 unique comments minimum before drawing conclusions. Smaller samples produce misleading patterns.

Step 3: Extract and label

The agent reads each comment and assigns a label: complaint, feature request, comparison question, churn reason, or success story. It pulls the exact quote and notes the thread context. No paraphrasing at this stage. You want the raw language.

Step 4: Pattern grouping

Across all labeled comments, the agent groups similar complaints together. You end up with something like: "18 people mentioned that [Tool X] does not support [Feature Y]" or "12 people asked whether this works without [Dependency Z]."

Step 5: Human review

You read the grouped output and decide what is signal. The agent finds patterns. You decide which ones matter for your specific product strategy. Do not skip this step.

That entire loop takes about 90 minutes the first time, less than 30 on repeat runs once the queries are refined.


What Should You Do With the Research Output?

The output should drive four things: sharpen your positioning copy, write better cold outreach, prioritize the product roadmap, and generate reply angles for community distribution. Most founders stop after reading the insights. The bigger move is letting the exact language customers use rewrite your messaging and your feature backlog.

Most founders stop at the insight. The bigger leverage is in using the output to drive the next decisions:

Sharpen the positioning. If the research keeps surfacing a specific frustration that your product actually solves better than alternatives, that is your headline. Not your feature list. The exact words people use to describe the pain.

Write better cold outreach. If you know the specific complaint, you can open with it instead of a generic pitch. "Noticed a thread where founders said [exact complaint] is their biggest frustration with [competitor]. We built [product] because of that specific problem." That is not spam. That is proof you did your homework.

Prioritize the roadmap. Grouping complaints by frequency gives you a signal-weighted backlog. Build the thing that 18 people are actively asking for before the thing that one person mentioned once.

Generate reply angles for distribution. If you know the forums where your target customers are posting, you can draft helpful replies to those threads. Not pitching. Answering the question they asked, with the credibility of someone who actually solved it. This is exactly what Xero Scout automates.


What Tools Do You Actually Need to Build This?

You do not need an expensive stack. A working agent needs four things: a search layer, a language model with a long context window, a structured output format, and a storage layer. Free or near-free options exist for each component. Total setup time is a few hours.

Here is how each piece works in practice:

A search layer: something that can query Reddit or pull from the API. Reddit's official API is available, though rate-limited on free tiers. The Reddit API documentation covers the available endpoints. For manual work, site:reddit.com Google queries with a Python scraper cover most cases without needing API access.

A language model with a long context window: you want to feed it 100 to 200 comments at once and get a structured extraction back. GPT-4 class models handle this well. Claude works too, and tends to be better at maintaining structured output consistency across large batches.

A structured output format: have the model return JSON with fields for quote, category, source URL, and thread context. Flat text dumps are hard to act on.

Storage: a simple spreadsheet or Airtable base to accumulate runs over time. You want to see how the patterns shift as you refine your queries.

If you want this running as a recurring loop rather than a one-off project, wire it to a cron job. The AI agent task scheduling post covers the infrastructure.


What Are the Most Common Mistakes Founders Make?

The most expensive mistake is searching for the product name instead of the pain, which surfaces review content rather than real buyer frustrations. The second is treating a single vivid complaint as a confirmed pattern before checking how many other people said the same thing. Both mistakes lead to building the wrong feature with false confidence.

Searching for the product, not the pain. "Reddit posts about [Competitor]" finds review content. "Reddit posts where people are frustrated about [problem]" finds buyers. Search for the symptom, not the category.

Reading the output without counting. One vivid complaint feels important. But if only one person said it in 200 comments, it is not a pattern. Count the clusters before you prioritize.

Treating it as a replacement for talking to people. This research is pre-qualification, not a substitute. Use it to identify the five people most worth calling, not to avoid calling anyone.

Ignoring the language. The most valuable output from this research is often not the insight but the exact phrasing. When someone says "I just want to know if it actually works without me having to monitor it every day," that sentence is your landing page copy. Use their words, not yours.

Stopping at one subreddit. The same buyer persona exists in multiple communities. A solo founder building a SaaS might be in r/solopreneur, r/SaaS, r/Entrepreneur, and a vertical-specific subreddit all at once. Run the same queries across all of them.


What Does This Look Like in Practice at Xero?

The Xero operating system runs this research loop weekly for each active product. The output feeds the content calendar, the reply queue, and the positioning documents directly. When a complaint cluster appears three weeks in a row, it becomes either a product feature or a piece of content. Nothing sits idle in a spreadsheet.

For CarCloser, the agent monitors automotive sales subreddits. For the main Xero AI audience, it watches r/solopreneur, r/SaaS, and r/Entrepreneur.

The output feeds the content calendar, the reply queue, and the positioning documents. When a new complaint cluster shows up three weeks in a row, it either becomes a product feature or a piece of content. Nothing goes to waste.

Scout was built as the productized version of this internal workflow because other founders kept asking how we were finding the right Reddit conversations to join. The answer was the agent loop above, wrapped in a cleaner interface.

If you want to start doing this without building the infrastructure from scratch, the AI starter guide walks through the actual agent setup, including the research workflow.


Customer research has always been the highest-leverage activity for early-stage products. The founders who understand what their market is actually saying before they build have a real advantage. AI agents do not replace that work. They just make it fast enough to fit into a Saturday morning.


Published by Michael Olivieri / Xero AI


Start Building Your Own AI System


Want to build your own AI co-founder?

I'm building Xero in public — an AI system that runs distribution, content, and ops while I work a full-time job.

Originally published at xeroaiagency.com

Top comments (0)