The epistemological problem general AI tools do not solve, the architecture that does, and what this means for anyone building research-adjacent applications.
By Dr. Susanne Friese · Founder, QInsights · Qualitative Research Methodologist
QInsights · Verified on Prezlo · LinkedIn
The Problem With How Most Teams Use AI for Qualitative Research Right Now
If you have ever watched a researcher paste forty interview transcripts into ChatGPT and ask for themes, you have seen the failure mode up close.
The output looks credible. It sounds like analysis. It uses the right vocabulary. And it is nearly impossible to verify because there is no traceable connection between the findings and what participants actually said.
This is not a hallucination problem in the traditional sense. The model is not making things up from nothing. It is doing what it was designed to do: producing fluent, coherent, plausible-sounding text. The problem is that qualitative research has a specific evidentiary standard that general language models were not built to meet.
Every finding in a qualitative study must be traceable to a specific source. Not a paraphrase of what a participant meant. Not a summary of the general direction of the interviews. The actual quote, from the actual transcript, at the actual moment the participant said it. Without that traceability, you cannot defend your analysis in peer review, in an ethics board submission, or in a client presentation where someone asks "but where does that come from?"
General AI tools produce outputs. Qualitative research requires evidence chains.
What Conversational Analysis with AI Actually Is
The methodology published by Dr. Susanne Friese on SSRN (2025) formalises what she calls Conversational Analysis with AI. Here is the technical version.
Traditional QDAS workflows look like this:
Raw transcripts
→ Manual reading
→ Code assignment per passage
→ Code tree construction
→ Theme identification across codes
→ Finding synthesis
→ Write-up
Each step in that pipeline is performed manually. For a forty-interview corpus averaging 8,000 words per transcript, that is 320,000 words of data that a researcher must read, mark, sort, and synthesise. At full analytical attention, that takes weeks. In practice it takes longer because no human researcher sustains full analytical attention across that volume.
Conversational Analysis with AI restructures the pipeline:
Raw transcripts
→ Upload to structured research environment
→ Researcher engages in dialogue with data
- "What did participants say about X?"
- "Show me contradictions in how people described Y"
- "Which participants mentioned Z unprompted?"
→ AI surfaces patterns with source citations
→ Researcher interprets, challenges, refines
→ Findings emerge from dialogue, not from sorting
The critical architectural requirement is what happens between the AI response and the source data. Every claim the AI makes must be anchored to a specific passage. The system must be able to show the researcher exactly which transcript, which participant, and which moment produced each finding.
The Architecture That Makes This Work in QInsights
QInsights is purpose-built around this evidentiary requirement. Here is how the system handles it technically.
Document ingestion and chunking
Uploaded transcripts are chunked into semantically coherent units rather than arbitrary token windows. Interview data has a specific structure: questions, responses, probes, tangents, returns. Chunking that respects conversational structure produces retrieval results that are more contextually accurate than chunking by token count.
Retrieval augmented generation with citation anchoring
Every response the system generates is grounded in retrieved passages from the uploaded corpus. The retrieval step runs before generation, not after. The model does not generate a finding and then look for supporting quotes. It retrieves relevant passages first, then generates a response that is constrained by what was actually retrieved.
User query: "What did participants say about trust in the onboarding flow?"
Step 1: Semantic search across chunked transcript corpus
Step 2: Retrieve top-k passages most relevant to query
Step 3: Generate response constrained to retrieved passages
Step 4: Return response with source citations intact
Output:
{
"finding": "Participants described trust as conditional on
transparency about data use",
"citations": [
{
"participant": "P12",
"transcript": "session_12.txt",
"passage": "I would have been fine with it if they'd just
told me upfront what they were collecting",
"timestamp": "00:23:41"
},
{
"participant": "P07",
"transcript": "session_07.txt",
"passage": "The problem wasn't the data, it was not knowing",
"timestamp": "00:41:18"
}
]
}
This citation structure is what separates qualitative-specific AI from general AI applied to qualitative data. The researcher can verify every finding against its source. The evidence chain is intact.
Contradiction detection
One of the most valuable and underused capabilities in qualitative analysis is identifying where participants contradict each other or where the same participant holds contradictory positions. QInsights surfaces these explicitly as part of the analytical dialogue rather than smoothing them into a false consensus.
User query: "Are there contradictions in how participants
described the onboarding experience?"
System response: "Yes. P03 and P09 describe the length
as appropriate while P14, P22, and P31
describe it as too long. Notably, P14
also described the process as 'fine'
in a later exchange [timestamp 01:02:33],
suggesting the length concern may be
context-dependent."
Citations: [P03/session_03.txt, P09/session_09.txt,
P14/session_14.txt, P22/session_22.txt,
P31/session_31.txt]
Why This Matters for Developers Building Research-Adjacent Applications
If you are building anything that involves processing user interview data, customer feedback, open-ended survey responses, or unstructured qualitative data at scale, the architectural decisions QInsights represents are directly relevant.
The citation requirement is not optional for serious research use cases. If you are building a research tool and your AI outputs do not anchor to specific source passages, your users will face a credibility problem the moment they try to present findings to anyone who asks "where does that come from?"
Chunking strategy matters more than model choice for qualitative data. The quality of retrieval in a RAG system for interview data is more sensitive to how the documents are chunked than to which model generates the response. Conversational data has structure. Preserve it.
Contradiction is signal, not noise. Most RAG implementations try to produce coherent, consensus outputs. For qualitative research, the contradictions between participants are often the most analytically significant finding. Build systems that surface them rather than resolve them.
The researcher is not replaceable, just better equipped. Any system that positions AI as doing the analysis rather than supporting the researcher will produce lower quality outputs and will fail the epistemological standards that serious research requires. The researcher's interpretive judgment is the product. The AI handles the parts of the job that do not require it.
Practical Integration Notes
QInsights is available at qinsights.ai. Monthly webinars run live with real data demos. Book a direct session with Dr. Friese to bring your specific research context.
The platform handles:
- Individual and group interview transcripts
- Focus group recordings and transcripts
- Open-ended survey response sets
- Ethnographic field notes and observational data
- Document analysis corpora
Supported research contexts: academic research, UX research, market research, healthcare research, policy analysis, enterprise insight functions.
Data privacy is built into the architecture for institutional research requirements. Participant data does not leave the secure research environment.
Further Reading
- Friese, S. (2025). Conversational Analysis with AI: Rethinking Coding in Qualitative Analysis. SSRN. Available at ssrn.com/abstract=5232579
- Hayes, A. (2025). Conversing with qualitative data: enhancing qualitative research through large language models. International Journal of Qualitative Methods, 24.
- Morgan, D. (2025). Query-Based Analysis: A strategy for analyzing qualitative data using ChatGPT.
Dr. Susanne Friese is a qualitative research methodologist with over thirty years of experience, founder of QInsights and Qeludra B.V., author of Qualitative Data Analysis with ATLAS.ti (SAGE Publications), and keynote speaker on AI in qualitative research globally.
Top comments (0)