Lopty ads

Posted on May 29

Why Qualitative Researchers Are Replacing Manual Coding With Conversational AI - And How It Works Technically

#ai #data #computerscience

The epistemological problem general AI tools do not solve, the architecture that does, and what this means for anyone building research-adjacent applications.

By Dr. Susanne Friese · Founder, QInsights · Qualitative Research Methodologist

QInsights · Verified on Prezlo · LinkedIn

The Problem With How Most Teams Use AI for Qualitative Research Right Now

If you have ever watched a researcher paste forty interview transcripts into ChatGPT and ask for themes, you have seen the failure mode up close.

The output looks credible. It sounds like analysis. It uses the right vocabulary. And it is nearly impossible to verify because there is no traceable connection between the findings and what participants actually said.

This is not a hallucination problem in the traditional sense. The model is not making things up from nothing. It is doing what it was designed to do: producing fluent, coherent, plausible-sounding text. The problem is that qualitative research has a specific evidentiary standard that general language models were not built to meet.

Every finding in a qualitative study must be traceable to a specific source. Not a paraphrase of what a participant meant. Not a summary of the general direction of the interviews. The actual quote, from the actual transcript, at the actual moment the participant said it. Without that traceability, you cannot defend your analysis in peer review, in an ethics board submission, or in a client presentation where someone asks "but where does that come from?"

General AI tools produce outputs. Qualitative research requires evidence chains.

What Conversational Analysis with AI Actually Is

The methodology published by Dr. Susanne Friese on SSRN (2025) formalises what she calls Conversational Analysis with AI. Here is the technical version.

Traditional QDAS workflows look like this:

Raw transcripts
    → Manual reading
    → Code assignment per passage
    → Code tree construction
    → Theme identification across codes
    → Finding synthesis
    → Write-up

Each step in that pipeline is performed manually. For a forty-interview corpus averaging 8,000 words per transcript, that is 320,000 words of data that a researcher must read, mark, sort, and synthesise. At full analytical attention, that takes weeks. In practice it takes longer because no human researcher sustains full analytical attention across that volume.

Conversational Analysis with AI restructures the pipeline:

Raw transcripts
    → Upload to structured research environment
    → Researcher engages in dialogue with data
        - "What did participants say about X?"
        - "Show me contradictions in how people described Y"
        - "Which participants mentioned Z unprompted?"
    → AI surfaces patterns with source citations
    → Researcher interprets, challenges, refines
    → Findings emerge from dialogue, not from sorting

The critical architectural requirement is what happens between the AI response and the source data. Every claim the AI makes must be anchored to a specific passage. The system must be able to show the researcher exactly which transcript, which participant, and which moment produced each finding.

The Architecture That Makes This Work in QInsights

QInsights is purpose-built around this evidentiary requirement. Here is how the system handles it technically.

Document ingestion and chunking

Uploaded transcripts are chunked into semantically coherent units rather than arbitrary token windows. Interview data has a specific structure: questions, responses, probes, tangents, returns. Chunking that respects conversational structure produces retrieval results that are more contextually accurate than chunking by token count.

Retrieval augmented generation with citation anchoring

Every response the system generates is grounded in retrieved passages from the uploaded corpus. The retrieval step runs before generation, not after. The model does not generate a finding and then look for supporting quotes. It retrieves relevant passages first, then generates a response that is constrained by what was actually retrieved.

User query: "What did participants say about trust in the onboarding flow?"

Step 1: Semantic search across chunked transcript corpus
Step 2: Retrieve top-k passages most relevant to query
Step 3: Generate response constrained to retrieved passages
Step 4: Return response with source citations intact

Output:
{
  "finding": "Participants described trust as conditional on 
               transparency about data use",
  "citations": [
    {
      "participant": "P12",
      "transcript": "session_12.txt",
      "passage": "I would have been fine with it if they'd just 
                  told me upfront what they were collecting",
      "timestamp": "00:23:41"
    },
    {
      "participant": "P07", 
      "transcript": "session_07.txt",
      "passage": "The problem wasn't the data, it was not knowing",
      "timestamp": "00:41:18"
    }
  ]
}

This citation structure is what separates qualitative-specific AI from general AI applied to qualitative data. The researcher can verify every finding against its source. The evidence chain is intact.

Contradiction detection

One of the most valuable and underused capabilities in qualitative analysis is identifying where participants contradict each other or where the same participant holds contradictory positions. QInsights surfaces these explicitly as part of the analytical dialogue rather than smoothing them into a false consensus.

User query: "Are there contradictions in how participants 
             described the onboarding experience?"

System response: "Yes. P03 and P09 describe the length 
                  as appropriate while P14, P22, and P31 
                  describe it as too long. Notably, P14 
                  also described the process as 'fine' 
                  in a later exchange [timestamp 01:02:33], 
                  suggesting the length concern may be 
                  context-dependent."

Citations: [P03/session_03.txt, P09/session_09.txt, 
           P14/session_14.txt, P22/session_22.txt, 
           P31/session_31.txt]

Why This Matters for Developers Building Research-Adjacent Applications

If you are building anything that involves processing user interview data, customer feedback, open-ended survey responses, or unstructured qualitative data at scale, the architectural decisions QInsights represents are directly relevant.

The citation requirement is not optional for serious research use cases. If you are building a research tool and your AI outputs do not anchor to specific source passages, your users will face a credibility problem the moment they try to present findings to anyone who asks "where does that come from?"

Chunking strategy matters more than model choice for qualitative data. The quality of retrieval in a RAG system for interview data is more sensitive to how the documents are chunked than to which model generates the response. Conversational data has structure. Preserve it.

Contradiction is signal, not noise. Most RAG implementations try to produce coherent, consensus outputs. For qualitative research, the contradictions between participants are often the most analytically significant finding. Build systems that surface them rather than resolve them.

The researcher is not replaceable, just better equipped. Any system that positions AI as doing the analysis rather than supporting the researcher will produce lower quality outputs and will fail the epistemological standards that serious research requires. The researcher's interpretive judgment is the product. The AI handles the parts of the job that do not require it.

Practical Integration Notes

QInsights is available at qinsights.ai. Monthly webinars run live with real data demos. Book a direct session with Dr. Friese to bring your specific research context.

The platform handles:

Individual and group interview transcripts
Focus group recordings and transcripts
Open-ended survey response sets
Ethnographic field notes and observational data
Document analysis corpora

Supported research contexts: academic research, UX research, market research, healthcare research, policy analysis, enterprise insight functions.

Data privacy is built into the architecture for institutional research requirements. Participant data does not leave the secure research environment.

DEV Community