DEV Community

Cheetu AI
Cheetu AI

Posted on

How to Turn Meeting Transcripts Into a Searchable AI Knowledge Base published: true

How to Turn Meeting Transcripts Into a Searchable AI Knowledge Base

Most meeting tools stop at the transcript.

They capture what was said.

That is useful.

But a transcript alone is not always easy to use.

A one-hour meeting can produce thousands of words. A team with daily calls can quickly generate hundreds of transcripts. A student may have dozens of lectures. A sales team may have hundreds of customer conversations.

The problem becomes obvious:

The information exists, but it is still hard to find.

At Cheetu AI, we think the next step is not simply better transcription.

The next step is conversation memory.

That means turning meetings, calls, lectures, and notes into a searchable knowledge layer that users can ask questions about later.

This post walks through how that kind of system can be designed.


The Problem With Raw Transcripts

Raw transcripts are valuable, but they have limitations.

They are often:

  • Too long to scan
  • Hard to search semantically
  • Missing useful structure
  • Disconnected from summaries
  • Difficult to compare across meetings
  • Not always clear about who said what
  • Not always connected to timestamps or source context A transcript tells you what happened.

But it does not always answer:

What did we decide?
Who owns the next step?
What risks were mentioned?
What did customers say about pricing?
Did we already discuss this topic before?
Enter fullscreen mode Exit fullscreen mode

To answer those questions, we need to treat transcripts as structured knowledge.


Step 1: Capture Real-Time Transcript Segments
The first layer is real-time transcription.
Instead of storing one large transcript file, it is more useful to store the conversation as timestamped segments.
For example:

{
  "session_id": "meeting_2026_05_18",
  "segments": [
    {
      "segment_id": "seg_001",
      "speaker": "Speaker A",
      "start_time": "00:01:12",
      "end_time": "00:01:18",
      "text": "Let's prioritize onboarding improvements for Q3.",
      "language": "en"
    },
    {
      "segment_id": "seg_002",
      "speaker": "Speaker B",
      "start_time": "00:01:19",
      "end_time": "00:01:27",
      "text": "Agreed. The setup flow is still too manual for new teams.",
      "language": "en"
    }
  ]
}
Enter fullscreen mode Exit fullscreen mode

This structure gives us the basic building blocks for memory:

  • Session ID
  • Segment ID
  • Speaker
  • Start time
  • End time
  • Transcript text
  • Language That makes it easier to retrieve not just the answer, but also the original source.

Step 2: Add Metadata Early
Metadata is what makes transcripts useful later.
Without metadata, everything becomes a wall of text.
Useful metadata may include:

  • Meeting title
  • Meeting type
  • Participants
  • Speaker labels
  • Language
  • Topic tags
  • Timestamp range
  • Customer or account name
  • Project name
  • Summary section
  • Action item owner
  • Due date

A segment with metadata might look like this:

{
  "segment_id": "seg_002",
  "session_id": "meeting_2026_05_18",
  "meeting_title": "Product Review",
  "meeting_type": "internal",
  "speaker": "Speaker B",
  "timestamp": "00:01:19",
  "text": "The setup flow is still too manual for new teams.",
  "language": "en",
  "topics": ["onboarding", "activation", "user experience"],
  "entities": ["new teams"],
  "importance": "medium"
}
Enter fullscreen mode Exit fullscreen mode

This gives the system more ways to filter and retrieve information.

For example:

Find onboarding feedback from product meetings.
Enter fullscreen mode Exit fullscreen mode
Show customer objections about setup complexity.
Enter fullscreen mode Exit fullscreen mode
What activation risks were mentioned last month?
Enter fullscreen mode Exit fullscreen mode


Step 3: Support Live Translation

For global teams, transcripts are not always in one language.
A meeting may include English, Spanish, Mandarin, Japanese, Korean, Arabic, French, or multiple languages in the same session.
If the product supports live translation, it is useful to store both the original text and the translated text.
For example:

{
  "segment_id": "seg_014",
  "speaker": "Speaker C",
  "timestamp": "00:08:42",
  "original": {
    "language": "es",
    "text": "Necesitamos una cronología clara para el lanzamiento."
  },
  "translation": {
    "language": "en",
    "text": "We need a clear timeline for the launch."
  }
}
Enter fullscreen mode Exit fullscreen mode

This supports two important use cases.
First, users can follow the conversation in real time.
Second, users can search later in their preferred language.
A user might ask in English:

What did the team say about the launch timeline?
Enter fullscreen mode Exit fullscreen mode

Even if the original discussion happened partly in Spanish.

That matters because conversation memory should not be limited by the language of the original meeting.

Step 4: Generate Structured Summaries

A summary should not just compress a transcript.
It should create structure.
For meeting workflows, useful summaries often include:

  • Key points
  • Decisions
  • Risks
  • Open questions
  • Action items
  • Owners
  • Due dates A structured summary might look like this:
## Key Points

* The team reviewed onboarding friction for new users.
* Setup complexity is affecting activation.
* Documentation may need to be simplified.

## Decisions

* Prioritize onboarding improvements in Q3.
* Move analytics improvements behind onboarding work.

## Risks

* Current setup flow may create drop-off during activation.
* New users may not understand how to invite teammates.

## Action Items

* Maya: Audit onboarding steps by Friday.
* Alex: Review activation data from the last cohort.
* Sam: Draft simplified setup documentation by next Tuesday.

## Open Questions

* Should enterprise users get a separate onboarding flow?
* Should onboarding guidance happen in-product or by email?
Enter fullscreen mode Exit fullscreen mode

This kind of summary is easier to review.
It is also easier to index.
For example, action items can be searched differently from decisions.
Risks can be grouped across multiple meetings.
Open questions can be tracked over time.


Step 5: Chunk the Conversation

Large transcripts are difficult to retrieve accurately.
A better approach is to break the conversation into chunks.
Chunks can be based on:

  • Time windows
  • Speaker turns
  • Topic shifts
  • Summary sections
  • Question-and-answer pairs
  • Action item boundaries

A simple chunk object could look like this:

{
  "chunk_id": "chunk_001",
  "session_id": "meeting_2026_05_18",
  "title": "Onboarding friction discussion",
  "start_time": "00:10:12",
  "end_time": "00:16:48",
  "text": "The team discussed setup complexity, activation drop-off, and onboarding documentation.",
  "topics": ["onboarding", "activation", "documentation"],
  "speakers": ["Speaker A", "Speaker B", "Speaker C"]
}
Enter fullscreen mode Exit fullscreen mode

Good chunks should be:

  • Small enough to retrieve accurately
  • Large enough to preserve context
  • Connected to timestamps
  • Connected to source transcript segments
  • Tagged with useful metadata

Chunking is one of the most important design decisions in a conversation memory system.
Bad chunks create bad retrieval.
Good chunks make answers more accurate and easier to verify.

Step 6: Build a Search Index

Once the conversation is chunked, it can be indexed.
A useful search system may combine:

  • Keyword search
  • Vector search
  • Metadata filters
  • Time filters
  • Speaker filters
  • Meeting-type filters
  • Language filters

A simplified search request might look like this:

{
  "query": "What did customers say about onboarding friction?",
  "filters": {
    "meeting_type": "customer_call",
    "date_range": "last_90_days",
    "topics": ["onboarding"]
  },
  "top_k": 5
}
Enter fullscreen mode Exit fullscreen mode

The system should return answers with source context:

{
  "answer": "Customers said onboarding felt too manual, especially during workspace setup and teammate invitation.",
  "sources": [
    {
      "meeting_title": "Customer Call - Acme",
      "timestamp": "00:12:44",
      "speaker": "Customer",
      "text": "The setup process takes too long when inviting multiple teammates."
    },
    {
      "meeting_title": "Customer Call - Northstar",
      "timestamp": "00:27:10",
      "speaker": "Customer",
      "text": "We had to ask support for help during initial setup."
    }
  ]
}
Enter fullscreen mode Exit fullscreen mode

This is important.
The goal is not only to generate an answer.
The goal is to help the user trust the answer.

Step 7: Make Answers Source-Grounded

AI answers can be useful, but they can also feel uncertain.
For conversation memory, source grounding is critical.

A good answer should show:

  • Which meeting it came from
  • Who said it
  • When it was said
  • What the original transcript said
  • Whether the text was translated
  • How the answer connects to the source

For example:

Answer:
Customers were concerned that onboarding required too many manual steps.

Sources:
1. Customer Call - Acme
   00:12:44
   Speaker: Customer
   "The setup process takes too long when inviting multiple teammates."

2. Customer Call - Northstar
   00:27:10
   Speaker: Customer
   "We had to ask support for help during initial setup."
Enter fullscreen mode Exit fullscreen mode

This turns the system from a black box into an assistant users can inspect.
That matters for trust.
Especially when the content comes from meetings that affect product, sales, hiring, support, education, or strategy.


A Simple Architecture
At a high level, the system might look like this:

Audio Stream
    ↓
Real-Time Transcription
    ↓
Speaker Labels + Timestamps
    ↓
Live Translation
    ↓
Structured Summary
    ↓
Chunking + Metadata
    ↓
Search Index
    ↓
Source-Grounded AI Answers
Enter fullscreen mode Exit fullscreen mode

Each layer adds value.

The technical challenge is making this pipeline reliable.
The product challenge is making it feel simple.

What This Enables
A searchable conversation memory system can support many workflows.
Product Teams
Product teams can ask:

What onboarding problems did customers mention this month?
Enter fullscreen mode Exit fullscreen mode
Which feature requests came up most often?
Enter fullscreen mode Exit fullscreen mode
What risks did we identify during the last roadmap review?
Enter fullscreen mode Exit fullscreen mode

Sales Teams
Sales teams can ask:

What pricing objections came up in recent calls?
Enter fullscreen mode Exit fullscreen mode
Which customers asked about enterprise security?
Enter fullscreen mode Exit fullscreen mode
What follow-ups were promised to this account?
Enter fullscreen mode Exit fullscreen mode

Students and Researchers
Students and researchers can ask:

Summarize the main points from my last three lectures.
Enter fullscreen mode Exit fullscreen mode
Find the part where the professor explained reinforcement learning.
Enter fullscreen mode Exit fullscreen mode
What open questions came up during the research discussion?
Enter fullscreen mode Exit fullscreen mode

Global Teams
Global teams can ask:

What was decided in the multilingual planning meeting?
Enter fullscreen mode Exit fullscreen mode
Show the original and translated version of that discussion.
Enter fullscreen mode Exit fullscreen mode
Which regional launch risks were mentioned?
Enter fullscreen mode Exit fullscreen mode

Why No Meeting Bot Matters
Many AI meeting tools use a bot that joins the call.
That can work.
But it can also create friction.

A visible bot can make participants feel watched.
External guests may ask who joined.
Some teams may restrict meeting bots.
Sensitive conversations may feel less natural.
The assistant can become another participant in the room.

A no-bot approach feels lighter.
The assistant supports the user without changing the meeting dynamic.
For Cheetu AI, this is an important design principle:

Help people capture, understand, summarize, and search conversations without requiring an AI bot to join the meeting.

The technology should support the conversation.
It should not become the conversation.

Design Principles for Conversation Memory
When building this kind of system, these principles are important.

1. Preserve source context
Every answer should be traceable back to the original conversation.

2. Keep real-time output readable
Live transcription and translation should be fast, clean, and easy to scan.

3. Structure summaries around decisions
Users need outcomes, not just paragraphs.

4. Make search natural
People should be able to ask questions in plain language.

5. Support multilingual knowledge

Users should be able to search and review across languages.

6. Avoid unnecessary meeting friction
The assistant should not make the meeting feel crowded or unnatural.


Final Thought

The future of meeting tools is not only about recording more conversations.

It is about making conversations easier to understand, reuse, and trust.

A transcript is a starting point.

A summary is a useful layer.

But the bigger opportunity is conversation memory:

  • Real-time transcription
  • Live translation
  • Structured AI summaries
  • Searchable archives
  • Source-grounded answers
  • A no-bot meeting experience

That is the direction we are exploring with Cheetu AI.

The question is no longer:

  • Can AI take better meeting notes?

The better question is:

  • Can AI help us remember the conversations that matter?

Learn more: Cheetu AI

Top comments (0)