DEV Community

Cover image for Building a Personal Conversation Memory Layer Without Adding a Meeting Bot
Cheetu AI
Cheetu AI

Posted on

Building a Personal Conversation Memory Layer Without Adding a Meeting Bot

Most AI meeting tools follow the same pattern:

1. A bot joins your meeting.
2. It records the conversation.
3. It generates a transcript.
4. It sends a summary afterward.

That workflow can be useful, but it also creates friction.

A visible meeting bot can make people feel watched. Some teams have strict meeting permissions. External guests may not know who the bot is. And in many cases, the value of the tool only arrives after the meeting is already over.

At **Cheetu AI**, we have been exploring a different question:

> What if conversations became useful in real time, and searchable afterward — without adding another participant to the meeting?

That idea led us to think about meetings not just as events, but as personal and team knowledge streams.

---

## Meetings Are Knowledge Streams

A meeting is usually treated as a temporary event.

People join, talk, decide, assign tasks, and leave. Afterward, the useful information often gets scattered across recordings, transcripts, chat messages, notes, or someone’s memory.

But from a product design perspective, a meeting contains structured knowledge:

- Who said what
- When something was said
- What decisions were made
- Which questions are still open
- Which action items were assigned
- Who owns each next step
- What risks or objections were raised
- What language each participant was most comfortable using

If we treat conversations as knowledge streams, the goal becomes bigger than “generate meeting notes.”

The goal becomes:

> Capture the conversation, make it understandable in real time, summarize it clearly, and make it searchable later.

---

## 1. Real-time Transcription: The Foundation

The first layer is real-time transcription.

Transcription is not only useful because it creates notes. It changes the experience while the conversation is happening.

For example, live transcription helps when:

- A participant misses a sentence
- A non-native speaker needs text support
- A host wants to focus instead of taking notes
- A student wants to listen instead of typing everything
- An interviewer wants to stay engaged with the speaker

The key design factor is latency.

If transcription arrives too late, it becomes a record.

If transcription arrives in real time, it becomes part of the meeting interface.

A simplified transcript structure might look like this:

```

json
{
  "session_id": "meeting_123",
  "segments": [
    {
      "speaker": "Speaker 1",
      "start_time": "00:03:12",
      "end_time": "00:03:18",
      "text": "Let's move the launch date to next Tuesday.",
      "language": "en"
    }
  ]
}


Enter fullscreen mode Exit fullscreen mode

This gives the system something useful to work with later: speaker labels, timestamps, language metadata, and searchable text.


2. Live Translation: Making Global Conversations Easier

Transcription answers:

What was said?

Translation answers:

Can everyone understand it comfortably enough to participate?

In global teams, language ability is rarely equal.

One person may be fluent. Another may understand most of the conversation but need extra processing time. Someone else may avoid asking questions because they are still translating mentally.

Live translated captions can reduce that gap.

A useful live translation interface should support:

  • Original captions
  • Translated captions
  • Fast language switching
  • Minimal interruption
  • Clear speaker context
  • Real-time readability

For Cheetu AI, the goal is to show both original and translated captions on screen, while allowing viewers to switch languages with one click.

The product goal is not to turn every meeting into a formal interpretation session.

It is to make multilingual collaboration feel natural.


3. AI Summaries Should Be Structured

A common mistake in AI meeting tools is treating a summary as a shorter transcript.

But users usually do not want the same meeting in fewer words.

They want structure.

A useful meeting recap should answer:

  • What were the key points?
  • What decisions were made?
  • What risks came up?
  • What questions are still open?
  • Who owns the next step?
  • When is it due?

For example, instead of this:

The team discussed onboarding and agreed that improvements were needed.

A more useful summary looks like this:


markdown
## Decisions

- Simplify the onboarding checklist before launch.
- Prioritize user guidance for first-time users.

## Action Items

- Maya: Remove duplicate setup steps by Friday.
- Alex: Review activation metrics from the last cohort.
- Sam: Prepare updated help docs before next Tuesday.

## Open Questions

- Should enterprise customers get a separate onboarding flow?
- Do we need in-product tips for the setup process?


Enter fullscreen mode Exit fullscreen mode

This is the difference between a passive summary and an execution layer.

Good AI summaries should help teams move from:

We talked about this.

to:

Here is what we decided, what is still unresolved, and what happens next.


4. Searchable Memory Is the Most Valuable Layer

Transcription captures the conversation.

Translation makes it understandable.

Summarization makes it reviewable.

But retrieval makes it reusable.

Most teams have valuable knowledge trapped inside conversations:

  • Customer calls
  • Sales demos
  • Product reviews
  • User interviews
  • Internal meetings
  • Lectures
  • Research discussions
  • Support calls

The problem is not that this knowledge does not exist.

The problem is that it is hard to find.

A searchable conversation archive changes the interface from:

Find the recording.

to:

Ask the knowledge base.

For example:


text
What did the customer say about pricing?


Enter fullscreen mode Exit fullscreen mode

text
Which action items did we assign in the last product review?


Enter fullscreen mode Exit fullscreen mode

text
Summarize all open risks mentioned in meetings this week.


Enter fullscreen mode Exit fullscreen mode

text
Find the part where we discussed Spanish captions.


Enter fullscreen mode Exit fullscreen mode

The most important design requirement here is source context.

If an AI answer comes from a past conversation, users should be able to see where it came from.

That source might include:

  • Meeting title
  • Speaker
  • Timestamp
  • Transcript segment
  • Original language
  • Translated text

A simplified retrieval result might look like this:


json
{
  "answer": "The customer was concerned that onboarding required too many manual setup steps.",
  "sources": [
    {
      "session": "Customer Call - April 18",
      "timestamp": "00:14:32",
      "speaker": "Customer"
    }
  ]
}


Enter fullscreen mode Exit fullscreen mode

This helps users trust the answer and return to the original moment when needed.


5. Why No Meeting Bot Matters

Many AI meeting assistants rely on a bot joining the meeting.

That can be convenient, but it can also create social and operational friction.

A meeting bot may cause problems when:

  • Participants feel monitored
  • External guests do not recognize the bot
  • Meeting platforms restrict third-party bots
  • Organizations have strict security policies
  • The bot distracts from the conversation
  • Permissions become complicated

A no-bot approach feels lighter.

The assistant supports the user without becoming another participant in the room.

That is one of the core product ideas behind Cheetu AI: real-time transcription, live translation, AI summaries, and searchable memory without requiring an AI bot to join the meeting.


6. What This Unlocks

Once conversations become structured, translated, summarized, and searchable, many workflows become easier.

For product teams

  • Search across user interviews
  • Track repeated customer pain points
  • Review product decisions
  • Find risks mentioned in past meetings

For sales and customer success

  • Review objections from customer calls
  • Find commitments made during meetings
  • Generate follow-up notes
  • Track account risks

For students and researchers

  • Search lecture notes
  • Summarize long discussions
  • Ask questions across past sessions
  • Return to exact source moments

For global teams

  • Collaborate across languages
  • Review original and translated context
  • Reduce meeting misunderstandings
  • Make participation more equal

Final Thought

The next generation of meeting tools should not only create prettier notes.

They should help people understand conversations as they happen and reuse that knowledge afterward.

That means combining:

  • Real-time transcription
  • Live translation
  • Structured AI summaries
  • Searchable conversation memory
  • Source-grounded answers
  • A low-friction meeting experience

That is the direction we are exploring with Cheetu AI.

If your work depends on meetings, calls, lectures, interviews, or multilingual conversations, the real opportunity is not just to record more.

It is to remember better.

Learn more at Cheetu AI.

Top comments (0)