bsorrentino

Posted on Nov 21, 2024

From Audio to Diagram

#devchallenge #assemblyaichallenge #ai #api

This is a submission for the AssemblyAI Challenge : Sophisticated Speech-to-Text.

What I Built

An Application that allow from an Audio concerning a discussion, a meeting, etc ... to generate a "meaningful mind-map diagram", that represent the touched key points. This representation joined with summary provide a more complete and understandable informations

Demo

The application in available here for access to full functionality you need both an AssemblyAI Api Key and a OpenAI Api Key. Below there are some representative screenshots

Settings

Upload Audio

Transcribe Audio

Generate Mindmap Diagram

Journey

To implement process from audio to diagram I have developed several "skilled agents" described below:

transcribe-from-audio: this agent use AssemblyAI transcripts API to transcribe the provided audio.
keypoints-from-transcript: this Agent use OpenAI (got-4o-mini) to extract the Keypoints inside the given transcription
summary-to-mindmap: this agent use OpenAI (got-4o-mini) to arrange the key points in a kind of ontology providing a hierarchical representation of information
mindmap-to-mermaid: last agent transform the mind-map representation in a mermaid syntax ready for the visualization

Diagram of Agentic Architecture

Top comments (5)

Bill • Nov 22 '24

I can think of several long-winded, recorded meetings in which key ideas presented in a schema like this would be highly beneficial. Nice job

bsorrentino • Nov 22 '24

Hi @bill_ec71da0eaea845fff0d0 thank you, however this is just one of the possible use cases of this process, the possibilities are truly limitless

Vincenzo Sorrentino • Nov 27 '24 • Edited

I think any knowledge like discussions, brainstorming, issues, meeting, voice recording and wiki could gain value having a "thinking tool" of this kind by immediately highlighting the key points in the mind map diagram. Great job!

bsorrentino • Nov 28 '24

Thank you @vsorrentino, I think that this idea could be further extended enabling different kind of transcript analisys like for example Problem-Solution Mapping and others

Vincenzo Sorrentino • Nov 28 '24

Yes, of course