DEV Community

Cover image for Hackathon submission - An audio search engine powered by Deepgram
MiguelMJ
MiguelMJ

Posted on • Updated on

Hackathon submission - An audio search engine powered by Deepgram

Overview of My Submission

This submission consists on a CLI application to search for a word (whole or partial) among several audio sources, including Telegram chats.

Our conversations are stored in chats in all kinds of instant messaging services. How many questions, declarations, confessions, apologies, reminders and decisions have been interchanged today in text format? However, this conversations are constantly being mixed with voicemails. It is faster to talk than to write and also our brains understand much better the emotion of a message when we hear it spoken out. And yet, when we want to search for past messages, using the search option in our chats, only the text is considered. Audio is left aside, in a secondary place, but it should be actually as important.

Cover image by Wallace Chuck from pexels

Submission Category:

Analytics Ambassadors
Wacky Wildcards

Link to Code on GitHub

GitHub logo MiguelMJ / AudioSearchEngine

Search engine for audio files. Submission for Deepgram + DEV hackathon.

Audio Search Engine

Search inside audio files

Search for words inside audio files or Telegram voicemails. Powered by Deepgram. Requires API keys from Deepgram and optionally Telegram. Submission for Deepgram+DEV hackathon, 2022.

You might want to read the submission post.

Get the API keys

  • Deepgram (required): Create an account in deepgram.com and get an API key.
  • Telegram (optional): Create an account in Telegram and follow the steps here: Obtaining api_id

Store them in files named deepgramApiKey, telegramApiId and telegramApiHash in the root folder or pass them directly in the CLI using the --deepgram-api-key, --telegram-api-id and --telegram-api-hash arguments.

Features

  • Tune the voice recognition process with the Deepgram query parameters for transcriptions pre-recorded audio with -P|--param KEY=VALUE arguments.
  • Search directly in local files passing them as arguments after the search term.
  • Automatically download audios from chats in Telegram with one or more -T|--telegram-chat CHAT_ID arguments.
  • Downloads and results are…

Additional Resources / Info

The documentation of the project is in the repository's README.

I like to make my programs as customizable as possible, so I use the argparse library to automatically parse the command line arguments and also build this nice help message.

usage: main.py [OPTIONS] TERM FILES...

Search engine for audios with support for several audio sources. Powered by Deepgram.

positional arguments:
  TERM                  Word to search
  FILES                 Files to perform the search

optional arguments:
  -h, --help            show this help message and exit
  --no-ansi             Don't display color in the output
  -L NUM, --log-level NUM
                        log level. -1=quiet, 0=errors, 1=warnings, 2=info (default=2)
  -C NUM, --context NUM
                        number of words to surround the search hits in the output (default=2)
  -W, --whole-word      search for whole words only
  -o FILE, --output-file FILE
                        file to store the results of the search in a JSON format

Deepgram options:
  --deepgram-api-key X  Deepgram API key. By default, get it from a file named deepgramApiKey
  -P X=Y, --param X=Y   parameter for the Deepgram URL
  -F, --ignore-cache    ignore cached transcriptions and force an API call

Telegram options:
  --telegram-api-id X   Telegram API key. By default, get it from a file named telegramApiId
  --telegram-api-hash X
                        Telegram API hash. By default, get it from a file named telegramApiHash
  -T X, --telegram-chat X
                        chat from Telegram to retreive messages from
  -M NUM, --messages NUM
                        number of messages to retreive while looking for audios in each Telegram chat(default=100)

Source code: https://github.com/MiguelMJ/AudioSearchEngine
Enter fullscreen mode Exit fullscreen mode

This is an example execution to search among local files:

Example search among local files

And this one, of an execution to search among the audios in the "me" chat in Telegram.

Example search among Telegram audios

The screenshot can't get all the output, but you get the idea.

Some things to consider if you want to try it:

  • I've used the -F option for showcase purposes, you don't need to.
  • The default language is Spanish (my native language), so you probably will have to either change that little line of code or use the -P language=X argument.
  • All the logging is made to stderr, so you can safely pipe it to another command and only get the JSON output of the search.
  • The Telegram integration is optional, if you are only going to search among local files. But if you want to use it, you must install the telethon dependency and have the API id and API hash provided by Telegram.
  • In any case, a Deepgram API key is required.

Possible future improvements

  • Add more remote audio sources, apart from Telegram chats (maybe Discord?).
  • Make the search process more flexible using an edit-distance based match, instead of only exact matches.
  • Allowing more complex queries: multiple words, regular expresions, etc.
  • If you can think of another one, feel free to make a PR!

If you like the project think about contributing or giving it a star. I'm really excited to know what you think. Let me know in the comments!

Top comments (4)

Collapse
 
unitybuddy profile image
Mr. Unity Buddy

Cool, I'm gonna definitely try it out! And a hard, but an amazing future improvement— A simple User Interface 💻🔥

Collapse
 
miguelmj profile image
MiguelMJ

Thanks! Feedback of any kind will be appreciated, both in GitHub or here!
Ah, GUIs are my Achilles heel, but I'll write that down... CLI applications tend to be more developer focused, but this one should be more user friendly, so I guess that would be better.
Would you recommend any GUI library for python? I tried tkinter in the past but I don't know if there are better alternatives.

Collapse
 
unitybuddy profile image
Mr. Unity Buddy

Tkinter is always my best option, and the second one would be PyQt5, which is somewhat complex than Tkinter.

Have you noticed about Tkinter Designer, which can be used to convert Figma to Tkinter code? Most of the time, that tool is really useful. Take a look at it too!

Thread Thread
 
miguelmj profile image
MiguelMJ

I will, thanks!