DEV Community

Riccardo
Riccardo

Posted on

Building an audio search engine with Quarkus and pgvector

Hey everyone!

This is my first post and I wanted to start sharing this project with you all.

I've been experimenting with audio embeddings recently to see if I could build a self-hosted search tool for music.

The result is a prototype called (for now) Agnostic Intelligence Layer. It's a semantic audio search engine designed to run entirely offline without reliance on external cloud APIs.

The Stack & Architecture
I wanted something fast and efficient, so I decided to mix Java and Python:

  • Java Quarkus: Handles the core engine pipeline and container efficiency.
  • Python: Manages the actual AI heavy lifting using CLAP neural networks to extract audio features into 512-dimensional vectors.
  • PostgreSQL + pgvector: Stores the vectors and finds acoustically similar tracks using cosine similarity.
  • MinIO: Handles fast and easy local storage.

Everything starts up via a single docker-compose, so you don't have to waste time configuring external services.

Why I'm sharing this
It's still a prototype, but the core pipeline works fine, and I'm going to continue working on it. I wanted to share it early to get some eyes on the code and see if the architecture makes sense to other devs.

The repository has a quick start guide if you want to check out the code or test it locally:
šŸ‘‰ https://github.com/BothBasilisk/agnostic-audio-engine.git

Feel free to leave your thoughts, critiques, or any tips on how to improve the pipeline!

Top comments (2)

Collapse
 
asym_alwali profile image
Asym

That's something probably great and the fact that you're just experimenting for fun stand out to me and honestly that's what most devs possessed in the first place.

Collapse
 
bothbasilisk profile image
Riccardo

Thanks appreciated