As a developer, I spend a lot of time building on the Internet.
And by virtue of being on the internet, Discord has become the place where I think out loud. It’s where I dump notes, sketch ideas, and talk through progress on whatever I’m working on at the moment.
The idea was simple: I have a dev channel full of thoughts. I should be able to ask questions and retrieve those thoughts like an archive that understands context.
Discord’s built-in search doesn’t really help with that. It’s not semantic, it’s barely chronological, and once a channel gets busy, it’s effectively useless.
This idea spawned between dev projects: I had just finished building an Upwork automation tool that operated through a chrome extension and I wanted to reference a particular build method I had notated.
I had two options: dig through my codebase and logs, or scroll back through thousands of Discord messages hoping I’d recognize the right one.
What I actually wanted was to ask my own Discord channel a question in plain English and get back the answer, with context, and ideally with citations pointing to where it came from.
That’s when the idea clicked: if Discord isn’t going to give me semantic search, I’ll just build it myself.
Can’t be hard, can it?
Truth is, it’s actually pretty simple.
There’s a ton of tutorials out there on how to do things like this, and there’s also plenty of information on what the APIs can and cannot do.
In the age of vibe-coding, I’ve never felt more empowered.
If you’re looking to prompt this out, just feed this main concept to your LLM of choice:
"Scrape, index, recall."
That’s the entire pipeline.
“Build a discord RAG tool that uses API endpoints for communications, stores the scraped messages with an encrypt function on ingestion, and then embeds and stores the saved vectors. Create an endpoint that can be accessed via a slash command, and accepts the arguments of the channel and the search query. When received, embed the query, compare with a cosine similarity search against embeddings, return top-k, and send to LLM for response. Keep a relationship between the vectors and the scraped messages. Please stream response. Thank you”.
Invite the bot with particular channel access, have it scrape the channel, store the data in the backend, and embed it. Then, for recall, use a slash command to ask your question. You can use a webhook, or you can use the actual Discord library for a more fluid experience. Either way, it’s really only two actions.
If you don’t want to automatically ingest and convert each message, you can also set it on a CRON job to have it run on its own every couple hours if need be.
The most wonderful part about this all is I have exactly one main tool that I use, beyond the host for my worker, that handles my business logic, API endpoints & database – and I didn’t have to write any of it.
If I want to think about security, I can, but the platform is already hardened.
Scaling is a similar ease. It’s handled by the platform, like you’d find in Railway, but for the Enterprise level. It’s designed for war.
I know a lot of people like Supabase for its utility in the web environment, but it doesn’t compare to Xano, a lesser known, albeit more powerful tool that is capable of much more than I’ve described (it’s like opening up a toy-box and discovering new favorite toys each time).
Ultimately, my database and row-level-access and vector database all live within the same tool. I minimize a lot of unnecessary external connections. Multiple layers is fine, but limiting multiple connections and keeping everything constrained inside a single environment makes development easy, and is especially kind to the context window of Claude.
And, probably something most people appreciate: it’s a single subscription.
If there’s a single platform that’s ever done it better, I’d like to know. Until then, Xano is literally my go-to tool of choice.
And now, with the power of Xano, I have a working semantic search inside my Discord server.

Top comments (1)
Can this be done for my MacBook? I can never find files on this thing