DEV Community

Ama
Ama

Posted on

I deleted my entire AI microservice and just used Postgres (here is why) 🐘⚑

A few months ago, I needed to build a feature that everyone is asking for right now: "Let our users chat with their messy data." (In my case, it was a massive dump of chaotic customer support tickets).

If you Google how to build this, the internet will immediately try to sell you a 5-tier architecture:

  • You need a Vector DB for embeddings.
  • You need a graph database for relationships.
  • You need LangChain to glue it together.
  • You need a separate Python microservice to run it all.

I fell for it. I built it.
And two weeks later, I was dealing with the most annoying bug in modern engineering: State mismatch.

A user would delete a ticket in our main database, but the vector representation of that ticket still lived in our Vector DB. The AI kept hallucinating answers based on deleted data.

The "Aha" Moment πŸ’‘
Syncing data between a relational database and a dedicated vector database is a nightmare. You have to write custom webhooks, handle failed retries, and pay for two separate servers.

So, I just threw it all away.

I stopped treating AI like some magical entity that requires a bespoke ecosystem, and went back to the most reliable tool in the backend world: Postgres.

Postgres + pgvector = Peace of Mind πŸ§˜β€β™‚οΈ (and its not a joke)
Instead of sending data across the internet to a third-party vector store, I just enabled the pgvector extension.

My data and my embeddings now live in the exact same table:

CREATE EXTENSION vector;

CREATE TABLE support_tickets (
  id uuid PRIMARY KEY DEFAULT gen_random_uuid(),
  user_id uuid REFERENCES users(id) ON DELETE CASCADE,
  issue_text text,
  embedding vector(1536) -- OpenAI's embedding size
);
Enter fullscreen mode Exit fullscreen mode

Look at that ON DELETE CASCADE.
If a user deletes their account, their tickets disappear. Because the tickets disappear, the embeddings disappear. Instant, ACID-compliant state management without writing a single line of sync logic.

Dumping the heavy frameworks πŸ—‘οΈ
Once everything was in Postgres, I realized I didn't need LangChain or LlamaIndex either.

Most AI frameworks try to do too much. They hide the actual API calls behind layers of abstraction, making debugging impossible.

Now, my entire "RAG" pipeline is just a raw SQL query and a standard API fetch:

  • User asks a question.
  • Turn the question into an embedding.
  • Run a cosine similarity search directly in SQL:
SELECT issue_text 
FROM support_tickets 
ORDER BY embedding <=> $1 
LIMIT 5;
Enter fullscreen mode Exit fullscreen mode

Feed those 5 results into the OpenAI/Anthropic API with a strict JSON schema.

That's it. No microservices. No $80/month Vector DB bills. Just a regular monolithic backend doing its job efficiently.

Why I'm sharing this (and how you can support) β˜•
The tech industry loves hype. We are constantly pushed to adopt the newest, most complex tools, even when a boring 20-year-old technology does it better.

I spend a lot of my time testing these architectures, making the mistakes, and stripping away the marketing fluff so you can just build things that actually work in production.

If this post just saved you from an unnecessary architecture rewrite, a massive SaaS bill, or a weekend of debugging sync errors, consider sponsoring my work on GitHub:

πŸ‘‰ Sponsor AmaLS367 on GitHub

Sponsorships give me the freedom to keep experimenting, breaking things, and open-sourcing production-ready boilerplates that save you time.

TL;DR
Don't over-engineer your AI features.
Dedicated Vector DBs often introduce data-sync nightmares.
pgvector keeps your embeddings ACID-compliant.
Ditch heavy AI frameworks; raw API calls + SQL are easier to debug and maintain.

I’m really curious β€” how are you managing your embeddings right now? Are you using a separate DB or keeping it monolithic? Let me know below! πŸ‘‡

Top comments (0)