Originally published on rohitraj.tech
Stuffing a 200-table schema into a prompt = burns tokens, hurts accuracy. pgvector + embeddings retrieve only relevant tables per query.
Concrete numbers from my build: token cost down 87%, query accuracy up from 64% → 91%.
Full implementation + code: https://rohitraj.tech/en/notes/rag-for-sql
Read the full version with code samples and diagrams: https://rohitraj.tech/en/notes/rag-for-sql
Top comments (0)