Implementing a RAG system inside an RDBMS: Sqlite and Postgres with Sqlite-vec, PGVector.

#llm #rag #rdbms #embedding

tl;dr: A discussion of the pros and cons of storing embedding data with the data it represents vs an external vector database. I'll be following this post up shortly with a walkthrough the boilerplate required to create and search embeddings in SQlite and PostGres.

"Vector Databases: They are single-purpose DBMSs with indexes to accelerate nearest-neighbor search. RM DBMSs should soon provide native support for these data structures and search methods using their extendable type system that will *render such specialized databases unnecessary" - Stonebraker & Pavlo, 2024.*

I've been exploring RAG techniques and embeddings and as part of that, I've been checking out effective embedding generation and storage/retrieval options. As long as I'm able to perform vector searches, I don't see the value in storing embedding data separately from the relational data it represents.

Do dedicated vector databases make sense?

Vector storage and search is at this point, essentially commoditized: going forward it's not clear to me (or others) how dedicated vector databases can differentiate themselves from bog-standard relational databases with Vector search enabled.

Storing vector embeddings with the data that they represent is convenient and allows for succinct access to the results of vector-based searches. Some people are finding that keeping a separate vector DB in synch can be "painful at best, even for prototype applications." That said, vector database providers are understandably keen to provide value. But, given that both storing vectors and searching them are solved problems, there doesn't appear to be much room in which they could make any improvements.

So, even if adding vector storage to your existing database won't work for you, making adding a secondary database your the least-worst option, there's no obvious reason not to consider Postgres with PGVec, (or PGVec-scale) for that role.

Bonus: LLMS can speak SQL.

Many LLMs compose SQL well, which brings up an interesting possibility: LLM agents that can compose their own vector-based search queries. The use cases where this would make sense might be minimal at the moment: but interesting avenue nonetheless. I want to play around with that.

The contenders

Currently, I am creating embeddings for a dataset, storing them in Sqlite and Postgres, and performing connecting them to a local LLM, via a couple of extensions. Given that Sqlite and Postgres aren't really competing databases, this isn't going to be much of a comparison as much as a walkthrough of encoding and retrieving vector embeddings from SQL compatible databases. There's Sqlite-vec, and PG-Vector.

No love for mySql

I couldn't find an equivalent for mySQL, so if you are using mySQL, adding an secondary vector database appears to be your only straightforward option.

Of course, this isn't likely to be the case for long: nearest-neighbour search is a solved problem, it just needs to be implemented for mySQL. There's also been at least one stab at building one: MySQLvss. At the moment, this doesn't seem to be a maintained: the last commit was six months ago: perhaps it can provide a starting off point, should you decide to build your own mySQL nearest-neighbour search.

Fwiw, many cloud services, such as both Oracle and Google are offering vector search functionality as part of their managed mySQL services.

Sqlite: Sqlite-vec, -lembed & -rembed

Sqlite-vec is a new database extension learned about during the AI Engineer World's Fair keynote. It performs vector search allows the storage and retrieval of vector embeddings It seemed like something nice and shiny, and yet practical program to add to my RAG toolset.

Enabling extensions in sqlite can be less than straightforward, if you don't have easy access to the sqlite C api. Enabling extensions via python sqlite library requires recompiling python with sqlite feature flags enabled, or just using the package from Homebrew, which comes with the feature enabled.

SqLite-Vec comes peer extensions that allow the execution of embedding models locally, or from a model running on a server or 3rd party service. PGVector is just provides yjr search function, so we'll have to create the embeddings before we can use it. It's a small convenience, but I can see how it would be useful for programmatic generation of embeddings (such as providing an agent semantic search over their interactions with the user and other behaviours).

Additionally, SqLite-Vec is built with WASM, so this can power AI running in the browser, or on embedded devices.

Postgres: PG-Vector, PG-Vector-scale

PGVector provides a nearest-neighbour vector search PostGres. As such, I expect it the one I'll be reaching for more often. Additionally, it provide more sophisticated search algorithms than Sqlite: while Sqlite only implements cosine similarity, PG-Vector also search algorithms such as HNSW (Hierarchical Navigable Small World), IVFFlat (Inverted File Flat) and Euclidean distance.

PG-Vector-scale is a super fast iteration on PGVector, intended for really large deployments where the PGVector might hit performance or scale limitations. It is also addresses scalability for large datasets, distributed indexing and querying and handling billions of vectors with efficiency. As I write this, it's not clear whether there are circumstances where it makes sense to use PGVector at all, hopefully as I dig into it, it will become clear.

Ok, so, these are the tools I'm currently playing with. Next post, I'm going to get into implementing them and working with them.

DEV Community