Ce Gao

Posted on Aug 7, 2023 • Originally published at modelz.ai

20x Faster as the Beginning: Introducing pgvecto.rs extension written in Rust

#vectordatabase #machinelearning #ai #llm

We are thrilled to announce the release of pgvecto.rs, a powerful Postgres extension for vector similarity search written in Rust. It's HNSW algorithm is 20x faster than pgvector at 90% recall. But speed is just the start - pgvecto.rs is architected to easily add new algorithms. We've made it an extensible architecture for contributors to implement new index with ease, and we look forward to the open source community driving pgvecto.rs to new heights!

Why Rust?

Pgvecto.rs is implemented in Rust rather than C like many existing Postgres extensions. It is built on top of the pgrx framework for writing Postgres extensions in Rust. Rust provides many advantages for an extension like pgvecto.rs. Rust's strict compile-time checks guarantee memory safety, which helps avoid entire classes of bugs and security issues that can plague C extensions. Just as importantly, Rust provides modern developer ergonomics with great documentation, package management, and excellent error messages. This makes pgvecto.rs more approachable for developers to use and contribute to compared to sprawling C codebases. The safety and ease of use of Rust make it an ideal language for building the next generation of Postgres extensions like pgvecto.rs on top of pgrx.

Extensible Architectures

Pgvecto.rs is designed with an extensible architecture that makes it easy to add support for new index types. At the core is a set of traits that define the required behaviors for a vector index, like building, saving, loading, and querying. Implementing a new index is as straightforward as creating a struct for that index type and implementing the required traits. Pgvecto.rs currently comes with two built-in index types - HNSW for maximum search speed, and ivfflat for quantization-based approximate search. But the doors are open for anyone to create additional indexes like RHNSW, NGT, or custom types tailored to specific use cases. The extensible architecture makes pgvecto.rs adaptable as new vector search algorithms emerge. And it lets you select the right index for your data and performance needs. Pgvecto.rs provides the framework for making vector search in Postgres as flexible and future-proof as possible.

Speed and Performance

Benchmarks show pgvecto.rs offers massive speed improvements over existing Postgres extensions like pgvector. In tests, its HNSW index demonstrates search performance up to 25x faster compared to pgvector's ivfflat index. The flexible architecture also allows using different indexing algorithms to optimize for either maximum throughput or precision. We're working on the quantization HNSW now, please also stay tuned!

Persistence and Management

Previous work pg_embedding did a great job implementing HNSW indexes, but lacked support for persistence and proper CRUD operations. pgvecto.rs adds those two core functionalities that were missing in pg_embedding. Vector indexes in pgvecto.rs are properly persisted using WAL (write-ahead logging). pgvecto.rs handles saving, loading, rebuilding, and updating indexes automatically behind the scenes. You get durable indexes that don't require external management while fitting cleanly into current Postgres deployments and workflows.

Getting Started

Let's assume you've created a table using the following SQL command:

CREATE TABLE items (id bigserial PRIMARY KEY, emb vector(4));

Here, vector(4) denotes the vector data type, with 4 representing the dimension of the vector. You can use vector without specifying a dimension, but be aware that you cannot create an index on a vector type without a specified dimension.

You can insert data like this anytime.

INSERT INTO items (emb)
VALUES ('[1.1, 2.2, 3.3, 4.4]');

To create an index on the emb vector column using squared Euclidean distance, you can use the following command:

CREATE INDEX ON items USING vectors (emb l2_ops)
WITH (options = $$
capacity = 2097152
size_ram = 4294967296
storage_vectors = "ram"
[algorithm.hnsw]
storage = "ram"
m = 32
ef = 256
$$);

If you want to retrieve the top 10 vectors closest to the origin, you can use the following SQL command:

SELECT *, emb <-> '[0, 0, 0, 0]' AS score
FROM items
ORDER BY emb <-> '[0, 0, 0, 0]'
LIMIT 10;

Conclusion

pgvecto.rs represents an exciting step forward for vector search in Postgres. Its implementation in Rust and extensible architecture provide key advantages over existing extensions like speed, safety, and flexibility. We're thrilled to release pgvecto.rs as an open source project under Apache 2.0 license and can't wait to see what the community builds on top of it. There's ample room for pgvecto.rs to expand - adding new index types and algorithms, optimizing for different data distributions and use cases, and integrating with existing Postgres workflows.

We encourage you to try out pgvecto.rs on GitHub, benchmark it against your workloads, and contribute your own indexing innovations. Together, we can make pgvecto.rs the best vector search extension Postgres has ever seen! The potential is vast, and we're just getting started. Please join us on this journey to bring unprecedented vector search capabilities to the Postgres ecosystem. Join our Discord community to connect with the developers and other users working to improve pgvecto.rs!

Advertisement Time

The mission of ModelZ is to simplify the process of taking machine learning models into production. With experiences from AWS, Tiktok, and Kubeflow, our team has extensive expertise in MLOps engineering. So if you have any questions related to putting models into production, please feel free to reach out, by joining Discord, or through modelz-support@tensorchord.ai. We're happy to help draw on our background building MLOps platforms across companies to provide guidance on any part of the model development to deployment workflow.

More products with ModelZ:

ModelZ - A Managed serverless GPU platform to deploy your own models
Mosec - A high-performance serving framework for ML models, offers dynamic batching and CPU/GPU pipelines to fully exploit your compute machine. Simple and faster alternative to NVIDIA Triton.
envd - A command-line tool that helps you create the container-based environment for AI/ML, from development to the production. Python is all you need to know to use this tool.
ModelZ-llm - OpenAI compatible API for LLMs and embeddings (LLaMA, Vicuna, ChatGLM and many others)

DEV Community

20x Faster as the Beginning: Introducing pgvecto.rs extension written in Rust

Why Rust?

Extensible Architectures

Speed and Performance

Persistence and Management

Getting Started

Conclusion

Advertisement Time

Top comments (0)

Read next

10 Top Strategic Technology Trends for 2025

Day 49: Serving LLMs with ONNX Runtime

Best AI project ideas, in google's opinion 🫶

Open Source LLMOps LangSmith Alternatives: LangFuse vs. Lunary.ai