Alain Airom

Posted on Jan 14

Jvector And Python! (Part 4 of jvector series)

#jvector #python #java #vectorsearch

Is it possible to use jvector and Python?

Introduction

I’ve spent the last few weeks exploring the power of jvector in Java, but one question kept surfacing: Can we unlock this speed for Python developers? Today, we’re (maybe) building that bridge.

Since JVector is a pure Java library, there is no official native Python port (like pip install jvector). JVector’s core strength is its use of the Java Vector API (SIMD), which is specific to the JVM.

However, if we want to use JVector from Python, we can “bridge” the two using PyJnius. This allows us to call Java classes directly from Python code.

Implementation via PyJnius

To make this work, we need the JVector .jar file and the PyJnius library.

Prerequisites: first, install PyJnius.

pip install pyjnius

I had to ensure having a JDK (17+) installed and that my JAVA_HOME environment variable is set.

Python Code Example: an example demonstrating how to initialize the Java Virtual Machine from Python, load the JVector classes, and perform a basic setup.

import jnius_config
# 1. Add the JVector jar to the classpath
jnius_config.set_classpath('.', './path/to/jvector-4.0.0.jar')

from jnius import autoclass, cast

# 2. Access JVector Java classes
VectorSimilarityFunction = autoclass('io.github.jbellis.jvector.vector.VectorSimilarityFunction')
GraphIndexBuilder = autoclass('io.github.jbellis.jvector.graph.GraphIndexBuilder')
VectorFloat = autoclass('io.github.jbellis.jvector.vector.VectorFloat')

def run_jvector_demo():
    # Example parameters
    dimension = 128
    max_degree = 16
    search_depth = 100

    print("JVector classes successfully loaded in Python!")
    print(f"Similarity Function: {VectorSimilarityFunction.COSINE}")

run_jvector_demo()

Should we use JVector in Python?

While the bridge works, it is usually not the best choice for a pure Python project. Here is a quick guide on when to use what:

Use JVector if the application logic is already in Java/Kotlin/Scala. It is built to solve the “Java gap” where standard libraries like Lucene are too slow for massive vector datasets.
Use FAISS or Use Scann if we are working in Python. These libraries have native Python bindings and are highly optimized for that ecosystem.
Use JVector via Python only if we try to maintain consistency with a Java-based backend (e.g., we are building a Python tool to inspect or debug an index created by a Java-based Apache Cassandra or Astra DB instance).

So, generally speaking in summary we should keep in mind the following ⬇️

| Goal                        | Recommended Library | Language       |
| --------------------------- | ------------------- | -------------- |
| **High-Perf Java Search**   | **JVector**         | Java           |
| **High-Perf Python Search** | **FAISS / Scann**   | Python / C++   |
| **PostgreSQL Integrated**   | **pgvector**        | SQL / Python   |
| **Distributed Scaling**     | **Milvus / Qdrant** | Multi-language |

Additional Points to Keep in Mind-Conversion of Arrays

As JVector is designed for the JVM, we’ll need to handle the conversion of NumPy arrays (Python’s standard for vectors) into Java float arrays or VectorFloat objects.

Python + JVector Implementation Example

Ensuring having the needed packages;

pip install pyjnius numpy

The code;

import numpy as np
import jnius_config

jnius_config.set_classpath('.', './jvector-4.0.0.jar')

from jnius import autoclass

# 2. Load the essential JVector classes
VectorSimilarityFunction = autoclass('io.github.jbellis.jvector.vector.VectorSimilarityFunction')
VectorFloat = autoclass('io.github.jbellis.jvector.vector.VectorFloat')

def numpy_to_jvector(np_array):
    """
    Converts a NumPy float32 array into a JVector VectorFloat object.
    """
    # Ensure it's float32 for JVector compatibility
    arr_32 = np_array.astype(np.float32)
    # Use PyJnius to pass the list to the Java VectorFloat constructor
    return VectorFloat(arr_32.tolist())

# --- Usage Example ---

# Create a sample vector in NumPy
my_vector = np.array([0.1, 0.5, 0.8, -0.2], dtype=np.float32)
query_vector = np.array([0.2, 0.4, 0.9, 0.1], dtype=np.float32)

# Convert to JVector format
jv_vector = numpy_to_jvector(my_vector)
jv_query = numpy_to_jvector(query_vector)

# Calculate similarity using JVector's engine
similarity_fn = VectorSimilarityFunction.COSINE
score = similarity_fn.compare(jv_vector, jv_query)

print(f"Cosine Similarity Score: {score}")

Important Considerations

Performance Overhead: Passing data between Python and Java via PyJnius involves some overhead. If we do millions of small searches from Python, this might be slower than a native Python library like FAISS.
Vector API (SIMD): To get JVector’s full speed, we must ensure our Python environment starts the JVM with the Vector API enabled. Pass JVM options via jnius_config:

jnius_config.add_options('--add-modules', 'jdk.incubator.vector')

As I understand, most users only use JVector in Python if they are building tools to manage a Cassandra or Astra DB database that uses JVector under the hood. For pure Python AI research, native libraries are usually preferred.

A second example of “NumPy Matrix” (a 2D array of vectors) into a “JVector Graph Index” using Python!

import numpy as np
import jnius_config

# 1. Initialize JVM with SIMD enabled
jnius_config.add_options('--add-modules', 'jdk.incubator.vector')
jnius_config.set_classpath('.', './jvector-4.0.0.jar')

from jnius import autoclass, PythonJavaClass, java_method

# Java class imports
VectorSimilarityFunction = autoclass('io.github.jbellis.jvector.vector.VectorSimilarityFunction')
GraphIndexBuilder = autoclass('io.github.jbellis.jvector.graph.GraphIndexBuilder')
BuildScoreProvider = autoclass('io.github.jbellis.jvector.graph.BuildScoreProvider')
VectorFloat = autoclass('io.github.jbellis.jvector.vector.VectorFloat')

class NumpyVectorValues(PythonJavaClass):
    """
    Bridges a NumPy matrix to the Java RandomAccessVectorValues interface.
    """
    __javainterfaces__ = ['io/github/jbellis/jvector/data/RandomAccessVectorValues']

    def __init__(self, matrix):
        super(NumpyVectorValues, self).__init__()
        self.matrix = matrix.astype(np.float32)
        self.dim = matrix.shape[1]

    @java_method('()I')
    def dimension(self):
        return self.dim

    @java_method('()I')
    def size(self):
        return self.matrix.shape[0]

    @java_method('(I)Lio/github/jbellis/jvector/vector/VectorFloat;')
    def getVector(self, ordinal):
        # Convert the specific row to a Java VectorFloat
        row = self.matrix[ordinal].tolist()
        return VectorFloat(row)

# --- Execution ---

# 1. Generate dummy data: 10,000 vectors of 128 dimensions
data_matrix = np.random.random((10000, 128)).astype(np.float32)

# 2. Wrap the matrix for Java
ravv = NumpyVectorValues(data_matrix)

# 3. Setup the Index Builder
# BuildScoreProvider handles how distances are calculated during the build
score_provider = BuildScoreProvider.randomAccessScoreProvider(
    ravv, 
    VectorSimilarityFunction.COSINE
)

builder = GraphIndexBuilder(
    score_provider, 
    ravv.dimension(), 
    16,   # M: Max connections
    100,  # Beam Width
    1.2,  # Alpha
    1.2   # Degree Overflow
)

# 4. Build the index
print("Building JVector index from NumPy matrix...")
index = builder.build(ravv)
print(f"Index complete! Nodes: {index.size()}")

Code explanation:

Numpy Matrix: the data lives in Python’s memory as a contiguous block.
PyJnius Proxy: the NumpyVectorValues class acts as a translator. When JVector (in Java) asks for "vector number 500," Python intercepts the call, slices the matrix, and hands over a Java-compatible object.
Vamana Graph: JVector’s GraphIndexBuilder takes these vectors and organizes them into a high-speed navigation graph.
SIMD: because we added the --add-modules flag, the distance checks happen using the CPU's most advanced math instructions.

If we have millions of vectors, converting each row to a list (via .tolist()) inside the getVector method can become a bottleneck. For enterprise-scale data, it is faster to save the NumPy matrix to a binary file and have JVector read it directly using its OnDiskGraphIndex and SimpleMappedReader classes.

Achieving High-Performance

For high-performance, large-scale datasets, the “bridge” approach where Python feeds Java row-by-row is inefficient. The Disk-First strategy is the industry standard for bridging Python data-science workflows with Java infrastructure.

In this approach, we export our NumPy data into a raw binary format that JVector can “memory map” directly. This eliminates the overhead of Python-to-Java object conversion.

Exporting NumPy to JVector Binary: as JVector expects a simple, flat binary file of 32-bit floats. We can use NumPy’s .tofile() to create this instantly.

import numpy as np

# Create 1 million vectors (128-dim)
data = np.random.random((1000000, 128)).astype(np.float32)

# Save as raw binary (no header)
data.tofile("vectors.bin")

print("Exported 1M vectors to vectors.bin (approx 512MB)")

Configuring JVector for Direct Memory Access: we tell JVector to “map” this file. Instead of loading it into the JVM heap, JVector will treat the file on the SSD as if it were RAM, using the OS page cache for lightning-fast access.

from jnius import autoclass

# Load JVector On-Disk classes
SimpleMappedReader = autoclass('io.github.jbellis.jvector.disk.SimpleMappedReader')
OnDiskGraphIndex = autoclass('io.github.jbellis.jvector.disk.OnDiskGraphIndex')

vector_reader = SimpleMappedReader("vectors.bin")

print("JVector is now mapped to the binary file created by Python.")

➡️ Using the Disk-First approach provides three massive advantages:

Zero-Copy: the data never actually “enters” the Python or Java logic; it stays in the OS kernel’s page cache, and the hardware’s SIMD units pull from it directly.
Memory Safety: we can search a 100GB vector file on a machine with only 8GB of RAM. The OS will handle swapping pieces of the file in and out as the graph search traverses different nodes.
Persistence: our data is already saved. If the Python script crashes or the Java service restarts, the index doesn’t need to be “re-uploaded” — it just re-maps the file and is ready in milliseconds.

Moving To Production

Data Prep: The “Clean Room” (NumPy/Pandas)

Python is unrivaled for data manipulation, and this is where we ensure JVector receives high-quality input.

Normalization: For similarity metrics like Cosine, JVector runs up to 50% faster if we pre-normalize our vectors to unit length in Python and use Dot Product in Java.
Type Casting: JVector strictly requires 32-bit floats. Ensuring NumPy arrays are explicitly cast using data.astype(np.float32) to avoid memory alignment errors during the bridge transfer.
Batching: If we are processing millions of records, using Pandas to chunk the data. This prevents the Python environment from hitting “Out of Memory” errors before the data even reaches JVector.

Storage: The “Zero-Copy” Bridge (.tofile())

Passing millions of objects row-by-row between Python and Java creates a massive bottleneck. We should bypass the bridge for the data itself.

Raw Binary Export: Use NumPy’s .tofile() to export the matrix as a flat binary file.
Memory Mapping: In Java, using JVector’s SimpleMappedReader to map that file directly. This allows JVector to treat the SSD as extended RAM, searching datasets much larger than the available memory without the overhead of Java object conversion.

Search Engine: The Java Wrapper (Spring Boot/Micronaut)

Instead of running JVector inside our Python script, host it in its native environment — the JVM.

JVM Optimization: Running JVector in a dedicated service allows us to easily enable the Java Vector API (Project Panama) with the --add-modules jdk.incubator.vector flag.
Concurrency: Using a framework like Spring Boot to manage JVector’s GraphIndexBuilder. JVector features lock-free concurrent construction, meaning it can use all available CPU cores to index the data with linear scaling.
Health & Persistence: A standalone service can manage the lifecycle of the index (saving/loading from disk) independently of the Python application’s restarts.

Communication: The “Fast Lane” (gRPC/FastAPI)

Finally, we need a high-speed way for the Python frontend to send queries to the Java search engine.

gRPC (Recommended): Use gRPC for its support of Protobuf, which is much faster than JSON for sending query vectors. It provides a typed contract between the Python client and Java server.
FastAPI / REST: for simplicity, using FastAPI in Python to call a REST endpoint on the Java service. While slightly slower than gRPC, it is easier to debug and integrate with standard web tools.
Arrow Flight: For returning massive amounts of data (e.g., returning 1,000+ vectors for re-ranking), we should look into Apache Arrow Flight, which is designed specifically for high-speed transport of large datasets between languages.

### Integration Strategies

| Feature         | JNI (PyJnius)         | Microservice (gRPC)       |
| --------------- | --------------------- | ------------------------- |
| **Complexity**  | High (JVM management) | Medium (Network setup)    |
| **Stability**   | Risk of JVM Segfaults | Isolated & Fault-tolerant |
| **Performance** | Lowest Latency        | Scalable Throughput       |

To implement the gRPC communication layer, first we define a strict contract (the .proto file) that both Python and Java understand. This ensures that the high-dimensional vector data is serialized efficiently using Protobuf instead of bulky JSON.

The Service Definition (vector_service.proto) on Java side: this sample defines the “handshake” between a Python client and the JVector Java service.

syntax = "proto3";

package jvector;

option java_package = "com.example.jvector.grpc";
option java_multiple_files = true;

// The request containing the query vector and search parameters
message SearchRequest {
  repeated float query_vector = 1;
  int32 top_k = 2;
}

// A single result from the JVector index
message SearchResult {
  int32 id = 1;
  float score = 2;
}

// The response containing the list of nearest neighbors
message SearchResponse {
  repeated SearchResult results = 1;
}

service JVectorService {
  rpc Search(SearchRequest) returns (SearchResponse);
}

The Python Client (Front-end): on the Python side, we’ll use the grpcio and grpcio-tools libraries to generate the client code from the .proto file. This allows us to treat the remote JVector search as a local function call.

import grpc
import vector_service_pb2
import vector_service_pb2_grpc
import numpy as np

def run_search(query_np_array, k=10):
    # 1. Connect to the Java JVector Service
    with grpc.insecure_channel('localhost:50051') as channel:
        stub = vector_service_pb2_grpc.JVectorServiceStub(channel)

        # 2. Convert NumPy array to Protobuf-compatible list
        # Ensure it is float32 as JVector requires
        query_list = query_np_array.astype(np.float32).tolist()

        # 3. Create the request
        request = vector_service_pb2.SearchRequest(
            query_vector=query_list,
            top_k=k
        )

        # 4. Execute the remote search
        response = stub.Search(request)

        print(f"Top {k} Results:")
        for res in response.results:
            print(f"ID: {res.id}, Score: {res.score}")

# Usage with a random query vector
query = np.random.random(128)
run_search(query)

To implement the Java “Server-Side” logic, we’ll need to create a service that implements the gRPC interface we defined in the .proto file and connects it to the JVector engine.

In a production scenario, we (most probably) woulduse a framework like Spring Boot with the grpc-spring-boot-starter to handle the boilerplate of starting the server and managing dependencies.

Java Dependencies (pom.xml): we need the gRPC libraries and the JVector dependency. Java 21+ is good because it can take advantage of the Foreign Function & Memory API (Project Panama) and for maximum speed.

<dependencies>
    <dependency>
        <groupId>net.devh</groupId>
        <artifactId>grpc-spring-boot-starter</artifactId>
        <version>2.15.0.RELEASE</version>
    </dependency>
    <dependency>
        <groupId>io.github.jbellis</groupId>
        <artifactId>jvector</artifactId>
        <version>2.0.0</version>
    </dependency>
</dependencies>

The Java Service Implementation: this class does the heavy lifting: it receives the float array from Python, wraps it into a JVector-compatible object, and performs the search.

import com.example.jvector.grpc.*;
import io.github.jbellis.jvector.graph.GraphSearcher;
import io.github.jbellis.jvector.vector.VectorFloat;
import io.github.jbellis.jvector.vector.VectorSimilarityFunction;
import io.grpc.stub.StreamObserver;
import net.devh.boot.grpc.server.service.GrpcService;
import java.util.stream.Collectors;

@GrpcService
public class JVectorGrpcService extends JVectorServiceGrpc.JVectorServiceImplBase {

    // In a real app, this would be injected and pre-loaded from disk
    private final GraphIndex index; 
    private final RandomAccessVectorValues ravv;

    @Override
    public void search(SearchRequest request, StreamObserver<SearchResponse> responseObserver) {
        // 1. Convert the Protobuf "repeated float" to a JVector VectorFloat
        float[] queryArray = new float[request.getQueryVectorCount()];
        for (int i = 0; i < request.getQueryVectorCount(); i++) {
            queryArray[i] = request.getQueryVector(i);
        }
        VectorFloat<?> queryVector = VectorFloat.of(queryArray);

        // 2. Perform the search using JVector's GraphSearcher
        try (var searcher = new GraphSearcher(index)) {
            var result = searcher.search(queryVector, request.getTopK(), ravv, VectorSimilarityFunction.COSINE);

            // 3. Map JVector results back to Protobuf messages
            var pbResults = result.getNodes().stream()
                .map(ns -> SearchResult.newBuilder()
                    .setId(ns.node)
                    .setScore(ns.score)
                    .build())
                .collect(Collectors.toList());

            // 4. Send the response back to Python
            SearchResponse response = SearchResponse.newBuilder()
                .addAllResults(pbResults)
                .build();

            responseObserver.onNext(response);
            responseObserver.onCompleted();
        } catch (Exception e) {
            responseObserver.onError(e);
        }
    }
}

Code Wrap-up

Hardware Acceleration: By running the searcher on the JVM, we can use the flag -XX:+UnlockExperimentalVMOptions -XX:+UseVectorApi. This allows JVector to use AVX-512 or ARM Neon instructions directly on the queries, something Python's interpreter cannot do natively for a custom graph traversal.
Thread Safety: JVector is designed for high concurrency. We can have dozens of Python workers hitting this single Java service simultaneously, and JVector will handle the parallel graph traversals efficiently.
Language Independence: Non java developers and data science team can iterate on the Python frontend (changing UI, re-ranking logic, etc.) without ever touching the high-performance indexing logic in Java.

Conclusion

By bridging the high-performance world of JVector with the flexibility of Python, one is no longer forced to choose between developer velocity and raw execution speed. By following the “Production Stack” — cleaning data in NumPy, leveraging binary memory-mapping for zero-copy storage, and wrapping the engine in a Java-based gRPC service — we create a system that is both scalable and incredibly lean. This architecture allows us to harness the power of DiskANN and SIMD acceleration while keeping our application logic in the Python ecosystem!

Disclaimer: this series of articles and the extensive research underpinning them are a testament to the power of collaborative innovation. This work would not have been possible without the invaluable contributions of our DataStax colleagues, who recently joined our ranks. Their willingness to share deep domain expertise and technical insights has been a true catalyst for these breakthroughs. By bridging our collective knowledge, we have been able to push the boundaries of what is possible in the AI and data landscape, and I am profoundly grateful for their partnership, mentorship, and the spirit of excellence they bring to our team.

DEV Community