Cies Breijs

Posted on Mar 24

Postgres pipelines from the JVM with Bpdbi

Every time your JVM app talks to Postgres over JDBC, something wasteful happens. Your code sends a query, waits for the response, sends the next query, waits again, etc. Each wait is a full network round-trip with a typical latency of 0.5-2ms in a cloud environment. For a simple transaction with a few queries, that's 8-10ms of just... waiting.

This problem is exacerbated by using db transactions and the additional queries that are commonly added when using Postgres' Row-Level Security (RLS). For instance when using Supabase, a role and JWT claims need to be set before executing the actual query. Usually this is done more than once for each HTTP request, as some db queries need to be executed with a different role than others.

Bpdbi is a new JVM database library for the Postgres database. It exposes pipelining, a feature that's been part of the Postgres wire protocol since version 14.

JDBC cannot do pipelines. Bpdbi can.

Where this came from

Bpdbi started as a port of the Vert.x SQL Client — the database layer behind Quarkus and one of the fastest JVM database drivers. Vert.x already speaks the Postgres wire protocol directly (no JDBC), and exposes the pipelines feature.

Vert.x is fully reactive (async/non-blocking). Reactive code is not for everyone: it adds significant complexity ¹ ².

Reactive code looks like this:

// Reactive — same logic, now good luck
public Uni<Invoice> generateInvoice(long orderId) {
  return orderRepo.findById(orderId)
      .flatMap(order ->
          Uni.combine().all()
              .unis(
                  customerRepo.findById(order.getCustomerId()),
                  itemRepo.findByOrderId(orderId))
              .asTuple()
              .map(tuple -> invoiceService.build(
                  order, tuple.getItem1(), tuple.getItem2())));
}

While the blocking equivalent look like this:

public Invoice generateInvoice(long orderId) {
  Order order = orderRepo.findById(orderId);
  Customer customer = customerRepo.findById(order.getCustomerId());
  List<Item> items = itemRepo.findByOrderId(orderId);
  return invoiceService.build(order, customer, items);
}

Reactive allows for higher throughput in high-traffic scenarios, but comes at a cost: less readable code, useless stack traces, and a paradigm that infects your code base. With Java 21's virtual threads, blocking I/O became much cheaper — you get thousands of concurrent connections without platform threads.

Bpdbi started as a port of Vert.x SQL Client, stripped of all its async/reactive machinery. Where Vert.x SQL uses Netty to connect with the database, Bpdbi uses a good old java.net.Socket.

Bpdbi employs Postgres' binary protocol and pipelines for all db queries, even single queries without parameters. This results in a simpler library with a much smaller footprint, while being very performant (as shown in the benchmarks).

TL;DR: Bpdbi provides blocking, pipelined, small-footprint and performant Postgres access for the JVM.

Pipelining in practice

Here's the core idea. Say you need to start a transaction, set some config, and run a query. With JDBC, that's four separate round-trips:

conn.createStatement().execute("BEGIN");
conn.createStatement().execute("SET statement_timeout TO '5s'");
conn.createStatement().execute("SET LOCAL role TO 'authenticated'");
PreparedStatement ps =
  conn.prepareStatement("SELECT * FROM orders WHERE id = ?");
ps.setInt(1, 42);
ResultSet rs = ps.executeQuery();

With Bpdbi, it's one:

conn.enqueue("BEGIN");
conn.enqueue("SET statement_timeout TO '5s'");
conn.enqueue("SET LOCAL role TO 'authenticated'");
RowSet result =
  conn.query("SELECT * FROM orders WHERE id = $1", 42);

enqueue() buffers statements locally. query() flushes everything —all the enqueued statements plus itself— in a single TCP write. The Postgres instance processes them all and sends all responses back at once. This feature is called pipelining.

When you need all the results

Sometimes you want results from multiple pipelined queries. flush() returns them all:

conn.enqueue("BEGIN");
int aliceQx = conn
  .enqueue("INSERT INTO users (name) VALUES (:name) RETURNING id")
  .bind("name", "Alice");
int bobQx = conn
  .enqueue("INSERT INTO users (name) VALUES (:name) RETURNING id")
  .bind("name", "Bob");
conn.enqueue("COMMIT");
List<RowSet> results = conn.flush();

long aliceId = results.get(aliceQx).first().getLong("id");
long bobId   = results.get(bobQx).first().getLong("id");

Four statements, one round-trip. Each enqueue() returns an index so you know which result is which.

The benchmark numbers

We ran JMH benchmarks using Toxiproxy to simulate 1ms network latency per direction (2ms round-trip). This simulates what you'd see talking to a database in the same cloud region.

Scenario	Bpdbi	JDBC (`pgjdbc`)	Speedup
10 SELECTs (pipelined vs sequential)	310 ops/s	18 ops/s	17x
Transaction (BEGIN+SELECT+COMMIT)	360 ops/s	185 ops/s	~2x
10 INSERTs in a transaction	116 ops/s	18 ops/s	6.5x
Cursor fetch (1000 rows)	281 ops/s	30 ops/s	9.3x
Bulk insert (100 rows)	313 ops/s	171 ops/s	1.8x
Single row lookup	370 ops/s	370 ops/s	op par
Multi-row fetch (10 rows)	358 ops/s	358 ops/s	on par

The pattern is clear: anything that touches the network more than once gets a massive speedup. While single-query performance is on par with JDBC + pgjdbc.

Postgres-only, and that's the point

Bpdbi only supports Postgres, the only open source database that truly supports pipelining. This is intentional, and it buys us a lot.

Binary protocol everywhere

The Postgres extended query protocol lets you request results in binary format. An integer comes back as four raw bytes instead of the text string "12345". A UUID is 16 bytes instead of 36 characters. No string allocation and string parsing.

Most JDBC drivers use the text format for simple queries (the "simple query protocol") and only switch to binary for prepared statements. Bpdbi uses the extended query protocol with binary format for everything — even BEGIN, COMMIT, and SET. This is what makes uniform pipelining possible: every statement uses the same wire protocol, so they can all be batched in a pipeline.

Small footprint

Bpdbi's Postgres driver is about 1,400 lines of Java. The whole library is under 200KB and that includes a connection pool.

Compare that to a typical Postgres/Jdbi/HikariCP stack:

Component	Size
`pgjdbc` (JDBC driver)	~1.1 MB
Jdbi (developer experience)	~1 MB
HikariCP (connection pool)	~160 KB
Total	~2.3 MB
Bpdbi (everything)	< 200 KB

And that's the modest stack. Hibernate is ~15MB, jOOQ is ~15MB.
The Vert.x SQL Client with Netty clocks in at 5MB+.
Bpdbi has no transitive dependencies beyond SCRAM-client (a small crypt lib for connection auth).

No Netty, no event loops

Plain java.net.Socket with unsynchronized buffered I/O streams.
No channel pipelines, no allocator frameworks, no thread pools. This works because Bpdbi connections are single-threaded by design — just like JDBC connections. With virtual threads, blocking on a socket is cheap. The simplicity pays off in readability, debugging, and startup time.

GraalVM native-image ready

The core library uses zero reflection. native-image just works, no configuration needed.

Developer experience: Jdbi-level, not JDBC-level

Directly using the JDBC API in your application code is low-level and verbose. That's why libraries like Jdbi, Spring JDBC Template and Sql2o exist. They provide: named parameters, row mapping, pluggable data type binders/ row mappers/ JSON mappers.

Bpdbi has all these features built-in: you don't need an additional library. It comes with add-on modules: bpdbi-record-mapper for mapping rows to Java records, and bpdbi-bean-mapper for mapping rows to JavaBeans. The bpdbi-kotlin add-on contains a mapper for mapping rows to Kotlin data classes using kotlinx.serialization which, unlike the other row mappers, does not use reflection.

Under the hood

The performance doesn't just come from pipelining. Bpdbi borrows optimization ideas from pgjdbc, Vert.x, and Jdbi, and even adds some of its own:

Column-oriented storage. A 100K-row, 10-column result creates 10 byte arrays instead of 1,000,000. Each Row is a lightweight view (buffer reference + row index) with no per-row allocation.
Lazy decoding. Rows store raw wire bytes. Columns you never read are never decoded. Your SELECT * that only reads 3 columns out of 20? Only those 3 get decoded.
Binary parameter encoding. Sending an int4 as 4 raw bytes instead of the ASCII string "12345" saves wire bandwidth and server CPU.
Unsynchronized I/O. Java's BufferedOutputStream acquires a lock on every write. Since Bpdbi connections are single-threaded, the buffered streams skip all synchronization.
Prepared statement cache. An LRU cache avoids re-parsing the same SQL. Oversized queries that would flush the cache are rejected outright.
Deadlock prevention. Large pipelined batches can deadlock if both TCP buffers fill up. Bpdbi estimates response sizes and inserts mid-pipeline syncs to prevent this — transparently.

Who is Bpdbi for?

Requirements for using Bpdbi:

Use Postgres (it's the only database that has pipelines).
Use Java 21+ (virtual threads make blocking I/O practical at scale).

Bpdbi makes a lot of sense if you:

Want to use Postgres' pipelining.
Use RLS (most Supabase users) or otherwise make a lot of "prefix" queries.
Prefer simple, blocking code over reactive/async/non-blocking code.
Care about small dependencies and fast startup (GraalVM, serverless, CLI tools).
Want to write SQL by hand (not ORM).

If you're already happy with Hibernate or jOOQ and their compile-time SQL validation, Bpdbi is probably not what you need.

Web application libraries/frameworks that work well with Bpdbi

Bpdbi uses blocking I/O and is designed for virtual threads.
It pairs well with HTTP frameworks that are not mandatorily reactive/async and do not dictate JDBC:

http4k — Functional, zero-reflection, tiny. The philosophical twin of Bpdbi on the HTTP side.
Javalin — Minimal Jetty wrapper with built-in virtual thread support. Very popular in both Java and Kotlin.
Helidon SE 4+ — Oracle's lightweight framework. Versions 1–3 were reactive (Reactive Streams); 4.x was rewritten around virtual threads and blocking I/O.
Undertow — Embedded, low-level. Blocking handlers run on a worker thread pool (or virtual threads).
Micronaut — Compile-time DI, GraalVM-first. Supports both reactive and imperative, controller methods can simply return values.
Spark — Dead-simple Java micro-framework with the same "just enough" philosophy.
Jooby — Modular micro-framework, explicit about dependencies, virtual thread support.
com.sun.net.httpserver — The JDK's built-in HTTP server. Zero dependencies, pairs naturally with Bpdbi's minimalism.

Frameworks like Spring Boot are opinionated about their own data stacks (Spring Data, Hibernate) and assume a JDBC DataSource integration for transactions, health checks, and connection management.

Give it a spin!

Everything you need to get started van be found on Bpdbi's GitHub repository. If you miss something, raise an issue to let us know.

DEV Community