Nuno Silva

Posted on Feb 20 • Originally published at edge-case.hashnode.dev

Precision Data Access in Spring Data JPA: A Guide to Projections

#database #java #springboot #performance

As an application matures, its domain model inevitably grows heavier. What started as a simple Order entity evolves into a dense, interconnected graph of LineItem, CustomerProfile, PaymentHistory, and ShippingManifest objects. That complexity is necessary — your core business logic genuinely needs it. But it creates a hidden tax on every read operation in your system.

The problem isn't the entity itself. The problem is using the same heavy entity fetch for every use case, regardless of what the caller actually needs.

Consider an Order entity with a dozen relationships. An invoice generation process needs all of it: the full entity graph, all lazy-loaded associations, the complete picture. A status monitoring job sitting next to it needs two fields: orderId and status. If both use findById() or findAll(), the monitoring job is doing the exact same work as the invoice process — hydrating a full entity graph, triggering Hibernate's dirty-tracking machinery, and risking N+1 fetches on relationships it never touches.

Spring Data JPA Projections solve this directly. They let you define exactly what data a caller needs and have the repository return precisely that — nothing more. This guide covers the projection types available in Spring Data JPA, when each is the right fit, and where each one breaks down.

The Problem Projections Solve

Before looking at the solutions, it's worth being precise about what's actually expensive when you fetch a full entity unnecessarily.

When Hibernate loads a managed entity, it does more than execute a SELECT. It:

Registers the entity in the first-level cache, holding a reference for the duration of the Session
Takes a state snapshot for dirty checking, so it can detect changes and generate targeted UPDATEs on flush
Initialises proxy objects for every lazy relationship declared on the entity, even ones the caller will never touch

That machinery is essential when you're going to modify the entity and persist changes. When you're only reading two fields and discarding the result, you're paying for infrastructure you don't use.

There's also the SQL itself. A standard findAll() on a complex entity selects every mapped column. The difference between a full entity fetch and a projection is not just what arrives in your JVM — it's what travels across the wire on every single row:

// Full entity fetch
DB ──► id, status, created_at, updated_at,
       customer_id, billing_addr, shipping_addr,
       currency, discount, tax_rate, subtotal,
       total, notes, internal_ref, ...           ──► JVM
       (28 columns the caller will never read)

// Projection fetch (OrderStatusSummary)
DB ──► id, status                                ──► JVM

Projections fix both problems. They scope the SQL to the columns you actually need, and because the result isn't a managed entity, Hibernate skips the lifecycle overhead entirely.

The Version Decision: One Rule

Before covering the projection types, here's the rule stated plainly so you can skip to what applies to your stack:

Java 16+: Use Records. They're stable, concise, and compiler-enforced.
Java 11 or below: Use class-based DTOs, with Lombok's @Value if it's on your classpath.
Java 14–15: Records exist behind --enable-preview, but preview features carry compatibility risk. Treat your stack as Java 11 for production purposes.

Both approaches use the same JPQL constructor expression syntax and generate identical SQL. The difference is purely in how much boilerplate you write to define the projection type.

Projection Types at a Glance

Type	Java Version	Boilerplate	Type Safety	Best For
Interface projection	Any	None	Compile-time	Simple root-entity field subsets
Class-based DTO	Java 8+	High (or Lombok)	Runtime (JPQL string)	Multi-table joins on Java 11 or below
Record projection	Java 16+ (stable)	None	Runtime (JPQL string)	Multi-table joins on Java 16+
Dynamic projection	Any	None	Runtime	Consolidating multiple fetch shapes into one repository method

1. Interface Projections

The simplest form of projection is an interface that declares getter methods for the fields you want. Spring Data JPA generates a proxy at runtime that maps the query result to your interface.

Suppose your status monitoring job only needs the order ID and its current status:

public interface OrderStatusSummary {
    Long getId();
    String getStatus();
}

You use it directly as a return type in your repository:

public interface OrderRepository extends JpaRepository<Order, Long> {

    List<OrderStatusSummary> findByStatus(String status);
}

Spring inspects the interface at startup, derives the required fields from the getter names, and generates a SQL query scoped to those columns:

-- What Spring actually generates — not SELECT *
SELECT o.id, o.status
FROM orders o
WHERE o.status = ?

No joins fired speculatively. No unmapped columns transferred. No entity lifecycle initialised. For a monitoring job running against a table with millions of rows, the difference in data transferred and query execution time is measurable.

Note: The SQL Spring generates is derived directly from your getter names. If a getter name doesn't match a mapped field on the entity, Spring will silently return null for that field rather than throwing an error. Any time a projection returns unexpected nulls, enable SQL logging and verify the generated query — the mismatch is usually obvious from the column list.

Where Interface Projections Break Down

Interface projections work cleanly when the fields you need map directly to columns on the root entity. They get dangerous when you need data from related entities.

You can traverse relationships using nested interfaces:

public interface OrderStatusSummary {
    Long getId();
    String getStatus();
    CustomerSummary getCustomer();  // nested projection

    interface CustomerSummary {
        String getName();
    }
}

This looks clean, but it hides a serious trap. When you back this with a derived query method like findByStatus(), Spring Data does not generate a join. Instead, it fetches the root projection and then issues a separate SELECT for every single row's related entity to populate the nested proxy — the exact N+1 problem this approach was supposed to avoid.

If you need a nested interface projection, you must back it with an explicit @EntityGraph or a @Query with a JOIN. The derived query and the entity graph version are not equivalent:

// ❌ Generates N+1 silently.
// Spring fetches orders, then fires one SELECT per row to load the customer.
public interface OrderRepository extends JpaRepository<Order, Long> {
    List<OrderStatusSummary> findByStatus(String status);
}

// ✅ Forces a join — one query, no surprises.
public interface OrderRepository extends JpaRepository<Order, Long> {

    @EntityGraph(attributePaths = {"customer"})
    List<OrderStatusSummary> findByStatus(String status);
}

Enable SQL logging (spring.jpa.show-sql=true) and verify the generated output any time you introduce a nested interface projection. If you see repeated identical SELECTs, the join isn't being applied.

The Open Projection Trap

Interface projections support SpEL expressions via @Value, which lets you compute derived fields from the entity:

public interface OrderStatusSummary {
    @Value("#{target.customer.firstName + ' ' + target.customer.lastName}")
    String getCustomerFullName();
}

This looks convenient but completely defeats the purpose of using a projection. To evaluate a SpEL expression, Spring must load the entire entity graph into memory — including all lazy relationships — before computing the result. You get none of the column scoping or lifecycle overhead savings that make projections valuable.

If you need a computed field, derive it in SQL instead using a dedicated DTO or Record with an explicit constructor expression:

// A dedicated projection for this specific read shape
public record OrderCustomerSummary(Long id, String customerFullName) {}

@Query("""
    SELECT new com.yourapp.dto.OrderCustomerSummary(
        o.id,
        CONCAT(c.firstName, ' ', c.lastName)
    )
    FROM Order o
    JOIN o.customer c
    WHERE o.status = :status
""")
List<OrderCustomerSummary> findWithCustomerName(@Param("status") String status);

The database computes the concatenation, only the two resulting values cross the wire, and there's no entity graph loaded anywhere.

There's also a subtler issue: interface projections are backed by a dynamic proxy, which means every field access goes through a method dispatch rather than a direct field read. For most use cases this cost is negligible. For a batch job processing millions of rows in a tight loop, it's worth being aware of.

2. Class-Based DTO Projections

Before Java Records existed, the standard approach was a plain class with a constructor matching the fields you wanted to project. This is the right choice for any Spring Boot 2.7.x application, and it remains fully supported across all modern Spring Boot releases if Records aren't an option.

The pattern relies on JPQL constructor expressions. You write a regular class with a matching constructor, and JPQL maps the query result directly into it:

public class OrderStatusSummary {

    private final Long id;
    private final String status;

    // Constructor must match the field order in the JPQL SELECT clause exactly
    public OrderStatusSummary(Long id, String status) {
        this.id = id;
        this.status = status;
    }

    public Long getId() { return id; }
    public String getStatus() { return status; }
}

The repository uses a @Query with a constructor expression:

public interface OrderRepository extends JpaRepository<Order, Long> {

    @Query("SELECT new com.yourapp.dto.OrderStatusSummary(o.id, o.status) FROM Order o WHERE o.status = :status")
    List<OrderStatusSummary> findByStatus(@Param("status") String status);
}

The generated SQL is scoped to the columns you declare, with no entity lifecycle overhead:

SELECT o.id, o.status
FROM orders o
WHERE o.status = ?

The result is a plain Java object with no Hibernate proxy, no dirty tracking, and no connection to the Session. It behaves identically to a Record projection in terms of what Hibernate does — the only difference is the boilerplate you write to define it.

The Boilerplate Problem at Scale

The weakness of class-based DTOs becomes apparent when your domain has many different projection shapes. Each one requires a separate class with a constructor, getters, and — if you need equality or debugging — equals(), hashCode(), and toString(). On Java 11 and older, that's a meaningful amount of code to maintain.

The common mitigation before Records was Lombok:

@Value  // generates constructor, getters, equals, hashCode, toString — immutable by default
public class OrderStatusSummary {
    Long id;
    String status;
}

@Value gives you a functionally immutable class with zero hand-written boilerplate, and it works on Java 8+. If your team is already using Lombok and you're not yet on Java 16, this is the practical equivalent of a Record projection.

One important caveat: class-based DTOs require a fully-qualified class name in the JPQL constructor expression. If you rename or move the class, the @Query annotation won't fail at compile time — it will fail at runtime when the JPQL is parsed. This is a known fragility of the constructor expression approach, and it applies equally to Records.

3. Java Record Projections

If you're on Java 11 or below, skip this section — the class-based DTO approach above is the direct equivalent.

Record projections are the modern replacement for class-based DTOs. The projection shape is declared as a Record, which gives you an immutable data carrier with a canonical constructor, equals(), hashCode(), and toString() generated by the compiler — no Lombok required:

public record OrderStatusSummary(Long id, String status) {}

The repository usage is identical to the class-based approach:

public interface OrderRepository extends JpaRepository<Order, Long> {

    @Query("SELECT new com.yourapp.dto.OrderStatusSummary(o.id, o.status) FROM Order o WHERE o.status = :status")
    List<OrderStatusSummary> findByStatus(@Param("status") String status);
}

The generated SQL and Hibernate behaviour are the same. The advantage is purely in the declaration: a one-line Record replaces a full DTO class, and the compiler enforces immutability rather than relying on convention.

Records also compose cleanly with Java Streams. Because a Record is a transparent data carrier with value-based equality, you can group, deduplicate, and compare projection results without implementing equals() yourself — something class-based DTOs require explicit attention to get right.

4. Dynamic Projections

If you have a single entity accessed by many different callers, each needing a different slice of data, you end up with either a proliferation of repository methods or the temptation to return the full entity everywhere and let each caller ignore what it doesn't need. Dynamic Projections offer a third option: one repository method that accepts the desired return type as a parameter.

public interface OrderRepository extends JpaRepository<Order, Long> {

    <T> Optional<T> findById(Long id, Class<T> type);
}

Each caller passes the projection type it needs:

// Invoice process: needs the full managed entity
Order fullOrder = orderRepository.findById(orderId, Order.class).orElseThrow();

// Status monitor: needs only the lightweight summary
OrderStatusSummary summary = orderRepository.findById(orderId, OrderStatusSummary.class).orElseThrow();

Spring inspects the Class<T> argument at runtime and generates the appropriate query — full entity fetch for Order.class, scoped column fetch for a projection interface or Record.

Where Dynamic Projections Break Down

The tradeoff is type safety. Because the return type is generic, the compiler cannot verify at build time that a given Class<T> argument is a valid projection for this entity. Passing an incompatible type compiles fine and fails at runtime. In a large codebase with many callers, that's a meaningful operational risk.

Dynamic Projections are a reasonable fit when you have a small, stable set of well-known projection types and the convenience of a single method is genuinely valuable. When the set of projection types is large or evolving, explicit repository methods with named return types are safer — the compiler enforces correctness, and the method signatures serve as documentation.

What Projections Are Not For

Projections are a read-only tool. They give you a scoped view of data for retrieval; they have no path back to the persistence context for writes.

If your caller needs to load an entity, modify it, and save it, use a standard entity fetch — that's exactly what Hibernate's dirty checking and transaction management are built for. The overhead that projections eliminate is only overhead when you're not using it. For writes, you need the full entity lifecycle.

The mental model that ties this to the N+1 problem: use projections for the same category of operations where you'd otherwise reach for a native SQL query returning a DTO. When you only need data — no state changes, no lifecycle — projections let you stay in the JPA abstraction while still being precise about what you ask the database for.

Summary

Projections don't replace entities — they complement them by giving you a precise, read-only view of your data without loading what you don't need.

Use interface projections when the fields you need map directly to the root entity and you want minimal boilerplate. Always back nested interface projections with @EntityGraph or an explicit @Query join — derived queries will silently generate N+1.
Never use SpEL @Value expressions in interface projections. They force a full entity load and eliminate every performance benefit projections provide. Push computed fields into SQL instead.
Use class-based DTO projections (with Lombok's @Value if available) on Java 11 or below. This is the workhorse for Spring Boot 2.7.x applications.
Use Record projections on Java 16+. Same SQL, same behaviour — less boilerplate, compiler-enforced immutability.
Use dynamic projections when consolidating multiple fetch patterns behind a single repository method, with the understanding that type safety is enforced at runtime, not compile time.
Don't use projections for writes. Any operation that modifies state and persists it should use the full managed entity.

Apply This Today

Open your APM tool — Datadog, New Relic, or whatever you're running in production — and filter for your highest-frequency read queries. Alternatively, turn on spring.jpa.show-sql=true locally and hit your most heavily used GET endpoints.

For each one, ask: is the repository method returning a full entity, and is the caller actually using all of it? If the answer is no, you have a projection candidate. Pick the heaviest offender, replace the entity return type with a scoped interface or DTO projection, and measure the query execution time and memory allocation before and after. The delta is usually immediate and significant.

DEV Community