Thellu

Posted on Feb 2

`getReferenceById()` Isn’t “No-DB”: What It Really Does in JPA (and When to Use It)

#webdev #beginners #programming #java

If you’ve used Spring Data JPA long enough, you’ve probably seen this advice:

“Use getReferenceById() to avoid a database query.”

It’s not wrong — but it’s also not the full story.

getReferenceById() (backed by EntityManager.getReference()) usually returns a lazy proxy. That proxy can avoid an immediate SELECT, but it can also surprise you later with extra queries, unexpected exceptions, or weird JSON output.

This post explains what it actually does, when it hits the DB, and the safe patterns to use it.

What `getReferenceById()` actually returns

When you call:

User user = userRepository.getReferenceById(id);

you usually get a proxy object (Hibernate proxy) that:

knows the entity type (User)
remembers the id
does not necessarily load the row immediately
will load the entity later, if/when you touch a non-id property

Think of it like: “I promise there’s a User with this id — I’ll fetch it only if you ask for details.”

When it doesn’t query the database

In many write-only use cases, you only need a reference to set a relationship.

Example: create a Payment pointing to an existing User, without loading the User.

@Transactional
public void createPayment(Long userId, BigDecimal amount) {
  User userRef = userRepository.getReferenceById(userId);

  Payment p = new Payment();
  p.setUser(userRef);
  p.setAmount(amount);

  paymentRepository.save(p); // often no SELECT for User
}

If you never access userRef.getName() / userRef.getEmail() etc., Hibernate can persist Payment using the FK (user_id) without loading the User.

✅ This is the best use case for getReferenceById().

When it will query the database

1) Accessing any non-id field

User userRef = repo.getReferenceById(id);

// Triggers SELECT (lazy initialization)
String name = userRef.getName();

Even calling toString() can trigger a load if your toString() prints fields.

2) Serializing entity proxies (e.g., returning entities from controllers)

If you return JPA entities directly from a REST controller, your JSON serializer (Jackson) may try to read fields → which triggers lazy loading → which triggers queries.

Sometimes it becomes:

N+1 queries
LazyInitializationException (if outside transaction)
weird proxy metadata in JSON

Rule: Do not return entities directly. Return DTOs.

The “surprise exception” people don’t expect

Entity not found… but only later

Unlike findById(), which checks immediately, a reference may fail when initialized.

User userRef = repo.getReferenceById(999L); // no immediate SELECT

// Later...
userRef.getName(); // may throw EntityNotFoundException at this moment

So your code might look fine during the write, then explode at a random later point when something touches the proxy.

`getReferenceById()` vs `findById()` (practical difference)

`findById(id)`

Executes a SELECT immediately
Returns Optional<T>
You know right away whether it exists
Best for read paths, validation, business decisions

`getReferenceById(id)`

Returns a proxy reference (usually no immediate SELECT)
Existence is not verified immediately
Best for write paths when you only need an FK reference

Safe patterns you can copy

✅ Pattern A: “FK-only write path”

Use reference when you only need the FK and don’t care about reading the parent.

@Transactional
public void addItem(Long orderId, String sku) {
  Order orderRef = orderRepository.getReferenceById(orderId);

  OrderItem item = new OrderItem();
  item.setOrder(orderRef);
  item.setSku(sku);

  orderItemRepository.save(item);
}

If the orderId is invalid, the DB may reject it via FK constraint on flush/commit.

✅ Pattern B: “Validate first, then reference”

If you need a friendly error message, validate existence first.

@Transactional
public void addItem(Long orderId, String sku) {
  if (!orderRepository.existsById(orderId)) {
    throw new IllegalArgumentException("Order not found: " + orderId);
  }
  Order orderRef = orderRepository.getReferenceById(orderId);

  OrderItem item = new OrderItem();
  item.setOrder(orderRef);
  item.setSku(sku);

  orderItemRepository.save(item);
}

(Yes, that’s an extra query — but it’s explicit and predictable.)

✅ Pattern C: map to DTOs (avoid proxy serialization traps)

Never return entities directly:

@GetMapping("/users/{id}")
public UserDto getUser(@PathVariable Long id) {
  User user = userRepository.findById(id).orElseThrow();
  return new UserDto(user.getId(), user.getName());
}

Common foot-guns (learn once, avoid forever)

❌ 1) Calling `getReferenceById()` and then logging fields

log.info("User name={}", userRef.getName()); // triggers SELECT

❌ 2) Using Lombok `@Data` on entities with lazy relations

Generated toString(), equals(), hashCode() can touch lazy fields and cause loading.

Tip: For entities, prefer:

@Getter/@Setter
carefully controlled toString()
equals/hashCode based on immutable identifiers (or the id once assigned), with caution

❌ 3) Using reference outside a transaction and accidentally initializing it

This is where LazyInitializationException shows up.

How to prove what’s happening (simple debugging trick)

Turn on SQL logging in dev:

spring.jpa.show-sql=true
logging.level.org.hibernate.SQL=DEBUG

Then compare:

findById() → you’ll see immediate SELECT
getReferenceById() → you’ll often see no SQL until a getter is called

Rule of thumb

If you remember only one thing, remember this:

Read path: use findById() (or a query/DTO projection)
Write path (FK-only): use getReferenceById()
Never: return JPA entities directly from REST endpoints

DEV Community

`getReferenceById()` Isn’t “No-DB”: What It Really Does in JPA (and When to Use It)

What `getReferenceById()` actually returns

When it doesn’t query the database

When it will query the database

1) Accessing any non-id field

2) Serializing entity proxies (e.g., returning entities from controllers)

The “surprise exception” people don’t expect

Entity not found… but only later

`getReferenceById()` vs `findById()` (practical difference)

`findById(id)`

`getReferenceById(id)`

Safe patterns you can copy

✅ Pattern A: “FK-only write path”

✅ Pattern B: “Validate first, then reference”

✅ Pattern C: map to DTOs (avoid proxy serialization traps)

Common foot-guns (learn once, avoid forever)

❌ 1) Calling `getReferenceById()` and then logging fields

❌ 2) Using Lombok `@Data` on entities with lazy relations

❌ 3) Using reference outside a transaction and accidentally initializing it

How to prove what’s happening (simple debugging trick)

Rule of thumb

Top comments (0)

What getReferenceById() actually returns

When it doesn’t query the database

When it will query the database

1) Accessing any non-id field

2) Serializing entity proxies (e.g., returning entities from controllers)

The “surprise exception” people don’t expect

Entity not found… but only later

getReferenceById() vs findById() (practical difference)

findById(id)

getReferenceById(id)

Safe patterns you can copy

✅ Pattern A: “FK-only write path”

✅ Pattern B: “Validate first, then reference”

✅ Pattern C: map to DTOs (avoid proxy serialization traps)

Common foot-guns (learn once, avoid forever)

❌ 1) Calling getReferenceById() and then logging fields

❌ 2) Using Lombok @Data on entities with lazy relations

❌ 3) Using reference outside a transaction and accidentally initializing it

How to prove what’s happening (simple debugging trick)

Rule of thumb

What `getReferenceById()` actually returns

`getReferenceById()` vs `findById()` (practical difference)

`findById(id)`

`getReferenceById(id)`

❌ 1) Calling `getReferenceById()` and then logging fields

❌ 2) Using Lombok `@Data` on entities with lazy relations