If you’ve used Spring Data JPA long enough, you’ve probably seen this advice:
“Use
getReferenceById()to avoid a database query.”
It’s not wrong — but it’s also not the full story.
getReferenceById() (backed by EntityManager.getReference()) usually returns a lazy proxy. That proxy can avoid an immediate SELECT, but it can also surprise you later with extra queries, unexpected exceptions, or weird JSON output.
This post explains what it actually does, when it hits the DB, and the safe patterns to use it.
What getReferenceById() actually returns
When you call:
User user = userRepository.getReferenceById(id);
you usually get a proxy object (Hibernate proxy) that:
- knows the entity type (
User) - remembers the id
- does not necessarily load the row immediately
- will load the entity later, if/when you touch a non-id property
Think of it like: “I promise there’s a User with this id — I’ll fetch it only if you ask for details.”
When it doesn’t query the database
In many write-only use cases, you only need a reference to set a relationship.
Example: create a Payment pointing to an existing User, without loading the User.
@Transactional
public void createPayment(Long userId, BigDecimal amount) {
User userRef = userRepository.getReferenceById(userId);
Payment p = new Payment();
p.setUser(userRef);
p.setAmount(amount);
paymentRepository.save(p); // often no SELECT for User
}
If you never access userRef.getName() / userRef.getEmail() etc., Hibernate can persist Payment using the FK (user_id) without loading the User.
✅ This is the best use case for getReferenceById().
When it will query the database
1) Accessing any non-id field
User userRef = repo.getReferenceById(id);
// Triggers SELECT (lazy initialization)
String name = userRef.getName();
Even calling toString() can trigger a load if your toString() prints fields.
2) Serializing entity proxies (e.g., returning entities from controllers)
If you return JPA entities directly from a REST controller, your JSON serializer (Jackson) may try to read fields → which triggers lazy loading → which triggers queries.
Sometimes it becomes:
- N+1 queries
-
LazyInitializationException(if outside transaction) - weird proxy metadata in JSON
Rule: Do not return entities directly. Return DTOs.
The “surprise exception” people don’t expect
Entity not found… but only later
Unlike findById(), which checks immediately, a reference may fail when initialized.
User userRef = repo.getReferenceById(999L); // no immediate SELECT
// Later...
userRef.getName(); // may throw EntityNotFoundException at this moment
So your code might look fine during the write, then explode at a random later point when something touches the proxy.
getReferenceById() vs findById() (practical difference)
findById(id)
- Executes a
SELECTimmediately - Returns
Optional<T> - You know right away whether it exists
- Best for read paths, validation, business decisions
getReferenceById(id)
- Returns a proxy reference (usually no immediate
SELECT) - Existence is not verified immediately
- Best for write paths when you only need an FK reference
Safe patterns you can copy
✅ Pattern A: “FK-only write path”
Use reference when you only need the FK and don’t care about reading the parent.
@Transactional
public void addItem(Long orderId, String sku) {
Order orderRef = orderRepository.getReferenceById(orderId);
OrderItem item = new OrderItem();
item.setOrder(orderRef);
item.setSku(sku);
orderItemRepository.save(item);
}
If the orderId is invalid, the DB may reject it via FK constraint on flush/commit.
✅ Pattern B: “Validate first, then reference”
If you need a friendly error message, validate existence first.
@Transactional
public void addItem(Long orderId, String sku) {
if (!orderRepository.existsById(orderId)) {
throw new IllegalArgumentException("Order not found: " + orderId);
}
Order orderRef = orderRepository.getReferenceById(orderId);
OrderItem item = new OrderItem();
item.setOrder(orderRef);
item.setSku(sku);
orderItemRepository.save(item);
}
(Yes, that’s an extra query — but it’s explicit and predictable.)
✅ Pattern C: map to DTOs (avoid proxy serialization traps)
Never return entities directly:
@GetMapping("/users/{id}")
public UserDto getUser(@PathVariable Long id) {
User user = userRepository.findById(id).orElseThrow();
return new UserDto(user.getId(), user.getName());
}
Common foot-guns (learn once, avoid forever)
❌ 1) Calling getReferenceById() and then logging fields
log.info("User name={}", userRef.getName()); // triggers SELECT
❌ 2) Using Lombok @Data on entities with lazy relations
Generated toString(), equals(), hashCode() can touch lazy fields and cause loading.
Tip: For entities, prefer:
@Getter/@Setter- carefully controlled
toString() -
equals/hashCodebased on immutable identifiers (or the id once assigned), with caution
❌ 3) Using reference outside a transaction and accidentally initializing it
This is where LazyInitializationException shows up.
How to prove what’s happening (simple debugging trick)
Turn on SQL logging in dev:
spring.jpa.show-sql=true
logging.level.org.hibernate.SQL=DEBUG
Then compare:
-
findById()→ you’ll see immediateSELECT -
getReferenceById()→ you’ll often see no SQL until a getter is called
Rule of thumb
If you remember only one thing, remember this:
-
Read path: use
findById()(or a query/DTO projection) -
Write path (FK-only): use
getReferenceById() - Never: return JPA entities directly from REST endpoints
Top comments (0)