DEV Community

Taras Antoniuk
Taras Antoniuk

Posted on

How JOIN FETCH Reduced Database Load by 94%: A Real-World Case Study

🎯 Introduction

The N+1 problem is one of the most common causes of high database load in Spring Boot applications. In this article, I'll show you how to systematically solve this problem using a real-world financial system project.

πŸ” What is the N+1 Problem?

The N+1 problem occurs when an ORM generates additional SELECT queries to load related entities.

Example of the Problem

@Entity
public class ExternalExchangeRate {
    @ManyToOne(fetch = FetchType.LAZY)
    private Currency currencyFrom;

    @ManyToOne(fetch = FetchType.LAZY)
    private Currency currencyTo;
}
Enter fullscreen mode Exit fullscreen mode

πŸ’‘ View ExternalExchangeRate code β†’

When executing:

List<ExternalExchangeRate> rates = repository.findByExchangeDateAndCurrencyFromId(date, currencyFromId);
for (ExternalExchangeRate rate : rates) {
    System.out.println(rate.getCurrencyFrom().getCode()); // N queries!
    System.out.println(rate.getCurrencyTo().getCode());   // N more queries!
}
Enter fullscreen mode Exit fullscreen mode

Result: 1 + N*2 queries to the database

For example, if we have 15 currency rates for a day:

  • 1 SELECT for rates
  • 15 SELECT for currencyFrom
  • 15 SELECT for currencyTo
  • Total: 31 queries! 😱

πŸ”¬ Detecting the N+1 Problem

Configuration for Development Mode

# application-dev.properties
spring.jpa.show-sql=true
spring.jpa.properties.hibernate.format_sql=true
spring.jpa.properties.hibernate.generate_statistics=true
spring.jpa.properties.hibernate.use_sql_comments=true
logging.level.org.hibernate.SQL=DEBUG
Enter fullscreen mode Exit fullscreen mode

πŸ’‘ View application-dev.properties β†’

Debug Endpoint for Monitoring

@RestController
@RequestMapping("/api/debug")
public class DebugController {

    @Autowired
    private EntityManagerFactory emf;

    @GetMapping("/hibernate-stats")
    public Map<String, Object> getHibernateStats() {
        Statistics stats = emf.unwrap(SessionFactory.class)
                              .getStatistics();

        return Map.of(
            "queriesExecuted", stats.getQueryExecutionCount(),
            "prepareStatementCount", stats.getPrepareStatementCount(),
            "entitiesLoaded", stats.getEntityLoadCount(),
            "entitiesFetched", stats.getEntityFetchCount()
        );
    }
}
Enter fullscreen mode Exit fullscreen mode

πŸ’‘ View DebugController code β†’

Measurements BEFORE Optimization

GET /api/external-exchange-rates/latest?date=2024-11-24&currencyFromId=2

Result:
- Time: 48ms
- prepareStatementCount: 33  ← Real SQL statements executed!
- queriesExecuted: 2  ← HQL/JPQL query types
- entitiesLoaded: 131
- entitiesFetched: 31  ← Additional lazy fetches!
Enter fullscreen mode Exit fullscreen mode

πŸ’‘ Understanding the metrics:

queriesExecuted = HQL/JPQL query types (2 types of queries)

prepareStatementCount = Actual JDBC statement executions (33 SQL queries!)

entitiesFetched = Lazy entity fetches (31 Currency entities loaded separately)

SQL log showed:

-- 1 query for exchange rates
SELECT * FROM external_exchange_rates 
WHERE exchange_date = '2024-11-24' 
AND currency_from_id = 2;

-- 31 additional queries for currencies!
SELECT * FROM currencies WHERE id = 1;
SELECT * FROM currencies WHERE id = 3;
SELECT * FROM currencies WHERE id = 4;
-- ... 28 more queries
Enter fullscreen mode Exit fullscreen mode

βœ… When JOIN FETCH Works Perfectly

Flat Data Structure (@ManyToOne)

JOIN FETCH is ideal for flat data structures where an entity has variables that reference only one entity from another table.

Perfect use case example:

@Entity
public class ExternalExchangeRate {
    @Id
    private Long id;

    // Each rate references ONE currency FROM
    @ManyToOne(fetch = FetchType.LAZY)
    private Currency currencyFrom;

    // Each rate references ONE currency TO
    @ManyToOne(fetch = FetchType.LAZY)
    private Currency currencyTo;

    private BigDecimal rate;
}
Enter fullscreen mode Exit fullscreen mode

Why this works perfectly:

  • 1 ExternalExchangeRate β†’ 1 Currency (from)
  • 1 ExternalExchangeRate β†’ 1 Currency (to)
  • No multiple relationships (@OneToMany)
  • Pagination works correctly βœ…
  • COUNT query is accurate βœ…

⚠️ When JOIN FETCH Is NOT Suitable

Table Parts and Collections (@OneToMany)

JOIN FETCH is NOT an optimal solution when an entity has a variable that references a collection of data (e.g., document line items).

Problematic example:

@Entity
public class Invoice {
    @Id
    private Long id;

    @ManyToOne
    private Customer customer;

    // ⚠️ PROBLEM: Collection of items!
    @OneToMany(mappedBy = "invoice")
    private List<InvoiceItem> items;  // Could be 1, 10, 100+ items!
}
Enter fullscreen mode Exit fullscreen mode

The Problem with Pagination

❌ Wrong approach:

@Query("SELECT DISTINCT i FROM Invoice i " +
       "LEFT JOIN FETCH i.items")  // ⚠️ PROBLEM!
Page<Invoice> findAll(Pageable pageable);
Enter fullscreen mode Exit fullscreen mode

Why this doesn't work:

  1. Cartesian product - 1 Invoice with 10 items β†’ 10 rows in result
  2. Pagination breaks - Page size 20 might return only 2 invoices!
  3. COUNT incorrect - Counts rows after JOIN, not invoices

⚠️ Hibernate Warning

HHH000104: firstResult/maxResults specified with collection fetch; 
applying in memory!
Enter fullscreen mode Exit fullscreen mode

This means Hibernate will load all data and apply pagination in memory - losing all benefits!


πŸ”„ Alternatives for Collections

For collections with @OneToMany:

  1. @EntityGraph - Better Hibernate understanding
  2. Two separate queries - Full control
  3. DTO Projection - When details not needed
  4. Batch Fetching - Optimizes multiple queries

πŸ“ Note: I'll cover @EntityGraph in detail in my next article with a real document + line items example.


πŸ“Š Comparison Table

Feature JOIN FETCH (@ManyToOne) JOIN FETCH (@OneToMany)
Pagination βœ… Works perfectly ❌ Breaks
COUNT accuracy βœ… Correct ❌ Counts rows
Number of queries βœ… 1-2 queries ⚠️ All in memory
Predictability βœ… Stable ❌ Data dependent
Recommendation βœ… Use it ❌ Avoid it

πŸ’‘ Solution: JOIN FETCH for Flat Structures

Step 1: Creating Optimized Repository Methods

@Repository
public interface ExternalExchangeRateRepository 
        extends JpaRepository<ExternalExchangeRate, Long> {

    /**
     * Find latest rates by date and currency from with currencies loaded.
     * Uses JOIN FETCH to prevent N+1 queries.
     */
    @Query("SELECT e FROM ExternalExchangeRate e " +
           "LEFT JOIN FETCH e.currencyFrom " +
           "LEFT JOIN FETCH e.currencyTo " +
           "WHERE e.exchangeDate = :date " +
           "AND e.currencyFrom.id = :currencyFromId")
    List<ExternalExchangeRate> findLatestRatesByCurrencyFromWithCurrencies(
            @Param("date") LocalDate date,
            @Param("currencyFromId") Long currencyFromId);

    /**
     * Find all exchange rates with currencies loaded in a single query.
     */
    @Query(value = "SELECT DISTINCT e FROM ExternalExchangeRate e " +
           "LEFT JOIN FETCH e.currencyFrom " +
           "LEFT JOIN FETCH e.currencyTo " +
           "ORDER BY e.exchangeDate DESC, e.id DESC",
           countQuery = "SELECT COUNT(e) FROM ExternalExchangeRate e")
    Page<ExternalExchangeRate> findAllWithCurrencies(Pageable pageable);
}
Enter fullscreen mode Exit fullscreen mode

πŸ’‘ View Repository code β†’

πŸ”‘ Key Points:

  1. DISTINCT - avoids duplicates with JOIN
  2. LEFT JOIN FETCH - loads related entities in one query
  3. countQuery - separate COUNT for pagination (without JOIN)
  4. WithCurrencies suffix - clear method naming

Step 2: Updating the Service

@Service
@Transactional(readOnly = true)
public class ExternalExchangeRateService {

    public List<ExternalExchangeRateResponseDTO> getLatestRatesByDateAndCurrencyFrom(
            LocalDate date, Long currencyFromId) {

        // Using optimized method
        List<ExternalExchangeRate> rates = repository
                .findLatestRatesByCurrencyFromWithCurrencies(date, currencyFromId);

        // No N+1 here, all data is loaded! βœ…
        return exchangeRateMapper.toResponseDTOList(rates);
    }
}
Enter fullscreen mode Exit fullscreen mode

πŸ’‘ View Service β†’

Step 3: Updating Tests

@Test
void getLatestRatesByDateAndCurrencyFrom_WhenExists_ShouldReturnRates() {
    // Given
    LocalDate date = LocalDate.of(2024, 11, 24);
    Long currencyFromId = 2L;

    // Updated to new method!
    when(exchangeRateRepository.findLatestRatesByCurrencyFromWithCurrencies(date, currencyFromId))
            .thenReturn(List.of(rate1, rate2));

    // When
    List<ExternalExchangeRateResponseDTO> result =
            exchangeRateService.getLatestRatesByDateAndCurrencyFrom(date, currencyFromId);

    // Then
    assertNotNull(result);
    assertEquals(2, result.size());
}
Enter fullscreen mode Exit fullscreen mode

πŸ’‘ View Tests β†’


πŸš€ Optimization Results

Performance Metrics:

Metric Before After Improvement
Response Time 48ms 37ms 23% faster ⚑
SQL Queries 33 2 94% reduction 🎯
Prepare Time ~413μs ~61μs 85% faster ⚑
Execution Time ~15ms ~14ms Similar
Scalability O(n*2) O(1) Linear β†’ Constant πŸš€

πŸŽ‰ Key Achievement:

  • βœ… N+1 Problem: SOLVED
  • SQL queries don't depend on record count
  • For 1000 records: 2001 queries β†’ 2 queries

πŸ€” Why Only 23% Response Time Improvement?

You might ask: "94% fewer queries but only 23% faster response time?"

The answer:

  1. Small dataset - Only 15 records in our test case
  2. Fast local database - PostgreSQL on localhost responds in ~0.5-1ms per query
  3. Connection pooling - Spring Boot's default HikariCP reuses connections efficiently
  4. Simple queries - SELECT by ID is very fast with proper indexes

Note: While the application has JDBC batching configured (batch_size=1000), it only affects INSERT/UPDATE operations, not SELECT queries.

Real impact shows at scale:

10 records:    11 queries β†’  2 queries  (similar time)
100 records:  201 queries β†’  2 queries  (23% faster)
1000 records: 2001 queries β†’ 2 queries  (90%+ faster!) πŸš€
Enter fullscreen mode Exit fullscreen mode

πŸ’‘ Key Insight: The real win is predictable performance regardless of data size! Performance doesn't degrade as data grows.

Measurements AFTER Optimization

GET /api/external-exchange-rates/latest?date=2024-11-24&currencyFromId=2

Result:
- Time: 37ms (23% faster! ⚑)
- prepareStatementCount: 2 (94% less! 🎯)
- queriesExecuted: 1 (only 1 query type now!)
- entitiesFetched: 0 (no additional fetches! βœ…)
Enter fullscreen mode Exit fullscreen mode

πŸ’‘ Key metric: prepareStatementCount dropped from 33 β†’ 2 (94% reduction!)

SQL log:

-- Only 1 query with JOIN! πŸŽ‰
SELECT DISTINCT 
    eer.id, eer.exchange_date, eer.rate, eer.source,
    cf.id, cf.code, cf.name, cf.symbol,
    ct.id, ct.code, ct.name, ct.symbol
FROM external_exchange_rates eer
LEFT JOIN currencies cf ON cf.id = eer.currency_from_id
LEFT JOIN currencies ct ON ct.id = eer.currency_to_id
WHERE eer.exchange_date = '2024-11-24'
AND eer.currency_from_id = 2;
Enter fullscreen mode Exit fullscreen mode

πŸ”§ Systematic Optimization of the Entire Project

I optimized 7 core modules with similar approach:

1. ExternalExchangeRate (2 relations)

@Query("SELECT e FROM ExternalExchangeRate e " +
       "LEFT JOIN FETCH e.currencyFrom " +
       "LEFT JOIN FETCH e.currencyTo")
Enter fullscreen mode Exit fullscreen mode

2. Bank (2 relations)

@Query("SELECT b FROM Bank b " +
       "LEFT JOIN FETCH b.country " +
       "LEFT JOIN FETCH b.counterparty")
Enter fullscreen mode Exit fullscreen mode

πŸ’‘ View Bank module β†’

3. BankAccount (2 relations)

@Query("SELECT ba FROM BankAccount ba " +
       "LEFT JOIN FETCH ba.bank " +
       "LEFT JOIN FETCH ba.currency")
Enter fullscreen mode Exit fullscreen mode

4. Country (1 relation)

@Query("SELECT c FROM Country c " +
       "LEFT JOIN FETCH c.currency")
Enter fullscreen mode Exit fullscreen mode

πŸ’‘ View Country entity β†’

Plus: Counterparty, Organization, and AccountingPolicy βœ…


🎯 Best Practices

1. Naming Convention

// Clearly indicate what the method does
findAllWithCurrencies()
findByIdWithRelations()
findLatestRatesByCurrencyFromWithCurrencies()
Enter fullscreen mode Exit fullscreen mode

2. Separate Methods for Different Scenarios

// For reading - with JOIN FETCH βœ…
@Query("SELECT b FROM Bank b LEFT JOIN FETCH b.country")
List<Bank> findAllWithCountry();

// For duplicate checks - WITHOUT JOIN (faster!) ⚑
boolean existsBySwiftCode(String swiftCode);
Enter fullscreen mode Exit fullscreen mode

3. DISTINCT for Multiple JOINs

@Query("SELECT DISTINCT e FROM Entity e " +
       "LEFT JOIN FETCH e.relation1 " +
       "LEFT JOIN FETCH e.relation2")
Enter fullscreen mode Exit fullscreen mode

4. Separate countQuery for Pagination

@Query(
    value = "SELECT e FROM Entity e LEFT JOIN FETCH e.relation",
    countQuery = "SELECT COUNT(e) FROM Entity e"  // Without JOIN!
)
Page<Entity> findAllWithRelation(Pageable pageable);
Enter fullscreen mode Exit fullscreen mode

❌ Common Mistakes

Mistake 1: Forgetting DISTINCT

// WITHOUT DISTINCT - may have duplicates! ⚠️
@Query("SELECT e FROM Entity e LEFT JOIN FETCH e.collection")
Enter fullscreen mode Exit fullscreen mode

Mistake 2: JOIN in countQuery

// Slow COUNT due to JOIN! 🐌
countQuery = "SELECT COUNT(e) FROM Entity e LEFT JOIN e.relation"
Enter fullscreen mode Exit fullscreen mode

Mistake 3: Not Updating Tests

// Old method no longer exists! ❌
when(repository.findAll()).thenReturn(list);
Enter fullscreen mode Exit fullscreen mode

Mistake 4: JOIN FETCH with @OneToMany and Pagination

// Breaks pagination! πŸ’₯
@Query("SELECT i FROM Invoice i LEFT JOIN FETCH i.items")
Page<Invoice> findAll(Pageable pageable);
Enter fullscreen mode Exit fullscreen mode

🚫 When NOT to Use JOIN FETCH

  1. ❌ Collections with pagination
  2. ❌ Document table parts
  3. ❌ @OneToMany with many records
  4. ❌ Duplicate checks
  5. ❌ Count queries
  6. ❌ Delete operations
  7. ❌ Exists checks

πŸ“ Conclusions

Results:

βœ… 7 modules optimized

βœ… 48 methods with JOIN FETCH

βœ… 94% reduction in SQL queries (33 β†’ 2)

βœ… 23% improvement in response time (48ms β†’ 37ms)

βœ… Scalability: O(n*2) β†’ O(1)

When to Use JOIN FETCH:

βœ… Perfect for:

  • Flat structures (@ManyToOne)
  • Fixed number of relationships
  • Pagination without collections
  • Predictable data sizes

❌ Avoid for:

  • Collections (@OneToMany) with pagination
  • Document table parts
  • Unpredictable data sizes

Next Steps:

  1. πŸ“Š Monitor production metrics
  2. πŸ’Ύ Consider caching for frequent queries
  3. πŸ” Explore @EntityGraph for collections (next article!)
  4. ⚑ Optimize batch operations

πŸ”— Useful Resources


πŸš€ About the Project

Financial Accounting System

πŸ“– View Project on GitHub β†’


Coming Next: "Optimizing N+1 for Collections with @EntityGraph" πŸ”œ


This article was created based on real production experience with 19+ years in enterprise development. πŸ‘¨β€πŸ’»

πŸ’¬ Questions? Drop them in the comments!

⭐ Found it helpful? Give the repo a star!

πŸ”” Want more? Follow for the next article on @EntityGraph!


πŸ‘¨β€πŸ’» About the Author

Taras Antoniuk

Java Backend Developer | 19+ years IT experience

πŸ“§ bronya2004@gmail.com

πŸ”— LinkedIn

πŸ’» HackerRank (5⭐ SQL, Java, Collections)

🌐 Production API

Top comments (0)