Timilehin Olusegun

Posted on Jul 7

EF Core Query Optimization: A Real-World Case Study

#csharp #dotnetcore #database #efcore

When our team imported about 90,000 historical records into our production database, we discovered that our Entity Framework Core queries, which had performed well with smaller datasets, suddenly began timing out. This case study examines the performance bottlenecks we encountered and the optimization techniques that resolved them.

The Context

Our application had been running smoothly with a modest dataset of about 1,000 records. The dashboard loaded consistently in under 2 seconds, monitoring showed healthy performance metrics, and users were satisfied with the response times.

The data migration was intended to be routine—importing historical records from a legacy system to provide users with years of valuable information.

However, when users began accessing the system, we encountered significant performance issues.

What Went Wrong

The data import exposed fundamental scaling issues in our EF Core queries. What worked with 1,000 records failed completely with 90,000 records.

The Problem Manifests

The frontend team reported that API endpoints were returning 500 errors. Our investigation revealed:

User Impact:

Frontend applications receiving 500 errors from API endpoints
Dashboard pages failing to load with error messages
API requests timing out after 30 seconds
Previously fast-loading data lists now completely failing

Technical Symptoms:

Application logs filled with database timeout exceptions
Database connection pool reaching capacity
Server CPU utilization spiking during peak usage
Previously reliable endpoints failing consistently

Initial Troubleshooting

Our first step was examining the application logs, which revealed a clear pattern:

Microsoft.Data.SqlClient.SqlException (0x80131904): Execution Timeout Expired.
The timeout period elapsed prior to completion of the operation or the server is not responding.
 ---> System.ComponentModel.Win32Exception (258): The wait operation timed out.
 ...

These timeout errors appeared across multiple endpoints, but all shared a common characteristic: they involved querying our primary data table, which had grown from 1,000 to about 90,000 records.

The correlation was unmistakable—every failing operation was related to queries against the expanded dataset.

The Scaling Challenge

This scenario illustrates a common scaling challenge in application development. Code that performs adequately with small datasets can become unusable when data volume increases significantly. This wasn't a defect in our implementation—it was a fundamental scalability limitation.

Why the increase in data volume caused complete failure:

Query Complexity Scaling: Operations that appear linear often have hidden exponential costs
Database Execution Plan Changes: Query optimizers choose different strategies for larger datasets
Memory Allocation Issues: Loading complete result sets becomes problematic at scale
Index Efficiency: Indexing strategies effective for small tables may not scale appropriately

The application was functioning exactly as designed—the issue was that our design assumptions no longer matched our operational reality.

Identifying the EF Core Bottlenecks

With the correlation established between our data volume increase and the timeout errors, we needed to understand what was happening at the database level. The next step was enabling Entity Framework Core's query logging to see exactly what SQL was being generated.

Enabling EF Core Query Logging

First, we added detailed logging to our application configuration:

// In Program.cs or Startup.cs
builder.Services.AddDbContext<ApplicationDbContext>(options =>
{
    options.UseSqlServer(connectionString);
    options.EnableSensitiveDataLogging(); // For development only
    options.LogTo(Console.WriteLine, LogLevel.Information);
});

The Include() Performance Trap

We discovered that our dashboard queries were using Entity Framework's .Include() method to load related data. This seemed reasonable for our original small dataset:

// The problematic code that worked fine with 1,000 records
public async Task<List<InfrastructureItem>> GetDashboardDataAsync(string status)
{
    return await _context.Infrastructures
        .Include(i => i.LocalGovernmentArea)
        .Include(i => i.Region)
        .Include(i => i.SenatorialDistrict)
        .Include(i => i.AssessedBy)
        .Include(i => i.CreatedBy)
        .Where(i => i.Status == status)
        .ToListAsync();
}

What this generated in SQL:

SELECT [i].[Id], [i].[Name], [i].[Status], [i].[CreatedAt],
       [l].[Id], [l].[Name], [l].[Code],
       [r].[Id], [r].[Name], [r].[Code],
       [s].[Id], [s].[Name], [s].[Code],
       [u].[Id], [u].[UserName], [u].[Email],
       [u2].[Id], [u2].[UserName], [u2].[Email]
FROM [Infrastructures] AS [i]
LEFT JOIN [LocalGovernmentAreas] AS [l] ON [i].[LocalGovernmentAreaId] = [l].[Id]
LEFT JOIN [Regions] AS [r] ON [i].[RegionId] = [r].[Id]
LEFT JOIN [SenatorialDistricts] AS [s] ON [i].[SenatorialDistrictId] = [s].[Id]
LEFT JOIN [AspNetUsers] AS [u] ON [i].[AssessedById] = [u].[Id]
LEFT JOIN [AspNetUsers] AS [u2] ON [i].[CreatedById] = [u2].[Id]
WHERE [i].[Status] = @status

The Hidden Cost

With 90,000 records, this single query was:

Joining 6 tables for every row returned
Loading unnecessary data — we only needed names, not full user profiles
Transferring massive amounts of data from database to application
Consuming excessive memory to hold all the joined data

The database was working exponentially harder, and the network was transferring far more data than needed.

The Optimization Solution

We replaced the expensive joins with a more efficient approach:

// Optimized version - get only what we need
public async Task<List<InfrastructureItem>> GetDashboardDataAsync(string status)
{
    // Step 1: Get core infrastructure data only
    var infrastructureData = await _context.Infrastructures
        .AsNoTracking() // Important: no change tracking needed
        .Where(i => i.Status == status)
        .Select(i => new
        {
            i.Id,
            i.Name,
            i.Status,
            i.CreatedAt,
            i.LocalGovernmentAreaId,
            i.AssessedById,
            i.CreatedById
        })
        .ToListAsync();

    // Step 2: Get lookup data efficiently
    var lgaIds = infrastructureData
        .Where(i => i.LocalGovernmentAreaId.HasValue)
        .Select(i => i.LocalGovernmentAreaId.Value)
        .Distinct()
        .ToList();

    var lgaLookup = await _context.LocalGovernmentAreas
        .Where(lga => lgaIds.Contains(lga.Id))
        .ToDictionaryAsync(lga => lga.Id, lga => lga.Name);

    // Step 3: Map to final result
    return infrastructureData.Select(i => new InfrastructureItem
    {
        Id = i.Id,
        Name = i.Name,
        Status = i.Status,
        CreatedAt = i.CreatedAt,
        LocalGovernmentArea = i.LocalGovernmentAreaId.HasValue 
            ? lgaLookup.GetValueOrDefault(i.LocalGovernmentAreaId.Value, "Unknown")
            : "Unknown"
    }).ToList();
}

Immediate Results

The optimization had an immediate and dramatic impact on our most problematic endpoints. Most importantly, our users could reaccess their dashboards without encountering timeout errors.

Key Lessons Learned

.Include() doesn't scale linearly - each join multiplies the complexity
.AsNoTracking() is crucial for read-only operations
Projection with .Select() dramatically reduces data transfer
Separate queries can be faster than complex joins for large datasets

Conclusion

Our EF Core optimizations transformed query performance from 30+ second timeouts to 2-3 second responses. The key lesson: .Include() doesn't scale linearly with data volume.

Key takeaways:

Use .Select() for projection instead of .Include() for large datasets
Add .AsNoTracking() for read-only operations
Separate queries often outperform complex joins
Enable EF Core logging to identify bottlenecks

While this solved our immediate crisis, some queries still struggled with sorting and pagination on large datasets. The real breakthrough came with database indexing—the subject of Part 2.

DEV Community