DEV Community

Cover image for Why Entity Framework Core Adds an Extra 'ORDER BY' When Including a Collection Navigation Property
Ali Shamekhi
Ali Shamekhi

Posted on

Why Entity Framework Core Adds an Extra 'ORDER BY' When Including a Collection Navigation Property

When working with Entity Framework Core (EF Core), developers often notice that queries involving collection navigation properties (especially when using .Include() — generate an additional ORDER BY clause in the SQL output. This behaviour can be puzzling, particularly since the relationship keys are already available in the result set. This article explains why EF Core does this, demonstrates the behaviour in practice, and provides references for further reading.

Example Model

Consider the following simple domain model:

public class Blog
{
public int Id { get; set; }
public string Title { get; set; }
public List<Post> Posts { get; set; } = new();
}

public class Post
{
public int Id { get; set; }
public string Content { get; set; }
public int BlogId { get; set; }
public Blog Blog { get; set; }
}

The Query

A straightforward EF Core query to eagerly load related posts might look like this:
var blogs = context.Blogs
.Include(b => b.Posts)
.ToList();

The Generated SQL

When logging is enabled (via ToQueryString() or a database profiler), EF Core might generate a query similar to the following:
SELECT [b].[Id], [b].[Title],
[p].[Id], [p].[Content], [p].[BlogId]
FROM [Blogs] AS [b]
LEFT JOIN [Posts] AS [p] ON [b].[Id] = [p].[BlogId]
ORDER BY [b].[Id]

Why EF Core Adds ORDER BY

At first glance, this extra ordering may appear unnecessary — after all, the BlogId field in the Posts table already indicates the relationship between posts and blogs. However, EF Core's query pipeline is not simply returning raw rows; it is materialising an object graph.

When a LEFT JOIN is performed for eager loading:

  1. The result set is flattened — each blog is repeated for each of its posts.
  2. EF Core must group dependent rows (Posts) under their principal (Blog) in memory.
  3. Without a guaranteed order, rows could arrive interleaved, requiring EF Core to maintain a more complex in-memory lookup (such as a dictionary) to correctly assign posts to blogs.

By ordering the results by the primary key(s) of the principal entity (Blog.Id in this case), EF Core ensures:

  • Rows belonging to the same blog are contiguous in the result set.
  • Materialisation is faster and requires less memory overhead.
  • The grouping operation becomes a simple sequential process.

Without Ordering – Potential Issue

Without ordering, the query result from the database could look like this:

Without ordering

In this case, EF Core would have to detect and match entities in a non-contiguous manner, increasing processing complexity.

With Ordering – Efficient Grouping

With ordering by Blog.Id:

With ordering

EF Core can now process the result set sequentially, attaching posts to blogs without additional lookups.

Official Notes and References

The EF Core team has acknowledged this behaviour in discussions and issue trackers:
"The ORDER BY clause is added when materializing 1-to-many relations, so that the principal is grouped together while loading the dependent. Without this, rows can arrive in an arbitrary order..."
— EF Core GitHub Issue #19571

Further Reading

For deeper exploration of EF Core query behaviour and performance:

  1. "Entity Framework Core in Action" – Jon P Smith
    Detailed coverage of loading strategies and query optimisation.

  2. "Pro Entity Framework Core 7" – Adam Freeman
    Includes explanations of query translation, eager loading, and materialisation performance considerations.

Conclusion

The ORDER BY clause added by EF Core during eager loading of collection navigation properties is not an arbitrary decision. It is a deliberate design choice aimed at optimising the materialisation process by ensuring contiguous grouping of related data. While it may seem redundant when relationship keys are present, this approach simplifies EF Core's internal processing and can improve performance for large datasets.

Top comments (1)

Collapse
 
auyeungdavid_2847435260 profile image
David Au Yeung

I like this kind of academic article👍