• If you have ever used Postman to debug and write tests for your REST APIs, guess what, those are the same concepts you need to know for writing tests for your gRPC requests in Postman . For more info about gRPC, they created a great beginner article here.
The Background
How to optimize the EF Core queries?
Write them properly the first time.
We often hear about optimizing a piece of code, or a query, and it mostly refers to "refactoring code that wasn't written well from the beginning".
He did not optimize anything, he changed the code to be as it should have been from the beginning.
Regarding Entity Framework Core, there are many rules, today I will present you 5 basic ones that should be followed when writing queries.
Let's dive in...
1# Get fields you need only
Let's see the example:
var employeeList = context.Employees
.Select(e => new EmployeeDto
{
Name = e.Name,
Email = e.Email
})
.ToList();
In the database, it is very likely that there are more than 2 fields. But we on the presentation layer don't need anything more than Name and Email, so why would we extract all the data? There is no need.
Why?
1. Reduced Data Transfer:
2. Lower Memory Usage:
Loading fewer fields means using less memory in your application, which is particularly important for large datasets or when dealing with resource-constrained environments.
Queries can execute faster because the database engine has less data to retrieve and transmit.
4. Decreased Entity Tracking Overhead:
In EF Core, when entities are tracked, there is overhead associated with change tracking. By retrieving only needed fields, you reduce this overhead, especially when tracking changes is not necessary (e.g., in read-only scenarios).
2# Avoid N+1 Queries
Let' take a look on the example:
// Retrieve all blogs - 1 query
var blogs = context.Blogs.ToList();
foreach (var blog in blogs)
{
// For each blog, this will trigger an additional query // to fetch its posts - N queries
var posts = blog.Posts;
foreach (var post in posts)
{
Console.WriteLine(post.Title);
}
}
Let' see how to fix this:
// Retrieve all blogs and their posts in a single query using eager loading
var blogs = context.Blogs.Include(b => b.Posts).ToList();
foreach (var blog in blogs)
{
foreach (var post in blog.Posts)
{
Console.WriteLine(post.Title);
}
}
Eager Loading with Include:
The Include(b => b.Posts) method tells EF Core to load the Posts related to each Blog as part of the initial query. This way, when you access, the data is already loaded, and no additional queries are needed.
Single Query Execution:
The context.Blogs.Include(b => b.Posts).ToList() line executes a single query that retrieves all the Blogs along with their associated Posts. This effectively eliminates the N+1 queries issue.
Iterating Over Results:
The nested foreach loops then iterate over the blogs and their posts. Since the posts are already loaded into memory, no additional queries are executed during iteration.
Why?
By reducing the number of queries, you significantly decrease the load on the database and the network overhead.
2. Simplified Code Logic:
The code is more straightforward and easier to maintain as it explicitly states the intention to load related data.
3# Using .AsNoTracking()
Let's see the example:
var products = context.Products.AsNoTracking().ToList();
// Use products for read-only purposes
Suppose you have a Product entity and you want to display a list of products on a web page.
In this example, the products list is retrieved without the overhead of change tracking.
Why?
Particularly noticeable in large-scale applications or when dealing with large datasets. Queries execute faster, and less memory is consumed.
2. Reduced Overhead:
Since EF Core does not need to track changes or maintain state information for these entities, there's a reduction in overhead.
Considerations
- Not Suitable for CUD Operations:
AsNoTracking should not be used when you plan to update, delete, or otherwise modify the entities. In such cases, EF Core's change tracking is necessary to persist these changes back to the database.
- Best Used for Stateless Operations:
Ideal for scenarios like API requests where each request is independent, and you don't need to track the entity's state across multiple operations.
4# Avoiding Cartesian Explosion
Suppose we want to list all books along with their authors, but we mistakenly create a query that leads to a cartesian explosion:
// Incorrect query leading to cartesian explosion
var query = from a in context.Authors
from b in context.Books
select new { a.Name, b.Title };
var results = query.ToList(); // This will produce a cartesian product
In this query, we're incorrectly combining every author with every book, regardless of whether the book was written by that author.
The corrected version involves properly joining the Authors and Books tables based on the AuthorId:
// Correct query using a proper join
var query = from a in context.Authors
join b in context.Books on a.AuthorId equals b.AuthorId
select new { a.Name, b.Title };
var results = query.ToList(); // This produces the correct result
In this corrected query, each book is matched with its corresponding author using a proper join condition. This eliminates the cartesian explosion, resulting in a more efficient query and accurate results.
The Impact
The query becomes slower due to the large volume of data being processed.
2. Resource Intensive:
3.Inaccurate Results:
The results are often bloated with duplicate or irrelevant data, making them practically useless.
5# Use AsSplitQuery()
Using AsSplitQuery() is particularly useful in scenarios where a single query could result in an inefficient execution plan or a cartesian explosion.
Consider an example where you have Author and Book entities, and you want to retrieve authors with their books.
A typical query without using AsSplitQuery():
var authors = context.Authors
.Include(a => a.Books)
.ToList();
This query retrieves all authors and their books in a single SQL query.
Using AsSplitQuery() to optimize the above query:
var authors = context.Authors
.Include(a => a.Books)
.AsSplitQuery()
.ToList();
Explanation
- Without AsSplitQuery():
EF Core generates a single SQL query with joins, which could be less efficient if the Books collection is large.
- With AsSplitQuery():
EF Core generates separate SQL queries: one for authors and one for books. This can be more efficient in cases of large datasets or complex relationships.
Benefits
2. Reduced Memory Overhead:
Helps to avoid cartesian explosions that consume large amounts of memory.
3. Flexibility:
Offers an alternative to the default join-based query approach, allowing for optimization based on specific query and data characteristics.
Considerations
1. More Database Round-Trips:
While AsSplitQuery() can improve efficiency, it results in more database round-trips. This should be considered, especially in environments where database latency is a concern.
2. Use Case Specific:
Its benefits are more pronounced in certain scenarios, particularly with large datasets. It may not always be the best choice for every query.
Conslusion
Don't wait until you run into a problem with your query due to carelessness and poorly written code. From the very beginning, do your best to meet the basics, and considering how far the Entity Framework Core has progressed, it is very likely that this will be enough for you.
In addition to these, there are many other practices that should be followed.
That's all from me today.