Quantcast
Channel: Entity Framework
Viewing all articles
Browse latest Browse all 10318

Updated Wiki: Task-based Asynchronous Pattern support in EF.

$
0
0

EF6 will support the new simplified approach to asynchronous programming introduced in Visual Studio 2012. You can read more about it here.

Goals

Calling long-running methods asynchronously can have a positive effect on these aspects of a data driven solution:

  • Server scalability
  • Client responsiveness

Server scalability

In typical server-side scenarios, multiple threads are allocated to process multiple overlapping client requests.

Whenever one of these threads performs a blocking operation (i.e. an operation that accesses the network or some other form of I/O) the thread will remain idle waiting for the operation to complete. If during the interval new requests arrive, each of them is assigned its own thread.

Depending mainly on the rate at which new client requests arrive, and the time it takes for each request to be served, the number of threads executing (or blocked) on the server might grow rapidly.

Threads happen to have a significant memory footprint (about 1 MB of virtual memory space); therefore too many simultaneous threads on a server can easily max out the available memory long before processor utilization comes near a 100%, becoming the main bottleneck for the server’s throughput.

In cases like these it is possible to improve scalability without changing hardware by using non-blocking calls when communicating to external resources: threads don’t need to waste time waiting for those calls to complete, and instead can be returned to the thread pool so that they can be reused to service other incoming requests.

Non-blocking I/O calls help in keeping the thread count low, removing the memory bottleneck and making it possible for the application to scale better on the same hardware.

For an illustrative description of this impact see F# async on the server side.

Client responsiveness

On the other hand, even with the progress and broad availability of high-speed network connectivity, latency is still one of the dominant factors in the usability of distributed applications. Data intensive applications in the cloud can rapidly degrade if their user interfaces spend most of the time blocked waiting for responses from the server.

The new Task-based Asynchronous Pattern (TAP) provides a simple way for users to make asynchronous the long-running methods that currently make the UI unresponsive.

Non-Goals

  • Thread safety
  • Lazy loading

Thread safety

While thread safety would make async more useful it is an orthogonal feature. Until it’s implemented we will have safeguards in the async methods to catch concurrent invocations.

Lazy loading

Lazy loading was one of the most requested features we added in .NET 4. It is a quite powerful feature because it allows “virtualizing” the navigation over a graph of objects that is actually stored in the database, providing the illusion that is completely loaded into memory. This allows for better separation of concerns and simpler code, but it has a number of disadvantages.

One of the main critiques to lazy loading is the fact that the cost of reading a property becomes indeterministic. It seems that there is no place for this kind of indeterminism in the scenarios in which we expect Task-based async to be critical.

In other words, there is an argument that leads to the conclusion that someone that is optimizing for server throughput should not use lazy loading and instead should use eager or explicit loading.

However, there is a hybrid approach we could consider supporting in the future:

public class Order
{
    public virtual Task<Customer> CustomerAsync { get; set; }
    …
}
var order = await context.Orders.FindAsync(1);
var customer = await order.CustomerAsync;

In this case, EF would need to recognize that the pattern of a property that returns a Task<T>, where T is an entity type is actually an “async navigation property”, create and do the adequate Object-Conceptual mapping for the actual property. Since the property is virtual, EF can generate a dynamic proxy that implements the lazy loading.

We should keep in mind that using the Task-based async patterns with properties dilutes some of the transparency that lazy loading provides. That might be ok, since thanks to TAP support in the language the code doesn’t look too different from the sync version, it is just the navigation becomes explicitly asynchronous.

Another important challenge would be how to refer to an async navigation property in a LINQ expression, given that the .NET languages currently don't support construction of lambda expressions containing await.

Dependencies

We are able to provide TAP support in EF by basing our implementation on the new async API in ADO.NET provider model and the async and await keywords introduced in Visual Studio 2012.

However if a specific provider doesn’t implement the asynchronous methods they will fall back to synchronous execution without any warning.

Design

We are aiming to introduce async versions of the methods that perform network I/O and could become the bottleneck on either the client or the server. And a typical example of this is iterating over the results of a query:

var query = from e in context.Employees
            where e.Name.StartsWith("a")
            select e;

foreach (var employee in query)
{
    Console.WriteLine(employee);
}

However there’s currently no async equivalent of a foreach statement, so we will add an extension method that offers the same functionality:

await query.ForEachAsync(employee =>
{
     Console.WriteLine(employee);
});

We will also add async counterparts of the IQueryable extension methods that don’t return a collection, e.g.:

var firstEmployee = await query.FirstAsync();

API Usage

Here are some examples of typical scenarios implemented with async methods.

Querying

var emptyCategories = from category in context.Categories
                      where category.Products.Count == 0
                      select category;

var emptyCategoriesCount = await emptyCategories.CountAsync();

Saving changes

// Modify
var product1 = await context.Products.FindAsync(1);
product1.Name= "Smarties";

// Delete
var product2 = awaitcontext.Products.FindAsync(2);
context.Products.Remove(product2);

// Add
var product3 = new Product() { Name = "Branston Pickle" };
context.Products.Add(product3);

// Save
int savedCount = await context.SaveChangesAsync();

Console.WriteLine("Affected entities: " + savedCount); // 3

Loading

await context.Categories.Include(c => c.Products).LoadAsync();

Raw SQL Queries

var categories = await context.Database.SqlQuery<Category>(
    "select * from Categories").ToListAsync();

Limitations

While TAP has some real advantages it is not for everyone. Asynchronous invocations introduce considerable overhead and can easily degrade performance if not used correctly. As with any performance-related changes establish goals and perform measurements before making any modifications to your applications.

Challenges

Code duplication

All of the new methods have equivalent behavior and implementation to the existing synchronous ones. However they return Task or Task<T> which makes it difficult to unify the implementations without decreasing the performance of the synchronous methods, since Task creation usually means allocation of a new object.

When there’s only a single place in the method where a Task is created and it’s a tail call then it can be replaced by a delegate, but this is rarely the case.

We still can extract parts of the implementation as long as they don’t contain async method calls. Also both implementations are placed consecutively in the source code, so it’s easy to change both when needed.

Performance considerations

While performance optimizations should be done after establishing goals and collecting measurement there are things that are known to be beneficial enough to do in the initial implementation. One of them is calling ConfigureAwait(false) on the tasks before awaiting them. A large portion of the async overhead is marshaling the continuation to the original context. In a library not only is this not necessary, but could actually result in a deadlock when called from code where context is important, like a UI thread.

If profiling shows that the async overhead is significant in the critical path of a main scenario we should consider doing the following:

Minimizing the number of async method calls on the stack. The compiler can’t inline methods in an async method so each invocation in an async method carries overhead. Also when an async method yields all the local variables are saved to the heap. This also means that async methods should have less local variables.

As a last resort of improving performance we can drop the async keyword and implement TAP manually with TaskCompletionSource<TResult>.

You can read more about performance challenges in async methods here.


Viewing all articles
Browse latest Browse all 10318

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>