Task.Run using custom thread pool - c#

I have to address a temporary situation that requires me to do a non-ideal thing: I have to call an async method from inside a sync one.
Let me just say here that I know all about the problems I'm getting myself into and I understand reasons why this is not advised.
That said, I'm dealing with a large codebase, which is completely sync from top to bottom and there is no way I can rewrite everything to use async await in a reasonable amount of time. But I do need to rewrite a number of small parts of this codebase to use the new async API that I'be been slowly developing over the last year or so, because it has a lot of new features that the old codebase would benefit from as well, but can't get them for legacy reasons. And since all that code isn't going away any time soon, I'm facing a problem.
TL;DR: A large sync codebase cannot be easily rewritten to support async but now requires calls into another large codebase, which is completely async.
I'm currently doing the simplest thing that works in the sync codebase: wrapping each async call into a Task.Run and waiting for the Result in a sync way.
The problem with this approach is, that it becomes slow whenever sync codebase does this in a tight loop. I understand the reasons and I sort of know what I can do - I'd have to make sure that all async calls are started on the same thread instead of borrowing a new one each time from the thread pool (which is what Task.Run does). This borrowing and returning incurs a lot of switching which can slow things down considerably if done a lot.
What are my options, short of writing my own scheduler that would prefer to reuse a single dedicated thread?
UPDATE: To better illustrate what I'm dealing with, I offer an example of one of the simplest transformations I need to do (there are more complex ones as well).
It's basically simple LINQ query that uses a custom LINQ provider under the hood. There's no EF or anything similar underneath.
[Old code]
var result = (from c in syncCtx.Query("Components")
where c.Property("Id") == id
select c).SingleOrDefault();
[New code]
var result = Task.Run(async () =>
{
Dictionary<string, object> data;
using (AuthorizationManager.Instance.AuthorizeAsInternal())
{
var uow = UnitOfWork.Current;
var source = await uow.Query("Components")
.Where("Id = #id", new { id })
.PrepareAsync();
var single = await source.SingleOrDefaultAsync();
data = single.ToDictionary();
}
return data;
}).Result;
As mentioned, this is one of the less complicated examples and it already contains 2 async calls.
UPDATE 2: I tried removing the Task.Run and invoking .Result directly on the result of a wrapper async method, as suggested by #Evk and #Manu. Unfortunately, while testing this in my staging environment, I quickly ran into a deadlock. I'm still trying to understand what exactly transpired, but it's obvious that Task.Run cannot simply be removed in my case. There are additional complications to be resolved, first...

I don't think you are on the right track. Wrapping every async call in a Task.Run seems horrible to me, it always starts an additional tasks which you don't need. But I understand that introducing async/await in a large codebase can be problematic.
I see a possible solution: Extract all async calls into separate, async methods. This way, your project will have a pretty nice transition from sync to async, since you can change methods one by one without affecting other parts of the code.
Something like this:
private Dictionary<string, object> GetSomeData(string id)
{
var syncCtx = GetContext();
var result = (from c in syncCtx.Query("Components")
where c.Property("Id") == id
select c).SingleOrDefault();
DoSomethingSyncWithResult(result);
return result;
}
would become something like this:
private Dictionary<string, object> GetSomeData(string id)
{
var result = FetchComponentAsync(id).Result;
DoSomethingSyncWithResult(result);
return result;
}
private async Task<Dictionary<string, object>> FetchComponentAsync(int id)
{
using (AuthorizationManager.Instance.AuthorizeAsInternal())
{
var uow = UnitOfWork.Current;
var source = await uow.Query("Components")
.Where("Id = #id", new { id })
.PrepareAsync();
var single = await source.SingleOrDefaultAsync();
return single.ToDictionary();
}
}
Since you are in a Asp.Net environment, mixing sync with async is a very bad idea. I'm surprised that your Task.Run solution works for you. The more you incorporate the new async codebase into the old sync codebase, the more you will run into problems and there is no easy fix for that, except rewriting everything in an async way.
I strongly suggest you to not mix your async parts into the sync codebase. Instead, work from "bottom to top", change everything from sync to async where you need to await an async call. It may seem like a lot of work, but the benefits are much higher than if you search for some "hacks" now and don't fix the underlining problems.

Related

Calling an async method with `.Result` in a Linq expression causes deadlock

We ran into a bug where we had to validate a list of objects with an async method. The writer of the code wanted to just stuff it into a Linq expression like so:
var invalidObjects = list
.Where(x => _service.IsValidAsync(x).Result)
.ToList();
The validating method looked something like this:
public async Task<bool> IsValidAsync(object #object) {
var validObjects = await _cache.GetAsync<List<object>>("ValidObjectsCacheKey");
return validObjects.Contains(#object);
}
This little solution caused the whole application to hang on the await _cache.GetAsync line.
The cache is a distributed cache (redis).
After changing the linq to a simple foreach and properly awaiting _service.IsValidAsync, the code ran deadlock-free and basically in an instant.
I understand on a basic level how async-await works, but I can't wrap my head around why this happened especially because the list only had one object.
Any suggestion is welcome!
EDIT: The application is running on .net Core 2.2, but the library in which the problem happened is targeting .netstandard 2.0.
System.Threading.SynchronizationContext.Current returns null at the time of the deadlock
EDIT2: It turns changing the cache provider (but still accessing it via an async method) also resolves the issue, so the bug might actually be in the redis cache client:
https://github.com/aspnet/Caching/blob/master/src/Microsoft.Extensions.Caching.StackExchangeRedis/RedisCache.cs
Apart from the already stated issue of mixing async-await and blocking calls like .Result or .Wait()
Reference Async/Await - Best Practices in Asynchronous Programming
To summarize this second guideline, you should avoid mixing async and blocking code. Mixed async and blocking code can cause deadlocks, more-complex error handling and unexpected blocking of context threads. The exception to this guideline is the Main method for console applications, or—if you’re an advanced user—managing a partially asynchronous codebase.
Sometimes the simple approach, as you have already discovered, is to traverse the list and properly await the asynchronous function
For example
var invalidObjects = //...
foreach(var x in list){
if(!(await _service.IsValidAsync(x)))
invalidObjects.Add(x);
}

Implementing Awaitable Async method for NCache Get

We're in the process of refactoring all, or a large portion of our .NET 4.6 MVC WebAPI controller methods to async methods.
This seems like it will work well for methods that have a lower level invocation of a method that is awaitable such as SQL command execution; however we are utilizing an in-memory distributed caching framework by Alachisoft called NCache (4.6 SP2 to be exact), which does not offer any truly asynchronous methods.
Would it be worth creating an async helper method that would expose a awaitable Task<object> return type?
Traditionally using the NCache API, you Get an object from cache by a Key in the following usage;
NCacheObject.Get(string);
The proposal is to create a helper method of the following;
protected static async Task<Object> GetAsync(string key, Cache nCache)
{
Task<Object> getTask = new Task<Object>(() => nCache.Get(key));
getTask.Start();
return await getTask.ConfigureAwait(false);
}
So that it would then allow full waterfall of async methods up to the entry controller method as such;
public static async Task<Tuple<List<SomeModel>, bool, string>> GetSomeModelList(string apiKey)
{
return newTuple<List<SomeModel>, bool, string>(await GetAsync("GetSomeModelKey", CacheInstance).ConfigureAwait(false), true, "Success message");
}
And finally the controller method;
[HttpGet, Route("Route/Method")]
public async Task<ResponseOutputModel<List<SomeModel>>> GetSomeModelList()
{
ResponseOutputModel<List<SomeModel>> resp = new ResponseOutputModel<List<SomeModel>>();
try
{
Tuple<List<SomeModel>, Boolean, String> asyncResp = await CacheProcessing.GetSomeModelList(GetApiKey()).ConfigureAwait(false);
resp.Response = asyncResp.Item1;
resp.Success = asyncResp.Item2;
resp.Message = asyncResp.Item3;
}
catch (Exception ex)
{
LoggingHelper.Write(ex);
resp.StatusCode = Enumerations.ApiResponseStatusCodes.ProcessingError;
resp.Success = false;
resp.Message = ex.Message;
}
return resp;
}
This would complicate the refactoring, as the original methods actually had output parameters for bool success and string message; but it seems this can be accomplished in a decent and quick manner using a Tuple<>; otherwise we could just create a return type model.
To do this properly, there would be many hundreds of methods to refactor; quite an undertaking.
Would this work as expected, and be the best solution to accomplish the objective?
Is it likely worth all of the effort required, with the end goal of increasing scalability and subsequently "performance" of the web servers?
Would it be worth creating an async helper method that would expose a awaitable Task return type?
Nope.
The proposal is to create a helper method of the following
This just queues in-memory work to the thread pool.
(Side note: The task constructor should never be used. Ever. If you need to queue work to the thread pool, use Task.Run instead of the task constructor with Start)
Would this work as expected, and be the best solution to accomplish the objective?
Is it likely worth all of the effort required, with the end goal of increasing scalability and subsequently "performance" of the web servers?
These are the same question. The objective is scalability; asynchrony is just one means to help you accomplish it.
Asynchrony assists scalability on ASP.NET because it frees up a thread that can process other requests. However, if you make methods that are asynchronous by using another thread, then that doesn't assist you at all. I call this "fake asynchrony" - these kinds of methods look asynchronous but they're really just synchronously running on the thread pool.
In contrast to true asynchrony, fake asynchrony will actually hurt your scalability.
As this is a memory cache and Im saying this out of any direct experience with NCache but experiences with other cache systems.
Yes, this would work of course. I'd rather opt for a response struct or a generic class that allows me to define my response type (if it's possible at all). Tuples are not bad, probably would be more performant if you opt for classes instead. But they are not easy on the eyes.
Now, when it comes to performance, there is one question though. Here's a cache server for you and you want fast read-write of course. Accessing a memcache should be easy. As you're directly accessing the memory here, from a performance point of view, it won't really give you much. Are you making sure your code is fully async and you're using the threadpool properly to make things alright? Yes, of course but all it would ever do is add an extra layer or work when you're accessing your cache.
And the moment you're going for async, please do make sure that your memory cache is thread safe. :)

will async Rest API call with threads affect performance?

So one of my colleagues was saying to me -
"Why would somebody want to call async RestAPIs with thread, Its not beneficial and in return it will create a separate thread for every request."
My scenario is-
public async Task<ApiResponse<ProductLookupDto>> GetProductsByValueOfTheDay()
{
string url = "http://domain.com?param=10120&isBasic=true"
var result = Task.Run(() => JsonConvert.DeserializeObject<ProductLookupDto>(SerializedResults.GET(url)));
return new ApiResponse<ProductLookupDto>()
{
Data = await result,
Message = "success"
};
}
So here I am using async with threading. I know it will create a separate thread for request. Which is my concern about the performance.
I want to know if some of my methods are called together and if reponse is async as I am doing now. Will the thread affect to performance?
keeping in mind where - response from rest call is too large.
Your friend actually is not really knowing what he talks about - a Task will NOT create a separate thread for every request. Rather the task scheduler will determine threads - and the standard uses the thread pool. Which means that threads are pre allocated and reused, especially if you schedule a LOT of tasks. Also threads are released while the request runs due to completion ports. Your colleague shows a serious lack of basics here.
I think your colleague is right to be concerned with that code, and you are right to say "it pretends to be async but not really". And Jon Skeet is right to point out that the real problem here lies with SerializedResults.GET. Whatever the implentation of that is exactly, it's clearly a blocking call, otherwise it would return a Task<string> (that can be awaited) instead of a string. Wrapping a synchronous call in Task.Run and calling it asynchronous is a common mistake. You haven't eliminated blocking at all, you've simply moved the blocking call to a different thread.
So to make this asynchronous, SerializedResults.GET needs to go. It should be replaced by a call that uses a truly asynchronous library like HttpClient. Since this is a JSON API (and I'm a little bit biased), I'll show you an example using my Flurl library, which is basically a wrapper around HttpClient and Json.NET that provides some convenience methods to cut down on the noise a bit:
public async Task<ApiResponse<ProductLookupDto>> GetProductsByValueOfTheDay()
{
string url = "http://domain.com?param=10120&isBasic=true";
var result = await url.GetJsonAsync<ProductLookupDto>();
return new ApiResponse<ProductLookupDto>()
{
Data = result,
Message = "success"
};
}

What is causing this particular method to deadlock?

As best as I can, I opt for async all the way down. However, I am still stuck using ASP.NET Membership which isn't built for async. As a result my calls to methods like string[] GetRolesForUser() can't use async.
In order to build roles properly I depend on data from various sources so I am using multiple tasks to fetch the data in parallel:
public override string[] GetRolesForUser(string username) {
...
Task.WaitAll(taskAccounts, taskContracts, taskOtherContracts, taskMoreContracts, taskSomeProduct);
...
}
All of these tasks are simply fetching data from a SQL Server database using the Entity Framework. However, the introduction of that last task (taskSomeProduct) is causing a deadlock while none of the other methods have been.
Here is the method that causes a deadlock:
public async Task<int> SomeProduct(IEnumerable<string> ids) {
var q = from c in this.context.Contracts
join p in this.context.Products
on c.ProductId equals p.Id
where ids.Contains(c.Id)
select p.Code;
//Adding .ConfigureAwait(false) fixes the problem here
var codes = await q.ToListAsync();
var slotCount = codes .Sum(p => char.GetNumericValue(p, p.Length - 1));
return Convert.ToInt32(slotCount);
}
However, this method (which looks very similar to all the other methods) isn't causing deadlocks:
public async Task<List<CustomAccount>> SomeAccounts(IEnumerable<string> ids) {
return await this.context.Accounts
.Where(o => ids.Contains(o.Id))
.ToListAsync()
.ToCustomAccountListAsync();
}
I'm not quite sure what it is about that one method that is causing the deadlock. Ultimately they are both doing the same task of querying the database. Adding ConfigureAwait(false) to the one method does fix the problem, but I'm not quite sure what differentiates itself from the other methods which execute fine.
Edit
Here is some additional code which I originally omitted for brevity:
public static Task<List<CustomAccount>> ToCustomAccountListAsync(this Task<List<Account>> sqlObjectsTask) {
var sqlObjects = sqlObjectsTask.Result;
var customObjects = sqlObjects.Select(o => PopulateCustomAccount(o)).ToList();
return Task.FromResult<List<CustomAccount>>(customObjects);
}
The PopulateCustomAccount method simply returns a CustomAccount object from the database Account object.
In ToCustomAccountListAsync you call Task.Result. That's a classic deadlock. Use await.
This is not an answer, but I have a lot to say, it wouldn't fit in comments.
Some fact: EF context is not thread safe and doesn't support parallel execution:
While thread safety would make async more useful it is an orthogonal feature. It is unclear that we could ever implement support for it in the most general case, given that EF interacts with a graph composed of user code to maintain state and there aren't easy ways to ensure that this code is also thread safe.
For the moment, EF will detect if the developer attempts to execute two async operations at one time and throw.
Some prediction:
You say that:
The parallel execution of the other four tasks has been in production for months without deadlocking.
They can't be executing in parallel. One possibility is that the thread pool cannot assign more than one thread to your operations, in that case they would be executed sequentially. Or it could be the way you are initializing your tasks, I'm not sure. Assuming they are executed sequentially (otherwise you would have recognized the exception I'm talking about), there is another problem:
Task.WaitAll hanging with multiple awaitable tasks in ASP.NET
So maybe it isn't about that specific task SomeProduct but it always happens on the last task? Well, if they executed in parallel, there wouldn't be a "last task" but as I've already pointed out, they must be running sequentially considering they had been in production for quite a long time.

How to properly write a custom Task returning method

For running code asynchronously (eg with async/await) I need a proper Task. Of course there are several predefined methods in the framework which cover the most frequent operations, but sometimes I want to write my own ones. I’m new to C# so it’s quite possible that I’m doing it wrong but I’m at least not fully happy with my current practice.
See the following example for what I’m doing:
public async Task<bool> doHeavyWork()
{
var b1 = await this.Foo();
//var b2 = await Task<bool>.Factory.StartNew(Bar); //using Task.Run
var b2 = await Task.Run(()=>Bar());
return b1 & b2;
}
public Task<bool> Foo()
{
return Task.Factory.StartNew(() =>
{
//do a lot of work without awaiting
//any other Task
return false;
});
}
public bool Bar()
{
//do a lot of work without awaiting any other task
return false;
}
In general I create and consume such methods like the Foo example, but there is an ‘extra’ lambda containing the whole method logic which doesn't look very pretty imho. Another option is to consume any Method like the Bar example, however I think that’s even worse because it isn’t clear that this method should be run async (apart from proper method names like BarAsync) and the Task.Factory.StartNew may has to be repeated several times in the program. I don’t know how to just tell the compiler ‘this method returns a Task, please wrap it as a whole into a Task when called’ which is what I want to do.
Finally my question: What’s the best way to write such a method? Can I get rid of the ‘extra’ lambda (without adding an additional named method of course)?
EDIT
As Servy stated there is always good reason to have a synchronous version of the method. An async version should be provided only if absolutley necessary (Stephen Cleary's link).
Would the world end if someone wanted to call Bar without starting it in a new thread/Task? What if there is another method that is already in a background thread so it doesn't need to start a new task to just run that long running code? What if the method ends up being refactored into a class library that is called from both desktop app context as well as an ASP or console context? In those other contexts that method would probably just need to be run directly, not as a Task.
I would say your code should look like Bar does there, unless it's absolutely imperative that, under no circumstances whatsoever, should that code be run without starting a new Task.
Also note that with newer versions of .NET you should be switching from Task.Factory.StartNew to Task.Run, so the amount of code should go down (a tad).

Categories