I need to fire off some Tasks to run, but I want them to be on SPECIFIC (the same) threads, every time they run. I don't know how to get that to happen except to perhaps instantiate a SingleThreadTaskScheduler (of my own creation). I am getting frames from a capture source, and I want to split off processing work onto parallel threads, but I need the parallel threads to operate on the frames in order. And for that to happen, they have to be the same thread as I fed last time, per processing pipeline. For instance, I have parallel processing pipelines A, B, and C. I need to feed AB&C each time I get a frame. They operate in parallel.
I saw another example on StackOverflow about how to create a single thread task scheduler, but it doesn't explain how I would be allowed to await the result and keep chugging in my current thread.
Here's the function I sort of need to execute. Task.Run() needs to be replaced by firing off x.LongRunningAsync() on a specific thread, not just some random one from the thread pool! That is, one specific thread PER item in this.Steps. The same thread needs to be called per call of DoParallelStuff. DoParallelStuff is called many times. The caller of this function wants to go off and do other stuff while these things are executing in parallel.
public async Task<bool> DoParallelStuff()
{
var tasks = this.Steps.Select(x => Task.Run(() => x.LongRunningAsync()));
var results = await Task.WhenAll(tasks);
var completed = !results.Any() || results.All(x => x == true);
this.OnCompleted();
return completed;
}
This problem can be solved with a custom TaskScheduler, but can also be solved with a custom SynchronizationContext. For example you could install Stephen Cleary's Nito.AsyncEx.Context package, and do this:
var tasks = this.Steps.Select(step => Task.Factory.StartNew(() =>
{
AsyncContext.Run(async () =>
{
await step.LongRunningAsync();
});
}, default, TaskCreationOptions.LongRunning, TaskScheduler.Default));
A dedicated thread is going to be launched for each step in this.Steps. Most probably this thread is going to be blocked for most of the time, while the LongRunningAsync is doing asynchronous stuff internally, but the continuations between the await points will be invoked on this thread.
It is important that all the await points inside the LongRunningAsync method are capturing the SynchronizationContext. A single ConfigureAwait(false) will cause the single-thread policy to fail.
And here is how a SingleThreadTaskScheduler can be used instead:
var tasks = this.Steps.Select(step => Task.Factory.StartNew(async () =>
{
await step.LongRunningAsync();
}, default, TaskCreationOptions.None, new SingleThreadTaskScheduler()).Unwrap());
Pay attention to the Unwrap call, it's very important.
The previous note, regarding the required absence of ConfigureAwait(false) at the await points, applies here too.
An (untested) implementation of the SingleThreadTaskScheduler class can be found here.
alright, I did some work over the weekend and figured it out. I used the SingleThreadTaskScheduler mentioned before, and, per pipeline (thread), I create a SingleThreadTaskScheduler. I modified this class to take a Func in Schedule(blah), and I passed in an action (what else would I call it), and I also modified the func to pass back a Task. Now, when I call Schedule(), I can use the returned Task to wait upon it, if I want, or I can ignore the Task and simply let it complete on the background thread. Either way, now I have full control. What I DON'T and can't have, however, is the ability for ANY call to 'await' within the Func<> that I send to the SingleThreadTaskScheduler. I don't know why it goes wrong, but if at any point, I use an 'await' in the code while running the SingleThreadTaskScheduler, it hangs. Dunno why, but don't care at this point, it is all running.
Related
I have this function:
async Task RefreshProfileInfo(List<string> listOfPlayers)
// For each player in the listOfPlayers, checks an in-memory cache if we have an entry.
// If we have a cached entry, do nothing.
// If we don't have a cached entry, fetch from backend via an API call.
This function is called very frequently, like:
await RefreshProfileInfo(playerA, playerB, playerC)
or
await RefreshProfileInfo(playerB, playerC, playerD)
or
await RefreshProfileInfo(playerE, playerF)
Ideally, if the players do not overlap each other, the calls should not affect each other (requesting PlayerE and PlayerF should not block the request for PlayerA, PlayerB, PlayerC). However, if the players DO overlap each other, the second call should wait for the first (requesting PlayerB, PlayerC, PlayerD, should wait for PlayerA, PlayerB, PlayerC to finish).
However, if that isn't possible, at the very least I'd like all calls to be sequential. (I think they should still be async, so they don't block other unrelated parts of the code).
Currently, what happens is each RefreshProfileInfo runs in parallel, which results in hitting backend every time (9 times in this example).
Instead, I want to execute them sequentially, so that only the first call hits the backend, and subsequent calls just hit cache.
What data structure/approach should I use? I'm having trouble figuring out how to "connect" the separate calls to each other. I've been playing around with Task.WhenAll() as well as SemaphoreSlim, but I can't figure out how to use them properly.
Failed attempt
The idea behind my failed attempt was to have a helper class where I could call a function, SequentialRequest(Task), and it would sequentially run all tasks invoked in this manner.
List<Task> waitingTasks = new List<Task>();
object _lock = new object();
public async Task SequentialRequest(Task func)
{
var waitingTasksCopy = new List<Task>();
lock (_lock)
{
waitingTasksCopy = new List<Task>(waitingTasks);
waitingTasks.Add(func); // Add this task to the waitingTasks (for future SequentialRequests)
}
// Wait for everything before this to finish
if (waitingTasksCopy.Count > 0)
{
await Task.WhenAll(waitingTasksCopy);
}
// Run this task
await func;
}
I thought this would work, but "func" is either run instantly (instead of waiting for earlier tasks to finish), or never run at all, depending on how I call it.
If I call it using this, it runs instantly:
async Task testTask()
{
await Task.Delay(4000);
}
If I call it using this, it never runs:
Task testTask = new Task(async () =>
{
await Task.Delay(4000);
});
Here's why your current attempt doesn't work:
// Run this task
await func;
The comment above is not describing what the code is doing. In the asynchronous world, a Task represents some operation that is already in progress. Tasks are not "run" by using await; await it a way for the current code to "asynchronously wait" for a task to complete. So no function signature taking a Task is going to work; the task is already in progress before it's even passed to that function.
Your question is actually about caching asynchronous operations. One way to do this is to cache the Task<T> itself. Currently, your cache holds the results (T); you can change your cache to hold the asynchronous operations that retrieve those results (Task<T>). For example, if your current cache type is ConcurrentDictionary<PlayerId, Player>, you could change it to ConcurrentDictionary<PlayerId, Task<Player>>.
With a cache of tasks, when your code checks for a cache entry, it will find an existing entry if the player data is loaded or has started loading. Because the Task<T> represents some asynchronous operation that is already in progress (or has already completed).
A couple of notes for this approach:
This only works for in-memory caches.
Think about how you want to handle errors. A naive cache of Task<T> will also cache error results, which is usually not desired.
The second point above is the trickier part. When an error happens, you'd probably want some additional logic to remove the errored task from the cache. Bonus points (and additional complexity) if the error handling code prevents an errored task from getting into the cache in the first place.
at the very least I'd like all calls to be sequential
Well, that's much easier. SemaphoreSlim is the asynchronous replacement for lock, so you can use a shared SemaphoreSlim. Call await mySemaphoreSlim.WaitAsync(); at the beginning of RefreshProfileInfo, put the body in a try, and in the finally block at the end of RefreshProfileInfo, call mySemaphoreSlim.Release();. That will limit all calls to RefreshProfileInfo to running sequentially.
I had the same issue in one of my projects. I had multiple threads call a single method and they all made IO calls when not found in cache. What you want to do is to add the Task to your cache and then await it. Subsequent calls will then just read the result once the task completes.
Example:
private Task RefreshProfile(Player player)
{
// cache is of type IMemoryCache
return _cache.GetOrCreate(player, entry =>
{
// expire in 30 seconds
entry.AbsoluteExpiration = DateTimeOffset.UtcNow.AddSeconds(30);
return ActualRefreshCodeThatReturnsTask(player);
});
}
Then just await in your calling code
await Task.WhenAll(RefreshProfile(Player a), RefreshProfile(Player b), RefreshProfile(Player c));
This might not be specific to SemaphoreSlim exclusively, but basically my question is about whether there is a difference between the below two methods of throttling a collection of long running tasks, and if so, what that difference is (and when if ever to use either).
In the example below, let's say that each tracked task involves loading data from a Url (totally made up example, but is a common one that I've found for SemaphoreSlim examples).
The main difference comes down to how the individual tasks are added to the list of tracked tasks. In the first example, we call Task.Run() with a lambda, whereas in the second, we new up a Func(<Task<Result>>()) with a lambda and then immediately call that func and add the result to the tracked task list.
Examples:
Using Task.Run():
SemaphoreSlim ss = new SemaphoreSlim(_concurrentTasks);
List<string> urls = ImportUrlsFromSource();
List<Task<Result>> trackedTasks = new List<Task<Result>>();
foreach (var item in urls)
{
await ss.WaitAsync().ConfigureAwait(false);
trackedTasks.Add(Task.Run(async () =>
{
try
{
return await ProcessUrl(item);
}
catch (Exception e)
{
_log.Error($"logging some stuff");
throw;
}
finally
{
ss.Release();
}
}));
}
var results = await Task.WhenAll(trackedTasks);
Using a new Func:
SemaphoreSlim ss = new SemaphoreSlim(_concurrentTasks);
List<string> urls = ImportUrlsFromSource();
List<Task<Result>> trackedTasks = new List<Task<Result>>();
foreach (var item in urls)
{
trackedTasks.Add(new Func<Task<Result>>(async () =>
{
await ss.WaitAsync().ConfigureAwait(false);
try
{
return await ProcessUrl(item);
}
catch (Exception e)
{
_log.Error($"logging some stuff");
throw;
}
finally
{
ss.Release();
}
})());
}
var results = await Task.WhenAll(trackedTasks);
There are two differences:
Task.Run does error handling
First off all, when you call the lambda, it runs. On the other hand, Task.Run would call it. This is relevant because Task.Run does a bit of work behind the scenes. The main work it does is handling a faulted task...
If you call a lambda, and the lambda throws, it would throw before you add the Task to the list...
However, in your case, because your lambda is async, the compiler would create the Task for it (you are not making it by hand), and it will correctly handle the exception and make it available via the returned Task. Therefore this point is moot.
Task.Run prevents task attachment
Task.Run sets DenyChildAttach. This means that the tasks created inside the Task.Run run independently from (are not synchronized with) the returned Task.
For example, this code:
List<Task<int>> trackedTasks = new List<Task<int>>();
var numbers = new int[]{0, 1, 2, 3, 4};
foreach (var item in numbers)
{
trackedTasks.Add(Task.Run(async () =>
{
var x = 0;
(new Func<Task<int>>(async () =>{x = item; return x;}))().Wait();
Console.WriteLine(x);
return x;
}));
}
var results = await Task.WhenAll(trackedTasks);
Will output the numbers from 0 to 4, in unknown order. However the following code:
List<Task<int>> trackedTasks = new List<Task<int>>();
var numbers = new int[]{0, 1, 2, 3, 4};
foreach (var item in numbers)
{
trackedTasks.Add(new Func<Task<int>>(async () =>
{
var x = 0;
(new Func<Task<int>>(async () =>{x = item; return x;}))().Wait();
Console.WriteLine(x);
return x;
})());
}
var results = await Task.WhenAll(trackedTasks);
Will output the numbers from 0 to 4, in order, every time. This is odd, right? What happens is that the inner task is attached to outer one, and executed right away in the same thread. But if you use Task.Run, the inner task is not attached and scheduled independently.
This remain true even if you use await, as long as the task you await does not go to an external system...
What happens with external system? Well, for example, if your task is reading from an URL - as in your example - the system would create a TaskCompletionSource, get the Task from it, set a response handler that writes the result to the TaskCompletionSource, make the request, and return the Task. This Task is not scheduled, it running on the same thread as a parent task makes no sense. And thus, it can break the order.
Since, you are using await to wait on an external system, this point is moot too.
Conclusion
I must conclude that these are equivalent.
If you want to be safe, and make sure it works as expected, even if - in a future version - some of the above points stops being moot, then keep Task.Run. On the other hand, if you really want to optimize, use the lambda and avoid the Task.Run (very small) overhead. However, that probably won't be a bottleneck.
Addendum
When I talk about a task that goes to an external system, I refer to something that runs outside of .NET. There a bit of code that will run in .NET to interface with the external system, but the bulk of the code will not run in .NET, and thus will not be in a managed thread at all.
The consumer of the API specify nothing for this to happen. The task would be a promise task, but that is not exposed, for the consumer there is nothing special about it.
In fact, a task that goes to an external system may barely run in the CPU at all. Futhermore, it might just be waiting on something exterior to the computer (it could be the network or user input).
The pattern is as follows:
The library creates a TaskCompletionSource.
The library sets a means to recieve a notification. It can be a callback, event, message loop, hook, listening to a socket, a pipe line, waiting on a global mutex... whatever is necesary.
The library sets code to react to the notification that will call SetResult, or SetException on the TaskCompletionSource as appropiate for the notification recieved.
The library does the actual call to the external system.
The library returns TaskCompletionSource.Task.
Note: with extra care of optimization not reordering things where it should not, and with care of handling errors during the setup phase. Also, if a CancellationToken is involved, it has to be taken into account (and call SetCancelled on the TaskCompletionSource when appropiate). Also, there could be tear down necesary in the reaction to the notification (or on cancellation). Ah, do not forget to validate your parameters.
Then the external system goes and does whatever it does. Then when it finishes, or something goes wrong, gives the library the notification, and your Task is sudendtly completed, faulted... (or if cancellation happened, your Task is now cancelled) and .NET will schedule the continuations of the task as needed.
Note: async/await uses continuations behind the scenes, that is how execution resumes.
Incidentally, if you wanted to implement SempahoreSlim yourself, you would have to do something very similar to what I describe above. You can see it in my backport of SemaphoreSlim.
Let us see a couple of examples of promise tasks...
Task.Delay: when we are waiting with Task.Delay, the CPU is not spinning. This is not running in a thread. In this case the notification mechanism will be an OS timer. When the OS sees that the time of the timer has elapsed, it will call into the CLR, and then the CLR will mark the task as completed. What thread was waiting? none.
FileStream.ReadSync: when we are reading from storage with FileStream.ReadSync the actual work is done by the device. The CRL has to declare a custom event, then pass the event, the file handle and the buffer to the OS... the OS calls the device driver, the device driver interfaces with the device. As the storage device recovers the information, it will write to memory (directly on the specified buffer) via DMA technology. And when it is done, it will set an interruption, that is handled by the driver, that notifies the OS, that calls the custom event, that marks the task as completed. What thread did read the data from storage? none.
A similar pattern will be used to download from a web page, except, this time the device goes to the network. How to make an HTTP request and how the system waits for a response is beyond the scope of this answer.
It is also possible that the external system is another program, in which case it would run on a thread. But it won't be a managed thread on your process.
Your take away is that these task do not run on any of your threads. And their timing might depend on external factors. Thus, it makes no sense to think of them as running in the same thread, or that we can predict their timing (well, except of course, in the case of the timer).
Both are not very good because they create the tasks immediately. The func version is a little less overhead since it saves the Task.Run route over the thread pool just to immediately end the thread pool work and suspend on the semaphore. You don't need an async Func, you could simplify this by using an async method (possibly a local function).
But you should not do this at all. Instead, use a helper method that implements a parallel async foreach.
public static Task ForEachAsync<T>(this IEnumerable<T> source, int dop, Func<T, Task> body)
{
return Task.WhenAll(
from partition in Partitioner.Create(source).GetPartitions(dop)
select Task.Run(async delegate {
using (partition)
while (partition.MoveNext())
await body(partition.Current);
}));
}
Then you just go urls.ForEachAsync(myDop, async input => await ProcessAsync(input));
Here, the tasks are created on demand. You can even make the input stream lazy.
I am writing a game, and using OpenGL I require that some work be offloaded to the rendering thread where an OpenGL context is active, but everything else is handled by the normal thread pool.
Is there a way I can force a Task to be executed in a special thread-pool, and any new tasks created from an async also be dispatched to that thread pool?
I want a few specialized threads for rendering, and I would like to be able to use async and await for example for creating and filling a vertex buffer.
If I just use a custom task scheduler and a new Factory(new MyScheduler()) it seems that any subsequent Task objects will be dispatched to the thread pool anyway where Task.Factory.Scheduler suddenly is null.
The following code should show what I want to be able to do:
public async Task Initialize()
{
// The two following tasks should run on the rendering thread pool
// They cannot run synchronously because that will cause them to fail.
this.VertexBuffer = await CreateVertexBuffer();
this.IndexBuffer = await CreateIndexBuffer();
// This should be dispatched, or run synchrounousyly, on the normal thread pool
Vertex[] vertices = CreateVertices();
// Issue task for filling vertex buffer on rendering thread pool
var fillVertexBufferTask = FillVertexBufffer(vertices, this.VertexBuffer);
// This should be dispatched, or run synchrounousyly, on the normal thread pool
short[] indices = CreateIndices();
// Wait for tasks on the rendering thread pool to complete.
await FillIndexBuffer(indices, this.IndexBuffer);
await fillVertexBufferTask; // Wait for the rendering task to complete.
}
Is there any way to achieve this, or is it outside the scope of async/await?
This is possible and basically the same thing what Microsoft did for the Windows Forms and WPF Synchronization Context.
First Part - You are in the OpenGL thread, and want to put some work into the thread pool, and after this work is done you want back into the OpenGL thread.
I think the best way for you to go about this is to implement your own SynchronizationContext. This thing basically controls how the TaskScheduler works and how it schedules the task. The default implementation simply sends the tasks to the thread pool. What you need to do is to send the task to a dedicated thread (that holds the OpenGL context) and execute them one by one there.
The key of the implementation is to overwrite the Post and the Send methods. Both methods are expected to execute the callback, where Send has to wait for the call to finish and Post does not. The example implementation using the thread pool is that Sendsimply directly calls the callback and Post delegates the callback to the thread pool.
For the execution queue for your OpenGL thread I am think a Thread that queries a BlockingCollection should do nicely. Just send the callbacks to this queue. You may also need some callback in case your post method is called from the wrong thread and you need to wait for the task to finish.
But all in all this way should work. async/await ensures that the SynchronizationContext is restored after a async call that is executed in the thread pool for example. So you should be able to return to the OpenGL thread after you did put some work off into another thread.
Second Part - You are in another thread and want to send some work into the OpenGL thread and await the completion of that work.
This is possible too. My idea in this case is that you don't use Tasks but other awaitable objects. In general every object can be awaitable. It just has to implement a public method getAwaiter() that returns a object implementing the INotifyCompletion interface. What await does is that it puts the remaining method into a new Action and sends this action to the OnCompleted method of that interface. The awaiter is expected to call the scheduled actions once the operation it is awaiting is done. Also this awaiter has to ensure that the SynchronizationContext is captured and the continuations are executed on the captured SynchronizationContext. That sounds complicated, but once you get the hang of it, it goes fairly easy. What helped me a lot is the reference source of the YieldAwaiter (this is basically what happens if you use await Task.Yield()). This is not what you need, but I think it is a place to start.
The method that returns the awaiter has to take care of sending the actual work to the thread that has to execute it (you maybe already have the execution queue from the first part) and the awaiter has to trigger once that work is done.
Conclusion
Make no mistake. That is a lot of work. But if you do all that you will have less problem down the line because you can seamless use the async/await pattern as if you would be working inside windows forms or WPF and that is a hue plus.
First, realize that await introduces the special behavior after the method is called; that is to say, this code:
this.VertexBuffer = await CreateVertexBuffer();
is pretty much the same as this code:
var createVertexBufferTask = CreateVertexBuffer();
this.VertexBuffer = await createVertexBufferTask;
So, you'll have to explicitly schedule code to execute a method within a different context.
You mention using a MyScheduler but I don't see your code using it. Something like this should work:
this.factory = new TaskFactory(CancellationToken.None, TaskCreationOptions.DenyChildAttach, TaskContinuationOptions.None, new MyScheduler());
public async Task Initialize()
{
// Since you mention OpenGL, I'm assuming this method is called on the UI thread.
// Run these methods on the rendering thread pool.
this.VertexBuffer = await this.factory.StartNew(() => CreateVertexBuffer()).Unwrap();
this.IndexBuffer = await this.factory.StartNew(() => CreateIndexBuffer()).Unwrap();
// Run these methods on the normal thread pool.
Vertex[] vertices = await Task.Run(() => CreateVertices());
var fillVertexBufferTask = Task.Run(() => FillVertexBufffer(vertices, this.VertexBuffer));
short[] indices = await Task.Run(() => CreateIndices());
await Task.Run(() => FillIndexBuffer(indices, this.IndexBuffer));
// Wait for the rendering task to complete.
await fillVertexBufferTask;
}
I would look into combining those multiple Task.Run calls, or (if Initialize is called on a normal thread pool thread) removing them completely.
I am trying to understand concurrency by doing it in code. I have a code snippet which I thought was running asynchronously. But when I put the debug writeline statements in, I found that it is running synchronously. Can someone explain what I need to do differently to push ComputeBB() onto another thread using Task.Something?
Clarification I want this code to run ComputeBB in some other thread so that the main thread will keep on running without blocking.
Here is the code:
{
// part of the calling method
Debug.WriteLine("About to call ComputeBB");
returnDTM.myBoundingBox = await Task.Run(() => returnDTM.ComputeBB());
Debug.WriteLine("Just called await ComputBB.");
return returnDTM;
}
private ptsBoundingBox2d ComputeBB()
{
Debug.WriteLine("Starting ComputeBB.");
Stopwatch sw = new Stopwatch(); sw.Start();
var point1 = this.allPoints.FirstOrDefault().Value;
var returnBB = new ptsBoundingBox2d(
point1.x, point1.y, point1.z, point1.x, point1.y, point1.z);
Parallel.ForEach(this.allPoints,
p => returnBB.expandByPoint(p.Value.x, p.Value.y, p.Value.z)
);
sw.Stop();
Debug.WriteLine(String.Format("Compute BB took {0}", sw.Elapsed));
return returnBB;
}
Here is the output in the immediate window:
About to call ComputeBB
Starting ComputeBB.
Compute BB took 00:00:00.1790574
Just called await ComputBB.
Clarification If it were really running asynchronously it would be in this order:
About to call ComputeBB
Just called await ComputBB.
Starting ComputeBB.
Compute BB took 00:00:00.1790574
But it is not.
Elaboration
The calling code has signature like so: private static async Task loadAsBinaryAsync(string fileName) At the next level up, though, I attempt to stop using async. So here is the call stack from top to bottom:
static void Main(string[] args)
{
aTinFile = ptsDTM.CreateFromExistingFile("TestSave.ptsTin");
// more stuff
}
public static ptsDTM CreateFromExistingFile(string fileName)
{
ptsDTM returnTin = new ptsDTM();
Task<ptsDTM> tsk = Task.Run(() => loadAsBinaryAsync(fileName));
returnTin = tsk.Result; // I suspect the problem is here.
return retunTin;
}
private static async Task<ptsDTM> loadAsBinaryAsync(string fileName)
{
// do a lot of processing
Debug.WriteLine("About to call ComputeBB");
returnDTM.myBoundingBox = await Task.Run(() => returnDTM.ComputeBB());
Debug.WriteLine("Just called await ComputBB.");
return returnDTM;
}
I have a code snippet which I thought was running asynchronously. But when I put the debug writeline statements in, I found that it is running synchronously.
await is used to asynchronously wait an operations completion. While doing so, it yields control back to the calling method until it's completion.
what I need to do differently to push ComputeBB() onto another thread
It is already ran on a thread pool thread. If you don't want to asynchronously wait on it in a "fire and forget" fashion, don't await the expression. Note this will have an effect on exception handling. Any exception which occurs inside the provided delegate would be captured inside the given Task, if you don't await, there is a chance they will go about unhandled.
Edit:
Lets look at this piece of code:
public static ptsDTM CreateFromExistingFile(string fileName)
{
ptsDTM returnTin = new ptsDTM();
Task<ptsDTM> tsk = Task.Run(() => loadAsBinaryAsync(fileName));
returnTin = tsk.Result; // I suspect the problem is here.
return retunTin;
}
What you're currently doing is synchronously blocking when you use tsk.Result. Also, for some reason you're calling Task.Run twice, once in each method. That is unnecessary. If you want to return your ptsDTM instance from CreateFromExistingFile, you will have to await it, there is no getting around that. "Fire and Forget" execution doesn't care about the result, at all. It simply wants to start whichever operation it needs, if it fails or succeeds is usually a non-concern. That is clearly not the case here.
You'll need to do something like this:
private PtsDtm LoadAsBinary(string fileName)
{
Debug.WriteLine("About to call ComputeBB");
returnDTM.myBoundingBox = returnDTM.ComputeBB();
Debug.WriteLine("Just called ComputeBB.");
return returnDTM;
}
And then somewhere up higher up the call stack, you don't actually need CreateFromExistingFiles, simply call:
Task.Run(() => LoadAsBinary(fileName));
When needed.
Also, please, read the C# naming conventions, which you're currently not following.
await's whole purpose is in adding the synchronicity back in asynchronous code. This allows you to easily partition the parts that are happenning synchronously and asynchronously. Your example is absurd in that it never takes any advantage whatsoever of this - if you just called the method directly instead of wrapping it in Task.Run and awaiting that, you would have had the exact same result (with less overhead).
Consider this, though:
await
Task.WhenAll
(
loadAsBinaryAsync(fileName1),
loadAsBinaryAsync(fileName2),
loadAsBinaryAsync(fileName3)
);
Again, you have the synchronicity back (await functions as the synchronization barrier), but you've actually performed three independent operations asynchronously with respect to each other.
Now, there's no reason to do something like this in your code, since you're using Parallel.ForEach at the bottom level - you're already using the CPU to the max (with unnecessary overhead, but let's ignore that for now).
So the basic usage of await is actually to handle asynchronous I/O rather than CPU work - apart from simplifying code that relies on some parts of CPU work being synchronised and some not (e.g. you have four threads of execution that simultaneously process different parts of the problem, but at some point have to be reunited to make sense of the individual parts - look at the Barrier class, for example). This includes stuff like "making sure the UI doesn't block while some CPU intensive operation happens in the background" - this makes the CPU work asynchronous with respect to the UI. But at some point, you still want to reintroduce the synchronicity, to make sure you can display the results of the work on the UI.
Consider this winforms code snippet:
async void btnDoStuff_Click(object sender, EventArgs e)
{
lblProgress.Text = "Calculating...";
var result = await DoTheUltraHardStuff();
lblProgress.Text = "Done! The result is " + result;
}
(note that the method is async void, not async Task nor async Task<T>)
What happens is that (on the GUI thread) the label is first assigned the text Calculating..., then the asynchronous DoTheUltraHardStuff method is scheduled, and then, the method returns. Immediately. This allows the GUI thread to do whatever it needs to do. However - as soon as the asynchronous task is complete and the GUI is free to handle the callback, the execution of btnDoStuff_Click will continue with the result already given (or an exception thrown, of course), back on the GUI thread, allowing you to set the label to the new text including the result of the asynchronous operation.
Asynchronicity is not an absolute property - stuff is asynchronous to some other stuff, and synchronous to some other stuff. It only makes sense with respect to some other stuff.
Hopefully, now you can go back to your original code and understand the part you've misunderstood before. The solutions are multiple, of course, but they depend a lot on how and why you're trying to do what you're trying to do. I suspect you don't actually need to use Task.Run or await at all - the Parallel.ForEach already tries to distribute the CPU work over multiple CPU cores, and the only thing you could do is to make sure other code doesn't have to wait for that work to finish - which would make a lot of sense in a GUI application, but I don't see how it would be useful in a console application with the singular purpose of calculating that single thing.
So yes, you can actually use await for fire-and-forget code - but only as part of code that doesn't prevent the code you want to continue from executing. For example, you could have code like this:
Task<string> result = SomeHardWorkAsync();
Debug.WriteLine("After calling SomeHardWorkAsync");
DoSomeOtherWorkInTheMeantime();
Debug.WriteLine("Done other work.");
Debug.WriteLine("Got result: " + (await result));
This allows SomeHardWorkAsync to execute asynchronously with respect to DoSomeOtherWorkInTheMeantime but not with respect to await result. And of course, you can use awaits in SomeHardWorkAsync without trashing the asynchronicity between SomeHardWorkAsync and DoSomeOtherWorkInTheMeantime.
The GUI example I've shown way above just takes advantage of handling the continuation as something that happens after the task completes, while ignoring the Task created in the async method (there really isn't much of a difference between using async void and async Task when you ignore the result). So for example, to fire-and-forget your method, you could use code like this:
async void Fire(string filename)
{
var result = await ProcessFileAsync(filename);
DoStuffWithResult(result);
}
Fire("MyFile");
This will cause DoStuffWithResult to execute as soon as result is ready, while the method Fire itself will return immediately after executing ProcessFileAsync (up to the first await or any explicit return someTask).
This pattern is usually frowned upon - there really isn't any reason to return void out of an async method (apart from event handlers); you could just as easily return Task (or even Task<T> depending on the scenario), and let the caller decide whether he wants his code to execute synchronously in respect to yours or not.
Again,
async Task FireAsync(string filename)
{
var result = await ProcessFileAsync(filename);
DoStuffWithResult(result);
}
Fire("MyFile");
does the same thing as using async void, except that the caller can decide what to do with the asynchronous task. Perhaps he wants to launch two of those in parallel and continue after all are done? He can just await Task.WhenAll(Fire("1"), Fire("2")). Or he just wants that stuff to happen completely asynchronously with respect to his code, so he'll just call Fire("1") and ignore the resulting Task (of course, ideally, you at the very least want to handle possible exceptions).
All, I have a situation where I have been asked to multi-thread a large 'Cost-Crunching' algorithm. I am relatively experienced with Tasks and would be confident in adopting a pattern like
CancellationTokenSource cancelSource = new CancellationTokenSource();
CancellationToken token = cancelSource.Token;
TaskScheduler uiScheduler = TaskScheduler.FromCurrentSynchronizationContext();
Task<bool> asyncTask = null;
asyncTask = Task.Factory.StartNew<bool>(() =>
SomeMethodAsync(uiScheduler, token, _dynamic), token);
asyncTask.ContinueWith(task =>
{
// For call back, exception handling etc.
}, uiScheduler);
and then for any operation where I need to provide and UI operation, I would use
Task task = Task.Factory.StartNew(() =>
{
mainForm.progressLeftLabelText = _strProgressLabel;
}, CancellationToken.None,
TaskCreationOptions.None,
uiScheduler);
Where this might be wrapped up in a method.
Now, I realise that I can make all this much less complicated, and leverage the async/await keywords of .NET 4.5. However, I have some questions: if I have a long running method that I launch using
// Start processing asynchroniously.
IProgress<CostEngine.ProgressInfo> progressIndicator =
new Progress<CostEngine.ProgressInfo>();
cancelSource = new CancellationTokenSource();
CancellationToken token = cancelSource.Token;
CostEngine.ScriptProcessor script = new CostEngine.ScriptProcessor(this);
await script.ProcessScriptAsync(doc, progressIndicator, token);
where CostEngine.ProgressInfo is some basic class used to return progress information and the method ProcessScriptAsync is defined as
public async Task ProcessScriptAsync(SSGForm doc, IProgress<ProgressInfo> progressInfo,
CancellationToken token, bool bShowCompleted = true)
{
...
if (!await Task<bool>.Run(() => TheLongRunningProcess(doc)))
return
...
}
I have two questions:
To get ProcessScriptAsync to return control to the UI almost immediately I await on a new Task<bool> delegate (this seemingly avoids an endless chain of async/awaits). Is this the right way to call ProcessScriptAsync? ['Lazy Initialisation', by wrapping in an outer method?]
To access the UI from within TheLongRunningProcess, do I merely pass in the UI TaskScheduler uiScheduler; i.e. TheLongRunningProcess(doc, uiScheduler), then use:
Task task = Task.Factory.StartNew(() =>
{
mainForm.progressLeftLabelText = _strProgressLabel;
}, CancellationToken.None,
TaskCreationOptions.None,
uiScheduler);
as before?
Sorry about the length and thanks for your time.
It depends. You've shown a lot of code, and yet omitted the one bit that you're actually asking a question about. First, without knowing what the code is we can't know if it's actually going to take a while or not. Next, if you await on a task that's already completed it will realize this, and not schedule a continuation but instead continue on (this is an optimization since scheduling tasks is time consuming). If the task you await isn't completed then the continuation will still be executed in the calling SynchronizationContext, which will again keep the UI thread busy. You can use ConfigureAwait(false) to ensure that the continuation runs in the thread pool though. This should handle both issues. Note that by doing this you can no longer access the UI controls in the ... sections of ProcessScriptAsync (without doing anything special). Also note that since ProcessScriptAsync is now executing in a thread pool thread, you don't need to use Task.Run to move the method call to a background thread.
That's one option, yes. Although, if you're updating the UI based on progress, that's what IProgress is for. I see you're using it already, so that is the preferable model for doing this. If this is updating a separate type of progress than the existing IProgress you are passing (i.e. the status text, rather than the percent complete as an int) then you can pass a second.
I think trying to switch back and forth between a background thread (for CPU intensive operations or IO operations with no async support) and the UI thread (to manipulate UI controls) is often a sign of bad design. Your calculations and your UI code should be separate.
If you're doing this just to notify the UI of some sort of progress, then use IProgress<T>. Any marshaling between threads then becomes the responsibility of the implementation of that interface and you can use Progress<T>, which does it correctly using the SynchronizationContext.
If you can't avoid mixing background thread code and UI thread code and your UI work isn't progress reporting (so IProgress<T> won't fit), I would probably enclose each bit of background thread code into its own await Task.Run(), and leave the UI code top level.
Your solution of using a single Task.Run() to run the background thread code and then switch to the UI thread using StartNew() with uiScheduler will work too. In that case, some helper methods might be useful, especially if you wanted to use await in the UI code too. (Otherwise, you would have to remember to double await the result of StartNew())
Yet another option would be create a SwitchTo(TaskScheduler) method, which would return a custom awaiter that continues on the given scheduler. Such method was in some of the async CTPs, but it was removed because it was deemed too dangerous when it comes to handling exceptions.