I've got a WPF application which makes use of await and async methods extensively. There are several places where I call await Task.Delay(...); to insert pauses. But the trouble I'm running into is that while many of these pauses are fine to be a little bit off, there are some places where I absolutely need the pause to be precise. So in other words if I call await Task.Delay(2000); there are some places in my application where I need to guarantee that it's only going to pause for 2 seconds (and not 2.2+ seconds).
I believe the trouble comes from the fact that I do have so many different async methods, so that when I tell one of them to delay, there aren't enough threads left in the thread pool right when it's supposed to come back alive, which results, inadvertently, in longer delays than intended.
What are the C# "best practices" for threading when you have a business need for your delays to be as accurate as possible? Clearly it doesn't seem like async methods are enough on their own (even though they are nice to read). Should I manually create a thread with a higher priority and use Thread.Sleep? Do I up the number of threads in the thread pool? Do I use a BackgroundWorker?
You can't really make Task.Delay itself more accurate as it's based on an internal Threading.Timer which has a resolution of up to 15 ms and scheduling the callback to the thread pool takes its time.
If you really need to be accurate you need a dedicated thread. You can have it sleep for 2 seconds with Thread.Sleep and when it wakes up do what you need to do.
Since Thread.Sleep causes a context-switch where the thread goes out of the CPU an even more accurate option would be to do a "busy wait" (e.g. with a while loop). That will remove the cost of the context-switch back to the CPU which takes some time.
You should realize though that these options require too much resources and you should consider whether that's really necessary.
public static async void ExecuteWithDelay( this Action action, int delay )
{
//await Task.Delay(delay);
Action a = () => { new System.Threading.ManualResetEventSlim(false).Wait(delay); };
await Task.Factory.StartNew(a);
action?.Invoke ();
}
Related
Let's say I properly use async-await, like
await client.GetStringAsync("http://stackoverflow.com");
I understand that the thread that invokes the await becomes "free", that is, something further up the call chain isn't stuck executing some loop equivalent to
bool done = false;
string html = null;
for(; !done; done = GetStringIfAvailable(ref html));
which is what it would be doing if I called the synchronous version of GetStringAsync (probably called GetString by convention).
However, here's where I get confused. Even if the calling thread or any other thread in application's pool of available threads isn't blocked with such a loop, then something is, because, as I understand, at a low level there is always polling going on. So, instead of lowering the total amount of work, I'm simply pushing work to something "beneath" my application's threads ... or something like that.
Can someone clear this up for me?
No.
The compiler will convert methods that use async / await in to state machines that can be broken up in to multiple steps. Once an await is hit, the state of the method is stored and execution is "offloaded" back to the thread that called it. If the task is waiting on things like disk IO, the OS kernel will end up relying on physical CPU interrupts to let the kernel know when to signal the application to resume processing. The state of the pending method is loaded, and queued up on an available thread (the same thread that hit the await if ConfigureAwait is true, or any free thread if false) (This last part isn't exactly right, please see Scott Chamberlain's comments below.). Think of it like an event, where the application asks the hardware to "ping" it once the work is done, while the application gets back to doing whatever it was doing before.
There are some cases where a new thread is spun up to do the work, such as Task.Run which does the work on a ThreadPool thread, but no thread is blocking while awaiting it to complete.
It is important to keep in mind that asynchronous operations using async/ await, are all about pausing, storing, retrieving, and resuming that state-machine. It doesn't really care about what happens inside the Task, what happens there, and how it happens, isn't directly related to async / await.
I was very confused by async / await too, until I really understood how the method is converted to a state-machine. Reading up on exactly what your async methods get converted to by the compiler might help.
You're pushing it off onto the operating system--which will run some other thread if it can rather than simply wait. It only ends up in a busy-wait when it can't find any thread that wants to run.
I need to constantly perform 20 repetitive, CPU intensive calculations as fast as possible. So there is 20 tasks which contain looped methods in :
while(!token.IsCancellationRequested)
to repeat them as fast as possible. All calculations are performed at the same time. Unfortunatelly this makes the program unresponsive, so added :
await Task.Delay(15);
At this point program doesn't hang but adding Delay is not correct approach and it unnecessarily slows down the speed of calculations. It is WPF program without MVVM. What approach would you suggest to keep all 20 tasks working at the same time? Each of them will be constantly repeated as soon as it finished. I would like to keep CPU (all cores) utilisation at max values (or near) to ensure best efficiency.
EDIT:
There is 20 controls in which user adjusts some parameters. Calculations are done in:
private async Task Calculate()
{
Task task001 = null;
task001 = Task.Run(async () =>
{
while (!CTSFor_task001.IsCancellationRequested)
{
await Task.Delay(15);
await CPUIntensiveMethod();
}
}, CTSFor_task001.Token);
}
Each control is independent. Calcullations are 100% CPU-bound, no I/O activity. (All values come from variables) During calculations values of some UI items are changed:
this.Dispatcher.BeginInvoke(new Action(() =>
{
this.lbl_001.Content = "someString";
}));
Let me just write the whole thing as an answer. You're confusing two related, but ultimately separate concepts (thankfully - that's why you can benefit from the distinction). Note that those are my definitions of the concepts - you'll hear tons of different names for the same things and vice versa.
Asynchronicity is about breaking the imposed synchronicity of operations (ie. op 1 waits for op 2, which waits for op 3, which waits for op 4...). For me, this is the more general concept, but nowadays it's more commonly used to mean what I'd call "inherent asynchronicity" - ie. the algorithm itself is asynchronous, and we're only using synchronous programming because we have to (and thanks to await and async, we don't have to anymore, yay!).
The key thought here is waiting. I can't do anything on the CPU, because I'm waiting for the result of an I/O operation. This kind of asynchronous programming is based on the thought that asynchronous operations are almost CPU free - they are I/O bound, not CPU-bound.
Parallelism is a special kind of the general asynchronicity, in which the operations don't primarily wait for one another. In other words, I'm not waiting, I'm working. If I have four CPU cores, I can ideally use four computing threads for this kind of processing - in an ideal world, my algorithm will scale linearly with the number of available cores.
With asynchronicity (waiting), using more threads will improve the apparent speed regardless of the number of the available logical cores. This is because 99% of the time, the code doesn't actually do any work, it's simply waiting.
With parallelism (working), using more threads is directly tied to the number of available work cores.
The lines blur a lot. That's because of things you may not even know are happening, for example the CPU (and the computer as a whole) is incredibly asynchronous on its own - the apparent synchronicity it shows is only there to allow you to write code synchronously; all the optimalizations and asynchronicity is limited by the fact that on output, everything is synchronous again. If the CPU had to wait for data from memory every time you do i ++, it wouldn't matter if your CPU was operating at 3 GHz or 100 MHz. Your awesome 3 GHz CPU would sit there idle 99% of the time.
With that said, your calculation tasks are CPU-bound. They should be executed using parallelism, because they are doing work. On the other hand, the UI is I/O bound, and it should be using asynchronous code.
In reality, all your async Calculate method does is that it masks the fact that it's not actually inherently asynchronous. Instead, you want to run it asynchronously to the I/O.
In other words, it's not the Calculate method that's asynchronous. It's the UI that wants this to run asynchronously to itself. Remove all that Task.Run clutter from there, it doesn't belong.
What to do next? That depends on your use case. Basically, there's two scenarios:
You want the tasks to always run, always in the background, from start to end. In that case, simply create a thread for each of them, and don't use Task at all. You might also want to explore some options like a producer-consumer queue etc., to optimize the actual run-time of the different possible calculation tasks. The actual implementation is quite tightly bound to what you're actually processing.
Or, you want to start the task on an UI action, and then work with the resulting values back in the UI method that started them when the results are ready. In that case, await finally comes to play:
private btn_Click(object sender, EventArgs e)
{
var result = await Task.Run(Calculate);
// Do some (little) work with the result once we get it
tbxResult.Text = result;
}
The async keyword actually has no place in your code at all.
Hope this is more clear now, feel free to ask more questions.
So what you actually seek is a clarification of a good practice to maximize performance while keeping the UI responsive. As Luaan clarified, the async and await sections in your proposal will not benefit your problem, and Task.Run is not suited for your work; using threads is a better approach.
Define an array of Threads to run one on each logical processor. Distribute your task data between them and control your 20 repetitive calculations via BufferBlock provided in TPL DataFlow library.
To keep UI responsive, I suggest two approaches:
Your calculations demand many frequent UI updates: Put their required update information in a queue and update them in Timer event.
Your calculations demand scarce UI updates: Update UI with an invocation method like Control.BeginInvoke
As #Luaan says, I would strongly recommend reading up on async/await, the key point being it doesn't introduce any parallelism.
I think what you're trying to do is something like the simple example below, where you kick off CPUIntensiveMethod on the thread pool and await its completion. await returns control from the Calculate method (allowing the UI thread to continue working) until the task completes, at which point it continues with the while loop.
private async Task Calculate()
{
while (!CTSFor_task001.IsCancellationRequested)
{
await Task.Run(CPUIntensiveMethod);
}
}
I'm trying to run a piece of code periodically with time intervals in between. There might be multiple number of such code pieces running simultaneously so I turned to Task.Run to utilize asynchronous method calls and parallelism. Now I wonder how should I implement the time intervals!
The straight forward way would be using Task.Delay like this:
var t = Task.Run(async delegate
{
await Task.Delay(1000);
return 42;
});
But I wonder if doing so is the right way to do it since I believe all the Task.Delay does is to sleep the thread and resume it once the period is over (even though I'm not sure). If this is the case then system has to pay for the task's thread resources even when the task is not running.
If this is the case, is there any way to run a task after a period of time without wasting any system resources?
Task.Delay does not cause the thread to sleep. It uses a timer.
Consider a queue holding a lot of jobs that need processing. Limitation of queue is can only get 1 job at a time and no way of knowing how many jobs there are. The jobs take 10s to complete and involve a lot of waiting for responses from web services so is not CPU bound.
If I use something like this
while (true)
{
var job = Queue.PopJob();
if (job == null)
break;
Task.Factory.StartNew(job.Execute);
}
Then it will furiously pop jobs from the queue much faster than it can complete them, run out of memory and fall on its ass. >.<
I can't use (I don't think) ParallelOptions.MaxDegreeOfParallelism because I can't use Parallel.Invoke or Parallel.ForEach
3 alternatives I've found
Replace Task.Factory.StartNew with
Task task = new Task(job.Execute,TaskCreationOptions.LongRunning)
task.Start();
Which seems to somewhat solve the problem but I am not clear exactly what this is doing and if this is the best method.
Create a custom task scheduler that limits the degree of concurrency
Use something like BlockingCollection to add jobs to collection when started and remove when finished to limit number that can be running.
With #1 I've got to trust that the right decision is automatically made, #2/#3 I've got to work out the max number of tasks that can be running myself.
Have I understood this correctly - which is the better way, or is there another way?
EDIT - This is what I've come up with from the answers below, producer-consumer pattern.
As well as overall throughput aim was not to dequeue jobs faster than could be processed and not have multiple threads polling queue (not shown here but thats a non-blocking op and will lead to huge transaction costs if polled at high frequency from multiple places).
// BlockingCollection<>(1) will block if try to add more than 1 job to queue (no
// point in being greedy!), or is empty on take.
var BlockingCollection<Job> jobs = new BlockingCollection<Job>(1);
// Setup a number of consumer threads.
// Determine MAX_CONSUMER_THREADS empirically, if 4 core CPU and 50% of time
// in job is blocked waiting IO then likely be 8.
for(int numConsumers = 0; numConsumers < MAX_CONSUMER_THREADS; numConsumers++)
{
Thread consumer = new Thread(() =>
{
while (!jobs.IsCompleted)
{
var job = jobs.Take();
job.Execute();
}
}
consumer.Start();
}
// Producer to take items of queue and put in blocking collection ready for processing
while (true)
{
var job = Queue.PopJob();
if (job != null)
jobs.Add(job);
else
{
jobs.CompletedAdding()
// May need to wait for running jobs to finish
break;
}
}
I just gave an answer which is very applicable to this question.
Basically, the TPL Task class is made to schedule CPU-bound work. It is not made for blocking work.
You are working with a resource that is not CPU: waiting for service replies. This means the TPL will mismange your resource because it assumes CPU boundedness to a certain degree.
Manage the resources yourself: Start a fixed number of threads or LongRunning tasks (which is basically the same). Decide on the number of threads empirically.
You can't put unreliable systems into production. For that reason, I recommend #1 but throttled. Don't create as many threads as there are work items. Create as many threads which are needed to saturate the remote service. Write yourself a helper function which spawns N threads and uses them to process M work items. You get totally predictable and reliable results that way.
Potential flow splits and continuations caused by await, later on in your code or in a 3rd party library, won't play nicely with long running tasks (or threads), so don't bother using long running tasks. In the async/await world, they're useless. More details here.
You can call ThreadPool.SetMaxThreads but before you make this call, make sure you set the minimum number of threads with ThreadPool.SetMinThreads, using values below or equal to the max ones. And by the way, the MSDN documentation is wrong. You CAN go below the number of cores on your machine with those method calls, at least in .NET 4.5 and 4.6 where I used this technique to reduce the processing power of a memory limited 32 bit service.
If however you don't wish to restrict the whole app but just the processing part of it, a custom task scheduler will do the job. A long time ago, MS released samples with several custom task schedulers, including a LimitedConcurrencyLevelTaskScheduler. Spawn the main processing task manually with Task.Factory.StartNew, providing the custom task scheduler, and every other task spawned by it will use it, including async/await and even Task.Yield, used for achieving asynchronousy early on in an async method.
But for your particular case, both solutions won't stop exhausting your queue of jobs before completing them. That might not be desirable, depending on the implementation and purpose of that queue of yours. They are more like "fire a bunch of tasks and let the scheduler find the time to execute them" type of solutions. So perhaps something a bit more appropriate here could be a stricter method of control over the execution of the jobs via semaphores. The code would look like this:
semaphore = new SemaphoreSlim(max_concurrent_jobs);
while(...){
job = Queue.PopJob();
semaphore.Wait();
ProcessJobAsync(job);
}
async Task ProcessJobAsync(Job job){
await Task.Yield();
... Process the job here...
semaphore.Release();
}
There's more than one way to skin a cat. Use what you believe is appropriate.
Microsoft has a very cool library called DataFlow which does exactly what you want (and much more). Details here.
You should use the ActionBlock class and set the MaxDegreeOfParallelism of the ExecutionDataflowBlockOptions object. ActionBlock plays nicely with async/await, so even when your external calls are awaited, no new jobs will begin processing.
ExecutionDataflowBlockOptions actionBlockOptions = new ExecutionDataflowBlockOptions
{
MaxDegreeOfParallelism = 10
};
this.sendToAzureActionBlock = new ActionBlock<List<Item>>(async items => await ProcessItems(items),
actionBlockOptions);
...
this.sendToAzureActionBlock.Post(itemsToProcess)
The problem here doesn't seem to be too many running Tasks, it's too many scheduled Tasks. Your code will try to schedule as many Tasks as it can, no matter how fast they are executed. And if you have too many jobs, this means you will get OOM.
Because of this, none of your proposed solutions will actually solve your problem. If it seems that simply specifying LongRunning solves your problem, then that's most likely because creating a new Thread (which is what LongRunning does) takes some time, which effectively throttles getting new jobs. So, this solution only works by accident, and will most likely lead to other problems later on.
Regarding the solution, I mostly agree with usr: the simplest solution that works reasonably well is to create a fixed number of LongRunning tasks and have one loop that calls Queue.PopJob() (protected by a lock if that method is not thread-safe) and Execute()s the job.
UPDATE: After some more thinking, I realized the following attempt will most likely behave terribly. Use it only if you're really sure it will work well for you.
But the TPL tries to figure out the best degree of parallelism, even for IO-bound Tasks. So, you might try to use that to your advantage. Long Tasks won't work here, because from the point of view of TPL, it seems like no work is done and it will start new Tasks over and over. What you can do instead is to start a new Task at the end of each Task. This way, TPL will know what's going on and its algorithm may work well. Also, to let the TPL decide the degree of parallelism, at the start of a Task that is first in its line, start another line of Tasks.
This algorithm may work well. But it's also possible that the TPL will make a bad decision regarding the degree of parallelism, I haven't actually tried anything like this.
In code, it would look like this:
void ProcessJobs(bool isFirst)
{
var job = Queue.PopJob(); // assumes PopJob() is thread-safe
if (job == null)
return;
if (isFirst)
Task.Factory.StartNew(() => ProcessJobs(true));
job.Execute();
Task.Factory.StartNew(() => ProcessJob(false));
}
And start it with
Task.Factory.StartNew(() => ProcessJobs(true));
TaskCreationOptions.LongRunning is useful for blocking tasks and using it here is legitimate. What it does is it suggests to the scheduler to dedicate a thread to the task. The scheduler itself tries to keep number of threads on same level as number of CPU cores to avoid excessive context switching.
It is well described in Threading in C# by Joseph Albahari
I use a message queue/mailbox mechanism to achieve this. It's akin to the actor model. I have a class that has a MailBox. I call this class my "worker." It can receive messages. Those messages are queued and they, essentially, define tasks that I want the worker to run. The worker will use Task.Wait() for its Task to finish before dequeueing the next message and starting the next task.
By limiting the number of workers I have, I am able to limit the number of concurrent threads/tasks that are being run.
This is outlined, with source code, in my blog post on a distributed compute engine. If you look at the code for IActor and the WorkerNode, I hope it makes sense.
https://long2know.com/2016/08/creating-a-distributed-computing-engine-with-the-actor-model-and-net-core/
I have a Winform which needs to wait for about 3 - 4 hours. I can't close and somehow reopen the App, as it does few things in background, while it waits.
To achieve the wait - without causing trouble to the UI thread and for other reasons -, I have a BackgroundWorker to which I send how many milliseconds to wait and Call Thread.Sleep(waitTime); in its doWork event. In the backGroundWorker_RunWorkerCompleted event, I do what the program is supposed to do after the wait.
This works fine on the development machine. i.e. the wait ends when it has to end. But on the Test machine, it keeps waiting for longer. It happened two times, first time it waited exactly 1 hour more than specified time and second time it waited more for about 2 Hours and 40 minutes.
Could there be any obvious reason for this to happen or am I missing something?
The dev machine is Win XP and Test machine is Win 7.
I propose to use ManualResetEvent instead:
http://msdn.microsoft.com/en-us/library/system.threading.manualresetevent.aspx
ManualResetEvent mre = new ManualResetEvent(false);
mre.WaitOne(waitTime);
...
//your background worker process
mre.Set();
As a bonus you will have an ability to interrupt this sleep quicker.
Have a look at this article which explains the reason:
Thread.Sleep(n) means block the current thread for at least the number
of timeslices (or thread quantums) that can occur within n
milliseconds. The length of a timeslice is different on different
versions/types of Windows and different processors and generally
ranges from 15 to 30 milliseconds. This means the thread is almost
guaranteed to block for more than n milliseconds. The likelihood that
your thread will re-awaken exactly after n milliseconds is about as
impossible as impossible can be. So, Thread.Sleep is pointless for
timing.
By the way it also explains why not to use Thread.Sleep ;)
I agree to the other recommendations to use a Timer instead of the Thread.Sleep.
In my humble opinion, the difference in wait time cannot solely be explained by the information that you have given us. I would really think that the cause of the difference lies within the moment of starting the sleep. So the actual Thread.sleep(waitTime); call. Are you sure that the sleep is called at the moment you think it is?
And, as suggested by the comment, if you really need to wait for this long; consider using a Timer to start the events needed. Or even scheduling of some sort, within your application. Of course, this depends on your actual implementation and thus can be easier said than done. But it 'feels' silly, letting a BackgroundWorker sleep for so long.
PREFIX: This requires .NET 4 or newer
Consider making your function async and simply doing:
await Task.Delay(waitTime);
Alternately, if you can't make your function async (or don't want to) you could also do:
Task.Delay(waitTime).Wait();
This is a one-line solution and anyone with a copy of Reflector can verify that Task.Delay uses a timer internally.