Sorry if it is a dumb question. I'm confused about the wait() and its variants in regards to the task parallel library.
Every single example I've seen waits on tasks to complete - is this considered good practice?
My scenario is this, that I'm developing a windows service that will run continuously. I would like to engage a number of tasks, but I don't care if they will run to completion - I will set a cancellation-token with an expiration, that will throw an error if something goes awry. So I don't see the need for a wait-to-complete, but every darn example uses it...
It really depends on what your situations needs. If for instance, you want to launch a sub process to do a procedure, say for instance, fire off an email in parallel you can do without waiting.
However, if you will need to act upon what ever result or structure which is affected by some behavior you will need to wait.
If your tasks are self contained and do not interact and/or depend on each other, then I do not see why you would need to wait.
You only need to wait on a task if the code that is waiting requires the output of the task before it can proceed. If you don't need that output, don't wait.
Related
I have an async call (DoAsyncWork()), that I would like to start in a fire-and-forget way, i.e. I'm not interesting in its result and would like the calling thread to continue even before the async method is finished.
What is the proper way to do this? I need this in both, .NET Framework 4.6 as well as .NET Core 2, in case there are differences.
public async Task<MyResult> DoWorkAsync(){...}
public void StarterA(){
Task.Run(() => DoWorkAsync());
}
public void StarterB(){
Task.Run(async () => await DoWorkAsync());
}
Is it one of those two or something different/better?
//edit: Ideally without any extra libraries.
What is the proper way to do this?
First, you need to decide whether you really want fire-and-forget. In my experience, about 90% of people who ask for this actually don't want fire-and-forget; they want a background processing service.
Specifically, fire-and-forget means:
You don't care when the action completes.
You don't care if there are any exceptions when executing the action.
You don't care if the action completes at all.
So the real-world use cases for fire-and-forget are astoundingly small. An action like updating a server-side cache would be OK. Sending emails, generating documents, or anything business related is not OK, because you would (1) want the action to be completed, and (2) get notified if the action had an error.
The vast majority of the time, people don't want fire-and-forget at all; they want a background processing service. The proper way to build one of those is to add a reliable queue (e.g., Azure Queue / Amazon SQS, or even a database), and have an independent background process (e.g., Azure Function / Amazon Lambda / .NET Core BackgroundService / Win32 service) processing that queue. This is essentially what Hangfire provides (using a database for a queue, and running the background process in-proc in the ASP.NET process).
Is it one of those two or something different/better?
In the general case, there's a number of small behavior differences when eliding async and await. It's not something you would want to do "by default".
However, in this specific case - where the async lambda is only calling a single method - eliding async and await is fine.
It depends on what you mean by proper :)
For instance: are you interested in the exceptions being thrown in your "fire and forget" calls? If not, than this is sort of fine. Though what you might need to think about is in what environment the task lives.
For instance, if this is a asp.net application and you do this inside the lifetime of a thread instantiated due to a call to a .aspx or .svc. The Task becomes a background thread of that (foreground)thread. The foreground thread might get cleaned up by the application pool before your "fire and forget" task is completed.
So also think about in which thread your tasks live.
I think this article gives you some useful information on that:
https://www.hanselman.com/blog/HowToRunBackgroundTasksInASPNET.aspx
Also note that if you do not return a value in your Tasks, a task will not return exception info. Source for that is the ref book for microsoft exam 70-483
There is probably a free version of that online somewhere ;P https://www.amazon.com/Exam-Ref-70-483-Programming-C/dp/0735676828
Maybe useful to know is that if your have an async method being called by a non-async and you wish to know its result. You can use .GetAwaiter().GetResult().
Also I think it is important to note the difference between async and multi-threading.
Async is only useful if there are operations that use other parts of a computer that is not the CPU. So things like networking or I/O operations. Using async then tells the system to go ahead and use CPU power somewhere else instead of "blocking" that thread in the CPU for just waiting for a response.
multi-threading is the allocation of operations on different threads in a CPU (for instance, creating a task which creates a background thread of the foreground thread... foreground threads being the threads that make up your application, they are primary, background threads exist linked to foreground threads. If you close the linked foreground thread, the background thread closes as well)
This allows the CPU to work on different tasks at the same time.
Combining these two makes sure the CPU does not get blocked up on just 4 threads if it is a 4 thread CPU. But can open more while it waits for async tasks that are waiting for I/O operations.
I hope this gives your the information needed to do, what ever it is you are doing :)
I have a C# program, which has an "Agent" class. The program creates several Agents, and each Agent has a "run()" method, which executes a Task (i.e.: Task.Factory.StartNew()...).
Each Agent performs some calculations, and then needs to wait for all the other Agents to finish their calculations, before proceeding to the next stage (his actions will be based according to the calculations of the others).
In order to make an Agent wait, I have created a CancellationTokenSource (named "tokenSource"), and in order to alert the program that this Agent is going to sleep, I threw an event. Thus, the 2 consecutive commands are:
(1) OnWaitingForAgents(new EventArgs());
(2) tokenSource.Token.WaitHandle.WaitOne();
(The event is caught by an "AgentManager" class, which is a thread in itself, and the 2nd command makes the Agent Task thread sleep until a signal will be received for the Cancellation Token).
Each time the above event is fired, the AgentManager class catches it, and adds +1 to a counter. If the number of the counter equals the number of Agents used in the program, the AgentManager (which holds a reference to all Agents) wakes each one up as follows:
agent.TokenSource.Cancel();
Now we reach my problem: The 1st command is executed asynchronously by an Agent, then due to a context switch between threads, the AgentManager seems to catch the event, and goes on to wake up all the Agents. BUT - the current Agent has not even reached the 2nd command yet !
Thus, the Agent is receiving a "wake up" signal, and only then does he go to sleep, which means he gets stuck sleeping with no one to wake him up!
Is there a way to "atomize" the 2 consecutive methods together, so no context switch will happen, thus forcing the Agent to go to sleep before the AgentManager has the chance to wake him up?
The low-level technique that you are asking about is thread synchronisation. What you have there is a critical section (or part of one), and you need to protect access to it. I'm surprised that you've learned about multithreaded programming without having learned about thread synchronisation and critical sections yet! It's essential to know about these things for any kind of "low-level" multithreaded programming.
Maybe look into Parallel.Invoke or Parallel.For in .NET 4, which allows you to execute methods in parallel and wait until all parallel methods have been invoked.
http://msdn.microsoft.com/en-us/library/dd992634.aspx
Seems like that would help you out a lot, and take care of all the queuing for you.
humm... I don't think it's good idea (or even possible) develop software in .NET worrying about context switches, since neither Windows or .NET are real time. Probably you have another kind of problem in that code.
I've understood that you simply run all your agents in parallel, and you want to wait till all of them have finished to go to the next stage. You can use several techniques to accomplish that, the easiest one would be using Monitor.Wait(Object monitor) and Monitor.PulseAll(Object monitor).
In the task library there are several things to do it as well. As #jishi has pointed out, you can use the Parallel flavours, or spawn a lot of Tasks and then wait for all with the Task.WaitAll(Task[] tasks) method.
Each time the above event is fired,
the AgentManager class catches it, and
adds +1 to a counter.
How are you adding 1 to that counter and how are you reading it? You should use Interloked.Increment to ensure an atomic operation, and read it in a volatile operation with Thread.VolatileRead for example, or simply put it in a lock statement.
I am using HttpWebRequest.BeginGetRequest() to make 500 asynchronous HTTP requests from a single method. I would like that method to wait until I get a response from all the requests or they timeout.
What is the best way to do this?
I'm currently wrapping the asynchronous calls within a List of Task objects to use Tasks.WaitAll(), but I don't want to go too far down the rabbit hole before I know that this is a good solution.
Any ideas?
EDIT
I implemented counters, and they work, but I'm curious about using delegates like shown on this page.
Multi-threading and Async Examples
Has anybody done something like this before? Is it overkill?
I'm currently wrapping the asynchronous calls within a List of Task objects to use Tasks.WaitAll()
This is a fairly clean solution if you truly want to force these "tasks" to synchronize and block at this point. This is the main rationale behind Task.WaitAll(), and is nice since it (optionally) allows you to cancel the blocking operation after a timeout, if you so choose.
Personally I wouldn't block the thread, it defeats the purpose of the async model.
If I absolutely had to wait for these web requests to finish before continuing I would instead keep a counter that is incremented each time you get called back on a successful or failed request.
Check the counter on each callback and if it has hit your desired count then let the thread continue...
This way you can also keep your UI nice and responsive and perhaps update a counter/progress bar - Even if you're not kicking these off on the UI thread it's nice to provide some visual feed back tot he user about what is going on.
Is it possible to purge a ThreadPool?
Remove items from the ThreadPool?
Anything like that?
ThreadPool.QueueUserWorkItem(GetDataThread);
RegisteredWaitHandle Handle = ThreadPool.RegisterWaitForSingleObject(CompletedEvent, WaitProc, null, 10000, true);
Any thoughts?
I recommend using the Task class (added in .NET 4.0) if you need this kind of behaviour. It supports cancellation, and you can have any number of tasks listening to the same cancellation token, which enables you to cancel them all with a single method call.
Updated (non-4.0 solution):
You really only have two choices. One: implement your own event demultiplexer (this is far more complex than it appears, due to the 64-handle wait limitation); I can't recommend this - I had to do it once (in unmanaged code), and it was hideous.
That leaves the second choice: Have a signal to cancel the tasks. Naturally, RegisteredWaitHandle.Unregister can cancel the RWFSO part. The QUWI is more complex, but can be done by making the action aware of a "token" value. When the action executes, it first checks the token value against its stored token value; if they are different, then it shouldn't do anything.
One major thing to consider is race conditions. Just keep in mind that there is a race condition between cancelling an action and the ThreadPool executing it, so it is possible to see actions running after cancellation.
I have a blog post on this concept, which I call "asynchronous callback contexts". The CallbackContext type mentioned in the blog post is available in the Nito.Async library.
There's no interface for removing a queued item. However, nothing stops you from "poisoning" the delegate so that it returns immediately.
edit
Based on what Paul said, I'm thinking you might also want to consider a pipelined architecture, where you have a fixed number of threads reading from a blocking queue (like .NET 4.0's BlockingCollection on a ConcurrentQueue). This way, if you want to cancel items, you can just access the queue yourself.
Having said that, Stephen's advice about Task is likely better, in that it gives you all the control you would realistically want, without all the hard work that rolling your own pipelines involves. I mention this only for completion.
The ThreadPool exists to help you manage your threads. You should not have to worry about purging it at all since it will make the best performance decisions on your behalf.
If you think you need tighter control over your threads then you could consider creating your own thread management class (similar to ThreadPool) but it would take a lot of work to match and exceed the functionality that ThreadPool has built in.
Take a look here at some of the ThreadPool optimizations and the ideas behind it.
For my second point, I found an article on Code Project that implements a "Cancelable Threadpool", probably for some of your own similar reasons. It would be a good place to start looking if you're going to write your own.
I've got a program I'm creating(in C#) and I see two approaches..
1) A job manager that waits for any number of X threads to finish, when finished it gets the next chunk of work and creates a new thread and gives it that chunk
or
2) We create X threads to start, give them each a chunk of work, and when a thread finishes a chunk its asks the job manager for more work. If there isn't any more work it sleeps and then asks again, with the sleep becoming progressively longer.
This program will be a run and done, tho I could see it turning into a service that continually looks for more jobs.
Each chunk will consists of a number of data ids, a call to the database to get some info or perform an operation on the data id, and then writing to the database info on the data id.
Assuming you are aware of the additional precautions that need to be taken when dealing with multithreaded database operations, it sounds like you're describing two different scenarios. In the first, you have several threads running, and once ALL of them finish it will look for new work. In the second, you have several threads running and their operations are completely parallel. Your environment is going to be what determines the proper approach to take; if there is something tying all of the work in the several threads where additional work cannot continue until all of them are finished, then with the former. If they don't have much affect on each other, go with the latter.
The second option isn't really right, as making the sleep time progressively longer means that you will unnecessarily keep those threads blocked.
Rather, you should have a pooled set of threads like the second option, but they use WaitHandles to wait for work and use a producer/consumer pattern. Basically, when the producer indicates that there is work, it sends a signal to a consumer (there will be a manager which will determine which thread will get the work, and then signal that thread) which will wake up and start working.
You might want to look into the Parallel Task Library. It's in beta now, but if you can use it and are comfortable with it, I would recommend it, as it will manage a great deal of this for you (and much better, taking into account the number of cores on a machine, the optimal number of threads, etc, etc).
The former solution (spawn a thread for each new piece of work), is easier to code, and not too bad, if the units of work are large enough.
The second solution (thread-pool, with a queue of work), is more complicated to code, but supports smaller units of work.
Instead of rolling your own solution, you should look at the ThreadPool class in the .NET framework. You could use the QueueUserWorkItem method. It should do exactly what you want to accomplish.