Task parallel library replacement for BackgroundWorker? - c#

Does the task parallel library have anything that would be considered a replacement or improvement over the BackgroundWorker class?
I have a WinForms application with a wizard-style UI, and it does some long-running tasks. I want to be able to have a responsive UI with the standard progress bar and ability to cancel the operation. I've done this before with BackgroundWorker, but I'm wondering if there are some TPL patterns that can be used instead?

The Task class is an improvement over the BackgroundWorker; it naturally supports nesting (parent/child tasks), uses the new cancellation API, task continuations, etc.
I have an example on my blog, showing the old BackgroundWorker way of doing things and the new Task way of doing things. I do have a small helper class for tasks that need to report progress, because I find the syntax rather awkward. The example covers result values, error conditions, cancellation, and progress reporting.

Background worker is still a valid way of achieving this - if you are running multiple large operations concurrently then the parallel extensions would be worth considering, if its just the one then I would stick with the backgroundworker.

Related

Why Use Async/Await Over Normal Threading or Tasks?

I've been reading a lot about async and await, and at first I didn't get it because I didn't properly understand threading or tasks. But after getting to grips with both I wonder: why use async/await if you're comfortable with threads?
The asynchronousy of async/await can be done with Thread signaling, or Thread.Join() etc. Is it merely for time saving coding and "less" hassle?
Yes, it is a syntactic sugar that makes dealing with threads much easier, it also makes the code easier to maintain, because the thread management is done by run-time. await release the thread immediately and allows that thread or another one to pick up where it left off, even if done on the main thread.
Like other abstractions, if you want complete control over the mechanisms under the covers, then you are still free to implement similar logic using thread signaling, etc.
If you are interested in seeing what async/await produces then you can use Reflector or ILSpy to decompile the generated code.
Read What does async & await generate? for a description of what C# 5.0 is doing on your behalf.
If await was just calling Task.Wait we wouldn't need special syntax and new APIs for that. The major difference is that async/await releases the current thread completely while waiting for completion. During an async IO there is no thread involved at all. The IO is just a small data structure inside of the kernel.
async/await uses callback-based waiting under the hood and makes all its nastiness (think of JavaScript callbacks...) go a way.
Note, that async does not just move the work to a background thread (in general). It releases all threads involved.
Comparing async and await with threads is like comparing apples and pipe wrenches. From 10,000 feet they may look similar, but they are very different solutions to very different problems.
async and await are all about asynchronous programming; specifically, allowing a method to pause itself while it's waiting for some operation. When the method pauses, it returns to its caller (usually returning a task, which is completed when the method completes).
I assume you're familiar with threading, which is about managing threads. The closest parallel to a thread in the async world is Task.Run, which starts executing some code on a background thread and returns a task which is completed when that code completes.
async and await were carefully designed to be thread-agnostic. So they work quite well in the UI thread of WPF/Win8/WinForms/Silverlight/WP apps, keeping the UI thread responsive without tying up thread pool resources. They also work quite well in multithreaded scenarios such as ASP.NET.
If you're looking for a good intro to async/await, I wrote up one on my blog which has links to additional recommended reading.
There is a difference between the Threads and async/await feature.
Think about a situation, where you are calling a network to get some data from network. Here the Thread which is calling the Network Driver (probably running in some svchost process) keeps itself blocked, and consumes resources.
In case of Async/await, if the call is not network bound, it wraps the entire call into a callback using SynchronizationContext which is capable of getting callback from external process. This frees the Thread and the Thread will be available for other things to consume.
Asynchrony and Concurrency are two different thing, the former is just calling something in async mode while the later is really cpu bound. Threads are generally better when you need concurrency.
I have written a blog long ago describing these features .
C# 5.0 vNext - New Asynchronous Pattern
async/await does not use threads; that's one of the big advantages. It keeps your application responsive without the added complexity and overhead inherent in threads.
The purpose is to make it easy to keep an application responsive when dealing with long-running, I/O intensive operations. For example, it's great if you have to download a bunch of data from a web site, or read files from disk. Spinning up a new thread (or threads) is overkill in those cases.
The general rule is to use threads via Task.Run when dealing with CPU-bound operations, and async/await when dealing with I/O bound operations.
Stephen Toub has a great blog post on async/await that I recommend you read.

Why to choose System.Threading over BackgroundWorker?

Why would I decide to choose working directly with System.Threading over BackgroundWorker if the last one abstracts for me the treading managemnet?
I can't see cases where I couldn't use BackgroundWorker to replace System.Threading
BackgroundWorker has been around since .NET 2.0 and was intended to aid in writing code that will run in a background thread and not bog down the UI thread. It originally appeared with Windows Forms, but also works with WPF or any future UI framework that registers a synchronization context. It allows you to report progress and results back to the UI thread without having to deal with InvokeRequired/BeginInvoke as well supports cancellation.
The Task Parallel Library (TPL) was introduced in .NET 4 and is intended to model asynchronous tasks. These tasks are asynchronous and may or may not be run on another thread. Examples of something that doesn't run on another thread is asynchronous IO and tasks that need to run on the UI (while still being asynchronous). This task metaphor also supports futures (or continuations) so that you can chain tasks together with ContinueWith, sometimes using specific synchronization contexts so that you can do things like run a task on a UI thread (to update the UI, for example).
Tasks also support cancellation and multiple tasks can share a cancellation token so a requested cancellation cancels multiple tasks.
One of the differences is a Task doesn't have an inherent method of reporting progress back to the UI. Of course it's possible, but it's not built into the interfaces. Tasks also support cancellation.
If you only have one thing you want to do in the background and you specifically want to communicate back to a UI like report progress, I would recommend BackgroundWorker. Otherwise I generally recommend using Task<T> (or Task if no result is necessary). Task is inherently used in the C# 5 async/await syntax...
I hope you attempt to think about the intention of each approaches.
BackgroundWorker was designed for Windows Forms mainly at the very beginning (though it can be used in WPF as well), and it only offers some functionality of asynchronous operation. Compared it to all classes under System.Threading, you can see BackgroundWorker obviously is built upon them.
With all classes under System.Threading, you can build your own BackgroundWorker and enjoy more functionality and control over your code. The difficulty here is sharp learning curve, and debugging challenges.
So if you think BackgroundWorker is enough, keep using it. If you find something missing, building blocks in System.Threading can be your helpers.
In .NET Framework 4, Microsoft designs another set of classes upon System.Threading, named Task-based Asynchronous Pattern,
http://www.microsoft.com/en-us/download/details.aspx?id=19957
Using it, you can almost forget about BackgroundWorker, as it offers much more functionality and give you enough control, while does not require you to dive into the complexity of working with System.Threading directly.
I have a blog post on the subject.
In short, you should use async Tasks if you possibly can. Thread does provide some additional "knobs" - such as Priority - but usually those knobs are not necessary, and programmers often turn them the wrong way anyway.
For one you cannot set scheduling priority on BackgroundWorker but you can on a Thread.
Thread.Priority Property
Comments that question my answer continue to refer to Task and ThreadPool. The stated question is not about Task nor ThreadPool and neither is my answer.
Please refer to the code sample from the link above. It clearly demonstrates assigning priority prior to starting the thread and control over starting the thread.
Complete code sample:
PriorityTest priorityTest = new PriorityTest();
ThreadStart startDelegate = new ThreadStart(priorityTest.ThreadMethod);
Thread threadOne = new Thread(startDelegate);
threadOne.Name = "ThreadOne";
Thread threadTwo = new Thread(startDelegate);
threadTwo.Name = "ThreadTwo";
threadTwo.Priority = ThreadPriority.BelowNormal;
threadOne.Start();
threadTwo.Start();
// Allow counting for 10 seconds.
Thread.Sleep(10000);
priorityTest.LoopSwitch = false;
I tested this and ThreadTwo starts and finishes on ThreadPriority.BelowNormal. In my test threadOne processes about 10X as threadTwo.
BackGroundWorker has no Priority property. A BackgroundWorker starts with the default priority of Normal. BackgroundWorker thread priority can be changed in DoWork but changing the priority of a thread once the work has started is clearly not the same.

Which threads work method I need to use?

I have audio player application (c# .NET 4.0 WPF) that gets an audio-stream from the web and plays it. The app also displays waveforms and spectrums and saves the audio to local disk. It also does a few more things.
My quetion is when I recive a new byte packet from the web and I need to play them (and maybe write them to local disk etc.), do I need use threads? I try to do all the things with the main thread and it seems to work well.
I can work with threadpool for every bytes packet that I received in my connection. Would this be a reasonable approach?
For this you can use the Task Parallel Library (TPL). The Task Parallel Library (TPL) is a set of public types and APIs in the System.Threading and System.Threading.Tasks namespaces in the .NET Framework version 4. The purpose of the TPL is to make developers more productive by simplifying the process of adding parallelism and concurrency to applications. The TPL scales the degree of concurrency dynamically to most efficiently use all the processors that are available. In addition, the TPL handles the partitioning of the work, the scheduling of threads on the ThreadPool, cancellation support, state management, and other low-level details.
Another option (if the operations you were performing were sufficiently long running) is the BackgroundWorker class. The BackgroundWorker component gives you the ability to execute time-consuming operations asynchronously ("in the background"), on a thread different from your application's main UI thread. To use a BackgroundWorker, you simply tell it what time-consuming worker method to execute in the background, and then you call the RunWorkerAsync method. Your calling thread continues to run normally while the worker method runs asynchronously. When the method is finished, the BackgroundWorker alerts the calling thread by firing the RunWorkerCompleted event, which optionally contains the results of the operation. This may not be the best option for you if you have many operations to undertake sequentially.
The next alternative that has been largely replaced by the TPL, is the Thread Class. This is not so easy to use at the TPL and you can do everything using the TPL as you can using the Thread Class (well almost) and the TPL is much more user friendly.
I hope this helps.
I suggest using 2 threads: in one you are downloading packets from web and putting them in queue(it can be UI thread if you are using async download operation) and in another thread you are analyzing queue and processing packets from it.

Processing multiple inputs in the background in C# .NET4 application

I am looking for an appropriate pattern and best modern way to solve the following problem:
My application is expecting inputs from multiple sources, for example: GUI, monitoring file-system, voice command, web request, etc. When an input is received I need to send it to some ProcessInput(InputData arg) method that would start processing the data in the background, without blocking the application to receive and process more data, and in some way return some results whenever the processing is complete. Depending on the input, the processing can take significantly different amounts of time. For starters I don't need the ability to check the progress or cancel the processing.
After reading a dozen of articles on MSDN and blogposts of some rock-star programmers I am really confused what pattern should be used here, and more importantly which features of .NET
My findings are:
ThreadPool.QueueUserWorkItem - easiest to understand, not very convinient about returning the results
BackgroundWorker - seems to be used only only for rather simple tasks, all workers run on single thread?
Event-based Asynchronous Pattern
Tasks in Task Parallel Library
C# 5 async/await - these seem to be shortcuts for Tasks from Task Parallel
Notes:
Performance is important, so taking advantage of multi-core system when possible would be really nice.
This is not a web application.
My problem reminds me of a TCP server(really any sort of server) where application is constantly listening for new connections/data on multiple sockets, I found the article Asynchronous Server Socket and I am curious if that pattern could be a possible solution for me.
My application is expecting inputs from multiple sources, for example: GUI, monitoring file-system, voice command, web request, etc.
I've done a whole lot of asynchronous programming in my time. I find it useful to distinguish between background operations and asynchronous events. A "background operation" is something that you initiate, and some time later it completes. An "asynchronous event" is something that's always going on independent of your program; you can subscribe, receive the events for a time, and then unsubscribe.
So, GUI inputs and file-system monitoring would be examples of asynchronous events; whereas web requests are background operations. Background operations can also be split into CPU-bound (e.g., processing some input in a pipeline) and I/O-bound (e.g., web request).
I make this distinction especially in .NET because different approaches have different strengths and weaknesses. When doing your evaluations, you also need to take into consideration how errors are propogated.
First, the options you've already found:
ThreadPool.QueueUserWorkItem - almost the worst option around. It can only handle background operations (no events), and doesn't handle I/O-bound operations well. Returning results and errors are both manual.
BackgroundWorker (BGW) - not the worst, but definitely not the best. It also only handles background operations (no events), and doesn't handle I/O-bound operations well. Each BGW runs in its own thread - which is bad, because you can't take advantage of the work-stealing self-balancing nature of the thread pool. Furthermore, the completion notifications are (usually) all queued to a single thread, which can cause a bottleneck in very busy systems.
Event-Based Asynchronous Pattern (EAP) - This is the first option from your list that would support asynchronous events as well as background operations, and it also can efficiently handle I/O-bound operations. However, it's somewhat difficult to program correctly, and it has the same problem as BGW where completion notifications are (usually) all queued to a single thread. (Note that BGW is the EAP applied to CPU-bound background operations). I wrote a library to help in writing EAP components, along with some EAP-based sockets. But I do not recommend this approach; there are better options available these days.
Tasks in Task Parallel Library - Task is the best option for background operations, both CPU-bound and I/O-bound. I review several background operation options on my blog - but that blog post does not address asychronous events at all.
C# 5 async/await - These allow a more natural expression of Task-based background operations. They also offer an easy way to synchronize back to the caller's context if you want to (useful for UI-initiated operations).
Of these options, async/await are the easiest to use, with Task a close second. The problem with those is that they were designed for background operations and not asynchronous events.
Any asynchronous event source may be consumed using asynchronous operations (e.g., Task) as long as you have a sufficient buffer for those events. When you have a buffer, you can just restart the asynchronous operation each time it completes. Some buffers are provided by the OS (e.g., sockets have read buffers, UI windows have message queues, etc), but you may have to provide other buffers yourself.
Having said that, here's my recommendations:
Task-based Asynchronous Pattern (TAP) - using either await/async or Task directly, use TAP to model at least your background operations.
TPL Dataflow (part of VS Async) - allows you to set up "pipelines" for data to travel through. Dataflow is based on Tasks. The disadvantage to Dataflow is that it's still developing and (IMO) not as stable as the rest of the Async support.
Reactive Extensions (Rx) - this is the only option that is specifically designed for asynchronous events, not just background operations. It's officially released (unlike VS Async and Dataflow), but the learning curve is steeper.
All three of these options are efficient (using the thread pool for any actual processing), and they all have well-defined semantics for error handling and results. I do recommend using TAP as much as possible; those parts can then easily be integrated into Dataflow or Rx.
You mentioned "voice commands" as one possible input source. You may be interested in a BuildWindows video where Stephen Toub sings -- and uses Dataflow to harmonize his voice in near-realtime. (Stephen Toub is one of the geniuses behind TPL, Dataflow, and Async).
IMO using a thread pool is the way to go WRT processing the input. Take a look at http://smartthreadpool.codeplex.com. It provides a very nice API (using generics) for waiting on results. You could use this in conjunction with Asynchronous Server Socket implementation. It may also be worth your while to take a look at Jeff Richter's Power Threading Lib: http://www.wintellect.com/Resources/Downloads
I am by no means expert in theese matters but I did some research on the subject recently and I'm very pleased with results achieved with MS TPL library. Tasks give you a nice wrapper around ThreadPool threads and are optimized for parallel processing so they ensure more performance. If you are able to use .NET 4.0 for your project, you should probably explore using tasks. They represent more advanced way of dealing with async operations and provide a nice way to cancel operations in progress using CancellationToken objects.
Here is the short example of accessing UI thread from different thread using tasks:
private void TaskUse()
{
var task = new Task<string>(() =>
{
Thread.Sleep(5000);
return "5 seconds passed!";
});
task.ContinueWith((tResult) =>
{
TestTextBox.Text = tResult.Result;
}, TaskScheduler.FromCurrentSynchronizationContext());
task.Start();
}
From previous example you can see how easy is to synchronize with UI thread with using TaskScheduler.FromCurrentSynchronizationContext(), assuming you call this method from UI thread. Tasks also provide optimizations for blocking operations like scenarios where you need to wait for service response and such by providing TaskCreationOptions.LongRunning enum value in Task constructor. This will assure that specified operation doesn't block processor core since maximum number of active tasks is determined by number of present processor cores.

Synchronous work in a BackgroundWorker

My application performs time consuming work independently on several files. I created a BackgroundWorker to pass the work off to on each file, but it appears the backgroundworker is only capable of performing asynchronous work. Is it possible to do several asynchronous tasks in unison with it, or is there a similar object for performing synchronous operations?
The background worker is usually used to update the UI and/or to pass off work so you don't freeze the UI when a long running process takes place. This means that you "pass" the background worker process the "file work" and then use a callback to update the UI(usually) all during which your APP remains responsive.
If the items are independent then you might want to spool up a few threads to split the work. Again, if I am understanding you correctly. If I am then you might want to look at Jon Skeet's threading article.
While you can use the BackgroundWorker, I think you should simply spin off a few threads to do the work. One thread (probably the main thread) will create and start these worker threads and then perform a Join on all the workers in order to wait for processing to complete.
Alternatively, have a look a the Parallel Extensions for .Net if you are using .Net 3.5. The Task object from that library is probably perfect for your situation.
You can do multiple aynchronous tasks by creating more then one BackgroundWorker object in your code. We created a JobPool class that created a number of BackgroundWorker objects to run so that we could control the total number running at any one time. But if there are just a few files you will be processing this wouldbe overkill.

Categories