I'm currently writing my own socket server (as the console app), and as I am new to multithreading in C# I started wondering about threading and backgrounds tasks. I found some possible alternatives to Threads like (BackgroundWorker, but for UI) or Task...
I have currenctly written a process, which periodically runs in endless while loop, where its checking clients if they are still connected.
As I cannot get opinion from searching on google, so I'm asking, is running processes in background, like my client check, through the Thread and endless loop a proper way, or there are some better ways how to do it?
If you are doing things periodically then a Timer would be suitable. Note that there are several alternatives with different behaviors. The motivation is that timers are made to run things periodically, so the intent becomes obvious when reading the code.
If you are processing items from other threads a Thread looping over a blocking collection is suitable. This can me useful if the processing needs to be in-order or singlethreaded. This will lockup some resources for the thread, so should be used somewhat sparingly.
If you want to run some one-off process-intensive task(s) in the background then a Task is most appropriate. This will use a thread from the threadpool, and return the thread when done. This is also useful if you want to run different things concurrently.
If you need to wait for some IO operation, like disk or network, then tasks + async/await is appropriate. This will not use any thread at all when waiting.
If running some process-intensive work in parallel, then Parallel.For/Foreach is suitable.
I would probably not use BackgroundWorker. This was made during the windows forms era, and is mostly superseded by tasks.
Related
Scenario
I have a Windows Forms Application. Inside the main form there is a loop that iterates around 3000 times, Creating a new instance of a class on a new thread to perform some calculations. Bearing in mind that this setup uses a Thread Pool, the UI does stay responsive when there are only around 100 iterations of this loop (100 Assets to process). But as soon as this number begins to increase heavily, the UI locks up into eggtimer mode and the thus the log that is writing out to the listbox on the form becomes unreadable.
Question
Am I right in thinking that the best way around this is to use a Background Worker?
And is the UI locking up because even though I'm using lots of different threads (for speed), the UI itself is not on its own separate thread?
Suggested Implementations greatly appreciated.
EDIT!!
So lets say that instead of just firing off and queuing up 3000 assets to process, I decide to do them in batches of 100. How would I go about doing this efficiently? I made an attempt earlier at adding "Thread.Sleep(5000);" after every batch of 100 were fired off, but the whole thing seemed to crap out....
If you are creating 3000 separate threads, you are pushing a documented limitation of the ThreadPool class:
If an application is subject to bursts
of activity in which large numbers of
thread pool tasks are queued, use the
SetMinThreads method to increase the
minimum number of idle threads.
Otherwise, the built-in delay in
creating new idle threads could cause
a bottleneck.
See that MSDN topic for suggestions to configure the thread pool for your situation.
If your work is CPU intensive, having that many separate threads will cause more overhead than it's worth. However, if it's very IO intensive, having a large number of threads may help things somewhat.
.NET 4 introduces outstanding support for parallel programming. If that is an option for you, I suggest you have a look at that.
More threads does not equal top speed. In fact too many threads equals less speed. If your task is simply CPU related you should only be using as many threads as you have cores otherwise you're wasting resources.
With 3,000 iterations and your form thread attempting to create a thread each time what's probably happening is you are maxing out the thread pool and the form is hanging because it needs to wait for a prior thread to complete before it can allocate a new one.
Apparently ThreadPool doesn't work this way. I have never checked it with threads before so I am not sure. Another possibility is that the tasks begin flooding the UI thread with invocations at which point it will give up on the GUI.
It's difficult to tell without seeing code - but, based on what you're describing, there is one suspect.
You mentioned that you have this running on the ThreadPool now. Switching to a BackgroundWorker won't change anything, dramatically, since it also uses the ThreadPool to execute. (BackgroundWorker just simplifies the invoke calls...)
That being said, I suspect the problem is your notifications back to the UI thread for your ListBox. If you're invoking too frequently, your UI may become unresponsive while it tries to "catch up". This can happen if you're feeding too much status info back to the UI thread via Control.Invoke.
Otherwise, make sure that ALL of your work is being done on the ThreadPool, and you're not blocking on the UI thread, and it should work.
If every thread logs something to your ui, every written log line must invoke the main thread. Better to cache the log-output and update the gui only every 100 iterations or something like that.
Since I haven't seen your code so this is just a lot of conjecture with some highly hopefully educated guessing.
All a threadpool does is queue up your requests and then fire new threads off as others complete their work. Now 3000 threads doesn't sounds like a lot but if there's a ton of processing going on you could be destroying your CPU.
I'm not convinced a background worker would help out since you will end up re-creating a manager to handle all the pooling the threadpool gives you. I think more you issue is you've got too much data chunking going on. I think a good place to start would be to throttle the amount of threads you start and maintain. The threadpool manager easily allows you to do this. Find a balance that allows you to process data while still keeping the UI responsive.
Help with ideas for redesign of the below C# program would be greatly appreciated. I am trying to pick between implementing multithreading using 1) TAP, 2) course-grained threads that contain spinners that terminate when their bools are set to false, or 3) the same threads using signalling instead of these bools. I will explain the program below, to make the case clear.
The Program
The program is a game automation application in C# that I am developing as a fun way to learn the language and C# (5.0) features better. It has a UI that needs to remain responsive while the app runs.
When a particular tab in the UI is opened, the program fires up a new thread called "Scan" that, in a new method in another class, scans various memory locations and updates labels in the UI with these quickly changing values using the UI's SynchronizationContext. This goes on in a while(scanning) loop, for as long as scanning bool is true (usually the full life-duration of the program).
When the user clicks the Start button on this tab, the program fires up two new threads that does the following: Thread "Run" moves the character around following a particular path. Thread "Action" hits particular buttons and performs actions at the same time as the player runs the path. If a certain scenario occurs, the program should stop the running thread and the action thread temporarily, run a method, and when it finishes, go back to the running and action'ing.
When the user clicks the Stop button on this tab, the automation should halt and threads terminate.
The Challenge
I have already created a working version using continuous spinner loops in each thread that takes care of the various work. The spinners run using a while(myBool). For the three threads the bools are: scanning, running and actioning.
When I want to stop a thread I set the bool to false, and use a Thread.Join to wait for the thread to terminate gracefully before proceeding. The threads can, as mentioned, be stopped by the user clicking the Stop button, or automatically by the program as part of its functionality. In the latter case a thread is stopped, Joined, and then at a later stage restarted.
After having done a lot of reading and research on threading and the new async programming tools in C# 5.0, I have realized that the way I am currently doing it might be very clumsy and unprofessional. It creates lots of synchronization/thread-safety issues, and as the goal of all of this is to learn more about C# I wanted to get your take on whether I should change it to a fine-grained asynchrounous programming approach instead, using TAP with async and await as appropriate.
Does this sound like a case where Tasks with cancellation tokens could be useful? The threads are after all long-running operations, so I was concerned that using the thread pool (Task.Run) would cause bad hygiene in the thread pool (over-subscription). If async programming seems like a bad match here, what about using threads as I have done, but instead use signalling to start and stop the threads?
Any thoughts greatly appreciated.
No. TPL was designed to run shorter tasks where the allocation of new threads all time would hurt perfomance. It got quite nice features like job queues and work stealing (a TPL thread can take jobs from another thread). It can of course have longer running task, but you wont get so many benefits from that. On the contrarary, you force TPL to allocate new threads.
However, the question is a bit general in the sense that we need more information about your actual implementation to know what you should use. For the Scan thread it's quite obvious that it should run in a single thread.
But for the others it's hard to know. Do they do work all the time or periodically? If they do work all the time you should keep them in seperate threads.
As for the thread syncronization there is another alternative. You could use a ConcurrentQueue to queue up everything that has to be drawn. In that way you do not need any synchronization. Just let the UI thread check the queue and draw anything in it, while the producers can continue to do their work.
In fact, in that way you can move anything not related to UI drawing to other threads. That should also improve the responsiveness in your application.
public void ActionRunnerThreadFunc()
{
_drawQueue.Enqueue(new SpaceShipRenderer(x, y));
}
public void UIThreadFunc()
{
IItemRender item;
if (_drawQueue.TryDequeue(out item))
item.Draw(drawContext);
}
The app I'm developing is composed this way:
A producer task scan the file system for text files and put a reference to them in a bag.
Many consumer tasks take file refs from the bag concurrently and read the files (and do some short work with their content)
I must be able to pause and resume the whole process.
I've tried using TPL, creating a task for every file ref as they are put in the bag (in this case the bag is just a concept, the producer directly create the consumers task as it find files) but this way I don't have control over the task I create, I can't (or I don't know how to) pause them. I can write some code to suspend the thread currently executing the task but that will ruin the point of working with logical tasks instead of manully creating threads wouldn't it? I would want something like "task already assigned to phisical thread can complete but waiting logical tasks should not start until resume command"
How can I achive this? Can it be done with TPL or should I use something else?
EDIT:
Your answers are all valid but my main doubt remains unanswered. We are talking about tasks, if I use TPL my producer and my many consumer will be tasks (right?) not threads (well, ok at the moment of the execution tasks will be mapped on threads). Every synchronization mechanism i've found (like the one proposed in the comment "ManualResetEventSlim") work at thread level.
E.g. the description of the Wait() method of "ManualResetEventSlim" is "Blocks the current thread until the current ManualResetEventSlim is set."
My knowledge of task is purely academic, I don't know how things works in the "real world" but it seem logical to me that I need a way to coordinate (wait/signal/...) tasks at task level or things could get weird... like... two task may be mapped on the same thread but one was supposed to signal the other that was waiting then deadlock. I'm a bit confused. This is why I asked if my app could use TPL instead of old style simple threads.
Yes, you can do that. First, you have a main thread, your application. There you have two workers, represented by threads. The first worker would be a producer and the second worker would be a consumer.
When your application starts, you start the workers. Both of them operates on the concurrency collection, the bag. Producer searches for files and puts references to the bag and consumer takes references from the bag and starts a task per reference.
When you want to signal pause, simply pause the producer. If you do that, consumer also stops working if there is nothing in the bag. If this is not a desired behaviour, you can simply define that pausing of the producer also clears the bag - backup your bag first and than clear it. This way all running tasks will finish their job and consumer will not start new tasks, but it can still run and wait for the results.
EDIT:
Based on your edit. I don't know how to achieve it the way you want, but although it is nice try to use new technologies, don't let your mind be clouded. Using a ThreadPool is also nice thing. It will take more time to start the application, but once it is running, consuming will be faster, because you already have workers ready.
It is not a bad idea, you can specify a maximum number of workers. If you create a task for every item in the bag, it will be more memory-consuming because you will still allocate and release memory. This will not happen with ThreadPool.
Sure you can use TPL for this. And may be also reactive extensions and LINQ to simplify grouping and pausing/resuming the thread works.
If you have just a short job on each file, it is pretty good idea to not to disturb the handler function with cancellations. You can just suspend queueing the workers instead.
I imagine something like this:
You directory scanner thread puts the found files into an observable collection.
The consumer thread subscribes the collection changes and gets/removes the files and assigns them to workers.
I'm working on a network-bound application, which is supposed to have a lot (hundreds, may be thousands) of parallel processes.
I'm looking for the best way to implement it.
When I tried setting
ThreadPool.SetMaxThreads(int.MaxValue, int.MaxValue);
and than creating 1000 threads and making those do stuff in parallel, application's execution became really jumpy.
I've heard somewhere that delegate.BeginInvoke is somehow better that new Thread(...), so I've tried it, and than opened the app in debugger, and what I've seen are parallel threads.
If I have to create lots and lots of threads, what is the best way to ensure that the application is going to run smoothly?
Have you tried the new await / async pattern in C# 5 / .NET 4.5?
I haven't got sources to hand about how this operates under the hood, but one of the most common use-cases of this new feature is waiting for IO bound stuff.
Threads are not lightweight objects. They are expensive to create and context switch to/from; hence the reason for the Thread Pool (pre-created and recycled). Most common solutions that involve networking or other IO ports utilise lower-level IO Completion Ports (there is a managed library here) to "wait" on a port, but where the thread can continue executing as normal.
BeginInvoke will utilise a Thread Pool thread, so it will be better than creating your own only if a thread is available. This approach, if used too heavily, can immediately result in thread starvation.
Setting such a high thread pool count is not going to work in the long run as threads are too heavy for what it appears you want to do.
Axum, a former Microsoft Research language, used to achieve massive parallelism that would have been suitable for this task. It operated similarly to Stackless Python or Erlang. Lots of concepts from Axum made their way into the parallelism drive into C# 5 and .NET 4.5.
Setting the ThreadPool.SetMaxThreads will only affect how many threads the thread pool has, and it won't make a difference regarding threads you create yourself with new Thread().
Go async (model, not keyword) as suggested by many.
You should follow the advice mentioned in the other answers and comments. As fsimonazzi says, creating new threads directly has nothing to do with the ThreadPool. For a quick test lower the max worker and completionPort threads and use the ThreadPool.QueueUserWorkItem method. The ThreadPool will decide what your system can handle, queue your tasks and resuse threads whenever it can.
If your tasks are not compute-bound then you should also utilize asynchronous I/O. You do not your worker threads to wait for I/O completion. You need those worker threads to return to the pool as quickly as possible and not block on I/O requests.
So far during my experience in Windows Phone 7 application development I notices there are different ways to runs an action in an asynchronous thread.
System.Threading.Thread
System.ComponentModel.BackgroundWorker
System.Threading.ThreadPool.QueueUserWorkItem()
I couldn't see any tangible difference between these methods (other than that the first two are more traceable).
Is there any thing you guys consider before using any of these methods? Which one would you prefer and why?
The question is kinda answered but the answers are a little short on detail (IMO).
Lets take each in turn.
System.Threading.Thread
All the threads (in the CLR anyway) are ultimately represented by this class. However you probably included this to query when we might want to create an instance ourselves.
The answer is rarely. Ordinarily the day-to-day workhorse for dispatching background tasks is the Threadpool. However there are some circumstances where we would want to create our own thread. Typically such a thread would live for most of the app runtime. It would spend most of its life in blocked on some wait handle. Occasionally we signal this handle and it comes alive to do something important but then it goes back to sleep. We don't use a Threadpool work item for this because we do not countenance the idea that it may queue up behind a large set of outstanding tasks some of which may themselves (perhaps inadverently) be blocked on some other wait.
System.ComponentModel.BackgroundWorker
This is friendly class wrapper around the a ThreadPool work item. This class only to the UI oriented developer who occasionally needs to use a background thread. Its events being dispatched on the UI thread makes it easy to consume.
System.Threading.ThreadPool.QueueUserWorkItem
This the day-to-day workhorse when you have some work you want doing on a background thread. This eliminates the expense of allocating and deallocating individual threads to perform some task. It limits the number of thread instances to prevent too much of the available resources being gobbled up by too many operations try to run in parallel.
The QueueUserWorkItem is my prefered option for invoking background operations.
It arguably depends on what you are trying to do, you have listed 3 very different threading models.
Basic threading
Designed for applications with a seperate UI thread.
Managed thread pool
Have you read MSDN etc...
http://www.albahari.com/threadin
Http://msdn.microsoft.com/en-us/library/aa645740(v=vs.71).aspx
You don't state "what for", but
Basic Thread - quite expensive, not for small jobs
Backgroundworker - mostly for UI + Progressbar work
ThreadPool - for small independent jobs
I think the TPL is not supported in SL, which is a pity.
The background worker tends to be better to use when your UI needs to be update as your thread progresses because it handles invoking the call back functions (such as the OnProgress callback) on the UI thread rather than the background thread. The other two don't do this work. It is up to you to do it.