Multi-threading - c#

The application I'm currently working on performs some I/O or CPU intensive actions (file compression, file transfers, communicating with third party APIs, etc.) that occur when a user presses a 'Send' button.
I'm currently trying to persuade my employers that we should push these actions out to separate threads inside the main application (we'd need a maximum of two worker threads active at any given time), but my colleague has claimed that:
Any extra processing executed on a low priority thread could affect the usability of the GUI.
My view was that pushing I/O or CPU intensive activity to worker threads, updating the UI with Invoke calls during progress reporting, is pretty standard practise for handling intensive activity.
Am I incorrect? If so, could someone provide an explanation?
EDIT:
Thank you for the answers so far.
I should clarify: the colleague's solution to non-blocking is to spawn a child process containing a timer loop that scans a folder and processes the file compression/transfer activities. (Note that this doesn't cover the calls to third party APIs - I have no idea what his solution there would be.
The main issue with this approach is that the main application loses all scope on the state of whatever activity., leading to, IMHO, further complexity (his solution to progress reporting is to expose the Windows message pump in both processes and send custom messages between the two processes).

You are correct. Background threads are the very essence of keeping the UI active, precisely as you have described, via Invoke operations. Keeping everything on the GUI thread will, eventually clog up the plubming and make the GUI unresponsive.

The definitive answer would be to implement it as a proof-of-concept and then profile the app to see what sort of potential performance hit may or may not exist. Maybe on a
Having said that - it sounds like rubbish to me. In fact, it's quite the opposite - using additional threads are often the best way to keep the UI responsive.
Especially with things like the Task Parallel Library, it's just not very difficult to basic multi-threading.

Yes, you are correct. The general principle is that the thread that's responsible for responding to the user and keeping the user interface up to date (usually referred to as the UI thread) should never be used to perform any lengthy operation.
As a rule of thumb, anything that could take longer than about 30ms is a candidate for removal from the UI thread. This is a little aggressive — 30ms is about the shortest interval that most people will perceive as being anything other than instantaneous and it's actually slightly less than the interval between successive frames shown on a movie screen.

Related

What is a multithreading program and how does it work?

What is a multithreading program and how does it work exactly? I read some documents but I'm confused. I know that code is executed line by line, but I can't understand how the program manages this.
A simple answer would be appreciated.c# example please (only animation!)
What is a multi-threading program and how does it work exactly?
Interesting part about this question is complete books are written on the topic, but still it is elusive to lot of people. I will try to explain in the order detailed underneath.
Please note this is just to provide a gist, an answer like this can never do justice to the depth and detail required. Regarding videos, best that I have come across are part of paid subscriptions (Wintellect and Pluralsight), check out if you can listen to them on trial basis, assuming you don't already have the subscription:
Wintellect by Jeffery Ritcher (from his Book, CLR via C#, has same chapter on Thread Fundamentals)
CLR Threading by Mike Woodring
Explanation Order
What is a thread ?
Why were threads introduced, main purpose ?
Pitfalls and how to avoid them, using Synchronization constructs ?
Thread Vs ThreadPool ?
Evolution of Multi threaded programming API, like Parallel API, Task API
Concurrent Collections, usage ?
Async-Await, thread but no thread, why they are best for IO
What is a thread ?
It is software implementation, which is purely a Windows OS concept (multi-threaded architecture), it is bare minimum unit of work. Every process on windows OS has at least one thread, every method call is done on the thread. Each process can have multiple threads, to do multiple things in parallel (provided hardware support).
Other Unix based OS are multi process architecture, in fact in Windows, even the most complex piece of software like Oracle.exe have single process with multiple threads for different critical background operations.
Why were threads introduced, main purpose ?
Contrary to the perception that concurrency is the main purpose, it was robustness that lead to the introduction of threads, imagine every process on Windows is running using same thread (in the initial 16 bit version) and out of them one process crash, that simply means system restart to recover in most of the cases. Usage of threads for concurrent operations, as multiple of them can be invoked in each process, came in picture down the line. In fact it is even important to utilize the processor with multiple cores to its full ability.
Pitfalls and how to avoid using Synchronization constructs ?
More threads means, more work completed concurrently, but issue comes, when same memory is accessed, especially for Write, as that's when it can lead to:
Memory corruption
Race condition
Also, another issue is thread is a very costly resource, each thread has a thread environment block, Kernel memory allocation. Also for scheduling each thread on a processor core, time is spent for context switching. It is quite possible that misuse can cause huge performance penalty, instead of improvement.
To avoid Thread related corruption issues, its important to use the Synchronization constructs, like lock, mutex, semaphore, based on requirement. Read is always thread safe, but Write needs appropriate Synchronization.
Thread Vs ThreadPool ?
Real threads are not the ones, we use in C#.Net, that's just the managed wrapper to invoke Win32 threads. Challenge remain in user's ability to grossly misuse, like invoking lot more than required number of threads, assigning the processor affinity, so isn't it better that we request a standard pool to queue the work item and its windows which decide when the new thread is required, when an already existing thread can schedule the work item. Thread is a costly resource, which needs to be optimized in usage, else it can be bane not boon.
Evolution of Multi threaded programming, like Parallel API, Task API
From .Net 4.0 onward, variety of new APIs Parallel.For, Parallel.ForEach for data paralellization and Task Parallelization, have made it very simple to introduce concurrency in the system. These APIs again work using a Thread pool internally. Task is more like scheduling a work for sometime in the future. Now introducing concurrency is like a breeze, though still synchronization constructs are required to avoid memory corruption, race condition or thread safe collections can be used.
Concurrent Collections, usage ?
Implementations like ConcurrentBag, ConcurrentQueue, ConcurrentDictionary, part of System.Collections.Concurrent are inherent thread safe, using spin-wait and much easier and quicker than explicit Synchronization. Also much easier to manage and work. There's another set API like ImmutableList System.Collections.Immutable, available via nuget, which are thread safe by virtue of creating another copy of data structure internally.
Async-Await, thread but no thread, why they are best for IO
This is an important aspect of concurrency meant for IO calls (disk, network), other APIs discussed till now, are meant for compute based concurrency so threads are important and make it faster, but for IO calls thread has no use except waiting for the call to return, IO calls are processed on hardware based queue IO Completion ports
A simple analogy might be found in the kitchen.
You've probably cooked using a recipe before -- start with the specified ingredients, follow the steps indicated in the recipe, and at the end you (hopefully) have a delicious dish ready to eat. If you do that, then you have executed a traditional (non-multithreaded) program.
But what if you have to cook a full meal, which includes a number of different dishes? The simple way to do it would be to start with the first recipe, do everything the recipe says, and when it's done, put the finished dish (and the first recipe) aside, then start on the second recipe, do everything it says, put the second dish (and second recipe) aside, and so on until you've gone through all of the recipes one after another. That will work, but you might end up spending 10 hours in the kitchen, and of course by the time the last dish is ready to eat, the first dish might be cold and unappetizing.
So instead you'd probably do what most chefs do, which is to start working on several recipes at the same time. For example, you might put the roast in the oven for 45 minutes, but instead of sitting in front of the oven waiting 45 minutes for the roast to cook, you'd spend the 45 minutes chopping the vegetables. When the oven timer rings, you put down your vegetable knife, pull the cooked roast out of the oven and let it cool, then go back to chopping vegetables, and so on. If you can do that, then you are successfully multitasking several recipes/programs. That is, you aren't literally working on multiple recipes at once (you still have only two hands!), but you are jumping back and forth from following one recipe to following another whenever necessary, and thereby making progress on several tasks rather than twiddling your thumbs a lot. Do this well and you can have the whole meal ready to eat in a much shorter amount of time, and everything will be hot and fresh at about the same time too. If you do this, you are executing a simple multithreaded program.
Then if you wanted to get really fancy, you might hire a few other chefs to work in the kitchen at the same time as you, so that you can get even more food prepared in a given amount of time. If you do this, your team is doing multiprocessing, with each chef taking one part of the total work and all of them working simultaneously. Note that each chef may well be working on multiple recipes (i.e. multitasking) as described in the previous paragraph.
As for how a computer does this sort of thing (no more analogies about chefs), it usually implements it using a list of ready-to-run threads and a timer. When the timer goes off (or when the thread that is currently executing has nothing to do for a while, because e.g. it is waiting to load data from a slow hard drive or something), the operating system does a context switch, in which pauses the current thread (by putting it into a list somewhere and no longer executing instructions from that thread's code anymore), then pulls another ready-to-run thread from the list of ready-to-run threads and starts executing instructions from that thread's code instead. This repeats for as long as necessary, often with context switches happening every few milliseconds, giving the illusion that multiple programs are running "at the same time" even on a single-core CPU. (On a multi-core CPU it does this same thing on each core, and in that case it's no longer just an illusion; multiple programs really are running at the same time)
Why don't you refer to Microsoft's very own documentation of the .net class System.Threading.Thread?
It has a handfull of simple example programs written in C# (at the bottom of the page) just as you asked for:
Thread Examples
actually multi thread is do multiple process at the same time together . and you can complete process parallel .
it's actually multi thread is do multiple process at the same time together . and you can complete process parallel . you can take task from your main thread then execute some other way and done .

ThreadPool.QueueUserWorkItem causing massive delay to UI thread due to lack of resources - better method to use? [duplicate]

Scenario
I have a Windows Forms Application. Inside the main form there is a loop that iterates around 3000 times, Creating a new instance of a class on a new thread to perform some calculations. Bearing in mind that this setup uses a Thread Pool, the UI does stay responsive when there are only around 100 iterations of this loop (100 Assets to process). But as soon as this number begins to increase heavily, the UI locks up into eggtimer mode and the thus the log that is writing out to the listbox on the form becomes unreadable.
Question
Am I right in thinking that the best way around this is to use a Background Worker?
And is the UI locking up because even though I'm using lots of different threads (for speed), the UI itself is not on its own separate thread?
Suggested Implementations greatly appreciated.
EDIT!!
So lets say that instead of just firing off and queuing up 3000 assets to process, I decide to do them in batches of 100. How would I go about doing this efficiently? I made an attempt earlier at adding "Thread.Sleep(5000);" after every batch of 100 were fired off, but the whole thing seemed to crap out....
If you are creating 3000 separate threads, you are pushing a documented limitation of the ThreadPool class:
If an application is subject to bursts
of activity in which large numbers of
thread pool tasks are queued, use the
SetMinThreads method to increase the
minimum number of idle threads.
Otherwise, the built-in delay in
creating new idle threads could cause
a bottleneck.
See that MSDN topic for suggestions to configure the thread pool for your situation.
If your work is CPU intensive, having that many separate threads will cause more overhead than it's worth. However, if it's very IO intensive, having a large number of threads may help things somewhat.
.NET 4 introduces outstanding support for parallel programming. If that is an option for you, I suggest you have a look at that.
More threads does not equal top speed. In fact too many threads equals less speed. If your task is simply CPU related you should only be using as many threads as you have cores otherwise you're wasting resources.
With 3,000 iterations and your form thread attempting to create a thread each time what's probably happening is you are maxing out the thread pool and the form is hanging because it needs to wait for a prior thread to complete before it can allocate a new one.
Apparently ThreadPool doesn't work this way. I have never checked it with threads before so I am not sure. Another possibility is that the tasks begin flooding the UI thread with invocations at which point it will give up on the GUI.
It's difficult to tell without seeing code - but, based on what you're describing, there is one suspect.
You mentioned that you have this running on the ThreadPool now. Switching to a BackgroundWorker won't change anything, dramatically, since it also uses the ThreadPool to execute. (BackgroundWorker just simplifies the invoke calls...)
That being said, I suspect the problem is your notifications back to the UI thread for your ListBox. If you're invoking too frequently, your UI may become unresponsive while it tries to "catch up". This can happen if you're feeding too much status info back to the UI thread via Control.Invoke.
Otherwise, make sure that ALL of your work is being done on the ThreadPool, and you're not blocking on the UI thread, and it should work.
If every thread logs something to your ui, every written log line must invoke the main thread. Better to cache the log-output and update the gui only every 100 iterations or something like that.
Since I haven't seen your code so this is just a lot of conjecture with some highly hopefully educated guessing.
All a threadpool does is queue up your requests and then fire new threads off as others complete their work. Now 3000 threads doesn't sounds like a lot but if there's a ton of processing going on you could be destroying your CPU.
I'm not convinced a background worker would help out since you will end up re-creating a manager to handle all the pooling the threadpool gives you. I think more you issue is you've got too much data chunking going on. I think a good place to start would be to throttle the amount of threads you start and maintain. The threadpool manager easily allows you to do this. Find a balance that allows you to process data while still keeping the UI responsive.

Design: Task-Asynchronous Pattern (TAP with await / async), vs threads with signalling vs other thread structures

Help with ideas for redesign of the below C# program would be greatly appreciated. I am trying to pick between implementing multithreading using 1) TAP, 2) course-grained threads that contain spinners that terminate when their bools are set to false, or 3) the same threads using signalling instead of these bools. I will explain the program below, to make the case clear.
The Program
The program is a game automation application in C# that I am developing as a fun way to learn the language and C# (5.0) features better. It has a UI that needs to remain responsive while the app runs.
When a particular tab in the UI is opened, the program fires up a new thread called "Scan" that, in a new method in another class, scans various memory locations and updates labels in the UI with these quickly changing values using the UI's SynchronizationContext. This goes on in a while(scanning) loop, for as long as scanning bool is true (usually the full life-duration of the program).
When the user clicks the Start button on this tab, the program fires up two new threads that does the following: Thread "Run" moves the character around following a particular path. Thread "Action" hits particular buttons and performs actions at the same time as the player runs the path. If a certain scenario occurs, the program should stop the running thread and the action thread temporarily, run a method, and when it finishes, go back to the running and action'ing.
When the user clicks the Stop button on this tab, the automation should halt and threads terminate.
The Challenge
I have already created a working version using continuous spinner loops in each thread that takes care of the various work. The spinners run using a while(myBool). For the three threads the bools are: scanning, running and actioning.
When I want to stop a thread I set the bool to false, and use a Thread.Join to wait for the thread to terminate gracefully before proceeding. The threads can, as mentioned, be stopped by the user clicking the Stop button, or automatically by the program as part of its functionality. In the latter case a thread is stopped, Joined, and then at a later stage restarted.
After having done a lot of reading and research on threading and the new async programming tools in C# 5.0, I have realized that the way I am currently doing it might be very clumsy and unprofessional. It creates lots of synchronization/thread-safety issues, and as the goal of all of this is to learn more about C# I wanted to get your take on whether I should change it to a fine-grained asynchrounous programming approach instead, using TAP with async and await as appropriate.
Does this sound like a case where Tasks with cancellation tokens could be useful? The threads are after all long-running operations, so I was concerned that using the thread pool (Task.Run) would cause bad hygiene in the thread pool (over-subscription). If async programming seems like a bad match here, what about using threads as I have done, but instead use signalling to start and stop the threads?
Any thoughts greatly appreciated.
No. TPL was designed to run shorter tasks where the allocation of new threads all time would hurt perfomance. It got quite nice features like job queues and work stealing (a TPL thread can take jobs from another thread). It can of course have longer running task, but you wont get so many benefits from that. On the contrarary, you force TPL to allocate new threads.
However, the question is a bit general in the sense that we need more information about your actual implementation to know what you should use. For the Scan thread it's quite obvious that it should run in a single thread.
But for the others it's hard to know. Do they do work all the time or periodically? If they do work all the time you should keep them in seperate threads.
As for the thread syncronization there is another alternative. You could use a ConcurrentQueue to queue up everything that has to be drawn. In that way you do not need any synchronization. Just let the UI thread check the queue and draw anything in it, while the producers can continue to do their work.
In fact, in that way you can move anything not related to UI drawing to other threads. That should also improve the responsiveness in your application.
public void ActionRunnerThreadFunc()
{
_drawQueue.Enqueue(new SpaceShipRenderer(x, y));
}
public void UIThreadFunc()
{
IItemRender item;
if (_drawQueue.TryDequeue(out item))
item.Draw(drawContext);
}

How to see how much prossesing time a C# Windows Forms application needs?

I have a C# Windows Forms application wicht does some camera control and computer vision. For all the parts which take longer for calculation I used seperate threads. But there are still some parts which are in the callback functions of the GUI. As I understand, all these callback functions are executed in the same thread. Is there a way to see how much time this thread is working or idle? What percentage of idle time is needed such that the GUI is still responsive?
It's recommended that you shouldn't block the UI thread for more than 50ms, otherwise it will affect the UI responsiveness. I.e., two UI callbacks queued with Form.BeginInvoke, each taking ~50ms to complete, may introduce some unpleasant UI experience to the user.
It doesn't make sense to update the UI more often than the user can react to it (i.e, ~24 frames per second). So, you should throttle the UI thread callbacks and give user input events a priority.
I recently posted an example of how it can possibly be done:
https://stackoverflow.com/a/21654436/1768303
For simple tasks you could use a stopwatch and measure the time manually. However I think you'll need to check what a performance profiler is.
Also - there is little situations in which your GUI needs that heavy processing. In most cases the problem comes from putting too much calculations in event handlers instead of implementing them somewhere outside and then update the form when finished. It's less of a single/multi-threading problem and more of using available events properly.

Thread.Sleep() usage to Prevent Server Overload

I wrote some code that mass imports a high volume of users into AD. To refrain from overloading the server, I put a thread.sleep() in the code, executed at every iteration.
Is this a good use of the method, or is there a better alternative (.NET 4.0 applies here)?
Does Thread.Sleep() even aid in performance? What is the cost and performance impact of sleeping a thread?
The Thread.Sleep() method will just put the thread in a pause state for the specified amount of time. I could tell you there are 3 different ways to achieve the same Sleep() calling the method from three different Types. They all have different features. Anyway most important, if you use Sleep() on the main UI thread, it will stop processing messages during that pause and the GUI will look locked. You need to use a BackgroundWorker to run the job you need to sleep.
My opinion is to use the Thread.Sleep() method and just follow my previous advice. In your specific case I guess you'll have no issues. If you put some efforts looking for the same exact topic on SO, I'm sure you'll find much better explanations about what I just summarized before.
If you have no way to receive a feedback from the called service, like it would happen on a typical event driven system (talking in abstract..we could also say callback or any information to understand how the service is affected by your call), the Sleep may be the way to go.
I think that Thread.Sleep is one way to handle this; #cHao is correct that using a timer would allow you to do this in another fashion. Essentially, you're trying to cut down number of commands sent to the AD server over a period of time.
In using timers, you're going to need to devise a way to detect trouble (that's more intuitive than a try/catch). For instance, if your server starts stalling and responding slower, you're going to continue stacking commands that the server can't handle (which may cascade in other errors).
When working with AD I've seen the Domain Controller freak out when too many commands come in (similar to a DOS attack) and bring the server to a crawl or crash. I think by using the sleep method you're creating a manageable and measurable flow.
In this instance, using a thread with a low priority may slow it down, but not to any controllable level. The thread priority will only be a factor on the machine sending the commands, not to the server having to process them.
Hope this helps; cheers!
If what you want is not overload the server you can just reduce the priority of the thread.
Thread.Sleep() do not consume any resources. However, the correct way to do this is set the priority of thread to a value below than Normal: Thread.Current.Priority = ThreadPriority.Lowest for example.
Thread.Sleep is not that "evil, do not do it ever", but maybe (just maybe) the fact that you need to use it reflects some lack on solution design. But this is not a rule at all.
Personally I never find a situation where I have to use Thread.Sleep.
Right now I'm working on an ASP.NET MVC application that uses a background thread to load a lot of data from database into a memory cache and after that write some data to the database.
The only feature I have used to prevent this thread to eat all my webserver and db processors was reduce the thread priority to the Lowest level. That thread will get about to 35 minutes to conclude all the operations instead of 7 minutes if a use a Normal priority thread. By the end of process, thread will have done about 230k selects to the database server, but this do not has affected my database or webserver performance in a perceptive way for the user.
tip: remember to set the priority back to Normal if you are using a thread from ThreadPool.
Here you can read about Thread.Priority:
http://msdn.microsoft.com/en-us/library/system.threading.thread.priority.aspx
Here a good article about why not use Thread.Sleep in production environment:
http://msmvps.com/blogs/peterritchie/archive/2007/04/26/thread-sleep-is-a-sign-of-a-poorly-designed-program.aspx
EDIT Like others said here, maybe just reduce your thread priority will not prevent the thread to send a large number of commands/data to AD. Maybe you'll get better results if you rethink all the thing and use timers or something like that. I personally think that reduce priority could resolve your problem, although I think you need to do some tests using your data to see what happens to your server and other servers involved in the process.
You could schedule the thread at BelowNormal priority instead. That said, that could potentially lead to your task never running if something else overloads the server. (Assuming Windows scheduling works the way the documentation on scheduling threads mentions for "some operating systems".)
That said, you said you're moving data into AD. If it's over the nework, it's entirely possible the CPU impact of your code will be negligible compared to I/O and processing on the AD side.
I don't see any issue with it except that during the time you put the thread to sleep then that thread will not be responsive. If that is your main thread then your GUI will become non responsive. If it is a background thread then you won't be able to communicate with it (eg to cancel it). If the time you sleep is short then it shouldn't matter.
I don't think reducing the priority of the thread will help as 1) your code might not even be running on the server and 2) most of the work being done by the server is probably not going to be on your thread anyway.
Thread.sleep does not aid performance (unless your thread has to wait for some resource). It incurs at least some overhead, and the amount of time that you sleep for is not guaranteed. The OS can decide to have your Thread sleep longer than the amount of time you specify.
As such, it would make more sense to do a significant batch of work between calls to Thread.Sleep().
Thread.Sleep() is a CPU-less wait state. Its overhead should be pretty minimal. If execute Thread.Sleep(0), you don't [necessarily] sleep, but you voluntarily surrender your time slice so the scheduler can let lower priority thread run.
You can also lower your thread's priority by setting Thread.Priority.
Another way of throttling your task is to use a Timer:
// instantiate a timer that 'ticks' 10 times per second (your ideal rate might be different)
Timer timer = new Timer( ImportUserIntoActiveDirectory , null , 0 , 100 ) ;
where ImportUserIntoActiveDirectory is an event handler that will import just user into AD:
private void ImportUserIntoActiveDirectory( object state )
{
// import just one user into AD
return
}
This lets you dial things in. The event handler is called on thread pool worker threads, so you don't tie up your primary thread. Let the OS do the work for you: all you do is decide on your target transaction rate.

Categories