In c# im creating my threads like this:
void LaunchThread(string url, string search, string regexstring)
{
new Thread(delegate()
{
Scrape(url, search, regexstring,false);
}).Start();
}
and I use an INT variable to follow how many threads are currently running but i have a feeling it can be a little wonky at times and not be accurate (due to when you time the check on how many exist)
I have 2 questions:
Is there a variable that could tell me how many threads are currently running
Is there a way to close/exit all threads mid way not waiting for them to complete?
thanks alot SO im new to c# and multitasking as a whole
If you're interested in only the threads you've started, keeping a counter like you describe should work fine. Just make sure that you use proper locking when you get/change it.
If the work you're doing in the worker threads involves a loop, you can add logic to the loop that checks a variable to see whether it should keep running. You could set this variable from another thread (again, assuming proper locking).
Keep a collection of the threads you spawn like List<Thread>. Then you can get the Count property to check the number of threads. You can even get live threads only using threadCollection.Count(t => t.IsAlive).
I assume you're using HttpWebRequest. Check out these two threads:
Terminate loopless thread instantly without Abort or Suspend
Killing HttpWebRequest object using Thread.Abort
Related
I have a task with a huge amount of input data (video). I need to process its frames in the background without freezing the UI and I don't need to process every frame.
So I want to create a background thread and skip frames while background is busy. Than I get another frames from video input and again.
I have this simple code now. I worked. But can it cause troubles and may be there is a better approach?
public class VideoProcessor{
bool busy=false;
void VideoStreamingEvent(Frame data){
if(!busy){
busy=true;
InvokeInBackground(()=>{
DataProcessing(data);
busy=false;
});
}
}
}
But can it cause troubles and may be there is a better approach?
If the VideoStreamingEvent method never executes concurrently on multiple threads, then this will work fine if you simply add volatile to the busy field declaration. It may, in practice, appear to work well enough without it, but that behavior is not guaranteed.
If it is possible for VideoStreamingEvent to be invoked on multiple threads, then you will need some synchronization around where you read and write the busy field.
The way I was told to make windows services is as followed:
Thread serviceThread = new Thread(new Thread(runProc())
Boolean isRunning = true;
if (_isRunning)
{
serviceThread.Start();
}else
close and log service
void runProc()
{
while(_isRunning)
{
//Service tasks
}
_isRunning = false;
}
This has worked fine for me so far but now I need to make a service that has big breaks in it, up to 2 hours at a time. Also I have started using timers so nothing is being done in the infinite loop other than stopping runProc() running over and over again which I can imagine is bad because threads are being made and remade a lot.
My question is, I have read that it is bad practice to put Thread.Sleep(big number) in that while(_isRunning) infinite loop, is this true? If this is the case, how do I get around the loop running constantly and using loads of resource? There is literally nothing being done in the loop right now, it is all handled in the tickevent of my timer, the only reason I have a loop is to stop runProc ending.
Thanks a lot an sorry if I explain myself badly
Thread.Sleep is bad because it cannot be (easily) interrupted1.
I generally prefer to use a ManualResetEvent or similar:
class abc {
Thread serviceThread = new Thread(new Thread(runProc())
ManualResetEvent abort = new ManualResetEvent(false);
void Start(){
serviceThread.Start();
}
void Stop(){
abort.Set();
serviceThread.Join();
}
void runProc()
{
while(!abort.WaitOne(delay))
{
//Service tasks
}
}
}
Hopefully you get the gist, not a great code sample.
The delay can be as large or small as you want (and can be arbitrarily recomputed during each loop). The WaitOne call will either delay the progress of this thread for delay milliseconds or, if Stop is called, will cause the loop to exit immediately.
1To summarize my position from the comments below - it can only be interrupted by blunt tools like Thread.Abort or Thread.Interrupt which both share the failing (to a greater or lesser extent) that they can also introduce their associated exceptions at various other places in your code. If you can guarantee that the thread is actually inside the Thread.Sleep call then the latter may be okay - but if you can make such a guarantee, you can also usually arrange to use a less blunt inter-thread communication mechanism - such as the one I've suggested in this answer.
I've always written services with a main infinite loop, not timers. Inside the loop, I check to see if there's any work to do, if so I do the work, if not I call Thread.Sleep(). That means that as long as there's work to be done, the loop will keep iterating, running as fast as it can. When the queue of work "dries up", it sleeps a little (a few seconds or minutes) while more work becomes available.
That's always worked really well for back-end jobs on a server where there's a constant stream of new work to be done throughout the day (and night). If you have big periods with no work the service will wake many times to check and then go back to sleep. You might like that or not. As long as the check is quick, it shouldn't be an issue. An alternative is to use a scheduled task (or database job) so that you know that work will be completed at specific times throughout the day. That's a better approach in some cases.
I have asked a similar question before here, but after much thought, and implementations from those that answered me, I found that my approach might have been incorrect.
When I implement the solution given to me on this previous question the following test result appeared:
When I 'simulate' multiple tasks running concurrently on multiple threads from the threadpool (by making the threads sleep at random times from 1 to 20 seconds for instance), then the model seems to work fine. I set the system to poll every 1 second to see if it can spawn another thread and all seems fine. Longer running (sleeping) threads would complete later on and threads would start and die all over the place. If I happen to run out of threads (I set it to spawn no more than 10) it would sit and wait for one to become available.
When I however make the system do actual processing in each thread (which would take anything from 3 seconds upwards), which would involve reading data, generating XMLs saving data, sending emails and the like, the system would spawn 1, 2 or 3 threads, do processing and then just close the threads (3...2...1...) and then say 0 threads running (I added console.writelines everywhere to document the process). It would then hang around 0 threads, never picking any more work!
So I decided to state my issue again the hopes that someone has a solution. I have tried various solutions so far:
ThreadPool: There's always the mention that you shouldn't over-work the ThreadPool and jobs has to be 'quick', but what is the definition of 'quick'? How do I know how big/busy the ThreadPool is?
Threads: It's always stated that Threads are expensive and you have to handle them starting up and ending, but how do I limit them, I have tried Semaphores, 'lock' objects, public variables, but it no no avail
So here is what I would like to accomplish:
I have the same job that needs to run at regular intervals, i.e. like gmail would check it's server for new email for you every 5 seconds.
If there is work to be done (i.e. you have new emails to be sent to your inbox), then spawn an async thread and make it start the work. This work will typically take longer than the interval stated in (1), hence the async thread, if an interval passes and the system checks again to see if there's new work and see you have more work, it will spawn another thread and make it start the work.
As in my example, all the jobs are the same kind of job (check of new mail), and are totally independent of eachother, they do not influence each other. If one of them fails, the rest can continue on working with no issue.
I need there to be a limit of how many concurrent threads and maximum threads I can have. If I pick '10', then the system should start checking for jobs as in (1), and keep on spawning threads as in (1), until it reaches 10 threads. All new attempts on an interval to spawn a new thread should just fail (do nothing) until a thread is released again. Here I suppose the choice will be: (a) when it's released there will already be some work queued waiting to be given to the new open thread or (b) on the next interval check if there's new work and assign it to the new open thread.
If there is no work, then typically the system should sit and wait, having no threads and in essence the only thing that should be running is some sort of timer
I currently use the sample in the previous question to do the following:
I start a timer, that ticks every 1 sec
On every tick I 'ThreadPool.QueueUserWorkItem(new WaitCallback(DoWork)'
In DoWork I I instantiate a class and call various methods that does some work
...but this leads to what I mentioned before, only 3 threads that die off and then nothing.
I as thinking of doing the following:
Set the ThreadPool to 10 thread's
Start a timer and in each tick ThreadPool.QueueUserWorkItem', and just keep on doing this, hoping that the ThreadPool will handle everything else. Isn't this what the ThreadPool is supposed to do?
Any help will be fantastic! (Sorry for the involved explanation!)
Try to have a look at the Semaphore class. You can use that to set a limit to how many threads can concurrently access a particular resource (and when I say resource, it can be anything).
Ok, edited for details:
In your class managing the threads, you create:
Semaphore concurrentThreadsEnforcer = new Semaphore(value1, value2);
Then, each thread you start will call:
concurrentThreadsEnforcer.WaitOne();
That will either take one slot from the semaphore and give it to the new thread, or block the new thread until a slot becomes available.
Whenever your new thread finishes its work, he (I like personalizing) MUST call, for obvious reasons:
concurrentThreadsEnforcer.Release().
Now, regarding the constructor, the second parameter is fairly simple: states how many concurrent threads can access the resource at any given time.
The first one is a bit trickier. The difference between the second parameter and the first one will state how many semaphore slots are reserved for the calling thread. That is, all your newly spawned threads will have access to the number of slots stated by the first parameter, and the rest of them up to the second parameter's value will be reserved for the original thread that created the semaphore (calling thread).
In your case, for 10 max threads, you would use:
... = new Semaphore(10, 10);
Since I posted a story anyway, let me gibe more details.
The way I will do it in the new threads, will be like this:
bool aquired = false;
try
{
aquired = concurrentThreadsEnforcer.WaitOne();
// Do some work here
} // Optional catch statements
finally
{
if (aquired)
concurrentThreadsEnforcer.Release();;
}
I would use a combination of BlockingCollection and Parallel.ForEach
Something like this:
private BlockingCollection<Job> jobs = new BlockingCollection<Job>();
private Task jobprocessor;
public void StartWork() {
timer.Start();
jobprocessor = Task.Factory.StartNew(RunJobs);
}
public void EndWork() {
timer.Stop();
jobs.CompleteAdding();
jobprocessor.Wait();
}
public void TimerTick() {
var job = new Job();
if (job.NeedsMoreWork())
jobs.Add(job);
}
public void RunJobs() {
var options = new ParallelOptions { MaxDegreeOfParallelism = 10 };
Parallel.ForEach(jobs.GetConsumingPartitioner(), options,
job => job.DoSomething());
}
Greetings
I have a program that creates multiples instances of a class, runs the same long-running Update method on all instances and waits for completion. I'm following Kev's approach from this question of adding the Update to ThreadPool.QueueUserWorkItem.
In the main prog., I'm sleeping for a few minutes and checking a Boolean in the last child to see if done
while(!child[child.Length-1].isFinished) {
Thread.Sleep(...);
}
This solution is working the way I want, but is there a better way to do this? Both for the independent instances and checking if all work is done.
Thanks
UPDATE:
There doesn't need to be locking. The different instances each have a different web service url they request from, and do similar work on the response. They're all doing their own thing.
If you know the number of operations that will be performed, use a countdown and an event:
Activity[] activities = GetActivities();
int remaining = activities.Length;
using (ManualResetEvent finishedEvent = new ManualResetEvent(false))
{
foreach (Activity activity in activities)
{
ThreadPool.QueueUserWorkItem(s =>
{
activity.Run();
if (Interlocked.Decrement(ref remaining) == 0)
finishedEvent.Set();
});
}
finishedEvent.WaitOne();
}
Don't poll for completion. The .NET Framework (and the Windows OS in general) has a number of threading primitives specifically designed to prevent the need for spinlocks, and a polling loop with Sleep is really just a slow spinlock.
You can try Semaphore.
A blocking way of waiting is a bit more elegant than polling. See the Monitor.Wait/Monitor.Pulse (Semaphore works ok too) for a simple way to block and signal. C# has some syntactic sugar around the Monitor class in the form of the lock keyword.
This doesn't look good. There is almost never a valid reason to assume that when the last thread is completed that the other ones are done as well. Unless you somehow interlock the worker threads, which you should never do. It also makes little sense to Sleep(), waiting for a thread to complete. You might as well do the work that thread is doing.
If you've got multiple threads going, give them each a ManualResetEvent. You can wait on completion with WaitHandle.WaitAll(). Counting down a thread counter with the Interlocked class can work too. Or use a CountdownLatch.
I have a GUI C# application that has a single button Start/Stop.
Originally this GUI was creating a single instance of a class that queries a database and performs some actions if there are results and gets a single "task" at a time from the database.
I was then asked to try to utilize all the computing power on some of the 8 core systems. Using the number of processors I figure I can create that number of instances of my class and run them all and come pretty close to using a fair ammount of the computing power.
Environment.ProccessorCount;
Using this value, in the GUI form, I have been trying to go through a loop ProccessorCount number of times and start a new thread that calls a "doWork" type method in the class. Then Sleep for 1 second (to ensure the initial query gets through) and then proceed to the next part of the loop.
I kept on having issues with this however because it seemed to wait until the loop was completed to start the queries leading to a collision of some sort (getting the same value from the MySQL database).
In the main form, once it starts the "workers" it then changes the button text to STOP and if the button is hit again, it should execute on each "worker" a "stopWork" method.
Does what I am trying to accomplish make sense? Is there a better way to do this (that doesn't involve restructuring the worker class)?
Restructure your design so you have one thread running in the background checking your database for work to do.
When it finds work to do, spawn a new thread for each work item.
Don't forget to use synchronization tools, such as semaphores and mutexes, for the key limited resources. Fine tuning the synchronization is worth your time.
You could also experiment with the maximum number of worker threads - my guess is that it would be a few over your current number of processors.
While an exhaustive answer on the best practices of multithreaded development is a little beyond what I can write here, a couple of things:
Don't use Sleep() to wait for something to continue unless ABSOLUTELY necessary. If you have another code process that you need to wait for completion, you can either Join() that thread or use either a ManualResetEvent or AutoResetEvent. There is a lot of information on MSDN about their usage. Take some time to read over it.
You can't really guarantee that your threads will each run on their own core. While it's entirely likely that the OS thread scheduler will do this, just be aware that it isn't guaranteed.
I would assume that the easiest way to increase your use of the processors would be to simply spawn the worker methods on threads from the ThreadPool (by calling ThreadPool.QueueUserWorkItem). If you do this in a loop, the runtime will pick up threads from the thread pool and run the worker threads in parallel.
ThreadPool.QueueUserWorkItem(state => DoWork());
Never use Sleep for thread synchronization.
Your question doesn't supply enough detail, but you might want to use a ManualResetEvent to make the workers wait for the initial query.
Yes, it makes sense what you are trying to do.
It would make sense to make 8 workers, each consuming tasks from a queue. You should take care to synchronize threads properly, if they need to access shared state. From your description of your problem, it sounds like you are having a thread synchronization problem.
You should remember, that you can only update the GUI from the GUI thread. That might also be the source of your problems.
There is really no way to tell, what exactly the problem is, without more information or a code example.
I'm suspecting you have a problem like this: You need to make a copy of the loop variable (task) into currenttask, otherwise the threads all actually share the same variable.
<main thread>
var tasks = db.GetTasks();
foreach(var task in tasks) {
var currenttask = task;
ThreadPool.QueueUserWorkItem(state => DoTask(currenttask));
// or, new Thread(() => DoTask(currentTask)).Start()
// ThreadPool.QueueUserWorkItem(state => DoTask(task)); this doesn't work!
}
Note that you shouldn't Thread.Sleep() on the main thread to wait for the worker threads to finish. if using the threadpool, you can continue to queue work items, if you want to wait for the executing tasks to finish, you should use something like an AutoResetEvent to wait for the threads to finish.
You seem to be encountering a common issue with multithreaded programming. It's called a Race Condition, and you'd do well to do some research on this and other multithreading issues before proceeding too far. It's very easy to quickly mess up all your data.
The short of it is that you must ensure all your commands to your database (eg: Get an available task) are performed within the scope of a single transaction.
I don't know MySQL Well enough to give a complete answer, however a very basic example for T-SQL might look like this:
BEGIN TRAN
DECLARE #taskid int
SELECT #taskid=taskid FROM tasks WHERE assigned = false
UPDATE tasks SET assigned=true WHERE taskid = #taskID
SELECT * from tasks where taskid = #taskid
COMMIT TRAN
MySQL 5 and above has support for transactions too.
You could also do a lock around the "fetch task from DB" code, that way only one thread will query the database at a time - but obviously this decrease the performance gain somewhat.
Some code of what you're doing (and maybe some SQL, this really depends) would be a huge help.
However assuming you're fetching a task from DB, and these tasks require some time in C#, you likely want something like this:
object myLock;
void StartWorking()
{
myLock = new object(); // only new it once, could be done in your constructor too.
for (int i = 0; i < Environment.Processorcount; i++)
{
ThreadPool.QueueUserWorkItem(null => DoWork());
}
}
void DoWork(object state)
{
object task;
lock(myLock)
{
task = GetTaskFromDB();
}
PerformTask(task);
}
There are some good ideas posted above. One of the things that we ran into is that we not only wanted a multi-processor capable application but a multi-server capable application as well. Depending upon your application we use a queue that gets wrapped in a lock through a common web server (causing others to be blocked) while we get the next thing to be processed.
In our case, we are processing lots of data, we to keep things single, we locked an object, get the id of the next unprocessed item, flag it as being processed, unlock the object, hand the record id to be processed back to the main thread on the calling server, and then it gets processed. This seems to work well for us since the time it takes to lock, get, update, and release is very small, and while blocking does occur, we never run into a deadlock situation while waiting for reasources (because we are using lock(object) { } and a nice tight try catch inside to ensure we handle errors gracefully inside.
As mentioned elsewhere, all of this is handled in the primary thread. Given the information to be processed, we push it to a new thread (which for us goes and retrieve 100mb's of data and processes it per call). This approach has allowed us to scale beyond the single server. In the past we had to through high end hardware at the problem, now we can throw several cheaper, but still very capable servers. We can also through this across our virtualization farm in low utilization periods.
On other thing I failed to mention, we also use locking mutexes inside our stored proc as well so if two apps on two servers call it at the same time, it's handled gracefully. So the concept above applies to our app and to the database. Our clients backend is MySql 5.1 series and it is done with just a few lines.
One of this things that I think people forget when they are developing is that you want to get in and out of the lock relatively quickly. If you want to return large chunks of data, I personally wouldn't do it in the lock itself unless you really had to. Otherwise, you can't really do much mutlithreading stuff if everyone is waiting to get data.
Okay, found my MySql code for doing just what you will need.
DELIMITER //
CREATE PROCEDURE getnextid(
I_service_entity_id INT(11)
, OUT O_tag VARCHAR(36)
)
BEGIN
DECLARE L_tag VARCHAR(36) DEFAULT '00000000-0000-0000-0000-000000000000';
DECLARE L_locked INT DEFAULT 0;
DECLARE C_next CURSOR FOR
SELECT tag FROM workitems
WHERE status in (0)
AND processable_date <= DATE_ADD(NOW(), INTERVAL 5 MINUTE)
;
DECLARE EXIT HANDLER FOR NOT FOUND
BEGIN
SET L_tag := '00000000-0000-0000-0000-000000000000';
DO RELEASE_LOCK('myuniquelockis');
END;
SELECT COALESCE(GET_LOCK('myuniquelockis',20), 0) INTO L_locked;
IF L_locked > 0 THEN
OPEN C_next;
FETCH C_next INTO I_tag;
IF I_tag <> '00000000-0000-0000-0000-000000000000' THEN
UPDATE workitems SET
status = 1
, service_entity_id = I_service_entity_id
, date_locked = NOW()
WHERE tag = I_tag;
END IF;
CLOSE C_next;
DO RELEASE_LOCK('myuniquelockis');
ELSE
SET I_tag := L_tag;
END IF;
END
//
DELIMITER ;
In our case, we return a GUID to C# as an out parameter. You could replace the SET at the end with SELECT L_tag; and be done with it and loose the OUT parameter, but we call this from another wrapper...
Hope this helps.