C#: Query status of a thread

C#: Query status of a thread - c#

I am running a background worker thread that takes a long time to run.
The actual function stores the folder structure of a location, but we can take an example of the following pseudo code running in a different thread -
private int currentResult=0;
private void worker() {
for(int n = 0; n<1000; ++n)
{
int result;
// Do some time consuming computation and update the result
currentResult = result;
}
}
This is running in a BackgroundWorker thread. Can I read the currentResult from another thread safely?
Edit:
Keyword volatile seems like a magic solution (thanks Jon)! I am planning to pass a string with a message in this way from the worker class to the UI.
You might be wondering why I don't use ReportProgress. The reason is that the BackgroundWorker.DoWork creates an object of a different class, and calls a method there which does bulk of the work. This method is time consuming. The class is one to get the directory structure, and many related methods in it which the main computing method depends on. So this class does not even know existance of the background worker, and hence cannot report progress to it. Moving functionality of the class to BackgroundWorker seems messy. If this strikes as a bad design I am open to suggestions!

If you make it volatile you can... or if you always use the Interlocked class to both read and write it. (Or always read/write it within a lock on the same monitor.) Without these precautions, you could end up reading a stale value.
However, that's not necessarily the best way to do things. Generally, a background worker should use ReportProgress to indicate its progress... is there any reason you don't want to do that in this case?

Related

Is it possible to modify an object passed like a parameter to other thread with C#? [duplicate]

is there any way in c# to put objects in another thread? All I found is how to actually execute some methods in another thread. What I actually want to do is to instanciate an object in a new thread for later use of the methods it provides.
Hope you can help me,
Russo

Objects do not really belong to a thread. If you have a reference to an object, you can access it from many threads.
This can give problems with object that are not designed to be accessed from many threads, like (almost all) System.Windows.Forms classes, and access to COM objects.
If you only want to access an object from the same thread, store a reference to the thread in the object (or a wrapping object), and execute the methods via that thread.

There seems to be some confusion about how threads work here, so this is a primer (very short too, so you should find more material before venturing further into multi-threaded programming.)
Objects and memory are inherently multi-thread in the sense that all threads in a process can access them as they choose.
So objects do not have anything to do with threads.
However, code executes in a thread, and it is the thread the code executes in that you're probably after.
Unfortunately there is no way to just "put an object into a different thread" as you put it, you need to specifically start a thread and specify what code to execute in that thread. Objects used by that code can thus be "said" to belong to that thread, though that is an artificial limit you impose yourself.
So there is no way to do this:
SomeObject obj = new SomeObject();
obj.PutInThread(thatOtherThread);
obj.Method(); // this now executes in that other thread
In fact, a common trap many new multi-thread programmers fall into is that if they create an object in one thread, and call methods on it from another thread, all those methods execute in the thread that created the object. This is incorrect, methods always executes in the thread that called them.
So the following is also incorrect:
Thread 1:
SomeObject obj = new SomeObject();
Thread 2:
obj.Method(); // executes in Thread 1
The method here will execute in Thread 2. The only way to get the method to execute in the original thread is to cooperate with the original thread and "ask it" to execute that method. How you do that depends on the situation and there's many many ways to do this.
So to summarize what you want: You want to create a new thread, and execute code in that thread.
To do that, look at the Thread class of .NET.
But be warned: Multi-threaded applications are exceedingly hard to get correct, I would not add multi-threaded capabilities to a program unless:
That is the only way to get more performance out of it
And, you know what you're doing

All threads of a process share the same data (ignoring thread local storage) so there is no need to explicitly migrate objects between threads.
internal sealed class Foo
{
private Object bar = null;
private void CreateBarOnNewThread()
{
var thread = new Thread(this.CreateBar);
thread.Start();
// Do other stuff while the new thread
// creates our bar.
Console.WriteLine("Doing crazy stuff.");
// Wait for the other thread to finish.
thread.Join();
// Use this.bar here...
}
private void CreateBar()
{
// Creating a bar takes a long time.
Thread.Sleep(1000);
this.bar = new Object();
}
}

All threads can see the stack heap, so if the thread has a reference to the objects you need (passed in through a method, for example) then the thread can use those objects. This is why you have to be very careful accessing objects when multi-threading, as two threads might try and change the object at the same time.
There is a ThreadLocal<T> class in .NET that you can use to restrict variables to a specific thread: see http://msdn.microsoft.com/en-us/library/dd642243.aspx and http://www.c-sharpcorner.com/UploadFile/ddoedens/UseThreadLocals11212005053901AM/UseThreadLocals.aspx

Use ParameterizedThreadStart to pass an object to your thread.

"for later use of the methods it provides."
Using a class that contains method to execute on new thread and other data and methods, you can gain access from your thread to Data and methods from the new thread.
But ... if your execute a method from the class, you are executing on current thread.
To execute the method on the new thread needs some Thread syncronization.
System.Windows.Forms.Control.BeginInvoke do it, the Control thread is waiting until a request arrives.
WaitHandle class can help you.

There's a lot of jargon around threading, But it boils down something pretty simple.
For a simple program, you have one point of execution flowing from point a to b, one line at a time. Programming 101, right?
Ok, for multithreading, You now have more then one point of execution in your program. So, point 1 can be in one part of your program, and point 2 can be someplace else.
It's all the same memory, data and code, but you have more then one thing happening at a time. So, you can think, what happens of both points enter a loop at the same time, what do you think would happen? So techniques were created to keep that kind of issue either from happening, or to speed up some kind of process. (counting a value vs. say, networking.)
That's all it really is. It can be tricky to manage, and and it's easy to get lost in the jargon and theory, but keep this in mind and it will be much simpler.
There are other exceptions to the rule as always, but this is the basics of it.

If the method that you run in a thread resides in a custom class you can have members of this class to hold the parameters.
public class Foo
{
object parameter1;
object parameter2;
public void ThreadMethod()
{
...
}
}

Sorry to duplicate some previous work, but the OP said
What I actually want to do is to instanciate an object in a new thread for later use of the methods it provides.
Let me interpret that as:
What I actually want to do is have a new thread instantiate an object so that later I can use that object's methods.
Pray correct me if I've missed the mark. Here's the example:
namespace silly
{
public static class Program
{
//declared volatile to make sure the object is in a consistent state
//between thread usages -- For thread safety.
public static volatile Object_w_Methods _method_provider = null;
static void Main(string[] args)
{
//right now, _method_provider is null.
System.Threading.Thread _creator_thread = new System.Threading.Thread(
new System.Threading.ThreadStart(Create_Object));
_creator_thread.Name = "Thread for creation of object";
_creator_thread.Start();
//here I can do other work while _method_provider is created.
System.Threading.Thread.Sleep(256);
_creator_thread.Join();
//by now, the other thread has created the _method_provider
//so we can use his methods in this thread, and any other thread!
System.Console.WriteLine("I got the name!! It is: `" +
_method_provider.Get_Name(1) + "'");
System.Console.WriteLine("Press any key to exit...");
System.Console.ReadKey(true);
}
static void Create_Object()
{
System.Threading.Thread.Sleep(512);
_method_provider = new Object_w_Methods();
}
}
public class Object_w_Methods
{
//Synchronize because it will probably be used by multiple threads,
//even though the current implementation is thread safe.
[System.Runtime.CompilerServices.MethodImpl(
System.Runtime.CompilerServices.MethodImplOptions.Synchronized)]
public string Get_Name(int id)
{
switch (id)
{
case 1:
return "one is the name";
case 2:
return "two is the one you want";
default:
return "supply the correct ID.";
}}}}

Just like to elaborate on a previous answer. To get back to the problem, objects and memory space are shared by all threads. So they are always shared, but I am assuming you want to do so safely and work with results created by another thread.
Firstly try one of the trusted C# patterns. Async Patterns
There are set patterns to work with, that do transmit basic messages and data between threads.
Usually the one threat completes after it computes the results!
Life threats: Nothing is fool proof when going asynchronous and sharing data on life threats.
So basically keep it as simple as possible if you do need to go this route and try follow known patterns.
So now I just like to elaborate why some of the known patters have a certain structure:
Eventargs: where you create a deepcopy of the objects before passing it. (It is not foolproof because certain references might still be shared . )
Passing results with basic types like int floats, etc, These can be created on a constructor and made immutable.
Atomic key words one these types, or create monitors etc.. Stick to one thread reads the other writes.
Assuming you have complex data you like to work with on two threads simultaneously a completely different ways to solve this , which I have not yet tested:
You could store results in database and let the other executable read it. ( There locks occur on a row level but you can try again or change the SQL code and at least you will get reported deadlocks that can be solved with good design, not just hanging software!!) I would only do this if it actually makes sense to store the data in a database for other reasons.
Another way that helps is to program F# . There objects and all types are immutable by default/ So your objects you want to share should have a constructor and no methods allow the object to get changed or basic types to get incremented.
So you create them and then they don't change! So they are non mutable after that.
Makes locking them and working with them in parallel so much easier. Don't go crazy with this in C# classes because others might follow this "convention' and most things like Lists were just not designed to be immutable in C# ( readonly is not the same as immutable, const is but it is very limiting). Immutable versus readonly

Best way to delay execution

Let's say I have a method that I run in a separate thread via Task.Factory.StartNew().
This method reports so many progress (IProgress) that it freezes my GUI.
I know that simply reducing the number of reports would be a solution, like reporting only 1 out of 10 but in my case, I really want to get all reports and display them in my GUI.
My first idea was to queue all reports and treat them one by one, pausing a little bit between each of them.
Firstly: Is it a good option?
Secondly: How to implement that? Using a timer or using some kind of Task.Delay()?
UPDATE:
I'll try to explain better. The progress sent to the GUI consists of geocoordinates that I display on a map. Displaying each progress one after another provide some kind of animation on the map. That's why I don't want to skip any of them.
In fact, I don't mind if the method that I execute in another thread finishes way before the animation. All I want, is to be sure that all points have been displayed for at least a certain amount of time (let's say 200 ms).

Sounds like the whole point of having the process run in a separate thread is wasted if this is the result. As such, my first recommendation would be to reduce the number of updates if possible.
If that is out of the question, perhaps you could revise the data you are sending as part of each update. How large, and how complex is the object or data-structure used for reporting? Can performance be improved by reducing it's complexity?
Finally, you might try another approach: What if you create a third thread that just handles the reporting, and delivers it to your GUI in larger chunks? If you let your worker-thread report it's status to this reporter-thread, then let the reporter thread report back to your main GUI-thread only occasionally (e.g. every 1 in 10, as you suggest yourself above, bur then reporting 10 chunks of data at once), then you won't call on your GUI that often, yet you'll still be able to keep all the status data from the processing, and make it available in the GUI.
I don't know how viable this will be for your particular situation, but it might be worth an experiment or two?

I have many concerns regarding your solution, but I can't say for sure which one can be a problem without code samples.
First of all, Stephen Cleary in his StartNew is Dangerous article points out the real problem with this method with using it with default parameters:
Easy enough for the simple case, but let’s consider a more realistic example:
private void Form1_Load(object sender, EventArgs e)
{
Compute(3);
}
private void Compute(int counter)
{
// If we're done computing, just return.
if (counter == 0)
return;
var ui = TaskScheduler.FromCurrentSynchronizationContext();
Task.Factory.StartNew(() => A(counter))
.ContinueWith(t =>
{
Text = t.Result.ToString(); // Update UI with results.
// Continue working.
Compute(counter - 1);
}, ui);
}
private int A(int value)
{
return value; // CPU-intensive work.
}
...
Now, the question returns: what thread does A run on? Go ahead and walk through it; you should have enough knowledge at this point to figure out the answer.
Ready? The method A runs on a thread pool thread the first time, and then it runs on the UI thread the last two times.
I strongly recommend you to read whole article for better understanding the StartNew method usage, but want to point out the last advice from there:
Unfortunately, the only overloads for StartNew that take a
TaskScheduler also require you to specify the CancellationToken and
TaskCreationOptions. This means that in order to use
Task.Factory.StartNew to reliably, predictably queue work to the
thread pool, you have to use an overload like this:
Task.Factory.StartNew(A, CancellationToken.None,
TaskCreationOptions.DenyChildAttach, TaskScheduler.Default);
And really, that’s kind of ridiculous. Just use Task.Run(() => A());.
So, may be your code can be easily improved simply by switching the method you are creating news tasks. But there is some other suggestions regarding your question:
Use BlockingCollection for the storing the reports, and write a simple consumer from this queue to UI, so you'll always have a limited number of reports to represent, but at the end all of them will be handled.
Use a ConcurrentExclusiveSchedulerPair class for your logic: for generating the reports use the ConcurrentScheduler Property and for displaying them use ExclusiveScheduler Property.

Pause loop until inner function returns (C#)

I have a loop that I don't want to continue until LoadAmazonDataByBatch() has returned. I know there must be a straight forward way of doing it, and I'm almost certain I'm approaching the problem wrong.
const int batchSize = 500;
for (int i = 0; i < total; i = i + batchSize)
{
LoadAmazonDataByBatch(i, batchSize, fileList, total, amazonLogHandler, stopWatch);
}
LoadAmazonDataByBatch() does a bunch of things on worker threads including creating a temporary DataSet that would get very large without the batching. I don't want to create a new DataSet until the old one is processed and disposed (by LoadAmazonDataByBatch).
Obviously the way this is written now everything happens almost all at once.
How can I approach this better?

You need to do some sort of thread synchronization.
Not clear where you got the LoadAmazonByBatch() , but I'd suggest
checking the doc for that function to see if there is a synchronous version of the operation.
if no doc is available, then you will need to roll up yr sleeves. It may require viewing or modifying the source of LoadAmazonByBatch(). Look for a ManualResetEvent that is set by the workers when they are finished. Or, maybe there is a regular .NET event that is emitted by that method when it completes. If those things don't exist you'll need to add something like that.

It's very likely that LoadAmazonDataByBatch creates a bunch of threads. You have to call Join on all created threads to wait till they complete.

Surely the only way this wouldn't wait for the function to return is if it's written asynchronously?
The relevant code isn't the loop you posted, it's the definition of LoadAmazonDataByBatch() that we need to see.

If that function has a callback (stopWatch?), perhaps you could call the function (LoadAmazonDataByBatch) within the callback.

If LoadAmazonDataByBatch() generates child threads, and it runs until each of those threads is finished, you can use the Thread.Join() method to make it wait for the child threads to finish. I am not sure how that would work for multiple children but I think it should be OK.
Reference: Threading in C#

How to force multiple commands to execute in same threading timeslice?

I have a C# app that needs to do a hot swap of a data input stream to a new handler class without breaking the data stream.
To do this, I have to perform multiple steps in a single thread without any other threads (most of all the data recieving thread) to run in between them due to CPU switching.
This is a simplified version of the situation but it should illustrate the problem.
void SwapInputHandler(Foo oldHandler, Foo newHandler)
{
UnhookProtocol(oldHandler);
HookProtocol(newHandler);
}
These two lines (unhook and hook) must execute in the same cpu slice to prevent any packets from getting through in case another thread executes in between them.
How can I make sure that these two commands run squentially using C# threading methods?
edit
There seems to be some confusion so I will try to be more specific. I didn't mean concurrently as in executing at the same time, just in the same cpu time slice so that no thread executes before these two complete. A lock is not what I'm looking for because that will only prevent THIS CODE from being executed again before the two commands run. I need to prevent ANY THREAD from running before these commands are done. Also, again I say this is a simplified version of my problem so don't try to solve my example, please answer the question.

Performing the operation in a single time slice will not help at all - the operation could just execute on another core or processor in parallel and access the stream while you perform the swap. You will have to use locking to prevent everybody from accessing the stream while it is in an inconsistent state.

Your data receiving thread needs to lock around accessing the handler pointer and you need to lock around changing the handler pointer.
Alternatively if your handler is a single variable you could use Interlocked.Exchange() to swap the value atomically.

Why not go at this from another direction, and let the thread in question handle the swap. Presumably, something wakes up when there's data to be handled, and passes it off to the current Foo. Could you post a notification to that thread that it needs to swap in a new handler the next time it wakes up? That would be much less fraught, I'd think.

Okay - to answer your specific question.
You can enumerate through all the threads in your process and call Thread.Suspend() on each one (except the active one), make the change and then call Thread.Resume().

Assuming your handlers are thread safe, my recommendation is to write a public wrapper over your handlers that does all the locking it needs using a private lock so you can safely change the handlers behind the scenes.
If you do this you can also use a ReaderWriterLockSlim, for accessing the wrapped handlers which allows concurrent read access.
Or you could architect your wrapper class and handler clases in such a way that no locking is required and the handler swamping can be done using a simple interlocked write or compare exchange.
Here's and example:
public interface IHandler
{
void Foo();
void Bar();
}
public class ThreadSafeHandler : IHandler
{
ReaderWriterLockSlim rwLock = new ReaderWriterLockSlim();
IHandler wrappedHandler;
public ThreadSafeHandler(IHandler handler)
{
wrappedHandler = handler;
}
public void Foo()
{
try
{
rwLock.EnterReadLock();
wrappedHandler.Foo();
}
finally
{
rwLock.ExitReadLock();
}
}
public void Bar()
{
try
{
rwLock.EnterReadLock();
wrappedHandler.Foo();
}
finally
{
rwLock.ExitReadLock();
}
}
public void SwapHandler(IHandler newHandler)
{
try
{
rwLock.EnterWriteLock();
UnhookProtocol(wrappedHandler);
HookProtocol(newHandler);
}
finally
{
rwLock.ExitWriteLock();
}
}
}
Take note that this is still not thread safe if atomic operations are required on the handler's methods, then you would need to use higher order locking between treads or add methods on your wrapper class to support thread safe atomic operations (something like, BeginTreadSafeBlock() folowed by EndTreadSafeBlock() that lock the wrapped handler for writing for a series of operations.

You can't and it's logical that you can't. The best you can do is avoid any other thread from disrupting the state between those two actions (as have already been said).
Here is why you can't:
Imagine there was an block that told the operating system to never thread switch while you're on that block. That would be technically possible but will lead to starvation everywhere.
You might thing your threads are the only one being used but that's an unwise assumption. There's the garbage collector, there are the async operations that works with threadpool threads, an external reference, such as a COM object could span its own thread (in your memory space) so that noone could progress while you're at it.
Imagine you make a very long operation in your HookOperation method. It involves a lot of non leaky operations but, as the Garbage Collector can't take over to free your resources, you end up without any memory left. Or imagine you call a COM object that uses multithreading to handle your request... but it can't start the new threads (well it can start them but they never get to run) and then joins them waiting for them to finish before coming back... and therefore you join on yourself, never returning!!.

As other posters have already said, you can't enforce system-wide critical section from user-mode code. However, you don't need it to implement the hot swapping.
Here is how.
Implement a proxy with the same interface as your hot-swappable Foo object. The proxy shall call HookProtocol and never unhook (until your app is stopped). It shall contain a reference to the current Foo handler, which you can replace with a new instance when needed. The proxy shall direct the data it receives from hooked functions to the current handler. Also, it shall provide a method for atomic replacement of the current Foo handler instance (there is a number of ways to implement it, from simple mutex to lock-free).

How can a child thread notify a parent thread of its status/progress?

I have a service responsible for many tasks, one of which is to launch jobs (one at a time) on a separate thread (threadJob child), these jobs can take a fair amount of time and
have various phases to them which I need to report back.
Ever so often a calling application requests the status from the service (GetStatus), this means that somehow the service needs to know at what point the job (child thread) is
at, my hope was that at some milestones the child thread could somehow inform (SetStatus) the parent thread (service) of its status and the service could return that information
to the calling application.
For example - I was looking to do something like this:
class Service
{
private Thread threadJob;
private int JOB_STATUS;
public Service()
{
JOB_STATUS = "IDLE";
}
public void RunTask()
{
threadJob = new Thread(new ThreadStart(PerformWork));
threadJob.IsBackground = true;
threadJob.Start();
}
public void PerformWork()
{
SetStatus("STARTING");
// do some work //
SetStatus("PHASE I");
// do some work //
SetStatus("PHASE II");
// do some work //
SetStatus("PHASE III");
// do some work //
SetStatus("FINISHED");
}
private void SetStatus(int status)
{
JOB_STATUS = status;
}
public string GetStatus()
{
return JOB_STATUS;
}
};
So, when a job needs to be performed RunTask() is called and this launches the thread (threadJob). This will run and perform some steps (using SetStatus to set the new status at
various points) and finally finish. Now, there is also function GetStatus() which should return the STATUS whenever requested (from a calling application using IPC) - this status
should reflect the current status of the job running by threadJob.
So, my problem is simple enough...
How can threadJob (or more specifically PerformWork()) return to Service the change in status in a thread-safe manner (I assume my example above of SetStatus/GetStatus is
unsafe)? Do I need to use events? I assume I cannot simply change JOB_STATUS directly ... Should I use a LOCK (if so on what?)...

You may have already looked into this, but the BackgroundWorker class gives you a nice interface for running tasks on background threads, and provides events to hook into for notifications that progress has changed.

You could create an event in the Service class and then invoke it in a thread-safe manner. Pay very close attention to how I have implemented the SetStatus method.
class Service
{
public delegate void JobStatusChangeHandler(string status);
// Event add/remove auto implemented code is already thread-safe.
public event JobStatusChangeHandler JobStatusChange;
public void PerformWork()
{
SetStatus("STARTING");
// stuff
SetStatus("FINISHED");
}
private void SetStatus(string status)
{
JobStatusChangeHandler snapshot;
lock (this)
{
// Get a snapshot of the invocation list for the event handler.
snapshot = JobStatusChange;
}
// This is threadsafe because multicast delegates are immutable.
// If you did not extract the invocation list into a local variable then
// the event may have all delegates removed after the check for null which
// which would result in a NullReferenceException when you attempt to invoke
// it.
if (snapshot != null)
{
snapshot(status);
}
}
}

I'd have the child thread raise a 'statusupdate' event, passing a struct with the information necessary for the parent and have the parent subscribe to it when launching it.

You can use the Event-Based Async Pattern.

I would go with delegate/event from the thread to the caller. If caller was UI or somewhere on that lines, I would be nice to the message pump and use appropriate Invoke()s to serialize notifications with the UIs thread when required.

I once wrote an app that needed a marker showing the progress a thread was making. I just used a shared global variable between them. The parent would just read the value, and the thread would just update it. No need to synchronize as only the parent read it, and only the child wrote it atomically. As it happened the parent was redrawing things frequently enough anyhow that it didn't even need to be poked by the child when the child updated the variable. Sometimes the simplest possible way works well.

Your current code mixes strings and ints for JOB_STATUS, which can't work. I'm assuming strings here, but it doesn't really matter, as I'll explain.
You current implementation is thread safe in the sense that no memory corruption will occur, since all assignments to reference type fields are guaranteed to be atomic. The CLR demands this, otherwise you could potentially access unmanaged memory if you could somehow access partially updated references. Your processor gives you that atomicity for free, however.
So as long as you're using reference types like strings, you won't get any memory corruption. The same is true for primitives like ints (and smaller) and enums based on them. (Just avoid longs and bigger, and non-primitive value types such as nullable integers.)
But, that is not the end of the story: this implementation is not guaranteed to always represent the current state. The reason for this is that the thread that calls GetStatus might be looking at a stale copy of the JOB_STATUS field, because the assignment in SetState contains no so-called memory barrier. That is: the new value for JOB_STATUS need not be sent to your main RAM right away. There are several reasons why this can be delayed:
Writing to main RAM is inherently slow (relatively speaking), which is the reason your processor has all kinds of buffers and L-something caches in the first place, so the processor usually delays memory synchronization. Not for very long, but it will probably delay. This can be quite noticeable on multicore processors, as these usually have separate caches per core.
The JIT might have stored the value of JOB_STATUS in a register earlier on, as part of some optimization strategy. Again, registers are far more efficient to use than your main RAM. However, this does mean that it might not see changes early enough, as it's still looking at the old copy in the register. (We're not talking minutes here, but still.)
So, if you want to be 100% certain that each thread & processor core is immediately aware of the changed status, declare your field as volatile:
private volatile int JOB_STATUS;
Now, GetStatus/SetStatus, without any locking constructs, is truly thread safe, as volatile demands that the value is read from and written to main RAM immediately (or something 100% equivalent, if the processor can do that more efficiently).
Note that if you don't declare your field as volatile you must use synchronization primitives, such as lock, but generally speaking you need to use the synchronization primitives both Get and Set, otherwise you won't solve the problem that volatile fixes.
Mind you, as you're doing IPC calls to get the status, I'd wager that you won't ever actually be able to observe any difference between non-volatile and volatile, given the overhead of the IPC calls and the thread synchronizations undoubtedly performed behind the scenes.
For more information on volatile, see volatile (C#) on MSDN.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.