How to deal with race conditions in multi-threading? - c#

Here's an example:
if (control.InvokeRequired)
{
control.BeginInvoke(action, control);
}
else
{
action(control);
}
What if between the condition and the BeginInvoke call the control is disposed, for example?
Another example having to do with events:
var handler = MyEvent;
if (handler != null)
{
handler.BeginInvoke(null, EventArgs.Empty, null, null);
}
If MyEvent is unsubscribed between the first line and the if statement, the if statement will still be executed. However, is that proper design? What if with the unsubscription also comes the destruction of state necessary for the proper invocation of the event? Not only does this solution have more lines of code (boilerplate), but it's not as intuitive and can lead to unexpected results on the client's end.
What say you, SO?

In my opinion, if any of this is an issue, both your thread management and object lifecycle management are reckless and need to be reexamined.
In the first example, the code is not symmetric: BeginInvoke will not wait for action to complete, but the direct call will; this is probably a bug already.
If you expect yet another thread to potentially dispose the control you're working with, you've no choice but to catch the ObjectDisposedException -- and it may not be thrown until you're already inside action, and possibly not on the current thread thanks to BeginInvoke.
It is improper to assume that once you have unsubscribed from an event you will no longer receive notifications on it. It doesn't even require multiple threads for this to happen -- only multiple subscribers. If the first subscriber does something while processing a notification that causes the second subscriber to unsubscribe, the notification currently "in flight" will still go to the second subscriber. You may be able to mitigate this with a guard clause at the top of your event handler routine, but you can't stop it from happening.

There are a few techniques for resolving a race condition:
Wrap the whole thing with a mutex. Make sure that there's a lock that each thread must first acquire before it can even start running in the race. That way, as soon as you get the lock, you know that no other thread is using that resource and you can complete safely.
Find a way to detect and recover from them; This can be very tricky, but some kinds of application work well; A typical way of dealing with this is to keep a counter of the number of times a resource has changed; If you get finished with a task and find that the version number is different from when you started, read the new version and start the task over from the beginning (or just fail)
redesign the application to use only atomic actions; basically this means using operations that can be completed in one step; this often involves "compare-and-swap" operations, or fitting all of the transaction's data into a single disk block that can be written atomically.
redesign the application to use lock-free techniques; This option only makes sense when satisfying hard, real-time constrains is more important than servicing every request, because lock-free designs inherently lose data (usually of some low priority nature).
Which option is "right" depends on the application. Each option has performance trade-offs that may make the benefit of concurrency less appealing.

If this behavior is spreading multiple places in your application, it might deserve to re-design the API, which looks like:
if(!control.invokeIfRequired()){
action(action);
}
Just the same idea as standard JDK library ConcurrentHashMap.putIfAbsent(...). Of course, you need to deal with synchronization inside this new control.invokeIfRequired() method.

Related

More C# Events and Thread Safety

I read the C# Events and Thread Safety
and the section Thread-safe delegate invocation in MSDN
Before asking quesion, i define thread safety, there are three aspects:
(item1)no bad data R/W. (intermediate data)
(item2)no effect of instruction reoder.
(item3)no effect of cache consistency.
Let's look at this example again:
PropertyChanged?.Invoke(…)
var handler = this.PropertyChanged;
if (handler != null)
{
handler(…);
}
OK, in C# the R/W OP of reference type is no bad data problem. So, when invoke handler, it is never null.
But, i still have questions
Is there a mechanism at the bottom of C# to ensure that an interlocked API operation is actually applied to the PropertyChanged, so there will be no problems with instruction reorder and cache consistency.
If there is indeed a similar interlocked mechanism, is it only for delegate and event types? Is there such a guarantee for other variables type that can use .? operator.
【Additional】
Yes,i cannot define the meaning of thread safety.I only want to give a NAME to item1-item3. And my the other doubt comes from the following field-like events are implemented using Interlocked.CompareExchange
What is this thing you call thread-safe?
The code we’ve got so far is
“thread-safe” in that it doesn’t matter what other threads do – you
won’t get a NullReferenceException from the above code. However, if
other threads are subscribing to the event or unsubscribing from it,
you might not see the most recent changes for the normal reasons of
memory models being complicated.
As of C# 4, field-like events are implemented using
Interlocked.CompareExchange, so we can just use a corresponding
Interlocked.CompareExchange call to make sure we get the most recent
value. There’s nothing new about being able to do that, admittedly,
but it does mean we can just write
CLEAN EVENT HANDLER INVOCATION WITH C# 6
You need to understand one thing about thread safety. Well, a couple of:
It is only given when documented. The default is that NO api is thread safe.
It comes with a significant cost. Any locking has a cost - so it should be avoided if unnecessary.
Finally, especially whe nyou talk about UI elements, there are very specific threading rules in the framework - going down to rules in windows. STA - single threaded apartment, one thread only. UI thread only.
So, no, there is no magic mechanism that guarantees something that IS NOT GUARANTEED PER DOCUMENTATION because it would mean the cost would have to be paid every time, mostly also when not needed.
Event mechanisms in .NET are single threaded. Period. Live with it. They go back to a notification mechanism in the UI, and there, for rules likely older than you (i.e. they go back to times of ActiveX UI elements, which incidentally still exist in i.e. the standard file dialog) this is the domain of "anything in the UI is only ever changed by the ONE ui thread".

Adding robust time-outs to a C# asynchronous callback scenario?

I came across this C# code sample on MSDN that shows how to use a delegate to wrap a callback method for an asynchronous DNS lookup operation:
http://msdn.microsoft.com/en-us/library/ms228972.aspx
As you can see from the code, a counter is incremented by the initiating method for each request and decremented once each time the callback method is executed. The initiating method sits in a loop until the counter reaches zero, keeping the UI updated as it waits.
What I don't see in the sample is a robust method for timing out the initiating method if the process takes too long. My questions are:
What is a good way to institute a robust time-out mechanism in this example? Is it necessary to make any calls to clean up any pending DNS lookups if the decision is made to abort the entire operation? If anyone knows of a good resource or example that demonstrates robust time-out handling in this call/callback pair scenario, I'd like to know about it.
Is this scenario better served by the async-await pattern added since VS2012?
Are there any tips or domain specific concerns related to executing in a Windows Phone context?
The Dns class specifically does not have a facility for cancelling requests. Some APIs do (generated WCF proxies) others don't (Dns and many others).
As for your questions (answering in the context of Timeout):
There are multiple ways of making the uber-request (the async request that wraps the multiple DNS requests) time out. One can simply check for time passed where the code calls UpdateUserInterface. That would be the simplest way in the context of this sample (though will probably fall short for most real-world scenarios, unless you are willing to take up a thread to do it). As for clean-up - if you mean memory clean up after the operation completes, that will happen (unless there's a library bug). If you instead mean clean up to conserve resources, truth is most people don't bother in most cases. The cost and added complexity (coupled with the fact that it's a "less travelled path" meaning less testing) means that calls that should be aborted are often just left alone and complete on their own sweet time. One just needs to make sure the continue code does not do anything bad.
Doing a non-blocking timeout with Tasks (await/async) is actually very compelling, if you have access to it.. All you need to do is the following (pseudo code):
async Task> GetAddresses(IEnumerable hosts, int timeoutms)
{
List> tasks = new List>();
foreach (var host in hosts)
{
tasks.Add(GetEntryAsync(host);
}
var wait = Task.Delay(timeoutms); // create a task that will fire when time passes.
var any = await Task.WhenAny(
wait,
Task.WhenAll(tasks)); // Wait for EITHER timeout or ALL the dns reqs.
if (any == wait)
{
throw new MyTimeoutException();
}
return tasks.Select(t => t.Result).ToArray(); // Should also check for exceptions here.
}
No tips really.. A good portion of async operations are not even cancellable and, at least in my opionion, are not worth re-writing just to get cancellation semantics.

How to Purge a ThreadPool? [Microsoft System.Threading.ThreadPool]

Is it possible to purge a ThreadPool?
Remove items from the ThreadPool?
Anything like that?
ThreadPool.QueueUserWorkItem(GetDataThread);
RegisteredWaitHandle Handle = ThreadPool.RegisterWaitForSingleObject(CompletedEvent, WaitProc, null, 10000, true);
Any thoughts?
I recommend using the Task class (added in .NET 4.0) if you need this kind of behaviour. It supports cancellation, and you can have any number of tasks listening to the same cancellation token, which enables you to cancel them all with a single method call.
Updated (non-4.0 solution):
You really only have two choices. One: implement your own event demultiplexer (this is far more complex than it appears, due to the 64-handle wait limitation); I can't recommend this - I had to do it once (in unmanaged code), and it was hideous.
That leaves the second choice: Have a signal to cancel the tasks. Naturally, RegisteredWaitHandle.Unregister can cancel the RWFSO part. The QUWI is more complex, but can be done by making the action aware of a "token" value. When the action executes, it first checks the token value against its stored token value; if they are different, then it shouldn't do anything.
One major thing to consider is race conditions. Just keep in mind that there is a race condition between cancelling an action and the ThreadPool executing it, so it is possible to see actions running after cancellation.
I have a blog post on this concept, which I call "asynchronous callback contexts". The CallbackContext type mentioned in the blog post is available in the Nito.Async library.
There's no interface for removing a queued item. However, nothing stops you from "poisoning" the delegate so that it returns immediately.
edit
Based on what Paul said, I'm thinking you might also want to consider a pipelined architecture, where you have a fixed number of threads reading from a blocking queue (like .NET 4.0's BlockingCollection on a ConcurrentQueue). This way, if you want to cancel items, you can just access the queue yourself.
Having said that, Stephen's advice about Task is likely better, in that it gives you all the control you would realistically want, without all the hard work that rolling your own pipelines involves. I mention this only for completion.
The ThreadPool exists to help you manage your threads. You should not have to worry about purging it at all since it will make the best performance decisions on your behalf.
If you think you need tighter control over your threads then you could consider creating your own thread management class (similar to ThreadPool) but it would take a lot of work to match and exceed the functionality that ThreadPool has built in.
Take a look here at some of the ThreadPool optimizations and the ideas behind it.
For my second point, I found an article on Code Project that implements a "Cancelable Threadpool", probably for some of your own similar reasons. It would be a good place to start looking if you're going to write your own.

Events vs. Yield

I have a multithreaded application that spawns threads for several hardware instruments. Each thread is basically an infinite loop (for the lifetime of the application) that polls the hardware for new data, and activates an event (which passes the data) each time it collects something new. There is a single listener class that consolidates all these instruments, performs some calculations, and fires a new event with this calculation.
However, I'm wondering if, since there is a single listener, it would be better to expose an IEnumerable<> method off these instruments, and use a yield return to return the data, instead of firing events.
I'd like to see if anybody knows of differences in these two methods. In particular, I'm looking for the best reliability, best ability to pause/cancel operation, best for threading purposes, general safety, etc.
Also, with the second method is it possible to still run the IEnumerable loop on a separate thread? Many of these instruments are somewhat CPU-bound, so ensuring each one is on a different thread is vital.
This sounds like a very good use case for the Reactive Extensions. There's a little bit of a learning curve to it but in a nutshell, IObservable is the dual of IEnumerable. Where IEnumerable requires you to pull from it, IObservable pushes its values to the observer. Pretty much any time you need to block in your enumerator, it's a good sign you should reverse the pattern and use a push model. Events are one way to go but IObservable has much more flexibility since it's composable and thread-aware.
instrument.DataEvents
.Where(x => x.SomeProperty == something)
.BufferWithTime( TimeSpan.FromSeconds(1) )
.Subscribe( x => DoSomethingWith(x) );
In the above example, DoSomethingWith(x) will be called whenever the subject (instrument) produces a DataEvent that has a matching SomeProperty and it buffers the events into batches of 1 second duration.
There's plenty more you could do such as merging in the events produced by other subjects or directing the notifications onto the UI thread, etc. Unfortunately documentation is currently pretty weak but there's some good information on Matthew Podwysocki's blog. (Although his posts almost exclusively mention Reactive Extensions for JavaScript, it's pretty much all applicable to Reactive Extensions for .NET as well.)
It's a close call, but I think I'd stick to the event model in this case, with the main decider behing that future maintenance programmers are less likely to understand the yield concept. Also, yield means the code processing each hardware request is in the same thread as the code generating the requests for processing. That's bad, because it could mean your hardware has to wait on the consumer code.
And speaking of consumers, another option is a producer/consumer queue. Your instruments can all push into the same queue and your single listener can then pop from it do whatever from there.
There's a pretty fundamental difference, push vs pull. The pull model (yield) being the harder one to implement from the instrument interface view. Because you'll have to store data until the client code is ready to pull. When you push, the client may or may not store, as it deems necessary.
But most practical implementations in multi-threading scenarios need to deal with the overhead in the inevitable thread context switch that's required to present data. And that's often done with pull, using a thread-safe bounded queue.
Stephen Toub blogs about a blocking queue which implements IEnumerable as an infinite loop using the yield keyword. Your worker threads could enqueue new data points as they appear and the calculation thread could dequeue them using a foreach loop with blocking semantics.
I don't think there's much difference performance-wise between the event and yield approach. Yield is lazy evaluated, so it leaves an opportunity to signal the producing threads to stop. If your code is thoughtfully documented then maintenance ought to be a wash, too.
My preference is a third option, to use a callback method instead of an event (even though both involve delegates). Your producers invoke the callback each time they have data. Callbacks can return values, so your consumer can signal producers to stop or continue each time they check in with data.
This approach can give you places to optimize performance if you have a high volume of data. In your callback you lock on a neutral object and append incoming data to a collection. The runtime internally uses an ready queue on the lock object, so this can serve as your queuing point.
This lets you choose a collection, such as a List<T> with predefined capacity, that is O(1) for appending. You can also double-buffer your consumer, with your callback appending to the "left" buffer while you consolidate from the "right" one, and so forth. This minimizes the amount of producer blocking and associated missed data, which is handy for bursty data. You can also readily measure high-water marks and processing rates as you vary the number of threads.

Blocking a function call in C#

How do I block a function call in C#?
This function gets repeatedly called by different classes.I want to lock it up so that no one else can use it until I perform my current operation. Then,i want to release it again.
Please help..
Thanks
Edit:I'm not actually using threads...but I am using timers that call the function repeatedly and it's also called among different classes.
Use a lock:
private static object lockObject = new object();
lock(lockObject) {
// ... code goes here...
}
If you are accessing this function from multiple threads you need some sort of synchronization.
If we are talking about single-threaded code, why not just use a boolean that stores if your function is currently "usable"?
Update:
As you added that you are not using threads, IMO a simple bool would suffice.
If you need more states, you could create an enum and set that accordingly.
Depending on the timers you use, you may be using threads. UI timers (Windows.Forms.Timers) only run on the UI thread, so they will not be called reentrantly. However, System.Threading timers use threadpool threads, so your code can be called reentrantly on different threads - in which case you need some thread synchronisation (locks).
If you are not using threads, then your code can only be called re-entrantly if it does somethng that causes itself to be called. The obvious case is when it calls itself, but less obvious are cases where applying some action causes a chain of events that result in your method being called again - a "feedback loop".
So, the first step is to ask "is my code being called reentrantly?"
The second step is to ask "why?"
Then, if your code is causing the reentrancy, "how can I chnage the operaiton of my code so it doesn't call itself".
Or, if it's threaded, "where should I put locks to ensure that only one thread can execute this code at a time?"
You state that you are not using threads. If this is really the case, then it should not be possible for the function to be called multiple times unless your block of code does something which can causes reentrancy.
I notice one of the other answers here suggests using the lock statement. Note, this will only work if the real issue is that you are calling the function from multiple threads. If the issue is actually that you have a single thread but are doing something which results in reentering the function (e.g. pumping the UI message queue) then the lock statement will not prevent the code from executing multiple times - this is because the lock statement allows recursive locks on the same thread - i.e. it only disallows concurrent access from different threads.
So, noticing that you mention the use of timers, your particular situation is going to depend on which type of timers you are using - if you are using the timer from the System.Threading namespace, then the lock is probably what you will want (because you actually are executing the code on different threads since this timer uses the thread pool to execute your callbacks). If however, you are using the timers from the System.Windows.Forms namespace, then you are using a timer that always executes using the UI thread (via processing the UI message queue), and in this case it's most likely that the issue you have is some kind of reentrancy issue.
If the issue is reentrancy, and you are not calling the code from different threads then instead of a lock which won't work, you should be able to just use a boolean variable to record when you are already in the function, and disallow reentrancy.
Having said that, if you do have this type of reentrancy you may have a problem with your design.
Note, one thing with timers people sometimes fail to take into account is whether it is ok for the timer to be triggered again whilst you are still performing operations resulting from a previous triggering of the timer. In order to avoid this, I personally rarely set a timer to trigger repetitively - instead I set it to trigger once only, and then reenable the timer at the end of processing the operations performed in response to the trigger. This means that the time duration (without adjustment) becomes the interval between the end of processing the last triggered operation and the start of the next triggered operation.
Usually though this is actually what you want since you don't want the same operation to be triggered while the previous operation has not yet completed.
The standard way to do is to throw an InvalidOperationException, signaling the client of your code that your class object is not yet in a state to be usable.

Categories