C# .Net 4.5 Multithreading sharing variables - c#

I am new to multithreading and have a question on sharing objects. I am doing this in C# .Net 4.5
I have an list that contains a object called Price. The class Price contains 12 properties one of type datetime and the others are of type double.
I then run 4 tasks which all reference this object List. None of the tasks will change the List object they are just reading from the object.
So the fact the tasks are all referencing the same object but only reading from it am I right to think that I will not need any locking?

Yes the read does not modify anything for those types (and indeed most types), so it's safe.

Until and unless you do not have update and add going on any other thread you do not need to add locking. If update or edit is going on any other thread then do consider to use locking.
ReaderWriterLockSlim provides an easy and efficient way to provide advanced Reader and Writer locks.
Moreover as mentioned in Thread Safety section in documentation,
It is safe to perform multiple read operations on a List, but issues can occur if the collection is modified while it’s being read.

Related

Dictionary vs Concurrent Dictionary

I am trying to understand when to use Dictionary vs ConcurrentDictionary because of an issue I had with one of the changes I made to a Dictionary.
I had this Dictionary
private static Dictionary<string, Side> _strategySides = null;
In the constructor, I am adding some keys and values to the Dictionary I created like this
_strategySides.Add("Combination", Side.Combo);
_strategySides.Add("Collar", Side.Collar);
This code was fine and had been running in all environments for a while now. When I added
_strategySides.Add("Diagonal", Side.Diagonal);
This code starts to break with exceptions “Index was outside the bounds of the array.” On the dictionary. Then I got into the concept of ConcurrentDictionary and its uses and that I needed to choose ConcurrentDictionary over Dictionary in my case since its a multi threaded application.
So my question to all you gurus is that why didn't it throw an exception all these days and it started when I added something to a dictionary. Any knowledge on this will be appreciated.
As you mentioned, you have multi threaded application. Dictionary is not thread-safe and somewhere in your code you are reading dictionary simultaneously when adding item to it -> IndexOutOfboundsException.
This is mentioned in documentation:
A Dictionary can support multiple readers concurrently, as long as the
collection is not modified. Even so, enumerating through a collection
is intrinsically not a thread-safe procedure. In the rare case where
an enumeration contends with write accesses, the collection must be
locked during the entire enumeration. To allow the collection to be
accessed by multiple threads for reading and writing, you must
implement your own synchronization. For a thread-safe alternative, see
ConcurrentDictionary.
Check out the answer to this question: c# Dictionary lookup throws "Index was outside the bounds of the array"
It seems as though receiving this error on a dictionary is specific to a thread safety violation. The linked answer provides 2 ways to deal with the issue, one is concurrentdictionary.
If I had to guess why it didn't happen before: you are adding the entries in your constructor for a static object, which means only 1 writer, and no readers yet.
Your new entry is probably being added outside the constructor? Another thread could be reading while this write is being attempted, and is not allowed.
Dictionary is not thread-safe, and if you modify it while being accessed from multiple threads, all kinds of weird stuff can happen, including appearing to "work"... until it doesn't. Either protect it with a lock, or use the data structure that was specifically designed for multi-threaded use (i.e. ConcurrentDictionary).
So why did it "work" - that's very difficult to know definitively, but my bet would be on either simply not seeing the problem (i.e. the internal dictionary state was corrupted but you didn't notice it due to your usage patterns), or simply being "lucky" on execution timings (e.g. you could have inadvertently "synchronized" the threads through the debugger).
The point is: whatever it was, you cannot rely on it! You have to do the "right thing" even if the "wrong thing" appears to "work". That is the nature of multi-threaded programming.

Is there an equivalent for Guava Striped-Class in C#?

There are some cases where I really like using Guava's Striped class.
Is there an equivalent in C#?
It doesn't look like there is a direct equivalent, but there are some lockless thread-safe collection options (I'm not sure what you're trying to achieve, so I can't say if they will work for your scenario). Have a look at the System.Collections.Concurrent Namespace.
In particular, ConcurrentBag, ConcurrentQueue, ConcurrentStack, and ConcurrentDictionary all have different locking/lockless thread-safe strategies. Some are explained in this blog post.
You might be able to get what you want via the Partitioner class, although I am unsure of the implementation.
#Behrooz is incorrect in saying that all .net framework types only use a single lock for the entire list. Take a look at the source for ConcurrentDictionary. Line 71 suggests that this class is implemented using multiple locks.
If you really want to, you could write your own version. The source for the Guava Striped is: https://github.com/google/guava/blob/master/guava/src/com/google/common/util/concurrent/Striped.java
I think best you can do is implementing your own because all dotnet framework types offer only one lock for the entire list.
To do that you can use the GetHashCode() function, modulus(%) it with the number of stripes you want. and use it as an index for Tuple<TLock, List<T>>[] where TLock can be any kind of lock defined in System.Threading namespace and T is the type you want to store/access.
With this you can decide how you want your stripes to be stored. There are choices like HashSet(inefficient in your case since you already use some of the bits to calculate the stripe index), SortedSet, List, Array.
btw, Thank you for the question, It's gonna help me solve a problem I'm having.
Have you tried Tamarind from NuGet?
It's C# port of Google's Guava library
I think the ConcurrentDictionary can archive a similar result.
Based on their documentation:
All these operations are atomic and are thread-safe with regards to all other operations on the ConcurrentDictionary class. The only exceptions are the methods that accept a delegate, that is, AddOrUpdate and GetOrAdd. For modifications and write operations to the dictionary, ConcurrentDictionary uses fine-grained locking to ensure thread safety. (Read operations on the dictionary are performed in a lock-free manner.) However, delegates for these methods are called outside the locks to avoid the problems that can arise from executing unknown code under a lock. Therefore, the code executed by these delegates is not subject to the atomicity of the operation.
As you can see, read operations are lock-free. That will allow you to not block the threads from reading while other are inserting for example.

How should I share a large read-only List<T> with each Task.Factory.StartNew() method

Consider that I have a custom class called Terms and that class contains a number of strings properties. Then I create a fairly large (say 50,000) List<Terms> object. This List<Terms> only needs to be read from but it needs to be read from by multiple instances of Task.Factory.StartNew (the number of instances could vary from 1 to 100s).
How would I best pass that list into the long running task? Memory isn't too much of a concern as this is a custom application for a specific use on a specific server with plenty of memory. Should I reference it or should I just pass it off as a normal argument into the method doing the work?
Since you're passing a reference it doesn't really matter how you pass it, it won't copy the list itself. As Ket Smith said, I would pass it as a parameter to the method you are executing.
The issue is List<T> is not entirely thread-safe. Reads by multiple threads are safe but a write can cause some issues:
It is safe to perform multiple read operations on a List, but issues can occur if the collection is modified while it’s being read. To ensure thread safety, lock the collection during a read or write operation.
From List<T>
You say your list is read-only so that may be a non-issue, but a single unpredictable change could lead to unexpected behavior and so it's bug-prone.
I recommend using ImmutableList<T> which is inherently thread-safe since it's immutable.
So long as you don't try to copy it into each separate task, it shouldn't make much difference: more a matter of coding style than anything else. Each task will still be working with the same list in memory: just a different reference to the same underlying list.
That said, sheerly as a matter of coding style and maintainability, I'd probably try to pass it in as a parameter to whatever method you're executing in your Task.Factory.StartNew() (or better yet, Task.Run() - see here). That way, you've clearly called out your task's dependencies, and if you decide that you need to get the list from some other place, it's more clear what you've got to change. (But you could probably find 20 places in my own code where I haven't followed that rule: sometimes I go with what's easier for me now than with what's likely to be easier for the me six months from now.)

Sending messages between threads in C#

How can I send and receive messages between threads?
One solution would be share a concurrent queue, for example (albeit its name) ConcurrentQueue. This will allow you to enqueue an object from one thread and have the other thread (or others threads) dequeue from the queue. As it is a generic solution, you may pass strongly typed items, anything from string to Action will do, or your own custom message class of course.
Threre is just one limitation with this approach, the class ConcurrentQueue is only available from .NET 4.0 onwards. If you need this for a previous version of .NET you need to look for a third party libary. For example you can take the source for ConcurrentQueue from mono.
The general approach over which those queues work is by having a linked list and they resource to optimistic concurrency control using spinning for synchronization. As far as I know, this is the state of art for concurrent queues of variable size. Now, if you know the message load beforehand you can try a fixed size approach or a solution that favors enqueue and dequeue over growing (that would be an array based queue).
Full disclouser (according to faq): I'm the author of one of those third party libraries... my libraries (nuget available), it includes a backport ConcurrentQueue for old versions of .NET, based on a custom implementation. You can find the underlaying structure under Theraot.Collections.ThreadSafe.SafeQueue, it is a linked list of arrays (which are kept in an object pool), by doing it this way, we do not need to copy the arrays to grow (because we just add another node to the list), and we do not need to rely on synchronization mechanisms as often (because adding or removing an item does not modify the list often).
Note: this question used to link to HashBucket, which is hosted on another repository, and was my old solution for the problem. That project is discontinued, please use the version I mention above.
This is an old question, but still a relevant topic...
A producer/consumer approach may be used as possible solution for a problem like this. .NET Core, from version 3.0, has a namespace with tools to deal with that in a simple way.
Take a look at System.Threading.Channels:
https://learn.microsoft.com/en-us/dotnet/api/system.threading.channels
https://devblogs.microsoft.com/dotnet/an-introduction-to-system-threading-channels/
I think you are talking about Joining between threads? and this.
One way to do it is to create a class that has the method that you will call for the thread.
That class can have more than just the method; it can have members that the parent thread can have access to.
Given that, the parent can read from and write to those members, that way there is a means of communication between the two threads throughout the span of the thread's life.
There are many thread synchronization primitives you can use in .Net such as EventWaitHandle, Mutex, Semaphores etc. Here is a useful link on MSDN to find out how. - https://learn.microsoft.com/en-us/dotnet/standard/threading/overview-of-synchronization-primitives

Why use Hashtable.Synchronized?

From the MSDN documentation:
"Synchronized supports multiple writing threads, provided that no threads are reading the Hashtable. The synchronized wrapper does not provide thread-safe access in the case of one or more readers and one or more writers."
Source:
http://msdn.microsoft.com/en-us/library/system.collections.hashtable.synchronized.aspx
It sounds like I still have to use locks anyways, so my question is why would we use Hashtable.Synchronized at all?
For the same reason there are different levels of DB transaction. You may care that writes are guaranteed, but not mind reading stale/possibly bad data.
EDIT I note that their specific example is an Enumerator. They can't handle this case in their wrapper, because if you break from the enumeration early, the wrapper class would have no way to know that it can release its lock.
Think instead of the case of a counter. Multiple threads can increase a value in the table, and you want to display the value of the count. It doesn't matter if you display 1,200,453 and the count is actually 1,200,454 - you just need it close. However, you don't want the data to be corrupt. This is a case where thread-safety is important for writes, but not reads.
For the case where you can guarantee that no reader will access the data structure when writing to it (or when you don't care reading wrong data). For example, where the structure is not continually being modified, but a one time calculation that you'll later have to access, although huge enough to warrant many threads writing to it.
you would need it when you are for-eaching over a hashtable on one thread (reads) and there exists other threads that may add/remove items to/from it (writes) ...

Categories