Handling concurrency at group level rather than application level - c#

I would like to handle the concurrent issue in the API. Here is a situation where we get a request from multiple users for the same group. There can be multiple groups as well. Below solution i think should work, correct me
// This will be a singleton across the API
ConcurrentDictionary<string, string> dict = new ConcurrentDictionary<string, string>();
if (dict.ContainsKey(groupId)) {
throw new Exception("request already accepted");
} else {
// Thinking this is thread lock operation or i can put lock statement
if(dict.TryAdd(groupId, "Added") == false) {
throw new Exception("request already accepted");
}
// continue the original logic
}
After every 10 minutes, we will clean off the older keys in dictionary (note this operation should work normal i.e. like thread is not locked mode because it will be working on already used and old keys). Does concurrent dictionary have thread locking at key level rather than dictionary level? so that we don't block all the requests instead we only block particular requests related to the group. Any help is greatly appreciated.
One quick solution is having lock wrapper around get and add of dictionary operation but this would stop all the requests from proceeding, we want to block at group level. Any help is greatly appreciated.

Adding stuff into a concurrent dictionary is a very fast operation. You are also not making threads wait for the first one to finish, you are throwing right away if they cannot acquire the lock.
That makes me think that probably Double Checked Lock is not really needed for your case
So, I would simply do your inner check without the outer one:
if(dict.TryAdd(groupId, "Added") == false)
{
throw new Exception("request already accepted");
}
If you have waaaay too many request after the first one, then I would do what you have done, since ContainsKey will not lock
Another interesting topic is how you are going to clean this.
maybe you could do all this locking in an IDisposable object that can remove itself at dispose time. For example:
// NOTE: THIS IS JUST PSEUDOCODE
// In your controller, you can simply do this...
//
public SomeController()
{
using (var operation = new GroupOperation(groupId))
{
// In here I am sure I am the only operation of this group
}
// In here I am sure that the operation got removed from the dictionary
}
// This class hides all the complexity of the concurrent dictionary
//
public class GroupOperation : IDisposable
{
var singletonDictionary = new ConcurrentDictionary<int,int>()
int GroupId;
public GroupOperation(int GroupID)
{
this.GroupId = GroupId;
if(!singletonDictionary.TryADd(GroupID, 1))
{
throw new Exception("Sorry, operation in progress for your group");
}
}
protected virtual void Dispose(bool disposing)
{
singletonDictionary.Remove(GroupId)
}
}

Related

Batch process all items in ConcurrentBag

I have the following use case. Multiple threads are creating data points which are collected in a ConcurrentBag. Every x ms a single consumer thread looks at the data points that came in since the last time and processes them (e.g. count them + calculate average).
The following code more or less represents the solution that I came up with:
private static ConcurrentBag<long> _bag = new ConcurrentBag<long>();
static void Main()
{
Task.Run(() => Consume());
var producerTasks = Enumerable.Range(0, 8).Select(i => Task.Run(() => Produce()));
Task.WaitAll(producerTasks.ToArray());
}
private static void Produce()
{
for (int i = 0; i < 100000000; i++)
{
_bag.Add(i);
}
}
private static void Consume()
{
while (true)
{
var oldBag = _bag;
_bag = new ConcurrentBag<long>();
var average = oldBag.DefaultIfEmpty().Average();
var count = oldBag.Count;
Console.WriteLine($"Avg = {average}, Count = {count}");
// Wait x ms
}
}
Is a ConcurrentBag the right tool for the job here?
Is switching the bags the right way to achieve clearing the list for new data points and then processing the old ones?
Is it safe to operate on oldBag or could I run into trouble when I iterate over oldBag and a thread is still adding an item?
Should I use Interlocked.Exchange() for switching the variables?
EDIT
I guess the above code was not really a good representation of what I'm trying to achieve. So here is some more code to show the problem:
public class LogCollectorTarget : TargetWithLayout, ILogCollector
{
private readonly List<string> _logMessageBuffer;
public LogCollectorTarget()
{
_logMessageBuffer = new List<string>();
}
protected override void Write(LogEventInfo logEvent)
{
var logMessage = Layout.Render(logEvent);
lock (_logMessageBuffer)
{
_logMessageBuffer.Add(logMessage);
}
}
public string GetBuffer()
{
lock (_logMessageBuffer)
{
var messages = string.Join(Environment.NewLine, _logMessageBuffer);
_logMessageBuffer.Clear();
return messages;
}
}
}
The class' purpose is to collect logs so they can be sent to a server in batches. Every x seconds GetBuffer is called. This should get the current log messages and clear the buffer for new messages. It works with locks but it as they are quite expensive I don't want to lock on every Logging-operation in my program. So that's why I wanted to use a ConcurrentBag as a buffer. But then I still need to switch or clear it when I call GetBuffer without loosing any log messages that happen during the switch.
Since you have a single consumer, you can work your way with a simple ConcurrentQueue, without swapping collections:
public class LogCollectorTarget : TargetWithLayout, ILogCollector
{
private readonly ConcurrentQueue<string> _logMessageBuffer;
public LogCollectorTarget()
{
_logMessageBuffer = new ConcurrentQueue<string>();
}
protected override void Write(LogEventInfo logEvent)
{
var logMessage = Layout.Render(logEvent);
_logMessageBuffer.Enqueue(logMessage);
}
public string GetBuffer()
{
// How many messages should we dequeue?
var count = _logMessageBuffer.Count;
var messages = new StringBuilder();
while (count > 0 && _logMessageBuffer.TryDequeue(out var message))
{
messages.AppendLine(message);
count--;
}
return messages.ToString();
}
}
If memory allocations become an issue, you can instead dequeue them to a fixed-size array and call string.Join on it. This way, you're guaranteed to do only two allocations (whereas the StringBuilder could do many more if the initial buffer isn't properly sized):
public string GetBuffer()
{
// How many messages should we dequeue?
var count = _logMessageBuffer.Count;
var buffer = new string[count];
for (int i = 0; i < count; i++)
{
_logMessageBuffer.TryDequeue(out var message);
buffer[i] = message;
}
return string.Join(Environment.NewLine, buffer);
}
Is a ConcurrentBag the right tool for the job here?
Its the right tool for a job, this really depends on what you are trying to do, and why. The example you have given is very simplistic without any context so its hard to tell.
Is switching the bags the right way to achieve clearing the list for
new data points and then processing the old ones?
The answer is no, for probably many reasons. What happens if a thread writes to it, while you are switching it?
Is it safe to operate on oldBag or could I run into trouble when I
iterate over oldBag and a thread is still adding an item?
No, you have just copied the reference, this will achieve nothing.
Should I use Interlocked.Exchange() for switching the variables?
Interlock methods are great things, however this will not help you in your current problem, they are for thread safe access to integer type values. You are really confused and you need to look up more thread safe examples.
However Lets point you in the right direction. forget about ConcurrentBag and those fancy classes. My advice is start simple and use locking so you understand the nature of the problem.
If you want multiple tasks/threads to access a list, you can easily use the lock statement and guard access to the list/array so other nasty threads aren't modifying it.
Obviously the code you have written is a nonsensical example, i mean you are just adding consecutive numbers to a list, and getting another thread to average them them. This hardly needs to be consumer producer at all, and would make more sense to just be synchronous.
At this point i would point you to better architectures that would allow you to implement this pattern, e.g Tpl Dataflow, but i fear this is just a learning excise and unfortunately you really need to do more reading on multithreading and try more examples before we can truly help you with a problem.
It works with locks but it as they are quite expensive. I don't want to lock on every logging-operation in my program.
Acquiring an uncontended lock is actually quite cheap. Quoting from Joseph Albahari's book:
You can expect to acquire and release a lock in as little as 20 nanoseconds on a 2010-era computer if the lock is uncontended.
Locking becomes expensive when it is contended. You can minimize the contention by reducing the work inside the critical region to the absolute minimum. In other words don't do anything inside the lock that can be done outside the lock. In your second example the method GetBuffer does a String.Join inside the lock, delaying the release of the lock and increasing the chances of blocking other threads. You can improve it like this:
public string GetBuffer()
{
string[] messages;
lock (_logMessageBuffer)
{
messages = _logMessageBuffer.ToArray();
_logMessageBuffer.Clear();
}
return String.Join(Environment.NewLine, messages);
}
But it can be optimized even further. You could use the technique of your first example, and instead of clearing the existing List<string>, just swap it with a new list:
public string GetBuffer()
{
List<string> oldList;
lock (_logMessageBuffer)
{
oldList = _logMessageBuffer;
_logMessageBuffer = new();
}
return String.Join(Environment.NewLine, oldList);
}
Starting from .NET Core 3.0, the Monitor class has the property Monitor.LockContentionCount, that returns the number of times there was contention at the entry point of a lock. You could watch the delta of this property every second, and see if the number is concerning. If you get single-digit numbers, there is nothing to worry about.
Touching some of your questions:
Is a ConcurrentBag the right tool for the job here?
No. The ConcurrentBag<T> is a very specialized collection intended for mixed producer scenarios, mainly object pools. You don't have such a scenario here. A ConcurrentQueue<T> is preferable to a ConcurrentBag<T> in almost all scenarios.
Should I use Interlocked.Exchange() for switching the variables?
Only if the collection was immutable. If the _logMessageBuffer was an ImmutableQueue<T>, then it would be excellent to swap it with Interlocked.Exchange. With mutable types you have no idea if the old collection is still in use by another thread, and for how long. The operating system can suspend any thread at any time for a duration of 10-30 milliseconds or even more (demo). So it's not safe to use lock-free techniques. You have to lock.

How to ensure data synchronization across threads within a "safe" area (e.g not in a critical section) without locking everything

We are using a proprietary API that requires synchronization of data at some point.
I've thought about some ways of ensuring data consistency but am eager to get more input on better solutions.
Here is a long running Task outlining the API syncing
new Task(() =>
{
while(true)
{
// Here other threads can access any API object (that's fine)
API.CriticalOperationStart(); // Between start and end no API Object may be used
API.CriticalOperationEnd();
// Here other threads can access any API object (that's fine too)
}
}, TaskCreationOptions.LongRunning).Start();
This is a separate task that actually does some data syncing.
The area between Start and End is critical. No other API call may be done while the API is in this critical step.
Here are some non guarded Threads using distinct API Objects:
// multiple calls to different API objects should not be exclusive
OtherThread1
APIObject1.SetData(42);
OtherThread2
APIObject2.SetData(43);
Constraints:
No APIObject Method is allowed to be called during the API is in the critical section.
Both SetData calls are allowed to be done simultaneously. They do not interfere with each other, only with the critical section.
Generally speaking accessing one APIObject from multiple threads is not thread-safe but accessing multiple APIObjects does not interfere with the API except during critical section.
The critical section must never be executed while any APIObject Method is used.
Guarding access to one APIObject from multiple threads is not required.
The trivial approach
Use a lock Object and lock the critical section and every call to API Objects.
This would effectively work but creates many unnecessary locks because of the fact that then also only one APIObject at a time could be accessed too.
Concurrent container of Actions
Use a single instance of a concurrent container where each modification of an APIObject is placed into a thread safe container and is executed in the task above explicitly by traversing the container outside the critical section and calling all actions. (Not a Consumer pattern, as waiting for new entries of the container must not block the task since the critical section must be executed periodically)
This imposes some drawbacks. Closure issues when capturing contexts could be one. Another would be reading from an APIObject returns old data as long as the actions in the container are not executed. Even worse if the creation of an APIObject is put in the container and subsequent code assumes it has already be created.
Make something up with Wait Handles and atomic increments
Every APIObject access could be guarded with a ManualResetEvent. The critical section would wait for the signal to be set by the APIObjects, the signal would only be set when all calls to APIObjects have finished (some sort of atomic increments/decrement around accessing APIObjects).
Sounds like a great way for deadlocks. May lock out the critical section for long periods of time when continuous APIObject calls prevent the signal from being ever set.
Does not solve the problem that APIObjects may not be accessed during critical section since this construct only guards in the other direction.
Requires additional locking (e.g Monitor.IsEntered on the critical section to not lock out simultaneous calls to distinct APIObjects).
=> Awful way, making a complex situation even more complex
If copying an APIObject is relatively inexpensive (or if it's moderately expensive and you don't sync very often) then you can put the objects in a wrapper that contains a singleton global_timestamp and a local_timestamp. When you update an object you first check to see if global_timestamp == long.MaxValue: if true, then return a destructively updated object; if global_timestamp != long.MaxValue and global_timestamp == local_timestamp, then return a destructively updated object. However if global_timestamp != long.MaxValue and global_timestamp != local_timestamp then return an updated copy of the object and set local_timestamp = global_timestamp. When you perform a sync, use an Interlocked update to set global_timestamp = DateTime.UtcNow.ToBinary, and when the sync is complete set global_timestamp = long.MaxValue. This way the rest of the program doesn't have to pause while a sync is performed, and the sync should have consistent data.
// APIObject provided to you
public class APIObject {
private string foo;
public void setFoo(string _foo) {
this.foo = _foo;
}
}
// Global Timestamp, readonly version for wrappers and readwrite version for sync
public class GlobalTimestamp {
protected long timestamp = long.MaxValue;
public long getTimestamp() {
return timestamp;
}
}
public class GlobalTimestampRW extends GlobalTimestamp {
public void startSync(long _timestamp) {
long value = System.Threading.Interlocked.CompareExchange(ref timestamp, _timestamp, long.MaxValue);
if(value != long.MaxValue) throw exception; // somebody else called this method already
}
public void endSync(long _timestamp) {
long value = System.Threading.Interlocked.CompareExchange(ref timestamp, long.MaxValue, _timestamp);
if(value != _timestamp) throw exception; // somebody else called this method already
}
}
// Wrapper
public class APIWrapper {
private APIObject apiObject;
private GlobalTimestamp globalTimestamp;
private long localTimestamp = long.MinValue;
public APIObject setFoo(string _foo) {
long tempGlobalTimestamp = globalTimestamp.getTimestamp();
if(tempGlobalTimestamp == long.MaxValue || tempGlobalTimestamp == localTimestamp) {
apiObject.setFoo(_foo);
return apiObject;
} else {
apiObject = apiObject.copy();
apiObject.setFoo(_foo);
localTimestamp = tempGlobalTimestamp;
return apiObject;
}
}
}
GlobalTimestampRW globalTimestamp;
new Task(() =>
{
while(true)
{
long timestamp = DateTime.UtcNow.ToBinary();
globalTimestamp.startSync(timestamp);
API.CriticalOperationStart(); // Between start and end no API Object may be used
API.CriticalOperationEnd();
globalTimestamp.endSync(timestamp);
}
}, TaskCreationOptions.LongRunning).Start();

Get the next values I'm going to need on background threads before I need them

I'm hoping to find some advice on the best way to achieve fetching a bunch of id values (like a database Identity values) before I need them. I have a number of classes that require a unique id (int) and what I'd like to do is fetch the next available id (per class, per server) and have it cached locally ready. When an id is taken I want to get the next one ready etc.
I've produced some code to demonstrate what I am trying to do. The code is terrible (it should contain locks etc.) but I think it gets the point across. Losing the odd id is not a problem - a duplicate id is (a problem). I'm happy with the guts of GetNextIdAsync - it calls a proc
this.Database.SqlQuery<int>("EXEC EntityNextIdentityValue #Key",
new SqlParameter("Key", key))).First();
on SQL Server that uses sp_getapplock to ensure each return value is unique (and incremental).
static class ClassId
{
static private Dictionary<string, int> _ids = new Dictionary<string,int>();
static private Dictionary<string, Thread> _threads = new Dictionary<string,Thread>();
static ClassId()
{
//get the first NextId for all known classes
StartGetNextId("Class1");
StartGetNextId("Class2");
StartGetNextId("Class3");
}
static public int NextId(string key)
{
//wait for a current call for nextId to finish
while (_threads.ContainsKey(key)) { }
//get the current nextId
int nextId = _ids[key];
//start the call for the next nextId
StartGetNextId(key);
//return the current nextId
return nextId;
}
static private void StartGetNextId(string key)
{
_threads.Add(key, new Thread(() => GetNextIdAsync(key)));
_threads[key].Start();
}
static private void GetNextIdAsync(string key)
{
//call the long running task to get the next available value
Thread.Sleep(1000);
if (_ids.ContainsKey(key)) _ids[key] += 1;
else _ids.Add(key, 1);
_threads.Remove(key);
}
}
My question is - what is the best way to always have the next value I'm going to need before I need it? How should the class be arranged and where should the locks be? E.g. lock inside GetNextIdAsync() add the new thread but don't start it and change StartGetNextId() to call .Start()?
You should have your database generate the identity values, by marking that column appropriately. You can retrieve that value with SCOPE_IDENTITY or similar.
The main failings of your implementation are the busy wait in NextId and accessing the Dictionary simultaneously from multiple threads. The simplest solution would be to use a BlockingCollection like ohadsc suggests below. You'll need to anticipate the case where your database goes down and you can't get more id's - you don't want to deadlock your application. So you would want to use the Take() overload that accepts a ConcellationToken, which you would notify in the event that accessing the database fails.
This seems like a good application for a producer-consumer pattern.
I'm thinking something like:
private ConcurrentDictionary<string, int> _ids;
private ConcurrentDictionary<string, Thread> _threads;
private Task _producer;
private Task _consumer;
private CancellationTokenSource _cancellation;
private void StartProducer()
{
_producer = Task.Factory.StartNew(() =>
while (_cancellation.Token.IsCancellationRequested == false)
{
_ids.Add(GetNextKeyValuePair());
}
)
}
private void StartConsumer()
{
_consumer = Task.Factory.StartNew(() =>
while (_cancellation.Token.IsCancellationRequested == false)
{
UseNextId(id);
_ids.Remove(id);
}
)
}
A few things to point out...
Firstly, and you probably know this already, it's very important to use thread-safe collections like ConcurrentDictionary or BlockingCollection instead of plain Dictonary or List. If you don't do this, bad things will happen, people will die and babies will cry.
Second, you might need something a little less hamfisted than the basic CancellationTokenSource, that's just what I'm used to from my service programming. The point is to have some way to cancel these things so you can shut them down gracefully.
Thirdly, consider throwing sleeps in there to keep it from pounding the processor too hard.
The particulars of this will vary based on how fast you can generate these things as opposed to how fast you can consume them. My code gives absolutely no guarantee that you will have the ID you want before the consumer asks for it, if the consumer is running at a much higher speed than the producer. However, this is a decent, albeit basic way to organize preparing of this sort of data concurrently.
You could use a BlockingCollection for this. Basically you'll have a thread pumping new IDs into a buffer:
BlockingCollection<int> _queue = new BlockingCollection<int>(BufferSize);
void Init()
{
Task.Factory.StartNew(PopulateIdBuffer, TaskCreationOptions.LongRunning);
}
void PopulateIdBuffer()
{
int id = 0;
while (true)
{
Thread.Sleep(1000); //Simulate long retrieval
_queue.Add(id++);
}
}
void SomeMethodThatNeedsId()
{
var nextId = _queue.Take();
....
}

Efficient approach to multithreaded set difference

I have a finite set of consumer threads each consuming a job. Once they process the job, they have a list of subjobs that were listed in the consumed job. I need to add the subjobs from that list that I don't already have in the database. There are 3 million in the database, so getting the list of which ones aren't already in the database is slow. I don't mind each thread blocking on that call, but since I have a race condition (see code) I have to lock them all on that slow call, so they can only call that section one at a time and my program crawls. What can I do to fix this so the threads don't slow down for that call? I tried a queue, but since the threads are pushing out lists of jobs faster than the computer can determine which ones should be added to the database, I end up with a queue that keeps growing and never empties.
My code:
IEnumerable<string> getUniqueJobNames(IEnumerable<job> subJobs, int setID)
{
return subJobs.Select(el => el.name)
.Except(db.jobs.Where(el => el.set_ID==setID).Select(el => el.name));
}
//...consumer thread i
lock(lockObj)
{
var uniqueJobNames = getUniqueJobNames(consumedJob.subJobs, consumerSetID);
//if there was a context switch here to some thread i+1
// and that thread found uniqueJobs that also were found in thread i
// then there will be multiple copies of the same job added in the database.
// So I put this section in a lock to prevent that.
saveJobsToDatabase(uniqueJobName, consumerSetID);
}
//continue consumer thread i...
Rather than going back to the database to check for uniqueness of job names you could the relevant info into a lookup data structure into memory, which allows you to check the existence much faster:
Dictionary<int, HashSet<string>> jobLookup = db.jobs.GroupBy(i => i.set_ID)
.ToDictionary(i => i.Key, i => new HashSet<string>(i.Select(i => i.Name)));
This you only do once. Afterwards every time you need to check for uniqueness you use the lookup:
IEnumerable<string> getUniqueJobNames(IEnumerable<job> subJobs, int setID)
{
var existingJobs = jobLookup.ContainsKey(setID) ? jobLookup[setID] : new HashSet<string>();
return subJobs.Select(el => el.Name)
.Except(existingJobs);
}
If you need to enter a new sub job also add it to the lookup:
lock(lockObj)
{
var uniqueJobNames = getUniqueJobNames(consumedJob.subJobs, consumerSetID);
//if there was a context switch here to some thread i+1
// and that thread found uniqueJobs that also were found in thread i
// then there will be multiple copies of the same job added in the database.
// So I put this section in a lock to prevent that.
saveJobsToDatabase(uniqueJobName, consumerSetID);
if(!jobLookup.ContainsKey(newconsumerSetID))
{
jobLookup.Add(newconsumerSetID, new HashSet<string>(uniqueJobNames));
}
else
{
jobLookup[newconsumerSetID] = new HashSet<string>(jobLookup[newconsumerSetID].Concat(uniqueJobNames)));
}
}

How to access the underlying default concurrent queue of a blocking collection?

I have multiple producers and a single consumer. However if there is something in the queue that is not yet consumed a producer should not queue it again. (unique no duplicates blocking collection that uses the default concurrent queue)
if (!myBlockingColl.Contains(item))
myBlockingColl.Add(item)
However the blocking collection does not have a contains method nor does it provide any kind of TryPeek() like method. How can I access the underlying concurrent queue so I can do something like
if (!myBlockingColl.myConcurQ.trypeek(item)
myBlockingColl.Add(item)
In a tail spin?
This is an interesting question. This is the first time I have seen someone ask for a blocking queue that ignores duplicates. Oddly enough I could find nothing like what you want that already exists in the BCL. I say this is odd because BlockingCollection can accept a IProducerConsumerCollection as the underlying collection which has the TryAdd method that is advertised as being able to fail when duplicates are detected. The problem is that I see no concrete implementation of IProducerConsumerCollection that prevents duplicates. At least we can write our own.
public class NoDuplicatesConcurrentQueue<T> : IProducerConsumerCollection<T>
{
// TODO: You will need to fully implement IProducerConsumerCollection.
private Queue<T> queue = new Queue<T>();
public bool TryAdd(T item)
{
lock (queue)
{
if (!queue.Contains(item))
{
queue.Enqueue(item);
return true;
}
return false;
}
}
public bool TryTake(out T item)
{
lock (queue)
{
item = null;
if (queue.Count > 0)
{
item = queue.Dequeue();
}
return item != null;
}
}
}
Now that we have our IProducerConsumerCollection that does not accept duplicates we can use it like this:
public class Example
{
private BlockingCollection<object> queue = new BlockingCollection<object>(new NoDuplicatesConcurrentQueue<object>());
public Example()
{
new Thread(Consume).Start();
}
public void Produce(object item)
{
bool unique = queue.TryAdd(item);
}
private void Consume()
{
while (true)
{
object item = queue.Take();
}
}
}
You may not like my implementation of NoDuplicatesConcurrentQueue. You are certainly free to implement your own using ConcurrentQueue or whatever if you think you need the low-lock performance that the TPL collections provide.
Update:
I was able to test the code this morning. There is some good news and bad news. The good news is that this will technically work. The bad news is that you probably will not want to do this because BlockingCollection.TryAdd intercepts the return value from the underlying IProducerConsumerCollection.TryAdd method and throws an exception when false is detected. Yep, that is right. It does not return false like you would expect and instead generates an exception. I have to be honest, this is both surprising and ridiculous. The whole point of the TryXXX methods is that they should not throw exceptions. I am deeply disappointed.
In addition to the caveat Brian Gideon mentioned after Update, his solution suffers from these performance issues:
O(n) operations on the queue (queue.Contains(item)) have a severe impact on performance as the queue grows
locks limit concurrency (which he does mention)
The following code improves on Brian's solution by
using a hash set to do O(1) lookups
combining 2 data structures from the System.Collections.Concurrent namespace
N.B. As there is no ConcurrentHashSet, I'm using a ConcurrentDictionary, ignoring the values.
In this rare case it is luckily possible to simply compose a more complex concurrent data structure out of multiple simpler ones, without adding locks. The order of operations on the 2 concurrent data structures is important here.
public class NoDuplicatesConcurrentQueue<T> : IProducerConsumerCollection<T>
{
private readonly ConcurrentDictionary<T, bool> existingElements = new ConcurrentDictionary<T, bool>();
private readonly ConcurrentQueue<T> queue = new ConcurrentQueue<T>();
public bool TryAdd(T item)
{
if (existingElements.TryAdd(item, false))
{
queue.Enqueue(item);
return true;
}
return false;
}
public bool TryTake(out T item)
{
if (queue.TryDequeue(out item))
{
bool _;
existingElements.TryRemove(item, out _);
return true;
}
return false;
}
...
}
N.B. Another way at looking at this problem: You want a set that preserves the insertion order.
I would suggest implementing your operations with lock so that you don't read and write the item in a way that corrupts it, making them atomic. For example, with any IEnumerable:
object bcLocker = new object();
// ...
lock (bcLocker)
{
bool foundTheItem = false;
foreach (someClass nextItem in myBlockingColl)
{
if (nextItem.Equals(item))
{
foundTheItem = true;
break;
}
}
if (foundTheItem == false)
{
// Add here
}
}
How to access the underlying default concurrent queue of a blocking collection?
The BlockingCollection<T> is backed by a ConcurrentQueue<T> by default. In other words if you don't specify explicitly its backing storage, it will create a ConcurrentQueue<T> behind the scenes. Since you want to have direct access to the underlying storage, you can create manually a ConcurrentQueue<T> and pass it to the constructor of the BlockingCollection<T>:
ConcurrentQueue<Item> queue = new();
BlockingCollection<Item> collection = new(queue);
Unfortunately the ConcurrentQueue<T> collection doesn't have a TryPeek method with an input parameter, so what you intend to do is not possible:
if (!queue.TryPeek(item)) // Compile error, missing out keyword
collection.Add(item);
Also be aware that the queue is now owned by the collection. If you attempt to mutate it directly (by issuing Enqueue or TryDequeue commands), the collection will throw exceptions.

Categories