Related
In my project I'm using some static variables which I use for storing values during the running lifetime of the application. Now, 99% of the time I'm only reading these values but from time to time I also need to update them and this will happen from different threads.
When thinking about what might happen with two different threads trying to access the same property e.g. concurrent read/write, I started to conclude that some form of synchronization would needed in order to avoid unexpected values being returned between different process or some risk of race conditions.
In essence I needed to derive a single source of truth. I realize that some properties are atomic like booleans, but my methodology mostly applies for the purpose of strings.
One of the challenges is that these static variables are referenced in many places and between different classes, so I also had to figure out an efficient way to solve this challenge without lots of code re-write.
I've decided to use concurrent dictionaries:
public static readonly ConcurrentDictionary<string, string> AppRunTimeStringDictionary = new();
public static readonly ConcurrentDictionary<string, int> AppRunTimeIntegerDictionary = new();
public static readonly ConcurrentDictionary<string, bool> AppRunTimeBooleanDictionary = new();
In my program.cs file, during the earliest stages of startup I simply add all of the properties needed for the running app:
DeviceProvisioning.AppRunTimeBooleanDictionary.TryAdd("UseGpsReceiver", false);
DeviceProvisioning.AppRunTimeStringDictionary.TryAdd("Latitude", String.Empty);
DeviceProvisioning.AppRunTimeStringDictionary.TryAdd("Longitude", String.Empty);
Then in one of my classes I hard code these properties:
public static bool? UseGpsReceiver
{
get
{
if (AppRunTimeBooleanDictionary.TryGetValue("UseGpsReceiver", out var returnedValue))
return returnedValue;
return null;
}
}
public static string? Latitude
{
get
{
if (AppRunTimeStringDictionary.TryGetValue("Latitude", out var returnedValue))
return returnedValue;
return null;
}
}
public static string? Longitude
{
get
{
if (AppRunTimeStringDictionary.TryGetValue("Longitude", out var returnedValue))
return returnedValue;
return null;
}
}
Now for updating these properties, which happens rarely but will be done every now and then, I'm updating these in just one location i.e. using a single method. This way I can use this common method and simply add more prperties to the switch case over time.
public static void SetRunTimeSettings(string property, object value)
{
switch (property)
{
case "UseGpsReceiver":
// code block
if (AppRunTimeBooleanDictionary.TryGetValue("UseGpsReceiver", out var useGpsReceiver))
{ AppRunTimeBooleanDictionary.TryUpdate("UseGpsReceiver", (bool)value, useGpsReceiver); }
break;
case "Latitude":
// code block
if (AppRunTimeStringDictionary.TryGetValue("Latitude", out var latitude))
{ AppRunTimeStringDictionary.TryUpdate("Latitude", (string)value, latitude); }
break;
case "Longitude":
// code block
if (AppRunTimeStringDictionary.TryGetValue("Latitude", out var longitude))
{ AppRunTimeStringDictionary.TryUpdate("Latitude", (string)value, longitude); }
break;
}
}
If I want to update a property then I simply invoke the method as such:
MyClassName.SetRunTimeSettings("UseGpsReceiver", true);
MyClassName.SetRunTimeSettings("Latitude", "51.1234");
MyClassName.SetRunTimeSettings("Longitude", "51.5678");
Because the properties themselves are public static then I can use the getter from anywhere in the app.
From my initial testing, everything seems to work.
Perceived advantages in this approach:
Using a separate dictionary for each type of property collection i.e. strings/integers etc, means I can simply add more properties to the dictionary any time in the future without the need for referencing a model class in the dictionary, as opposed to the dictionary below:
public static readonly ConcurrentDictionary<string, myModelClass> AppRunTimeStringDictionary = new();
Use of the concurrent dictionary (my understanding) is that any process trying to read the property value from the dictionary will always get the latest value, if a property is being updated then I have less risk in reading an old value. Not such an issue for structured logging but if I was storing keys/secrets/connection strings or anything else, reading an old value might stop some process from being able to function correctly.
Using the concurrent dictionary means I don't have to hand craft my own locking mechanisms, which many people seem not to like doing.
Dictionary applies its own internal locks on the individual objects, so any property not being updated can still be read by other processes without much delay.
If the public static getter ever returned a null value, my thoughts are it would be better to return a null value rather than returning the wrong value. I could always implement some kind of polly or retry mechanism somewhere from the calling process, some short delay before trying to retrieve the property value again (by which time it should have been updated from the other thread that was currently updating it)
Appreciate there will be other ways to approach this, so really what I'm asking here is whether anyone sees any issue in my approach?
I'm not planning to add that many properties to each dictionary, I just want a way to ensure that reads and writes are happening with some form of synchronization and order.
Your SetRunTimeSettings is awful. It relies on methods that follow the Try* pattern, but it itself does not. Also doing a TryGetValue just to then be able to call TryUpdate is just throwing away all of the value of Try* operators anyway. It's a hack.
And you have a clear bug in the code for the "Longitude" case - you're updating "Latitude" inside.
I'd suggest going old school and just do this:
private static bool? _UseGpsReceiver;
private readonly static object _UseGpsReceiverLock = new();
public static bool? UseGpsReceiver
{
get { lock (_UseGpsReceiverLock) return _UseGpsReceiver; }
set { lock (_UseGpsReceiverLock) _UseGpsReceiver = value; }
}
private static string? _Latitude;
private readonly static object _LatitudeLock = new();
public static string? Latitude
{
get { lock (_LatitudeLock) return _Latitude; }
set { lock (_LatitudeLock) _Latitude = value; }
}
private static string? _Longitude;
private readonly static object _LongitudeLock = new();
public static string? Longitude
{
get { lock (_LongitudeLock) return _Longitude; }
set { lock (_LongitudeLock) _Longitude = value; }
}
If you don't want to repeat all of the locks then maybe a Locked<T> class might be of use:
public struct Locked<T>
{
public Locked(T value)
{
_value = value;
}
private T _value;
private readonly object _gate = new();
public T Value
{
get { lock (_gate) return _value; }
set { lock (_gate) _value = value; }
}
}
Then you can write this:
private static Locked<bool?> _UseGpsReceiver;
public static bool? UseGpsReceiver
{
get { return _UseGpsReceiver.Value; }
set { _UseGpsReceiver.Value = value; }
}
private static Locked<string?> _Latitude;
public static string? Latitude
{
get { return _Latitude.Value; }
set { _Latitude.Value = value; }
}
private static Locked<string?> _Longitude;
public static string? Longitude
{
get { return _Longitude.Value; }
set { _Longitude.Value = value; }
}
If you are only setting a single string / int / bool at a time, then you don't need to any thread safety. If you are assigning any single value smaller than a machine word, any reading thread will either see the before value or the after value.
However it looks like you intend to set three values at the same time;
MyClassName.SetRunTimeSettings("UseGpsReceiver", true);
MyClassName.SetRunTimeSettings("Latitude", "51.1234");
MyClassName.SetRunTimeSettings("Longitude", "51.5678");
And I assume you want any reader to see either the old values or the new values. In this case you would need some thread synchronisation around every read / write. Which your current code doesn't have.
You could instead store the three values in a class, then update the reference to that instance in one write operation.
public class GpsSettings{
public bool UseGpsReceiver { get; init; }
public double Latitude { get; init; }
public double Longitude { get; init; }
public static GpsSettings Current;
}
...
// write
GpsSettings.Current = new GpsSettings {
UseGpsReceiver = true,
Latitude = 51.1234,
Longitude = 51.5678
};
// read
var gps = GpsSettings.Current;
var location = $"{gps.Latitude}, {gps.Longitude}";
// but never do this;
var location = $"{GpsSettings.Current.Latitude}, {GpsSettings.Current.Longitude}";
Not everyone would agree with me on this one but my personal approach would be to have a single dictionary of the following type:
Dictionary<string, object>
Wrapped in a separate class with the following methods such as AddValue, GetValue, HasKey, HasValue, and UpdateValue with lock statements. Also notice that you'll have to use somewhat generic methods in order to be able to retrieve the value with the actual type and a default value. For example:
public static T GetValue<T>(string key, T defaultValue)
Also, I don't see a problem with your approach but if you want to synchronize things then you'll need n dedicated locks for n dictionaries which I don't think is a clean way; unless I'm missing something, and of course registering multiple dictionaries in design time can be a headache.
Alternatively to using multiple ConcurrentDictionary<string, T> collections, or a single ConcurrentDictionary<string, object>, or the Locked<T> struct shown in Enigmativity's answer, you could just store the values in immutable and recyclable Tuple<T> instances, and store these in private volatile fields:
private static volatile Tuple<bool?> _UseGpsReceiver;
public static bool? UseGpsReceiver
{
get { return _UseGpsReceiver?.Item1; }
set { _UseGpsReceiver = new(value); }
}
private static volatile Tuple<string> _Latitude;
public static string Latitude
{
get { return _Latitude?.Item1; }
set { _Latitude = new(value); }
}
private static volatile Tuple<string> _Longitude;
public static string Longitude
{
get { return _Longitude?.Item1; }
set { _Longitude = new(value); }
}
Pros: Both the reading and the writing are lock-free. An unlimited number of readers and writers can read and update the values at the same time, without contention.
Cons: Every time a value is updated, a new Tuple<T> is instantiated, adding pressure on the .NET garbage collector. This reduces the appeal of this approach in case the values are updated too frequently. Also if you have dozens of properties like these, it might be easy to introduce subtle bugs by omitting the important volatile keyword by mistake.
Edited the code to make it thread-safe post comments
Please see the updated question at the end.
Can you please help me understand if this code is thread-safe or how it can be made thread safe?
Setup
My system has a very simple class called WorkItem.
public class WorkItem
{
public int Id {get;set;}
public string Name {get;set;}
public DateTime DateCreated {get;set;}
public IList<object> CalculatedValues {get;set;}
}
There is an interface ICalculator which has a method that takes a work item, performs a calculation and returns true.
public interface ICalculator
{
bool Calculate(WorkItem WorkItem);
}
Let's say we have two implementations of ICalculator.
public class BasicCalculator: ICalculator
{
public bool Calculate(WorkItem WorkItem)
{
//calculate some value on the WorkItem and populate CalculatedValues property
return true;
}
}
Another calculator:
public class AnotherCalculator: ICalculator
{
public bool Calculate(WorkItem WorkItem)
{
//calculate some value on the WorkItem and populate CalculatedValues property
//some complex calculation on work item
if (somevalue==0) return false;
return true;
}
}
There is a calculator handler class. Its responsibility is to execute calculators sequentially.
public class CalculatorHandler
{
public bool ExecuteAllCalculators(WorkItem task, ICalculator[] calculators)
{
bool final = true;
//call all calculators in a loop
foreach(var calculator in calculators)
{
var calculatedValue = calculator.Calculate(WorkItem);
final = final && calculatedValue;
}
return final;
}
}
Finally, in my client class, I inject ICalculators[] which are relevant for the run. I then instantiate ExecuteCalculators() method.
Now I have a large number of work items and I want to perform calculations on them so I create a list of Task, where each task is responsible of instantiating CalculatorHandler instance and then takes a work item and performs calculations by doing a WaitAll() on all of the tasks, e.g.
public class Client
{
private ICalculators[] _myCalculators;
public Client(ICalculators[] calculators)
{
_myCalculators = calculators;
}
public void ExecuteCalculators()
{
var list = new List<Task>();
for(int i =0; i <10;i++)
{
Task task = new Task(() =>
var handler = new CalculatorHandler();
var WorkItem = new WorkItem(){
Id=i,
Name="TestTask",
DateCreated=DateTime.Now
};
var result = handler.ExecuteAllCalculators(WorkItem, _myCalculators);
);
list.Add(task);
}
Task.WaitAll(list);
}
}
This is a simplied version of the system. Actual system has a range of calculators and Calculators and CalculatorHandler are injected via IoC etc.
My questions are - help me understand these points:
Each task creates a new instance of CalculatorHandler. Does this
mean anything that happens in CalculatorHandler is thread safe as it
does not have any public properties and simply loops over
calculators?
Calculators are shared amongst all tasks because they are member variable of Client class but they are passed into
CalculatorHandler which is instantiated for each task. Does it mean that when all tasks run, as new
instance of CalculatorHandler is created therefore Calculators are
automatically thread safe and we will not experience any threading issues e.g. deadlocks etc?
Can you please suggest how I can make the code threadsafe? Is it
best to pass in a Func<'ICalculators>'[] to Client class and then within each task, we can execute Func<'ICalculator'>() and then pass those instances to ICalculator there? Func<'ICalculator'> will return instance of ICalculator.
Is it true that calculators are passed in as private method variable therefore other instances of CalulatorHandler cannot run the same instance of calculator? Or because calculators are reference types, we are bound to get multi thread issues?
Update
Can you please help me understand if this updated code is thread-safe or how it can be made thread safe?
Setup
My system has a very simple class called WorkItem. It has getter public properties except 1 property "CalculatedValues".
public class WorkItem
{
public int Id {get;}
public string Name {get;}
public DateTime DateCreated {get;}
public IList<object> CalculatedValues {get;set;}
public WorkItem(int id, string name, DateTime dateCreated)
{
Id = id,
Name = name,
DateCreated = dateCreated
}
}
There is an interface ICalculator which has a method that takes a work item, performs a calculation and returns a IList. It does not change the state of work item.
public interface ICalculator
{
IList<object> Calculate(WorkItem WorkItem);
}
Let's say we have two implementations of ICalculator.
public class BasicCalculator: ICalculator
{
public IList<object>Calculate(WorkItem WorkItem)
{
//calculate some value and return List<object>
return List<object>{"A", 1};
}
}
Another calculator:
public class AnotherCalculator: ICalculator
{
public bool Calculate(WorkItem WorkItem)
{
//calculate some value and return List<object>
return List<object>{"A", 1, workItem.Name};
}
}
There is a calculator handler class. Its responsibility is to execute calculators sequentially. Note, it takes in ICalculators in its constructor when it is instantiated. It has a private static lock object too when it updates work item instance.
public class CalculatorHandler
{
private ICalculators[] _calculators;
public CalculatorHandler(ICalculators[] calculators)
{
_calculators = calculators;
}
//static lock
private static object _lock = new object();
public bool ExecuteAllCalculators(WorkItem workItem, ICalculator[] calculators)
{
bool final = true;
//call all calculators in a loop
foreach(var calculator in calculators)
{
var calculatedValues = calculator.Calculate(workItem);
//within a lock, work item is updated
lock(_lock)
{
workItem.CalculatedValues = calculatedValues;
}
}
return final;
}
}
Finally, in my client class, I execute CalculatorHandler.
Now I have a large number of work items and I want to perform calculations on them so I create a list of Task, where each task is responsible of instantiating CalculatorHandler instance and then takes a work item and performs calculations by doing a WaitAll() on all of the tasks, e.g.
public class Client
{
public void ExecuteCalculators()
{
var list = new List<Task>();
for(int i =0; i <10;i++)
{
Task task = new Task(() =>
//new handler instance and new calculator instances
var handler = new CalculatorHandler(new[]{
new BasicCalculator(), new AnotherCalculator()
});
var WorkItem = new WorkItem(
i,
"TestTask",
DateTime.Now
};
var result = handler.ExecuteAllCalculators(WorkItem);
);
list.Add(task);
}
Task.WaitAll(list);
}
}
This is a simplied version of the system. Actual system has a range of calculators and Calculators and CalculatorHandler are injected via IoC etc.
My questions are - help me understand these points:
Each task creates a new instance of CalculatorHandler and new instances of ICalculators. Calculators do not perform any I/O operations and only create a new private IList. Is calculator handler and calculator instances now thread safe?
CalculatorHandler updates work item but within a lock. Lock is a static private object. Does it mean all instances of CalculatorHandler will share one single lock and therefore at one point, only one thread can update the work item?
Work item has all public getter properties except its CalculatedValues property. CalculatedValues is only set within a static lock. Is this code now thread-safe?
1) Creating a new instance of a class, even one without public properties does not provide any guarantee of thread safety. The problem is that ExecuteAllCalculators takes two object parameters. The WorkItem object contains mutable properties and the same WorkItem object is used for all ICalculator calls. Suppose one of the calculators decides to call Clear() on WorkItem.CalculatedValues. Or suppose one calculator sets WorkItem.Name to null and the next decides to do a WorkItem.Name.Length. This isn't technically a "threading" issue because those problems can occur without multiple threads involved.
2) Calculator objects shared across threads is definitely not thread safe. Suppose one of the calculator instances uses a class level variable. Unless that variable is somehow thread protected (example: lock {...}), then it would be possible to produce inconsistent results. Depending how "creative" the implementer of the calculator instances were a deadlock could be possible.
3) Any time your code accepts interfaces you are inviting people to "play in your sandbox". It allows code that you have little control of to be executed. One of the best ways to handle this is to use immutable objects. Unfortunately, you can't change the WorkItem definition without breaking your interface contract.
4) Calculators are passed by reference. The code shows that _myCalculators is shared across all tasks created. This doesn't guarantee that you will have problems, it only makes it possible that you might have problems.
No, it is not thread-safe. If there is any shared state in any calculation then the it is possible to have threading issues. The only way to avoid threading issues is to ensure you are not updating any shared state. That means read-only objects and/or using "pure" functions.
You've used the word "shared" - that means not thread-safe by virtue of sharing state. Unless you mean "distributed" rather than "shared".
Exclusively use read-only objects.
They are reference types so they may be shared amongst separate threads - hence not thread-safe - unless they are read-only.
Here's an example of a read-only object:
public sealed class WorkItem : IEquatable<WorkItem>
{
private readonly int _id;
private readonly string _name;
private readonly DateTime _dateCreated;
public int Id { get { return _id; } }
public string Name { get { return _name; } }
public DateTime DateCreated { get { return _dateCreated; } }
public WorkItem(int id, string name, DateTime dateCreated)
{
_id = id;
_name = name;
_dateCreated = dateCreated;
}
public override bool Equals(object obj)
{
if (obj is WorkItem)
return Equals((WorkItem)obj);
return false;
}
public bool Equals(WorkItem obj)
{
if (obj == null) return false;
if (!EqualityComparer<int>.Default.Equals(_id, obj._id)) return false;
if (!EqualityComparer<string>.Default.Equals(_name, obj._name)) return false;
if (!EqualityComparer<DateTime>.Default.Equals(_dateCreated, obj._dateCreated)) return false;
return true;
}
public override int GetHashCode()
{
int hash = 0;
hash ^= EqualityComparer<int>.Default.GetHashCode(_id);
hash ^= EqualityComparer<string>.Default.GetHashCode(_name);
hash ^= EqualityComparer<DateTime>.Default.GetHashCode(_dateCreated);
return hash;
}
public override string ToString()
{
return String.Format("{{ Id = {0}, Name = {1}, DateCreated = {2} }}", _id, _name, _dateCreated);
}
public static bool operator ==(WorkItem left, WorkItem right)
{
if (object.ReferenceEquals(left, null))
{
return object.ReferenceEquals(right, null);
}
return left.Equals(right);
}
public static bool operator !=(WorkItem left, WorkItem right)
{
return !(left == right);
}
}
Once created it can't be modified so thread-safety is no longer an issue.
Now, if I can assume that each ICalculator is also implemented without state, and thus is a pure function, then the calculation is thread-safe. However, there is nothing in your question that let's me know that I can make this assumption. There is no way, because of that, that anyone can tell you that your code is thread-safe.
So, given the read-only WorkItem and the pure ICalculator function, then the rest of your code then looks like it would be perfectly fine.
Given a Queue<MyMessage>, where MyMessage is the base class for some types of messages: all message types have different fields, so they will use a different amount of bytes. Therefore it would make sense to measure the fill level of this queue in terms of bytes rather than of elements present in the queue.
In fact, since this queue is associated with a connection, I could better control the message flow, reducing the traffic if the queue is nearly full.
In order to get this target, I thought to wrap a simple Queue with a custom class MyQueue.
public class MyQueue
{
private Queue<MyMessage> _outputQueue;
private Int32 _byteCapacity;
private Int32 _currentSize; // number of used bytes
public MyQueue(int byteCapacity)
{
this._outputQueue = new Queue<MyMessage>();
this._byteCapacity = byteCapacity;
this._currentSize = 0;
}
public void Enqueue(MyMessage msg)
{
this._outputQueue.Enqueue(msg);
this._currentSize += Marshal.SizeOf(msg.GetType());
}
public MyMessage Dequeue()
{
MyMessage result = this._outputQueue.Dequeue();
this._currentSize -= Marshal.SizeOf(result.GetType());
return result;
}
}
The problem is that this is not good for classes, because Marshal.SizeOf throws an ArgumentException exception.
Is it possible to calculate in some way the size of an object (instance of a class)?
Are there some alternatives to monitor the fill level of a queue in terms of bytes?
Are there any queues that can be managed in this way?
UPDATE: As an alternative solution I could add a method int SizeBytes() on each message type, but this solution seems a little ugly, although it would perhaps be the most efficient since You cannot easily measure a reference type.
public interface MyMessage
{
Guid Identifier
{
get;
set;
}
int SizeBytes();
}
The classes that implement this interface must, in addition to implementing the SizeBytes() method, also implement an Identifier property.
public class ExampleMessage
{
public Guid Identifier { get; set; } // so I have a field and its Identifier property
public String Request { get; set; }
public int SizeBytes()
{
return (Marshal.SizeOf(Identifier)); // return 16
}
}
The sizeof operator can not be used with Guid because it does not have a predefined size, so I use Marshal.SizeOf(). But at this point perhaps I should use the experimentally determined values: for example, since Marshal.SizeOf() returns 16 for a Guid and since a string consists of N char, then the SizeBytes() method could be as following:
public int SizeBytes()
{
return (16 + Request.Length * sizeof(char));
}
If you could edit the MyMessage base class with a virtual method SizeOf(), then you could have the message classes use the c# sizeof operator on its primitive types. If you can do that, the rest of your code is gold.
You can get an indication of the size of your objects by measuring the length of their binary serialization. Note that this figure will typically be higher than you expect, since .NET may also include metadata in the serialized representation. This approach would also require all your classes to be marked with the [Serializable] attribute.
public static long GetSerializedSize(object root)
{
using (var memoryStream = new MemoryStream())
{
var binaryFormatter = new BinaryFormatter();
binaryFormatter.Serialize(memoryStream, root);
return memoryStream.Length;
}
}
I need to keep track of a sequential Id, this is being returned to me via a SP which is doing a max(id) across 4 tables, there is no identifer/sequence in the db which is managing the sequence. This will obviously have concurrency issues so i created a helper class to ensure unique Id's are always generated.
The helper is initialised via its repository, which initially calls the DB to find the current Id, all subsequent requests for an Id are serviced in memory via the helper. There will only ever be 1 app using the DB (mine) so i dont need to worry about someone else coming along and creating transactions & throwing the Id out of sync. I think ive got the basics of thread-saftey but im worried about a race condition when the helper is initialised, can someone please advise :)
private class TransactionIdProvider
{
private static readonly object Accesslock = new object();
private int _transactionId;
public int NextId
{
get
{
lock (Accesslock)
{
if(!Initialised) throw new Exception("Must Initialise with id first!!");
return _transactionId++;
}
}
}
public bool Initialised { get; private set; }
public void SetId(int id)
{
lock (Accesslock)
{
if (Initialised) return;
_transactionId = id;
Initialised = true;
}
}
public TransactionIdProvider()
{
Initialised = false;
}
}
The helper class is initialised in a repository:
private static readonly TransactionIdProvider IdProvider = new TransactionIdProvider();
public int GetNextTransactionId()
{
if(!IdProvider.Initialised)
{
// Ask the DB
int? id = _context.GetNextTransactionId().First();
if (!id.HasValue)
throw new Exception("No transaction Id returned");
IdProvider.SetId(id.Value);
}
return IdProvider.NextId;
}
It is thread-safe, but it's unnecessarily slow.
You don't need a lock to just increment a number; instead, you can use atomic math.
Also, you're sharing the lock across all instances (it's static), which is unnecessary. (There's nothing wrong with having two different instances run at once)
Finally, (IMHO) there is no point in having a separate uninitialized state.
I would write it like this:
class TransactionIdProvider {
private int nextId;
public TransactionIdProvider(int id) {
nextId = value;
}
public int GetId() {
return Interlocked.Increment(ref nextId);
}
}
Yes it is thread-safe; however, IMO the lock is too global - a static lock to protect instance data smacks a bit of overkill.
Also, NextId as a property is bad - it changes state, so should be a method.
You might also prefer Interlocked.Increment over a lock, although that changes most of the class.
Finally, in the SetId - if it is already initialised I would throw an exception (InvalidOperationException) rather than blindly ignore the call - that sounds like an error. Of course, that then introduces a tricky interval between checking Initialized and calling SetId - you could just hAve SetId return true if it made the change, and false if it turned out to be initialized at the point of set, but SLaks' approach is nicer.
I don't think this is a good idea, you should find another way to deal with this.
Usually when really unique ids are required and there is no a way computationally valid to check if the id is used, i would use a GUID.
However you can just use interlocked operations instead of locking, you can do it without locking at all.
Look for Interlocked.Increment, Interlocked.Exchange and Interlocked.CompareExchange
private class TransactionIdProvider
{
private volatile int _initialized;
private int _transactionId;
public int NextId
{
get
{
for (;;)
{
switch (_initialized)
{
case 0: throw new Exception("Not initialized");
case 1: return Interlocked.Increment(ref _transactionId);
default: Thread.Yield();
}
}
}
}
public void SetId(int id)
{
if (Interlocked.CompareExchange(ref _initialized, -1, 0) == 0)
{
Interlocked.Exchange(ref _transactionId, id);
Interlocked.Exchange(ref _initialized, 1);
}
}
}
This will give you a warning, but it is normal and is also reported in C# documentation as legal. So ignore that warning with a nice pragma:
// Disable warning "A reference to a volatile field will not be treated as volatile"
#pragma warning disable 0420
If you don't need to check for IsInitialized you can do it in the simplest possible way:
public int NextId()
{
return Interlocked.Increment(ref _transactionId);
}
public void Set(int value)
{
Interlocked.Exchange(ref _transactionId, value);
}
Does anyone have a good resource on implementing a shared object pool strategy for a limited resource in vein of Sql connection pooling? (ie would be implemented fully that it is thread safe).
To follow up in regards to #Aaronaught request for clarification the pool usage would be for load balancing requests to an external service. To put it in a scenario that would probably be easier to immediately understand as opposed to my direct situtation. I have a session object that functions similarly to the ISession object from NHibernate. That each unique session manages it's connection to the database. Currently I have 1 long running session object and am encountering issues where my service provider is rate limiting my usage of this individual session.
Due to their lack of expectation that a single session would be treated as a long running service account they apparently treat it as a client that is hammering their service. Which brings me to my question here, instead of having 1 individual session I would create a pool of different sessions and split the requests up to the service across those multiple sessions instead of creating a single focal point as I was previously doing.
Hopefully that background offers some value but to directly answer some of your questions:
Q: Are the objects expensive to create?
A: No objects are a pool of limited resources
Q: Will they be acquired/released very frequently?
A: Yes, once again they can be thought of NHibernate ISessions where 1 is usually acquired and released for the duration of every single page request.
Q: Will a simple first-come-first-serve suffice or do you need something more intelligent, i.e. that would prevent starvation?
A: A simple round robin type distribution would suffice, by starvation I assume you mean if there are no available sessions that callers become blocked waiting for releases. This isn't really applicable since the sessions can be shared by different callers. My goal is distribute the usage across multiple sessions as opposed to 1 single session.
I believe this is probably a divergence from a normal usage of an object pool which is why I originally left this part out and planned just to adapt the pattern to allow sharing of objects as opposed to allowing a starvation situation to ever occur.
Q: What about things like priorities, lazy vs. eager loading, etc.?
A: There is no prioritization involved, for simplicity's sake just assume that I would create the pool of available objects at the creation of the pool itself.
This question is a little trickier than one might expect due to several unknowns: The behaviour of the resource being pooled, the expected/required lifetime of objects, the real reason that the pool is required, etc. Typically pools are special-purpose - thread pools, connection pools, etc. - because it is easier to optimize one when you know exactly what the resource does and more importantly have control over how that resource is implemented.
Since it's not that simple, what I've tried to do is offer up a fairly flexible approach that you can experiment with and see what works best. Apologies in advance for the long post, but there is a lot of ground to cover when it comes to implementing a decent general-purpose resource pool. and I'm really only scratching the surface.
A general-purpose pool would have to have a few main "settings", including:
Resource loading strategy - eager or lazy;
Resource loading mechanism - how to actually construct one;
Access strategy - you mention "round robin" which is not as straightforward as it sounds; this implementation can use a circular buffer which is similar, but not perfect, because the pool has no control over when resources are actually reclaimed. Other options are FIFO and LIFO; FIFO will have more of a random-access pattern, but LIFO makes it significantly easier to implement a Least-Recently-Used freeing strategy (which you said was out of scope, but it's still worth mentioning).
For the resource loading mechanism, .NET already gives us a clean abstraction - delegates.
private Func<Pool<T>, T> factory;
Pass this through the pool's constructor and we're about done with that. Using a generic type with a new() constraint works too, but this is more flexible.
Of the other two parameters, the access strategy is the more complicated beast, so my approach was to use an inheritance (interface) based approach:
public class Pool<T> : IDisposable
{
// Other code - we'll come back to this
interface IItemStore
{
T Fetch();
void Store(T item);
int Count { get; }
}
}
The concept here is simple - we'll let the public Pool class handle the common issues like thread-safety, but use a different "item store" for each access pattern. LIFO is easily represented by a stack, FIFO is a queue, and I've used a not-very-optimized-but-probably-adequate circular buffer implementation using a List<T> and index pointer to approximate a round-robin access pattern.
All of the classes below are inner classes of the Pool<T> - this was a style choice, but since these really aren't meant to be used outside the Pool, it makes the most sense.
class QueueStore : Queue<T>, IItemStore
{
public QueueStore(int capacity) : base(capacity)
{
}
public T Fetch()
{
return Dequeue();
}
public void Store(T item)
{
Enqueue(item);
}
}
class StackStore : Stack<T>, IItemStore
{
public StackStore(int capacity) : base(capacity)
{
}
public T Fetch()
{
return Pop();
}
public void Store(T item)
{
Push(item);
}
}
These are the obvious ones - stack and queue. I don't think they really warrant much explanation. The circular buffer is a little more complicated:
class CircularStore : IItemStore
{
private List<Slot> slots;
private int freeSlotCount;
private int position = -1;
public CircularStore(int capacity)
{
slots = new List<Slot>(capacity);
}
public T Fetch()
{
if (Count == 0)
throw new InvalidOperationException("The buffer is empty.");
int startPosition = position;
do
{
Advance();
Slot slot = slots[position];
if (!slot.IsInUse)
{
slot.IsInUse = true;
--freeSlotCount;
return slot.Item;
}
} while (startPosition != position);
throw new InvalidOperationException("No free slots.");
}
public void Store(T item)
{
Slot slot = slots.Find(s => object.Equals(s.Item, item));
if (slot == null)
{
slot = new Slot(item);
slots.Add(slot);
}
slot.IsInUse = false;
++freeSlotCount;
}
public int Count
{
get { return freeSlotCount; }
}
private void Advance()
{
position = (position + 1) % slots.Count;
}
class Slot
{
public Slot(T item)
{
this.Item = item;
}
public T Item { get; private set; }
public bool IsInUse { get; set; }
}
}
I could have picked a number of different approaches, but the bottom line is that resources should be accessed in the same order that they were created, which means that we have to maintain references to them but mark them as "in use" (or not). In the worst-case scenario, only one slot is ever available, and it takes a full iteration of the buffer for every fetch. This is bad if you have hundreds of resources pooled and are acquiring and releasing them several times per second; not really an issue for a pool of 5-10 items, and in the typical case, where resources are lightly used, it only has to advance one or two slots.
Remember, these classes are private inner classes - that is why they don't need a whole lot of error-checking, the pool itself restricts access to them.
Throw in an enumeration and a factory method and we're done with this part:
// Outside the pool
public enum AccessMode { FIFO, LIFO, Circular };
private IItemStore itemStore;
// Inside the Pool
private IItemStore CreateItemStore(AccessMode mode, int capacity)
{
switch (mode)
{
case AccessMode.FIFO:
return new QueueStore(capacity);
case AccessMode.LIFO:
return new StackStore(capacity);
default:
Debug.Assert(mode == AccessMode.Circular,
"Invalid AccessMode in CreateItemStore");
return new CircularStore(capacity);
}
}
The next problem to solve is loading strategy. I've defined three types:
public enum LoadingMode { Eager, Lazy, LazyExpanding };
The first two should be self-explanatory; the third is sort of a hybrid, it lazy-loads resources but doesn't actually start re-using any resources until the pool is full. This would be a good trade-off if you want the pool to be full (which it sounds like you do) but want to defer the expense of actually creating them until first access (i.e. to improve startup times).
The loading methods really aren't too complicated, now that we have the item-store abstraction:
private int size;
private int count;
private T AcquireEager()
{
lock (itemStore)
{
return itemStore.Fetch();
}
}
private T AcquireLazy()
{
lock (itemStore)
{
if (itemStore.Count > 0)
{
return itemStore.Fetch();
}
}
Interlocked.Increment(ref count);
return factory(this);
}
private T AcquireLazyExpanding()
{
bool shouldExpand = false;
if (count < size)
{
int newCount = Interlocked.Increment(ref count);
if (newCount <= size)
{
shouldExpand = true;
}
else
{
// Another thread took the last spot - use the store instead
Interlocked.Decrement(ref count);
}
}
if (shouldExpand)
{
return factory(this);
}
else
{
lock (itemStore)
{
return itemStore.Fetch();
}
}
}
private void PreloadItems()
{
for (int i = 0; i < size; i++)
{
T item = factory(this);
itemStore.Store(item);
}
count = size;
}
The size and count fields above refer to the maximum size of the pool and the total number of resources owned by the pool (but not necessarily available), respectively. AcquireEager is the simplest, it assumes that an item is already in the store - these items would be preloaded at construction, i.e. in the PreloadItems method shown last.
AcquireLazy checks to see if there are free items in the pool, and if not, it creates a new one. AcquireLazyExpanding will create a new resource as long as the pool hasn't reached its target size yet. I've tried to optimize this to minimize locking, and I hope I haven't made any mistakes (I have tested this under multi-threaded conditions, but obviously not exhaustively).
You might be wondering why none of these methods bother checking to see whether or not the store has reached the maximum size. I'll get to that in a moment.
Now for the pool itself. Here is the full set of private data, some of which has already been shown:
private bool isDisposed;
private Func<Pool<T>, T> factory;
private LoadingMode loadingMode;
private IItemStore itemStore;
private int size;
private int count;
private Semaphore sync;
Answering the question I glossed over in the last paragraph - how to ensure we limit the total number of resources created - it turns out that the .NET already has a perfectly good tool for that, it's called Semaphore and it's designed specifically to allow a fixed number of threads access to a resource (in this case the "resource" is the inner item store). Since we're not implementing a full-on producer/consumer queue, this is perfectly adequate for our needs.
The constructor looks like this:
public Pool(int size, Func<Pool<T>, T> factory,
LoadingMode loadingMode, AccessMode accessMode)
{
if (size <= 0)
throw new ArgumentOutOfRangeException("size", size,
"Argument 'size' must be greater than zero.");
if (factory == null)
throw new ArgumentNullException("factory");
this.size = size;
this.factory = factory;
sync = new Semaphore(size, size);
this.loadingMode = loadingMode;
this.itemStore = CreateItemStore(accessMode, size);
if (loadingMode == LoadingMode.Eager)
{
PreloadItems();
}
}
Should be no surprises here. Only thing to note is the special-casing for eager loading, using the PreloadItems method already shown earlier.
Since almost everything's been cleanly abstracted away by now, the actual Acquire and Release methods are really very straightforward:
public T Acquire()
{
sync.WaitOne();
switch (loadingMode)
{
case LoadingMode.Eager:
return AcquireEager();
case LoadingMode.Lazy:
return AcquireLazy();
default:
Debug.Assert(loadingMode == LoadingMode.LazyExpanding,
"Unknown LoadingMode encountered in Acquire method.");
return AcquireLazyExpanding();
}
}
public void Release(T item)
{
lock (itemStore)
{
itemStore.Store(item);
}
sync.Release();
}
As explained earlier, we're using the Semaphore to control concurrency instead of religiously checking the status of the item store. As long as acquired items are correctly released, there's nothing to worry about.
Last but not least, there's cleanup:
public void Dispose()
{
if (isDisposed)
{
return;
}
isDisposed = true;
if (typeof(IDisposable).IsAssignableFrom(typeof(T)))
{
lock (itemStore)
{
while (itemStore.Count > 0)
{
IDisposable disposable = (IDisposable)itemStore.Fetch();
disposable.Dispose();
}
}
}
sync.Close();
}
public bool IsDisposed
{
get { return isDisposed; }
}
The purpose of that IsDisposed property will become clear in a moment. All the main Dispose method really does is dispose the actual pooled items if they implement IDisposable.
Now you can basically use this as-is, with a try-finally block, but I'm not fond of that syntax, because if you start passing around pooled resources between classes and methods then it's going to get very confusing. It's possible that the main class that uses a resource doesn't even have a reference to the pool. It really becomes quite messy, so a better approach is to create a "smart" pooled object.
Let's say we start with the following simple interface/class:
public interface IFoo : IDisposable
{
void Test();
}
public class Foo : IFoo
{
private static int count = 0;
private int num;
public Foo()
{
num = Interlocked.Increment(ref count);
}
public void Dispose()
{
Console.WriteLine("Goodbye from Foo #{0}", num);
}
public void Test()
{
Console.WriteLine("Hello from Foo #{0}", num);
}
}
Here's our pretend disposable Foo resource which implements IFoo and has some boilerplate code for generating unique identities. What we do is to create another special, pooled object:
public class PooledFoo : IFoo
{
private Foo internalFoo;
private Pool<IFoo> pool;
public PooledFoo(Pool<IFoo> pool)
{
if (pool == null)
throw new ArgumentNullException("pool");
this.pool = pool;
this.internalFoo = new Foo();
}
public void Dispose()
{
if (pool.IsDisposed)
{
internalFoo.Dispose();
}
else
{
pool.Release(this);
}
}
public void Test()
{
internalFoo.Test();
}
}
This just proxies all of the "real" methods to its inner IFoo (we could do this with a Dynamic Proxy library like Castle, but I won't get into that). It also maintains a reference to the Pool that creates it, so that when we Dispose this object, it automatically releases itself back to the pool. Except when the pool has already been disposed - this means we are in "cleanup" mode and in this case it actually cleans up the internal resource instead.
Using the approach above, we get to write code like this:
// Create the pool early
Pool<IFoo> pool = new Pool<IFoo>(PoolSize, p => new PooledFoo(p),
LoadingMode.Lazy, AccessMode.Circular);
// Sometime later on...
using (IFoo foo = pool.Acquire())
{
foo.Test();
}
This is a very good thing to be able to do. It means that the code which uses the IFoo (as opposed to the code which creates it) does not actually need to be aware of the pool. You can even inject IFoo objects using your favourite DI library and the Pool<T> as the provider/factory.
I've put the complete code on PasteBin for your copy-and-pasting enjoyment. There's also a short test program you can use to play around with different loading/access modes and multithreaded conditions, to satisfy yourself that it's thread-safe and not buggy.
Let me know if you have any questions or concerns about any of this.
Object Pooling in .NET Core
The dotnet core has an implementation of object pooling added to the base class library (BCL). You can read the original GitHub issue here and view the code for System.Buffers. Currently the ArrayPool is the only type available and is used to pool arrays. There is a nice blog post here.
namespace System.Buffers
{
public abstract class ArrayPool<T>
{
public static ArrayPool<T> Shared { get; internal set; }
public static ArrayPool<T> Create(int maxBufferSize = <number>, int numberOfBuffers = <number>);
public T[] Rent(int size);
public T[] Enlarge(T[] buffer, int newSize, bool clearBuffer = false);
public void Return(T[] buffer, bool clearBuffer = false);
}
}
An example of its usage can be seen in ASP.NET Core. Because it is in the dotnet core BCL, ASP.NET Core can share it's object pool with other objects such as Newtonsoft.Json's JSON serializer. You can read this blog post for more information on how Newtonsoft.Json is doing this.
Object Pooling in Microsoft Roslyn C# Compiler
The new Microsoft Roslyn C# compiler contains the ObjectPool type, which is used to pool frequently used objects which would normally get new'ed up and garbage collected very often. This reduces the amount and size of garbage collection operations which have to happen. There are a few different sub-implementations all using ObjectPool (See: Why are there so many implementations of Object Pooling in Roslyn?).
1 - SharedPools - Stores a pool of 20 objects or 100 if the BigDefault is used.
// Example 1 - In a using statement, so the object gets freed at the end.
using (PooledObject<Foo> pooledObject = SharedPools.Default<List<Foo>>().GetPooledObject())
{
// Do something with pooledObject.Object
}
// Example 2 - No using statement so you need to be sure no exceptions are not thrown.
List<Foo> list = SharedPools.Default<List<Foo>>().AllocateAndClear();
// Do something with list
SharedPools.Default<List<Foo>>().Free(list);
// Example 3 - I have also seen this variation of the above pattern, which ends up the same as Example 1, except Example 1 seems to create a new instance of the IDisposable [PooledObject<T>][4] object. This is probably the preferred option if you want fewer GC's.
List<Foo> list = SharedPools.Default<List<Foo>>().AllocateAndClear();
try
{
// Do something with list
}
finally
{
SharedPools.Default<List<Foo>>().Free(list);
}
2 - ListPool and StringBuilderPool - Not strictly separate implementations but wrappers around the SharedPools implementation shown above specifically for List and StringBuilder's. So this re-uses the pool of objects stored in SharedPools.
// Example 1 - No using statement so you need to be sure no exceptions are thrown.
StringBuilder stringBuilder= StringBuilderPool.Allocate();
// Do something with stringBuilder
StringBuilderPool.Free(stringBuilder);
// Example 2 - Safer version of Example 1.
StringBuilder stringBuilder= StringBuilderPool.Allocate();
try
{
// Do something with stringBuilder
}
finally
{
StringBuilderPool.Free(stringBuilder);
}
3 - PooledDictionary and PooledHashSet - These use ObjectPool directly and have a totally separate pool of objects. Stores a pool of 128 objects.
// Example 1
PooledHashSet<Foo> hashSet = PooledHashSet<Foo>.GetInstance()
// Do something with hashSet.
hashSet.Free();
// Example 2 - Safer version of Example 1.
PooledHashSet<Foo> hashSet = PooledHashSet<Foo>.GetInstance()
try
{
// Do something with hashSet.
}
finally
{
hashSet.Free();
}
Microsoft.IO.RecyclableMemoryStream
This library provides pooling for MemoryStream objects. It's a drop-in replacement for System.IO.MemoryStream. It has exactly the same semantics. It was designed by Bing engineers. Read the blog post here or see the code on GitHub.
var sourceBuffer = new byte[]{0,1,2,3,4,5,6,7};
var manager = new RecyclableMemoryStreamManager();
using (var stream = manager.GetStream())
{
stream.Write(sourceBuffer, 0, sourceBuffer.Length);
}
Note that RecyclableMemoryStreamManager should be declared once and it will live for the entire process–this is the pool. It is perfectly fine to use multiple pools if you desire.
Something like this might suit your needs.
/// <summary>
/// Represents a pool of objects with a size limit.
/// </summary>
/// <typeparam name="T">The type of object in the pool.</typeparam>
public sealed class ObjectPool<T> : IDisposable
where T : new()
{
private readonly int size;
private readonly object locker;
private readonly Queue<T> queue;
private int count;
/// <summary>
/// Initializes a new instance of the ObjectPool class.
/// </summary>
/// <param name="size">The size of the object pool.</param>
public ObjectPool(int size)
{
if (size <= 0)
{
const string message = "The size of the pool must be greater than zero.";
throw new ArgumentOutOfRangeException("size", size, message);
}
this.size = size;
locker = new object();
queue = new Queue<T>();
}
/// <summary>
/// Retrieves an item from the pool.
/// </summary>
/// <returns>The item retrieved from the pool.</returns>
public T Get()
{
lock (locker)
{
if (queue.Count > 0)
{
return queue.Dequeue();
}
count++;
return new T();
}
}
/// <summary>
/// Places an item in the pool.
/// </summary>
/// <param name="item">The item to place to the pool.</param>
public void Put(T item)
{
lock (locker)
{
if (count < size)
{
queue.Enqueue(item);
}
else
{
using (item as IDisposable)
{
count--;
}
}
}
}
/// <summary>
/// Disposes of items in the pool that implement IDisposable.
/// </summary>
public void Dispose()
{
lock (locker)
{
count = 0;
while (queue.Count > 0)
{
using (queue.Dequeue() as IDisposable)
{
}
}
}
}
}
Example Usage
public class ThisObject
{
private readonly ObjectPool<That> pool = new ObjectPool<That>(100);
public void ThisMethod()
{
var that = pool.Get();
try
{
// Use that ....
}
finally
{
pool.Put(that);
}
}
}
Sample from MSDN: How to: Create an Object Pool by Using a ConcurrentBag
Back in the day Microsoft provided a framework through Microsoft Transaction Server (MTS) and later COM+ to do object pooling for COM objects. That functionality was carried forward to System.EnterpriseServices in the .NET Framework and now in Windows Communication Foundation.
Object Pooling in WCF
This article is from .NET 1.1 but should still apply in the current versions of the Framework (even though WCF is the preferred method).
Object Pooling .NET
I really like Aronaught's implementation -- especially since he handles the waiting on resource to become available through the use of a semaphore. There are several additions I would like to make:
Change sync.WaitOne() to sync.WaitOne(timeout) and expose the timeout as a parameter on Acquire(int timeout) method. This would also necessitate handling the condition when the thread times out waiting on an object to become available.
Add Recycle(T item) method to handle situations when an object needs to be recycled when a failure occurs, for example.
This is another implementation, with limited number of objects in pool.
public class ObjectPool<T>
where T : class
{
private readonly int maxSize;
private Func<T> constructor;
private int currentSize;
private Queue<T> pool;
private AutoResetEvent poolReleasedEvent;
public ObjectPool(int maxSize, Func<T> constructor)
{
this.maxSize = maxSize;
this.constructor = constructor;
this.currentSize = 0;
this.pool = new Queue<T>();
this.poolReleasedEvent = new AutoResetEvent(false);
}
public T GetFromPool()
{
T item = null;
do
{
lock (this)
{
if (this.pool.Count == 0)
{
if (this.currentSize < this.maxSize)
{
item = this.constructor();
this.currentSize++;
}
}
else
{
item = this.pool.Dequeue();
}
}
if (null == item)
{
this.poolReleasedEvent.WaitOne();
}
}
while (null == item);
return item;
}
public void ReturnToPool(T item)
{
lock (this)
{
this.pool.Enqueue(item);
this.poolReleasedEvent.Set();
}
}
}
Java oriented, this article expose the connectionImpl pool pattern and the abstracted object pool pattern and could be a good first approach :
http://www.developer.com/design/article.php/626171/Pattern-Summaries-Object-Pool.htm
Object pool Pattern:
You may use the NuGet package Microsoft.Extensions.ObjectPool
Documentations here:
https://learn.microsoft.com/en-us/aspnet/core/performance/objectpool?view=aspnetcore-3.1
https://learn.microsoft.com/en-us/dotnet/api/microsoft.extensions.objectpool