In the code below I have a concurrent dictionary that i'm using for storing a single key/value pair where the value is a collection of strings.
I will be reading and updating the strings in this single key/value pair from different threads.
I'm aware that concurrent dictionaries are not entirely thread safe if one thread has changed a value before the other thread has perhaps finished reading it. But equally i'm not sure if string values really come into this topic or not, could someone please advise?
Its also worth mentioning that although I put this "GetRunTimeVariables" method into an interface for dependency injection, I actually cant use DI all the time for accessing this method due to the stages of app startup and OIDC events sign in/out where i need to access the dictionary values within classes that can't use dependency injection, so in essence I could be accessing this dictionary from nay means as necessary throughout the lifetime of the application.
Lastly i'm not really sure if there is any benefit in pushing this method into an interface, the other option is simply new up a reference to this class each time i need it, some thoughts on this would be appreciated.
public class RunTimeSettings : IRunTimeSettings
{
// Create a new static instance of the RunTimeSettingsDictionary that is used for storing settings that are used for quick
// access during the life time of the application. Various other classes/threads will read/update the parameters.
public static readonly ConcurrentDictionary<int, RunTimeVariables> RunTimeSettingsDictionary = new ConcurrentDictionary<int, RunTimeVariables>();
public object GetRunTimeVariables()
{
dynamic settings = new RunTimeVariables();
if (RunTimeSettingsDictionary.TryGetValue(1, out RunTimeVariables runTimeVariables))
{
settings.Sitename = runTimeVariables.SiteName;
settings.Street = runTimeVariables.Street;
settings.City = runTimeVariables.City;
settings.Country = runTimeVariables.Country;
settings.PostCode = runTimeVariables.PostCode;
settings.Latitude = runTimeVariables.Latitude;
settings.Longitude = runTimeVariables.Longitude;
}
return settings;
}
}
Class for string values:
public class RunTimeVariables
{
public bool InstalledLocationConfigured { get; set; }
public string SiteName { get; set; }
public string Street { get; set; }
public string City { get; set; }
public string Country { get; set; }
public string PostCode { get; set; }
public string Latitude { get; set; }
public string Longitude { get; set; }
}
The System.String type (the classical strings of C#) is immutable. No one can modify the "content" of a String. Anyone can make a property reference a different String.
But in truth the only problem you have here is that the various values could be de-synced. You have various properties that are correlated. If one thread is modifying the object while another thread is reading it, the reading thread could see some properties of the "old" version and some properties of the "new" version. This isn't a big problem if the object once written to the ConcurrentDictionary is not changed (is "immutable" at least as a business rule). Clearly a correct solution is to have RuntimeVariables be an immutable object (composed only of readonly fields that are initialized at instancing for example)
Related
I have created a Photo class which contains a lot of properties such as size, filename, latitude/longitude, location etc.
public class Photo {
public string FileName { get; set; }
public string Title { get; set; }
public double Size { get; set; }
public DateTime CapturedDate { get; set; }
public string Latitude { get; set; }
public string Longitude { get; set; }
public string LocationName { get; set; }
public byte[] Data { get; set; }
}
I want to create an object of this class and populate its properties in a step by step approach, where each step needs to perform some kind of operation before setting some of the values on the object.
In a quick and dirty solution to make this work I simply created some void methods to fill the object properties, but I think this is anti-pattern.
var photo = new Photo();
photo.FileName = "Photo 1"
SetExifData(photo, photoStream);
SetLocationName(photo);
DoSomethingElse(photo);
void SetExifData(Photo photo, Stream photoStream) {
//Read exit data from the photo stream and update the object
photo.Latitude = "10,0";
photo.Longitude = "10,0";
photo.CapturedDate = ....
}
void SetLocationName(Photo photo){
//Call external API to get location name from lat/lng
photo.LocationName = "London"
}
void DoSomethingElse(Photo photo){
}
It would be optimal to pass this object to a kind of pipeline or builder. What pattern would be suitable for this scenario?
Only from what you've told me, a builder pattern might be the way to go. This can be created to support a fluent interface (if sufficient for your Pipeline needs), or as an intermediate step for use when setting up a more specialized Pipeline pattern.
class PhotoBuilder
{
// All properties goes here
private string Latitude { get; set; }
private string Longitude { get; set; }
...
public PhotoBuilder WithExifDataFromStream(Stream photoStream)
{
Latitude = ...;
Longitude = ...;
...
return this;
}
public Photo Build() => return new Photo(...);
}
I would advice to hide all state in the builder and only create the object at the end as a read-only object. It makes for much simpler reasoning.
The slightly less attractive alternative would be to hide your setters as internal and use them from the builder. I tend to stay away from such internals as far as possible for purity reasons, but YMMV. It certainly helps for very large structures. Even with this approach I would urge you to only expose the object at the end to keep the separation of the builder and the object itself. Also make sure you pass out an "immutable" object, I.E. clear up any references to that photo as soon as Build is called to avoid inadvertently changing the object post build.
Idle musings:
I have found it good to be fairly restrictive on the kind of logic you put in such a builder to start with. Of course this will depend on your use-case, but in larger systems certain logic will often be used in many places and I have had more than once had headaches over refactoring out multi-purpose functionality from an interface which was not as clean as it could be.
In this case, this could translate to SetLocation(double longitude, double latitude, string locationName) and WithCaptureDate(...). This would enable the builder to be used in more scenarios, and enables you to create special functions on top of it later if need be.
I've found it also helps in Unit Testing and composition. For example, you do not bind your builder to a Location API. This can also be solved with Dependency Injection, of course (ILocationResolver or similar), but then you might be looking at an object which is invalidating the Single-Responsibility principle.
In the end, it is very dependent on the system you are building and the requirements for this particular class. I would err on the side of creating another wrapper which does all the location and stream parsing stuff, but that is comming from someone who mostly deals with large interconnected systems where functionality reuse and composability requirements are high. If the system is very small, this may only introduce unwanted complexity.
I store this class
public class Customer
{
public string Firstname { get; set; }
public string Lastname { get; set; }
public string CustID { get; set; }
}
In this dictionary :
public static ConcurrentDictonary<string, Customer> Customers = new ConcurrentDictonary<string, Customer>();
The key is a unique string for each customer.
I am trying to find the cleanest thread-safe way to update properties of the customer's stored in the dictionary.
Sorry if the code above has any syntax issues, typed it in from a smartphone.
Here is what I’m currently doing:
Customer oCustomers = new Customer();
Customers.tryGetValue(ID, out oCustomers);
Customer nCustomer = new Customer();
nCustomer = oCustomer;
nCustomer.Firstname = NewValue;
Customers.tryUpdate(ID, nCustomer, oCustomer);
This works but seems so hacked to me, any suggestions would be great.
This was closed as a duplicated question that asks how to modify the ConcurrentDictionary in a thred-safe way. I'm asking how to modify individual customers, not the dictionary.
I have not found an answer on stack overflow and have searched for some time. Will someone please re-open this question so I can’t get the help I came here for.
ConcurrentDictionary is a dictionary that is thread safe by default and all operations involving it are atomic by design.
ConcurrentDictionary is thread-safe. It just depends on what you expect from thread safety.
From MSDN
ConcurrentDictionary<TKey, TValue> is designed for multithreaded scenarios. You do not have to use locks in your code to add or remove items from the collection. However, it is always possible for one thread to retrieve a value, and another thread to immediately update the collection by giving the same key a new value.
The ConcurrentDictionary is thread-safe. That doesn't mean that its contents are.
The easiest way to ensure thread-safety is to make the objects immutable and replace them with a new one when you want to change the values. That's how immutable types in functional languages like F# work.
Your code though, doesn't do that. It's still modifying the stored object. When you type nCustomer = oCustomer you don't clone the object, you change the variable to point to the original oCustomer.
You can make the class immutable by using only readonly properties which are initialized by the constructor :
public class Customer
{
public string FirstName { get; }
public string LastName { get; }
public string CustID { get; }
public Customer(string firstName,string lastName,string custID)
{
CustID = custID;
FirstName=firstName;
LastName=lastName;
}
}
To update a customer, pull it from the dictionary, create a copy and call TryUpdate. Make sure to check for success. If TryUpdate fails it means some other thread modified the customer and you probably need to retry. Eg:
Customer old;
if (customers.TryGetValue(ID, out old))
{
var newCustomer = new Customer(newName,old.LastName,old.CustID);
if(!customers.TryUpdate(ID,newCustomer,old))
{
// Who moved my cheese ?
}
}
else
{
//No customer!
}
You'll have to decide what to do if the customer value has changed:
You can retry the update, thus overwriting any other updates.
You can try reloading the customer and make your update, assuming that whoever changed the customer changed one of the other fields.
You can stop trying and warn the user that the record has already changed
In WPF, I want to effectively store the millions of objects in low memory usage and retrieve it very fast. Below is my sample class.
public class CellInfo
{
public int A { get; set; }
public int B { get; set; }
public string C { get; set; }
public object D { get; set; }
public bool E { get; set; }
public double F { get; set; }
public ClassA G { get; set; }
}
I want to store a millions of CellInfo objects and each object have its own identity. And I want retrieve it back using that identity. If the properties of the CellInfo instance is not defined, then it needs to be return the default value which would be stored in a static field.
So i want to only store the Properties of CellInfo object which are defined and others i dont want to keep in a memory and can retrieve those from static variable.
So can anyone please suggest me fastest way to store and retrieve the millions of objects in a low memory usage?
Note: I dont want any additional software installation and DB or any external file to store this.
You haven't indicated which field is the 'own identity', so I've assumed a Guid Identity. A Dictionary keyed on this identity should offer fastest retrieval.
Dictionary<Guid, CellInfo> cells = new Dictionary<Guid, CellInfo>();
If you have the data already, you can use .ToDictionary() to project the key / value mappings from an enumerable.
If you need to simultaneously mutate and access the collection from multiple threads (or if you intend making the collection static), you can swap out with a ConcurrentDictionary to address thread safety issues:
Before accessing an element, you'll need to determine whether the item exists in the Dictionary via ContainsKey (or TryGet as per Amleth). If not, use your default element. So would suggest you hide the underlying dictionary implementation and force consumers through an encapsulation helper which does this check for you.
I've been working on a project for a while to parse a list of entries from a csv file and use that data to update a database.
For each entry I create a new user instance that I put in a collection. Now I want to iterate that collection and compare the user entry to the user from the database (if it exists). My question is, how can I compare that user (entry) object to the user (db) object, while returning a list with differences?
For example following classes generated from database:
public class User
{
public int ID { get; set; }
public string EmployeeNumber { get; set; }
public string UserName { get; set; }
public string FirstName { get; set; }
public string LastName { get; set; }
public Nullable<int> OfficeID { get; set; }
public virtual Office Office { get; set; }
}
public class Office
{
public int ID { get; set; }
public string Code { get; set; }
public virtual ICollection<User> Users { get; set; }
}
To save some queries to the database, I only fill the properties that I can retrieve from the csv file, so the ID's (for example) are not available for the equality check.
Is there any way to compare these objects without defining a rule for each property and returning a list of properties that are modified? I know this question seems similar to some earlier posts. I've read a lot of them but as I'm rather inexperienced at programming, I'd appreciate some advice.
From what I've gathered from what I've read, should I be combining 'comparing properties generically' with 'ignoring properties using data annotations' and 'returning a list of CompareResults'?
There are several approaches that you can solve this:
Approach #1 is to create separate DTO-style classes for the contents of the CSV files. Though this involves creating new classes with a lot of similar fields, it decouples the CSV file format from your database and gives you the ability to change them later without influencing the other part. In order to implement the comparison, you could create a Comparer class. As long as the classes are almost identical, the comparison can get all the properties from the DTO class and implement the comparison dynamically (e.g. by creating and evaluating a Lambda expression that contains a BinaryExpression of type Equal).
Approach #2 avoids the DTOs, but uses attributes to mark the properties that are part of the comparison. You'd need to create a custom attribute that you assign to the properties in question. In the compare, you analyze all the properties of the class and filter out the ones that are marked with the attribute. For the comparison of the properties you can use the same approach as in #1. Downside of this approach is that you couple the comparison logic tightly with the data classes. If you'd need to implement several different comparisons, you'd clutter the data classes with the attributes.
Of course, #1 results in a higher effort than #2. I understand that it is not what you are looking for, but maybe having a separate, strongly-typed compared class is also an approach one can think about.
Some more details on a dynamic comparison algorithm: it is based on reflection to get the properties that need to be compared (depending on the approach you get the properties of the DTO or the relevant ones of the data class). Once you have the properties (in case of DTOs, the properties should have the same name and data type), you can create a LamdaExpression and compile and evaluate it dynamically. The following lines show an excerpt of a code sample:
public static bool AreEqual<TDTO, TDATA>(TDTO dto, TDATA data)
{
foreach(var prop in typeof(TDTO).GetProperties())
{
var dataProp = typeof(TDATA).GetProperty(prop.Name);
if (dataProp == null)
throw new InvalidOperationException(string.Format("Property {0} is missing in data class.", prop.Name));
var compExpr = GetComparisonExpression(prop, dataProp);
var del = compExpr.Compile();
if (!(bool)del.DynamicInvoke(dto, data))
return false;
}
return true;
}
private static LambdaExpression GetComparisonExpression(PropertyInfo dtoProp, PropertyInfo dataProp)
{
var dtoParam = Expression.Parameter(dtoProp.DeclaringType, "dto");
var dataParam = Expression.Parameter(dataProp.DeclaringType, "data");
return Expression.Lambda(
Expression.MakeBinary(ExpressionType.Equal,
Expression.MakeMemberAccess(
dtoParam, dtoProp),
Expression.MakeMemberAccess(
dataParam, dataProp)), dtoParam, dataParam);
}
For the full sample, see this link. Please note that this dynamic approach is just an easy implementation that leaves room for improvement (e.g. there is no check for the data type of the properties). It also does only check for equality and does not collect the properties that are not equal; but that should be easy to transfer.
While the dynamic approach is easy to implement, the risk for runtime errors is bigger than in a strongly-typed approach.
I need to update InstrumentInfo class frequently. I update this class from one thread and access (read) from another.
I have Instrument class. For each Instrument class I need to maintain InstrumentInfo:
// omit class Instrument as not improtant
public class InstrumentInfo
{
public string Name { get; set; }
public TradingStatus Status { get; set; }
public decimal MinStep;
public double ValToday;
public decimal BestBuy;
public decimal BestSell;
}
public class DerivativeInfo : InstrumentInfo
{
public DateTime LastTradeDate { get; set; }
public DateTime ExpirationDate { get; set; }
public string UnderlyingTicker { get; set; }
}
// i do have several more subclasses
I do have two options:
Create only one InstrumentInfo for each Instrument. When some field updates, for example BestBuy just update value of this field. Clients should obtain InstrumentInfo only once and use it during entire application lifetime.
On each update create new instance of InstrumentInfo. Clients should obtain every time the most recent copy of InstrumentInfo.
With 1 I do need to lock, because decimal DateTime string update is not guaranteed to be atomic. But I don't need to reinstatiate object.
With 2 I don't need to lock at all, as reference update is atomic. But I likely will use more memory and I will probably create more work for GC because every time I need to instatiate new object (and initialize all fields).
1 implementation
private InstrumentInfo[] instrumentInfos = new InstrumentInfo[Constants.MAX_INSTRUMENTS_NUMBER_IN_SYSTEM];
// invoked from different threads
public InstrumentInfo GetInstrumentInfo(Instrument instrument)
{
lock (instrumentInfos) {
var result = instrumentInfos[instrument.Id];
if (result == null) {
result = new InstrumentInfo();
instrumentInfos[instrument.Id] = result;
}
return result;
}
}
...........
InstrumentInfo ii = GetInstrumentInfo(instrument);
lock (ii) {
ii.BestSell = BestSell;
}
2 implementation:
private InstrumentInfo[] instrumentInfos = new InstrumentInfo[Constants.MAX_INSTRUMENTS_NUMBER_IN_SYSTEM];
// get and set are invoked from different threads
// but i don't need to lock at all!!! as reference update is atomic
public void SetInstrumentInfo(Instrument instrument, InstrumentInfo info)
{
if (instrument == null || info == null)
{
return;
}
instrumentInfos[instrument.Id] = info;
}
// get and set are invoked from different threads
public InstrumentInfo GetInstrumentInfo(Instrument instrument)
{
return instrumentInfos[instrument.Id];
}
....
InstrumentInfo ii = new InstrumentInfo {
Name = ..
TradingStatus = ...
...
BestSell =
}
SetInstrumentInfo(instrument, ii); // replace InstrumentInfo
So what do you think? I want to use approach 2 because I like code without locks! Am I correct that I do not need lock at all as I just replace reference? Do you aggree that 2 is preferred? Any suggestions are welcome.
With 2 I don't need to lock at all, as reference update is atomic. But I likely will use more memory and I will probably create more work for GC because
No, your option 1 is just as likely to cause more load on the GC (by promoting more objects to the next generation).
Use the most sensible, maintainable form. In this case, create new objects.
Do not optimize based on what you 'think' might be slower. Use a profiler.
You should consider several unrelated points.
When you can go without locks, you should go without them, of course. And when you go for multithreading, prefer immutable objects.
On the other side, immutable objects
strain memory
are considered "anti-OOP"
may be incorrectly consumed by client code (because people are not used working with them)
Your second approach still requires some concurrency handling strategy, because several threads may set infos with different starting assumptions.
I am not sure that reference assignment is atomic. If it were, why does CLR have Interlocked.Exchange<T>? Thanks to Henk Holterman for pointing this out.