Related
I'm trying to save the streaming data of a pressure map.
Basically I have a pressure matrix defined as:
double[,] pressureMatrix = new double[e.Data.GetLength(0), e.Data.GetLength(1)];
Basically, I'm getting one of this pressureMatrix every 10 milliseconds and I want to save all the information in a JSON file to be able to reproduce it later.
What I do is, first of all, write what I call the header with all the settings used to do the recording like this:
recordedData.softwareVersion = Assembly.GetExecutingAssembly().GetName().Version.Major.ToString() + "." + Assembly.GetExecutingAssembly().GetName().Version.Minor.ToString();
recordedData.calibrationConfiguration = calibrationConfiguration;
recordedData.representationConfiguration = representationSettings;
recordedData.pressureData = new List<PressureMap>();
var json = JsonConvert.SerializeObject(csvRecordedData, Formatting.None);
File.WriteAllText(this.filePath, json);
Then, every time I get a new pressure map I create a new Thread to add the new PressureMatrix and re-write the file:
var newPressureMatrix = new PressureMap(datos, DateTime.Now);
recordedData.pressureData.Add(newPressureMatrix);
var json = JsonConvert.SerializeObject(recordedData, Formatting.None);
File.WriteAllText(this.filePath, json);
After about 20-30 min I get an OutOfMemory Exception because the system cannot hold the recordedData var because the List<PressureMatrix> in it is too big.
How can I handle this to save a the data? I would like to save the information of 24-48 hours.
Your basic problem is that you are holding all of your pressure map samples in memory rather than writing each one individually and then allowing it to be garbage collected. What's worse, you are doing this in two different places:
You serialize your entire list of samples to a JSON string json before writing the string to a file.
Instead, as explained in Performance Tips: Optimize Memory Usage, you should serialize and deserialize directly to and from your file in such situations. For instructions on how to do this see this answer to Can Json.NET serialize / deserialize to / from a stream? and also Serialize JSON to a file.
The recordedData.pressureData = new List<PressureMap>(); accumulates all pressure map samples, then writes all of them every time a sample is made.
A better solution would be to write each sample once and forget it, but the requirement for each sample to be nested inside some container objects in the JSON makes it nonobvious how to do that.
So, how to attack issue #2?
First, let's modify your data model as follows, partitioning the header data into a separate class:
public class PressureMap
{
public double[,] PressureMatrix { get; set; }
}
public class CalibrationConfiguration
{
// Data model not included in question
}
public class RepresentationConfiguration
{
// Data model not included in question
}
public class RecordedDataHeader
{
public string SoftwareVersion { get; set; }
public CalibrationConfiguration CalibrationConfiguration { get; set; }
public RepresentationConfiguration RepresentationConfiguration { get; set; }
}
public class RecordedData
{
// Ensure the header is serialized first.
[JsonProperty(Order = 1)]
public RecordedDataHeader RecordedDataHeader { get; set; }
// Ensure the pressure data is serialized last.
[JsonProperty(Order = 2)]
public IEnumerable<PressureMap> PressureData { get; set; }
}
Option #1 is a version of the producer-comsumer pattern. It involves spinning up two threads: one to generate PressureData samples, and one to serialize the RecordedData. The first thread will generate samples and add them to a BlockingCollection<PressureMap> collection that is passed to the second thread. The second thread will then serialize BlockingCollection<PressureMap>.GetConsumingEnumerable()
as the value of RecordedData.PressureData.
The following code gives a skeleton for how to do this:
var sampleCount = 400; // Or whatever stopping criterion you prefer
var sampleInterval = 10; // in ms
using (var pressureData = new BlockingCollection<PressureMap>())
{
// Adapted from
// https://learn.microsoft.com/en-us/dotnet/standard/collections/thread-safe/blockingcollection-overview
// https://learn.microsoft.com/en-us/dotnet/api/system.collections.concurrent.blockingcollection-1?view=netframework-4.7.2
// Spin up a Task to sample the pressure maps
using (Task t1 = Task.Factory.StartNew(() =>
{
for (int i = 0; i < sampleCount; i++)
{
var data = GetPressureMap(i);
Console.WriteLine("Generated sample {0}", i);
pressureData.Add(data);
System.Threading.Thread.Sleep(sampleInterval);
}
pressureData.CompleteAdding();
}))
{
// Spin up a Task to consume the BlockingCollection
using (Task t2 = Task.Factory.StartNew(() =>
{
var recordedDataHeader = new RecordedDataHeader
{
SoftwareVersion = softwareVersion,
CalibrationConfiguration = calibrationConfiguration,
RepresentationConfiguration = representationConfiguration,
};
var settings = new JsonSerializerSettings
{
ContractResolver = new CamelCasePropertyNamesContractResolver(),
};
using (var stream = new FileStream(this.filePath, FileMode.Create))
using (var textWriter = new StreamWriter(stream))
using (var jsonWriter = new JsonTextWriter(textWriter))
{
int j = 0;
var query = pressureData
.GetConsumingEnumerable()
.Select(p =>
{
// Flush the writer periodically in case the process terminates abnormally
jsonWriter.Flush();
Console.WriteLine("Serializing item {0}", j++);
return p;
});
var recordedData = new RecordedData
{
RecordedDataHeader = recordedDataHeader,
// Since PressureData is declared as IEnumerable<PressureMap>, evaluation will be lazy.
PressureData = query,
};
Console.WriteLine("Beginning serialization of {0} to {1}:", recordedData, this.filePath);
JsonSerializer.CreateDefault(settings).Serialize(textWriter, recordedData);
Console.WriteLine("Finished serialization of {0} to {1}.", recordedData, this.filePath);
}
}))
{
Task.WaitAll(t1, t2);
}
}
}
Notes:
This solution uses the fact that, when serializing an IEnumerable<T>, Json.NET will not materialize the enumerable as a list. Instead it will take full advantage of lazy evaluation and simply enumerate through it, writing then forgetting each individual item encountered.
The first thread samples PressureData and adds them to the blocking collection.
The second thread wraps the blocking collection in an IEnumerable<PressureData> then serializes that as RecordedData.PressureData.
During serialization, the serializer will enumerate through the IEnumerable<PressureData> enumerable, streaming each to the JSON file then proceeding to the next -- effectively blocking until one becomes available.
You will need to do some experimentation to make sure that the serialization thread can "keep up" with the sampling thread, possibly by setting a BoundedCapacity during construction. If not, you may need to adopt a different strategy.
PressureMap GetPressureMap(int count) should be some method of yours (not shown in the question) that returns the current pressure map sample.
In this technique the JSON file remains open for the duration of the sampling session. If sampling terminates abnormally the file may be truncated. I make some attempt to ameliorate the problem by flushing the writer periodically.
While data serialization will no longer require unbounded amounts of memory, deserializing a RecordedData later will deserialize the PressureData array into a concrete List<PressureMap>. This may possibly cause memory issues during downstream processing.
Demo fiddle #1 here.
Option #2 would be to switch from a JSON file to a Newline Delimited JSON file. Such a file consists of sequences of JSON objects separated by newline characters. In your case, you would make the first object contain the RecordedDataHeader information, and the subsequent objects be of type PressureMap:
var sampleCount = 100; // Or whatever
var sampleInterval = 10;
var recordedDataHeader = new RecordedDataHeader
{
SoftwareVersion = softwareVersion,
CalibrationConfiguration = calibrationConfiguration,
RepresentationConfiguration = representationConfiguration,
};
var settings = new JsonSerializerSettings
{
ContractResolver = new CamelCasePropertyNamesContractResolver(),
};
// Write the header
Console.WriteLine("Beginning serialization of sample data to {0}.", this.filePath);
using (var stream = new FileStream(this.filePath, FileMode.Create))
{
JsonExtensions.ToNewlineDelimitedJson(stream, new[] { recordedDataHeader });
}
// Write each sample incrementally
for (int i = 0; i < sampleCount; i++)
{
Thread.Sleep(sampleInterval);
Console.WriteLine("Performing sample {0} of {1}", i, sampleCount);
var map = GetPressureMap(i);
using (var stream = new FileStream(this.filePath, FileMode.Append))
{
JsonExtensions.ToNewlineDelimitedJson(stream, new[] { map });
}
}
Console.WriteLine("Finished serialization of sample data to {0}.", this.filePath);
Using the extension methods:
public static partial class JsonExtensions
{
// Adapted from the answer to
// https://stackoverflow.com/questions/44787652/serialize-as-ndjson-using-json-net
// by dbc https://stackoverflow.com/users/3744182/dbc
public static void ToNewlineDelimitedJson<T>(Stream stream, IEnumerable<T> items)
{
// Let caller dispose the underlying stream
using (var textWriter = new StreamWriter(stream, new UTF8Encoding(false, true), 1024, true))
{
ToNewlineDelimitedJson(textWriter, items);
}
}
public static void ToNewlineDelimitedJson<T>(TextWriter textWriter, IEnumerable<T> items)
{
var serializer = JsonSerializer.CreateDefault();
foreach (var item in items)
{
// Formatting.None is the default; I set it here for clarity.
using (var writer = new JsonTextWriter(textWriter) { Formatting = Formatting.None, CloseOutput = false })
{
serializer.Serialize(writer, item);
}
// http://specs.okfnlabs.org/ndjson/
// Each JSON text MUST conform to the [RFC7159] standard and MUST be written to the stream followed by the newline character \n (0x0A).
// The newline charater MAY be preceeded by a carriage return \r (0x0D). The JSON texts MUST NOT contain newlines or carriage returns.
textWriter.Write("\n");
}
}
// Adapted from the answer to
// https://stackoverflow.com/questions/29729063/line-delimited-json-serializing-and-de-serializing
// by Yuval Itzchakov https://stackoverflow.com/users/1870803/yuval-itzchakov
public static IEnumerable<TBase> FromNewlineDelimitedJson<TBase, THeader, TRow>(TextReader reader)
where THeader : TBase
where TRow : TBase
{
bool first = true;
using (var jsonReader = new JsonTextReader(reader) { CloseInput = false, SupportMultipleContent = true })
{
var serializer = JsonSerializer.CreateDefault();
while (jsonReader.Read())
{
if (jsonReader.TokenType == JsonToken.Comment)
continue;
if (first)
{
yield return serializer.Deserialize<THeader>(jsonReader);
first = false;
}
else
{
yield return serializer.Deserialize<TRow>(jsonReader);
}
}
}
}
}
Later, you can process the newline delimited JSON file as follows:
using (var stream = File.OpenRead(filePath))
using (var textReader = new StreamReader(stream))
{
foreach (var obj in JsonExtensions.FromNewlineDelimitedJson<object, RecordedDataHeader, PressureMap>(textReader))
{
if (obj is RecordedDataHeader)
{
var header = (RecordedDataHeader)obj;
// Process the header
Console.WriteLine(JsonConvert.SerializeObject(header));
}
else
{
var row = (PressureMap)obj;
// Process the row.
Console.WriteLine(JsonConvert.SerializeObject(row));
}
}
}
Notes:
This approach looks simpler because the samples are added incrementally to the end of the file, rather than inserted inside some overall JSON container.
With this approach both serialization and downstream processing can be done with bounded memory use.
The sample file does not remain open for the duration of sampling, so is less likely to be truncated.
Downstream applications may not have built-in tools for processing newline delimited JSON.
This strategy may integrate more simply with your current threading code.
Demo fiddle #2 here.
I'm trying to implement FileCache (https://github.com/acarteas/FileCache) which is based on ObjectCache.
I'm trying to check a cache object for its existence, or add it if required and return. When the object does not exist however, the delegate is not executed and an error is thrown: Type 'myNamespace.Controllers.ListController+<>c__DisplayClass0_0' in [...] is not marked as serializable.
What I've tried (simplified):
private string generateString(int? Id)
{
return "String";
}
public ActionResult GetSomething(int? Id)
{
var cacheFilePath = $"{AppDomain.CurrentDomain.BaseDirectory}{"\\cache"}";
var cache = new FileCache(cacheFilePath, false, TimeSpan.FromSeconds(30));
if (purgeCache)
cache.Remove($"CacheKey{Id}");
Func<string> createString = () => generateString(Id);
var myString = (string) cache.AddOrGetExisting($"CacheKey{Id}", createString, new CacheItemPolicy() { AbsoluteExpiration = DateTimeOffset.Now.AddSeconds(30) });
return new JsonStringResult(myString);
}
Ok, now I've tried to specify the delegate createString with Serializable, but that doesn't work.
Could someone point me into the right direction?
Essentially I want:
- run a statement that returns a previous output of generateString(123); if it doesn't exist or is expired, it should re-generate it.
Thanks for any tips!
Due to the nature of FileCache, I think the only reasonable way to do that is to fallback to the usual way - check if item exists in cache and if not - add it:
private static readonly _cacheLock = new object();
private static readonly FileCache _cache = new FileCache($"{AppDomain.CurrentDomain.BaseDirectory}{"\\cache"}", false, TimeSpan.FromSeconds(30));
/// ...
lock (_cacheLock) {
var myString = _cache.Get($"CacheKey{Id}");
if (myString == null) {
myString = generateString(Id);
_cache.Set($"CacheKey{Id}", myString, new CacheItemPolicy() { AbsoluteExpiration = DateTimeOffset.Now.AddSeconds(30) });
}
}
Lock is necessary because FileCache both writes and reads from the same file, and this is never safe to do from multiple threads.
The signature for AddOrGetExisting says the second parameter is object value and not a callback delegate:
https://github.com/acarteas/FileCache/blob/master/src/FileCache/FileCache.cs
public override object AddOrGetExisting(string key, object value, CacheItemPolicy policy, string regionName = null)
I think you just want this (I've corrected other potential issues in your code too):
public ActionResult GetSomething(int? id)
{
String cacheFilePath = Path.Combine( AppDomain.CurrentDomain.BaseDirectory, "Cache" );
FileCache cache = new FileCache( cacheFilePath, false, TimeSpan.FromSeconds(30) );
String cacheKey = String.Format( CultureInfo.InvariantCulture, "CacheKey{0}", id );
if( purgeCache ) cache.Remove( cacheKey );
String valueString = this.GenerateString( id );
String myString = (String)cache.AddOrGetExisting( cacheKey, valueString, new CacheItemPolicy() { AbsoluteExpiration = DateTimeOffset.Now.AddSeconds(30) });
return new JsonStringResult( myString );
}
C# Interpolated strings $"like {this}" are not suitable for use outside of UI code because they automatically use CultureInfo.CurrentCulture which results in inconsistent output depending on the current thread's culture which in ASP.NET is automatically set to the visitor's browser's Accept-Language header value. It's best to use an explict String.Format( CultureInfo.InvariantCulture, format, args ) instead.
Your code had redundant steps for generating the cache key, I moved it to a single variable cacheKey instead.
C# naming conventions use camelCase for parameters and locals, and PascalCase for methods.
I want to serialize a C# object as JSON into a stream, but to avoid the serialization if the object is not valid according to a schema. How should I proceed with this task using JSON.NET and Json.NET Schema? From what I see there is no method in the JSON.NET library which allows the validation of a C# object against a JSON schema. It seems somewhat weird that there is no direct method to just validate the C# object without encoding it. Do you have any idea why this method is not available?
It seems this API not currently available. At a guess, this might be because recursively generating the JSON values to validate involves most of the work of serializing the object. Or it could just be because no one at Newtonsoft ever designed, specified, implemented, tested, documented and shipped that feature.
If you want, you could file an enhancement request requesting this API, probably as a part of the SchemaExtensions class.
In the meantime, if you do need to test-validate a POCO without generating a complete serialization of it (because e.g. the result would be very large), you could grab NullJsonWriter from Reference to automatically created objects, wrap it in a JSchemaValidatingWriter and test-serialize your object as shown in Validate JSON with JSchemaValidatingWriter. NullJsonWriter doesn't actually write anything, and so using it eliminates the performance and memory overhead of generating a complete serialization (either as a string or as a JToken).
First, add the following static method:
public static class JsonExtensions
{
public static bool TestValidate<T>(T obj, JSchema schema, SchemaValidationEventHandler handler = null, JsonSerializerSettings settings = null)
{
using (var writer = new NullJsonWriter())
using (var validatingWriter = new JSchemaValidatingWriter(writer) { Schema = schema })
{
int count = 0;
if (handler != null)
validatingWriter.ValidationEventHandler += handler;
validatingWriter.ValidationEventHandler += (o, a) => count++;
JsonSerializer.CreateDefault(settings).Serialize(validatingWriter, obj);
return count == 0;
}
}
}
// Used to enable Json.NET to traverse an object hierarchy without actually writing any data.
class NullJsonWriter : JsonWriter
{
public NullJsonWriter()
: base()
{
}
public override void Flush()
{
// Do nothing.
}
}
Then use it like:
// Example adapted from
// https://www.newtonsoft.com/jsonschema/help/html/JsonValidatingWriterAndSerializer.htm
// by James Newton-King
string schemaJson = #"{
'description': 'A person',
'type': 'object',
'properties': {
'name': {'type':'string'},
'hobbies': {
'type': 'array',
'maxItems': 3,
'items': {'type':'string'}
}
}
}";
var schema = JSchema.Parse(schemaJson);
var person = new
{
Name = "James",
Hobbies = new [] { ".Net", "Blogging", "Reading", "XBox", "LOLCATS" },
};
var settings = new JsonSerializerSettings { ContractResolver = new CamelCasePropertyNamesContractResolver() };
var isValid = JsonExtensions.TestValidate(person, schema, (o, a) => Console.WriteLine(a.Message), settings);
// Prints Array item count 5 exceeds maximum count of 3. Path 'hobbies'.
Console.WriteLine("isValid = {0}", isValid);
// Prints isValid = False
Watch out for cases by the way. Json.NET schema is case sensitive so you will need to use an appropriate contract resolver when test-validating.
Sample fiddle.
You cannot do that from the JSON string, you need an object and a schema to compare with first..
public void Validate()
{
//...
JsonSchema schema = JsonSchema.Parse("{'pattern':'lol'}");
JToken stringToken = JToken.FromObject("pie");
stringToken.Validate(schema);
I assume this code has concurrency issues:
const string CacheKey = "CacheKey";
static string GetCachedData()
{
string expensiveString =null;
if (MemoryCache.Default.Contains(CacheKey))
{
expensiveString = MemoryCache.Default[CacheKey] as string;
}
else
{
CacheItemPolicy cip = new CacheItemPolicy()
{
AbsoluteExpiration = new DateTimeOffset(DateTime.Now.AddMinutes(20))
};
expensiveString = SomeHeavyAndExpensiveCalculation();
MemoryCache.Default.Set(CacheKey, expensiveString, cip);
}
return expensiveString;
}
The reason for the concurrency issue is that multiple threads can get a null key and then attempt to insert data into cache.
What would be the shortest and cleanest way to make this code concurrency proof? I like to follow a good pattern across my cache related code. A link to an online article would be a great help.
UPDATE:
I came up with this code based on #Scott Chamberlain's answer. Can anyone find any performance or concurrency issue with this?
If this works, it would save many line of code and errors.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Runtime.Caching;
namespace CachePoc
{
class Program
{
static object everoneUseThisLockObject4CacheXYZ = new object();
const string CacheXYZ = "CacheXYZ";
static object everoneUseThisLockObject4CacheABC = new object();
const string CacheABC = "CacheABC";
static void Main(string[] args)
{
string xyzData = MemoryCacheHelper.GetCachedData<string>(CacheXYZ, everoneUseThisLockObject4CacheXYZ, 20, SomeHeavyAndExpensiveXYZCalculation);
string abcData = MemoryCacheHelper.GetCachedData<string>(CacheABC, everoneUseThisLockObject4CacheXYZ, 20, SomeHeavyAndExpensiveXYZCalculation);
}
private static string SomeHeavyAndExpensiveXYZCalculation() {return "Expensive";}
private static string SomeHeavyAndExpensiveABCCalculation() {return "Expensive";}
public static class MemoryCacheHelper
{
public static T GetCachedData<T>(string cacheKey, object cacheLock, int cacheTimePolicyMinutes, Func<T> GetData)
where T : class
{
//Returns null if the string does not exist, prevents a race condition where the cache invalidates between the contains check and the retreival.
T cachedData = MemoryCache.Default.Get(cacheKey, null) as T;
if (cachedData != null)
{
return cachedData;
}
lock (cacheLock)
{
//Check to see if anyone wrote to the cache while we where waiting our turn to write the new value.
cachedData = MemoryCache.Default.Get(cacheKey, null) as T;
if (cachedData != null)
{
return cachedData;
}
//The value still did not exist so we now write it in to the cache.
CacheItemPolicy cip = new CacheItemPolicy()
{
AbsoluteExpiration = new DateTimeOffset(DateTime.Now.AddMinutes(cacheTimePolicyMinutes))
};
cachedData = GetData();
MemoryCache.Default.Set(cacheKey, cachedData, cip);
return cachedData;
}
}
}
}
}
This is my 2nd iteration of the code. Because MemoryCache is thread safe you don't need to lock on the initial read, you can just read and if the cache returns null then do the lock check to see if you need to create the string. It greatly simplifies the code.
const string CacheKey = "CacheKey";
static readonly object cacheLock = new object();
private static string GetCachedData()
{
//Returns null if the string does not exist, prevents a race condition where the cache invalidates between the contains check and the retreival.
var cachedString = MemoryCache.Default.Get(CacheKey, null) as string;
if (cachedString != null)
{
return cachedString;
}
lock (cacheLock)
{
//Check to see if anyone wrote to the cache while we where waiting our turn to write the new value.
cachedString = MemoryCache.Default.Get(CacheKey, null) as string;
if (cachedString != null)
{
return cachedString;
}
//The value still did not exist so we now write it in to the cache.
var expensiveString = SomeHeavyAndExpensiveCalculation();
CacheItemPolicy cip = new CacheItemPolicy()
{
AbsoluteExpiration = new DateTimeOffset(DateTime.Now.AddMinutes(20))
};
MemoryCache.Default.Set(CacheKey, expensiveString, cip);
return expensiveString;
}
}
EDIT: The below code is unnecessary but I wanted to leave it to show the original method. It may be useful to future visitors who are using a different collection that has thread safe reads but non-thread safe writes (almost all of classes under the System.Collections namespace is like that).
Here is how I would do it using ReaderWriterLockSlim to protect access. You need to do a kind of "Double Checked Locking" to see if anyone else created the cached item while we where waiting to to take the lock.
const string CacheKey = "CacheKey";
static readonly ReaderWriterLockSlim cacheLock = new ReaderWriterLockSlim();
static string GetCachedData()
{
//First we do a read lock to see if it already exists, this allows multiple readers at the same time.
cacheLock.EnterReadLock();
try
{
//Returns null if the string does not exist, prevents a race condition where the cache invalidates between the contains check and the retreival.
var cachedString = MemoryCache.Default.Get(CacheKey, null) as string;
if (cachedString != null)
{
return cachedString;
}
}
finally
{
cacheLock.ExitReadLock();
}
//Only one UpgradeableReadLock can exist at one time, but it can co-exist with many ReadLocks
cacheLock.EnterUpgradeableReadLock();
try
{
//We need to check again to see if the string was created while we where waiting to enter the EnterUpgradeableReadLock
var cachedString = MemoryCache.Default.Get(CacheKey, null) as string;
if (cachedString != null)
{
return cachedString;
}
//The entry still does not exist so we need to create it and enter the write lock
var expensiveString = SomeHeavyAndExpensiveCalculation();
cacheLock.EnterWriteLock(); //This will block till all the Readers flush.
try
{
CacheItemPolicy cip = new CacheItemPolicy()
{
AbsoluteExpiration = new DateTimeOffset(DateTime.Now.AddMinutes(20))
};
MemoryCache.Default.Set(CacheKey, expensiveString, cip);
return expensiveString;
}
finally
{
cacheLock.ExitWriteLock();
}
}
finally
{
cacheLock.ExitUpgradeableReadLock();
}
}
There is an open source library [disclaimer: that I wrote]: LazyCache that IMO covers your requirement with two lines of code:
IAppCache cache = new CachingService();
var cachedResults = cache.GetOrAdd("CacheKey",
() => SomeHeavyAndExpensiveCalculation());
It has built in locking by default so the cacheable method will only execute once per cache miss, and it uses a lambda so you can do "get or add" in one go. It defaults to 20 minutes sliding expiration.
There's even a NuGet package ;)
I've solved this issue by making use of the AddOrGetExisting method on the MemoryCache and the use of Lazy initialization.
Essentially, my code looks something like this:
static string GetCachedData(string key, DateTimeOffset offset)
{
Lazy<String> lazyObject = new Lazy<String>(() => SomeHeavyAndExpensiveCalculationThatReturnsAString());
var returnedLazyObject = MemoryCache.Default.AddOrGetExisting(key, lazyObject, offset);
if (returnedLazyObject == null)
return lazyObject.Value;
return ((Lazy<String>) returnedLazyObject).Value;
}
Worst case scenario here is that you create the same Lazy object twice. But that is pretty trivial. The use of AddOrGetExisting guarantees that you'll only ever get one instance of the Lazy object, and so you're also guaranteed to only call the expensive initialization method once.
I assume this code has concurrency issues:
Actually, it's quite possibly fine, though with a possible improvement.
Now, in general the pattern where we have multiple threads setting a shared value on first use, to not lock on the value being obtained and set can be:
Disastrous - other code will assume only one instance exists.
Disastrous - the code that obtains the instance is not can only tolerate one (or perhaps a certain small number) concurrent operations.
Disastrous - the means of storage is not thread-safe (e.g. have two threads adding to a dictionary and you can get all sorts of nasty errors).
Sub-optimal - the overall performance is worse than if locking had ensured only one thread did the work of obtaining the value.
Optimal - the cost of having multiple threads do redundant work is less than the cost of preventing it, especially since that can only happen during a relatively brief period.
However, considering here that MemoryCache may evict entries then:
If it's disastrous to have more than one instance then MemoryCache is the wrong approach.
If you must prevent simultaneous creation, you should do so at the point of creation.
MemoryCache is thread-safe in terms of access to that object, so that is not a concern here.
Both of these possibilities have to be thought about of course, though the only time having two instances of the same string existing can be a problem is if you're doing very particular optimisations that don't apply here*.
So, we're left with the possibilities:
It is cheaper to avoid the cost of duplicate calls to SomeHeavyAndExpensiveCalculation().
It is cheaper not to avoid the cost of duplicate calls to SomeHeavyAndExpensiveCalculation().
And working that out can be difficult (indeed, the sort of thing where it's worth profiling rather than assuming you can work it out). It's worth considering here though that most obvious ways of locking on insert will prevent all additions to the cache, including those that are unrelated.
This means that if we had 50 threads trying to set 50 different values, then we'll have to make all 50 threads wait on each other, even though they weren't even going to do the same calculation.
As such, you're probably better off with the code you have, than with code that avoids the race-condition, and if the race-condition is a problem, you quite likely either need to handle that somewhere else, or need a different caching strategy than one that expels old entries†.
The one thing I would change is I'd replace the call to Set() with one to AddOrGetExisting(). From the above it should be clear that it probably isn't necessary, but it would allow the newly obtained item to be collected, reducing overall memory use and allowing a higher ratio of low generation to high generation collections.
So yeah, you could use double-locking to prevent concurrency, but either the concurrency isn't actually a problem, or your storing the values in the wrong way, or double-locking on the store would not be the best way to solve it.
*If you know only one each of a set of strings exists, you can optimise equality comparisons, which is about the only time having two copies of a string can be incorrect rather than just sub-optimal, but you'd want to be doing very different types of caching for that to make sense. E.g. the sort XmlReader does internally.
†Quite likely either one that stores indefinitely, or one that makes use of weak references so it will only expel entries if there are no existing uses.
Somewhat dated question, but maybe still useful: you may take a look at FusionCache ⚡🦥, which I recently released.
The feature you are looking for is described here, and you can use it like this:
const string CacheKey = "CacheKey";
static string GetCachedData()
{
return fusionCache.GetOrSet(
CacheKey,
_ => SomeHeavyAndExpensiveCalculation(),
TimeSpan.FromMinutes(20)
);
}
You may also find some of the other features interesting like fail-safe, advanced timeouts with background factory completion and support for an optional, distributed 2nd level cache.
If you will give it a chance please let me know what you think.
/shameless-plug
It is difficult to choose which one is better; lock or ReaderWriterLockSlim. You need real world statistics of read and write numbers and ratios etc.
But if you believe using "lock" is the correct way. Then here is a different solution for different needs. I also include the Allan Xu's solution in the code. Because both can be needed for different needs.
Here are the requirements, driving me to this solution:
You don't want to or cannot supply the 'GetData' function for some reason. Perhaps the 'GetData' function is located in some other class with a heavy constructor and you do not want to even create an instance till ensuring it is unescapable.
You need to access the same cached data from different locations/tiers of the application. And those different locations don't have access to same locker object.
You don't have a constant cache key. For example; need of caching some data with the sessionId cache key.
Code:
using System;
using System.Runtime.Caching;
using System.Collections.Concurrent;
using System.Collections.Generic;
namespace CachePoc
{
class Program
{
static object everoneUseThisLockObject4CacheXYZ = new object();
const string CacheXYZ = "CacheXYZ";
static object everoneUseThisLockObject4CacheABC = new object();
const string CacheABC = "CacheABC";
static void Main(string[] args)
{
//Allan Xu's usage
string xyzData = MemoryCacheHelper.GetCachedDataOrAdd<string>(CacheXYZ, everoneUseThisLockObject4CacheXYZ, 20, SomeHeavyAndExpensiveXYZCalculation);
string abcData = MemoryCacheHelper.GetCachedDataOrAdd<string>(CacheABC, everoneUseThisLockObject4CacheXYZ, 20, SomeHeavyAndExpensiveXYZCalculation);
//My usage
string sessionId = System.Web.HttpContext.Current.Session["CurrentUser.SessionId"].ToString();
string yvz = MemoryCacheHelper.GetCachedData<string>(sessionId);
if (string.IsNullOrWhiteSpace(yvz))
{
object locker = MemoryCacheHelper.GetLocker(sessionId);
lock (locker)
{
yvz = MemoryCacheHelper.GetCachedData<string>(sessionId);
if (string.IsNullOrWhiteSpace(yvz))
{
DatabaseRepositoryWithHeavyConstructorOverHead dbRepo = new DatabaseRepositoryWithHeavyConstructorOverHead();
yvz = dbRepo.GetDataExpensiveDataForSession(sessionId);
MemoryCacheHelper.AddDataToCache(sessionId, yvz, 5);
}
}
}
}
private static string SomeHeavyAndExpensiveXYZCalculation() { return "Expensive"; }
private static string SomeHeavyAndExpensiveABCCalculation() { return "Expensive"; }
public static class MemoryCacheHelper
{
//Allan Xu's solution
public static T GetCachedDataOrAdd<T>(string cacheKey, object cacheLock, int minutesToExpire, Func<T> GetData) where T : class
{
//Returns null if the string does not exist, prevents a race condition where the cache invalidates between the contains check and the retreival.
T cachedData = MemoryCache.Default.Get(cacheKey, null) as T;
if (cachedData != null)
return cachedData;
lock (cacheLock)
{
//Check to see if anyone wrote to the cache while we where waiting our turn to write the new value.
cachedData = MemoryCache.Default.Get(cacheKey, null) as T;
if (cachedData != null)
return cachedData;
cachedData = GetData();
MemoryCache.Default.Set(cacheKey, cachedData, DateTime.Now.AddMinutes(minutesToExpire));
return cachedData;
}
}
#region "My Solution"
readonly static ConcurrentDictionary<string, object> Lockers = new ConcurrentDictionary<string, object>();
public static object GetLocker(string cacheKey)
{
CleanupLockers();
return Lockers.GetOrAdd(cacheKey, item => (cacheKey, new object()));
}
public static T GetCachedData<T>(string cacheKey) where T : class
{
CleanupLockers();
T cachedData = MemoryCache.Default.Get(cacheKey) as T;
return cachedData;
}
public static void AddDataToCache(string cacheKey, object value, int cacheTimePolicyMinutes)
{
CleanupLockers();
MemoryCache.Default.Add(cacheKey, value, DateTimeOffset.Now.AddMinutes(cacheTimePolicyMinutes));
}
static DateTimeOffset lastCleanUpTime = DateTimeOffset.MinValue;
static void CleanupLockers()
{
if (DateTimeOffset.Now.Subtract(lastCleanUpTime).TotalMinutes > 1)
{
lock (Lockers)//maybe a better locker is needed?
{
try//bypass exceptions
{
List<string> lockersToRemove = new List<string>();
foreach (var locker in Lockers)
{
if (!MemoryCache.Default.Contains(locker.Key))
lockersToRemove.Add(locker.Key);
}
object dummy;
foreach (string lockerKey in lockersToRemove)
Lockers.TryRemove(lockerKey, out dummy);
lastCleanUpTime = DateTimeOffset.Now;
}
catch (Exception)
{ }
}
}
}
#endregion
}
}
class DatabaseRepositoryWithHeavyConstructorOverHead
{
internal string GetDataExpensiveDataForSession(string sessionId)
{
return "Expensive data from database";
}
}
}
To avoid the global lock, you can use SingletonCache to implement one lock per key, without exploding memory usage (the lock objects are removed when no longer referenced, and acquire/release is thread safe guaranteeing that only 1 instance is ever in use via compare and swap).
Using it looks like this:
SingletonCache<string, object> keyLocks = new SingletonCache<string, object>();
const string CacheKey = "CacheKey";
static string GetCachedData()
{
string expensiveString =null;
if (MemoryCache.Default.Contains(CacheKey))
{
return MemoryCache.Default[CacheKey] as string;
}
// double checked lock
using (var lifetime = keyLocks.Acquire(url))
{
lock (lifetime.Value)
{
if (MemoryCache.Default.Contains(CacheKey))
{
return MemoryCache.Default[CacheKey] as string;
}
cacheItemPolicy cip = new CacheItemPolicy()
{
AbsoluteExpiration = new DateTimeOffset(DateTime.Now.AddMinutes(20))
};
expensiveString = SomeHeavyAndExpensiveCalculation();
MemoryCache.Default.Set(CacheKey, expensiveString, cip);
return expensiveString;
}
}
}
Code is here on GitHub: https://github.com/bitfaster/BitFaster.Caching
Install-Package BitFaster.Caching
There is also an LRU implementation that is lighter weight than MemoryCache, and has several advantages - faster concurrent reads and writes, bounded size, no background thread, internal perf counters etc. (disclaimer, I wrote it).
Console example of MemoryCache, "How to save/get simple class objects"
Output after launching and pressing Any key except Esc :
Saving to cache!
Getting from cache!
Some1
Some2
class Some
{
public String text { get; set; }
public Some(String text)
{
this.text = text;
}
public override string ToString()
{
return text;
}
}
public static MemoryCache cache = new MemoryCache("cache");
public static string cache_name = "mycache";
static void Main(string[] args)
{
Some some1 = new Some("some1");
Some some2 = new Some("some2");
List<Some> list = new List<Some>();
list.Add(some1);
list.Add(some2);
do {
if (cache.Contains(cache_name))
{
Console.WriteLine("Getting from cache!");
List<Some> list_c = cache.Get(cache_name) as List<Some>;
foreach (Some s in list_c) Console.WriteLine(s);
}
else
{
Console.WriteLine("Saving to cache!");
cache.Set(cache_name, list, DateTime.Now.AddMinutes(10));
}
} while (Console.ReadKey(true).Key != ConsoleKey.Escape);
}
public interface ILazyCacheProvider : IAppCache
{
/// <summary>
/// Get data loaded - after allways throw cached result (even when data is older then needed) but very fast!
/// </summary>
/// <param name="key"></param>
/// <param name="getData"></param>
/// <param name="slidingExpiration"></param>
/// <typeparam name="T"></typeparam>
/// <returns></returns>
T GetOrAddPermanent<T>(string key, Func<T> getData, TimeSpan slidingExpiration);
}
/// <summary>
/// Initialize LazyCache in runtime
/// </summary>
public class LazzyCacheProvider: CachingService, ILazyCacheProvider
{
private readonly Logger _logger = LogManager.GetLogger("MemCashe");
private readonly Hashtable _hash = new Hashtable();
private readonly List<string> _reloader = new List<string>();
private readonly ConcurrentDictionary<string, DateTime> _lastLoad = new ConcurrentDictionary<string, DateTime>();
T ILazyCacheProvider.GetOrAddPermanent<T>(string dataKey, Func<T> getData, TimeSpan slidingExpiration)
{
var currentPrincipal = Thread.CurrentPrincipal;
if (!ObjectCache.Contains(dataKey) && !_hash.Contains(dataKey))
{
_hash[dataKey] = null;
_logger.Debug($"{dataKey} - first start");
_lastLoad[dataKey] = DateTime.Now;
_hash[dataKey] = ((object)GetOrAdd(dataKey, getData, slidingExpiration)).CloneObject();
_lastLoad[dataKey] = DateTime.Now;
_logger.Debug($"{dataKey} - first");
}
else
{
if ((!ObjectCache.Contains(dataKey) || _lastLoad[dataKey].AddMinutes(slidingExpiration.Minutes) < DateTime.Now) && _hash[dataKey] != null)
Task.Run(() =>
{
if (_reloader.Contains(dataKey)) return;
lock (_reloader)
{
if (ObjectCache.Contains(dataKey))
{
if(_lastLoad[dataKey].AddMinutes(slidingExpiration.Minutes) > DateTime.Now)
return;
_lastLoad[dataKey] = DateTime.Now;
Remove(dataKey);
}
_reloader.Add(dataKey);
Thread.CurrentPrincipal = currentPrincipal;
_logger.Debug($"{dataKey} - reload start");
_hash[dataKey] = ((object)GetOrAdd(dataKey, getData, slidingExpiration)).CloneObject();
_logger.Debug($"{dataKey} - reload");
_reloader.Remove(dataKey);
}
});
}
if (_hash[dataKey] != null) return (T) (_hash[dataKey]);
_logger.Debug($"{dataKey} - dummy start");
var data = GetOrAdd(dataKey, getData, slidingExpiration);
_logger.Debug($"{dataKey} - dummy");
return (T)((object)data).CloneObject();
}
}
Its a bit late, however...
Full implementation:
[HttpGet]
public async Task<HttpResponseMessage> GetPageFromUriOrBody(RequestQuery requestQuery)
{
log(nameof(GetPageFromUriOrBody), nameof(requestQuery));
var responseResult = await _requestQueryCache.GetOrCreate(
nameof(GetPageFromUriOrBody)
, requestQuery
, (x) => getPageContent(x).Result);
return Request.CreateResponse(System.Net.HttpStatusCode.Accepted, responseResult);
}
static MemoryCacheWithPolicy<RequestQuery, string> _requestQueryCache = new MemoryCacheWithPolicy<RequestQuery, string>();
Here is getPageContent signature:
async Task<string> getPageContent(RequestQuery requestQuery);
And here is the MemoryCacheWithPolicy implementation:
public class MemoryCacheWithPolicy<TParameter, TResult>
{
static ILogger _nlogger = new AppLogger().Logger;
private MemoryCache _cache = new MemoryCache(new MemoryCacheOptions()
{
//Size limit amount: this is actually a memory size limit value!
SizeLimit = 1024
});
/// <summary>
/// Gets or creates a new memory cache record for a main data
/// along with parameter data that is assocciated with main main.
/// </summary>
/// <param name="key">Main data cache memory key.</param>
/// <param name="param">Parameter model that assocciated to main model (request result).</param>
/// <param name="createCacheData">A delegate to create a new main data to cache.</param>
/// <returns></returns>
public async Task<TResult> GetOrCreate(object key, TParameter param, Func<TParameter, TResult> createCacheData)
{
// this key is used for param cache memory.
var paramKey = key + nameof(param);
if (!_cache.TryGetValue(key, out TResult cacheEntry))
{
// key is not in the cache, create data through the delegate.
cacheEntry = createCacheData(param);
createMemoryCache(key, cacheEntry, paramKey, param);
_nlogger.Warn(" cache is created.");
}
else
{
// data is chached so far..., check if param model is same (or changed)?
if(!_cache.TryGetValue(paramKey, out TParameter cacheParam))
{
//exception: this case should not happened!
}
if (!cacheParam.Equals(param))
{
// request param is changed, create data through the delegate.
cacheEntry = createCacheData(param);
createMemoryCache(key, cacheEntry, paramKey, param);
_nlogger.Warn(" cache is re-created (param model has been changed).");
}
else
{
_nlogger.Trace(" cache is used.");
}
}
return await Task.FromResult<TResult>(cacheEntry);
}
MemoryCacheEntryOptions createMemoryCacheEntryOptions(TimeSpan slidingOffset, TimeSpan relativeOffset)
{
// Cache data within [slidingOffset] seconds,
// request new result after [relativeOffset] seconds.
return new MemoryCacheEntryOptions()
// Size amount: this is actually an entry count per
// key limit value! not an actual memory size value!
.SetSize(1)
// Priority on removing when reaching size limit (memory pressure)
.SetPriority(CacheItemPriority.High)
// Keep in cache for this amount of time, reset it if accessed.
.SetSlidingExpiration(slidingOffset)
// Remove from cache after this time, regardless of sliding expiration
.SetAbsoluteExpiration(relativeOffset);
//
}
void createMemoryCache(object key, TResult cacheEntry, object paramKey, TParameter param)
{
// Cache data within 2 seconds,
// request new result after 5 seconds.
var cacheEntryOptions = createMemoryCacheEntryOptions(
TimeSpan.FromSeconds(2)
, TimeSpan.FromSeconds(5));
// Save data in cache.
_cache.Set(key, cacheEntry, cacheEntryOptions);
// Save param in cache.
_cache.Set(paramKey, param, cacheEntryOptions);
}
void checkCacheEntry<T>(object key, string name)
{
_cache.TryGetValue(key, out T value);
_nlogger.Fatal("Key: {0}, Name: {1}, Value: {2}", key, name, value);
}
}
nlogger is just nLog object to trace MemoryCacheWithPolicy behavior.
I re-create the memory cache if request object (RequestQuery requestQuery) is changed through the delegate (Func<TParameter, TResult> createCacheData) or re-create when sliding or absolute time reached their limit. Note that everything is async too ;)
I'm building an XNA game and I'm trying to save game/map etc. state completely, and then be able to load and resume from exactly the same state.
My game logic consists of fairly complex elements (for serializing) such as references, delegates etc. I've done hours of research and decided that it's the best to use a DataContractSerializer that preserves the object references. (I also got around for delegates but that's another topic) I have no problem serializing and deserializing the state, re-creating the objects, the fields, lists, and even object references correctly and completely. But I've got a problem with cyclic references. Consider this scenario:
class X{
public X another;
}
//from code:
X first = new X();
X second = new X();
first.another = second;
second.another = first;
Trying to serialize X will result in an exception complaining about cyclic references. If I comment out the last line it works fine. Well, I can imagine WHY it is happening, but I have no idea HOW to solve it. I've read somewhere that I can use the DataContract attribute with IsReference set to true, but it didn't change anything for me -- still got the error. (I want to avoid it anyway since the code I'm working on is portable code and may someday run on Xbox too, and portable library for Xbox doesn't support the assembly that DataContract is in.)
Here is the code to serialize:
class DataContractContentWriterBase<T> where T : GameObject
{
internal void Write(Stream output, T objectToWrite, Type[] extraTypes = null)
{
if (extraTypes == null) { extraTypes = new Type[0]; }
DataContractSerializer serializer = new DataContractSerializer(typeof(T), extraTypes, int.MaxValue, false, true, null);
serializer.WriteObject(output, objectToWrite);
}
}
and I'm calling this code from this class:
[ContentTypeWriter]
public class PlatformObjectTemplateWriter : ContentTypeWriter<TWrite>
(... lots of code ...)
DataContractContentWriterBase<TWrite> writer = new DataContractContentWriterBase<TWrite>();
protected override void Write(ContentWriter output, TWrite value)
{
writer.Write(output.BaseStream, value, GetExtraTypes());
}
and for deserialization:
class DataContractContentReaderBase<T> where T: GameObject
{
internal T Read(Stream input, Type[] extraTypes = null)
{
if (extraTypes == null) { extraTypes = new Type[0]; }
DataContractSerializer serializer = new DataContractSerializer(typeof(T), extraTypes, int.MaxValue, false, true, null);
T obj = serializer.ReadObject(input) as T;
//return obj.Clone() as T; //clone falan.. bi bak iste.
return obj;
}
}
and it's being called by:
public class PlatformObjectTemplateReader : ContentTypeReader<TRead>
(lots of code...)
DataContractContentReaderBase<TRead> reader = new DataContractContentReaderBase<TRead>();
protected override TRead Read(ContentReader input, TRead existingInstance)
{
return reader.Read(input.BaseStream, GetExtraTypes());
}
where:
PlatformObjectTemplate was my type to write.
Any suggestions?
SOLUTION: Just a few minutes ago, I've realized that I wasn't marking the fields with DataMember attribute, and before I added the DataContract attribute, the XNA serializer was somehow acting as the "default" serializer. Now, I've marked all the objects, and things are working perfectly now. I now have cyclic references with no problem in my model.
If you don't want to use [DataContract(IsReference=true)] then DataContractSerializer won't help you, because this attribute is the thing that does the trick with references.
So, you should either look for alternative serializers, or write some serialization code that transforms your graphs into some conventional representation (like a list of nodes + a list of links between them) and back, and then serialize that simple structure.
In case you decide to use DataContract(IsReference=true), here's a sample that serializes your graph:
[DataContract(IsReference = true)]
class X{
[DataMember]
public X another;
}
static void Main()
{
//from code:
var first = new X();
var second = new X();
first.another = second;
second.another = first;
byte[] data;
using (var stream = new MemoryStream())
{
var serializer = new DataContractSerializer(typeof(X));
serializer.WriteObject(stream, first);
data = stream.ToArray();
}
var str = Encoding.UTF8.GetString(data2);
}
The str will contain the following XML:
<X z:Id="i1" xmlns="http://schemas.datacontract.org/2004/07/GraphXmlSerialization"
xmlns:i="http://www.w3.org/2001/XMLSchema-instance"
xmlns:z="http://schemas.microsoft.com/2003/10/Serialization/">
<another z:Id="i2">
<another z:Ref="i1"/>
</another>
</X>