Validate object against a schema before serialization - c#

I want to serialize a C# object as JSON into a stream, but to avoid the serialization if the object is not valid according to a schema. How should I proceed with this task using JSON.NET and Json.NET Schema? From what I see there is no method in the JSON.NET library which allows the validation of a C# object against a JSON schema. It seems somewhat weird that there is no direct method to just validate the C# object without encoding it. Do you have any idea why this method is not available?

It seems this API not currently available. At a guess, this might be because recursively generating the JSON values to validate involves most of the work of serializing the object. Or it could just be because no one at Newtonsoft ever designed, specified, implemented, tested, documented and shipped that feature.
If you want, you could file an enhancement request requesting this API, probably as a part of the SchemaExtensions class.
In the meantime, if you do need to test-validate a POCO without generating a complete serialization of it (because e.g. the result would be very large), you could grab NullJsonWriter from Reference to automatically created objects, wrap it in a JSchemaValidatingWriter and test-serialize your object as shown in Validate JSON with JSchemaValidatingWriter. NullJsonWriter doesn't actually write anything, and so using it eliminates the performance and memory overhead of generating a complete serialization (either as a string or as a JToken).
First, add the following static method:
public static class JsonExtensions
{
public static bool TestValidate<T>(T obj, JSchema schema, SchemaValidationEventHandler handler = null, JsonSerializerSettings settings = null)
{
using (var writer = new NullJsonWriter())
using (var validatingWriter = new JSchemaValidatingWriter(writer) { Schema = schema })
{
int count = 0;
if (handler != null)
validatingWriter.ValidationEventHandler += handler;
validatingWriter.ValidationEventHandler += (o, a) => count++;
JsonSerializer.CreateDefault(settings).Serialize(validatingWriter, obj);
return count == 0;
}
}
}
// Used to enable Json.NET to traverse an object hierarchy without actually writing any data.
class NullJsonWriter : JsonWriter
{
public NullJsonWriter()
: base()
{
}
public override void Flush()
{
// Do nothing.
}
}
Then use it like:
// Example adapted from
// https://www.newtonsoft.com/jsonschema/help/html/JsonValidatingWriterAndSerializer.htm
// by James Newton-King
string schemaJson = #"{
'description': 'A person',
'type': 'object',
'properties': {
'name': {'type':'string'},
'hobbies': {
'type': 'array',
'maxItems': 3,
'items': {'type':'string'}
}
}
}";
var schema = JSchema.Parse(schemaJson);
var person = new
{
Name = "James",
Hobbies = new [] { ".Net", "Blogging", "Reading", "XBox", "LOLCATS" },
};
var settings = new JsonSerializerSettings { ContractResolver = new CamelCasePropertyNamesContractResolver() };
var isValid = JsonExtensions.TestValidate(person, schema, (o, a) => Console.WriteLine(a.Message), settings);
// Prints Array item count 5 exceeds maximum count of 3. Path 'hobbies'.
Console.WriteLine("isValid = {0}", isValid);
// Prints isValid = False
Watch out for cases by the way. Json.NET schema is case sensitive so you will need to use an appropriate contract resolver when test-validating.
Sample fiddle.

You cannot do that from the JSON string, you need an object and a schema to compare with first..
public void Validate()
{
//...
JsonSchema schema = JsonSchema.Parse("{'pattern':'lol'}");
JToken stringToken = JToken.FromObject("pie");
stringToken.Validate(schema);

Related

Newtonsoft serialization of object with few not serializable properties

I use the following code to serialize JSON
JsonSerializerSettings settings = new JsonSerializerSettings()
{
Culture = CultureInfo.InvariantCulture,
ReferenceLoopHandling = ReferenceLoopHandling.Ignore,
MaxDepth = 10
};
string serializationResult = JsonConvert.SerializeObject(currentObject, settings);
But i have found that if currentObject contains not serializable items it will throw Exception (for example a reference to dirty a DataContext).
I have no control over currentObject because the actual code is part of a logging utility, and so 'currentObject' could be everything the utility user submitted to utility.
I would like to be able to serialize 'currentObject' as much as i can, avoiding problematic properties if there are any.
So i modified the code in the following way :
try
{
JsonSerializerSettings settings = new JsonSerializerSettings()
{
Culture = CultureInfo.InvariantCulture,
ReferenceLoopHandling = ReferenceLoopHandling.Ignore,
MaxDepth = 10,
Error = delegate (object sender, ErrorEventArgs args)
{
args.ErrorContext.Handled = true;
}
};
string serializationResult = JsonConvert.SerializeObject(currentObject, settings);
return serializationResult;
}
catch ( Exception err)
{
return string.Empty;
}
I test it again passing a 'currentObject' in which a property is a dirty DataConetxt, and now the serialization process never end. Seems like it keep looping on the error hanlder.
After an huge amount of times (few minutes) i finally get a serialization error :
System.NotSupportedException: 'Specified method is not supported.'
at System.Web.HttpResponseStream.get_Length()
at GetLength(Object )
at Newtonsoft.Json.Serialization.DynamicValueProvider.GetValue(Object target)
Is not clear to me what i am missin and how to handle such situation in which a property of object to serialize may be not serializable. Is my logic correct but the prolem is that DataContext is huge to serialize (and if so i have to try to filter out such situation)?
Or i mistakenly impelemented the error handling?
Even if not clear exactly where is the issue with DataContext, it's seems reasoneable to me that try to serialize such a complex object is not a good approach (unmanaged resource may be involved behind the hood).
As suggested by #Eidat i tried to create a test method to check if the serialization of an entry/property works in a given amount of time or if it fail.
If the serailization failed then skip the entity/property.
This approach may be a bit too time consuming, but if performance is not a top prioritiy, or volume of data are low, this solution may be viable.
To do so in a resuable way i write that simple extension method on which i test the "critical" object before serialization (with critical i mean object i don't know where they came from, and for that reason they may be not serializable).
/// <summary>
/// Check if an object is serializable throught Newtonsoft serialization in a given amount of time
/// </summary>
/// <param name="this">The object to test</param>
/// <param name="settings">The settings to use for serialization (optional)</param>
/// <param name="timeoutMs">The timeout in millisenconds, in which the test have to complete (optional : default 97ms)</param>
/// <returns></returns>
public static bool IsNSSerializable ( this object #this, JsonSerializerSettings settings = null, int timeoutMs = 97)
{
try
{
TimeSpan testTimeout = TimeSpan.FromMilliseconds(timeoutMs);
var currentSettings = settings ?? new JsonSerializerSettings()
{
Culture = CultureInfo.InvariantCulture,
ReferenceLoopHandling = ReferenceLoopHandling.Ignore,
MaxDepth = 10
};
using (var tokenSource = new CancellationTokenSource(testTimeout))
{
string result = string.Empty;
var token = tokenSource.Token;
Thread workerThread = null;
Task testActivity = Task.Run(() =>
{
try
{
workerThread = new Thread(() =>
{
try
{
result = JsonConvert.SerializeObject(#this, settings);
}
catch (Exception err)
{ }
});
workerThread.Join(testTimeout);
}
catch (Exception err)
{ }
} , token);
if (!testActivity.Wait(testTimeout))
{
token.ThrowIfCancellationRequested();
global::System.Threading.Thread.Sleep(1);
if (workerThread != null && workerThread.IsAlive)
workerThread.Abort();
}
return !string.IsNullOrEmpty(result);
}
}
catch (Exception err )
{
return false;
}
}

Set RootAttribute for System.Text.JsonSerializer?

Is there something like XmlRootAttribute that can be used with System.Text.JsonSerializer?
I need to be able to download data from this vendor using both XML an JSON. See sample data here:
{
"categories": [
{
"id": 125,
"name": "Trade Balance",
"parent_id": 13
}
]
}
Note the data elements are wrapped by an array named categories. When downloading using XML I can set the root element to categories and the correct object is returned (see XMLClient below ). When using JSONClient however I cannot (or do not know how) to set the root element. Best workaround I could find is to use JsonDocument which creates a string allocation. I could also create some wrapper classes for the JSON implementation but that is a lot of work that involves not only creating additional DTOs but also requires overwriting many methods on BaseClient. I also don't want to write converters - it pretty much defeats the purpose of using well-known serialization protocols. (Different responses will have a different root wrapper property name. I know the name in runtime, but not necessarily at compile time.)
public class JSONClient : BaseClient
{
protected override async Task<T> Parse<T>(string uri, string root)
{
uri = uri + (uri.Contains("?") ? "&" : "?") + "file_type=json";
var document = JsonDocument.Parse((await Download(uri)), new JsonDocumentOptions { AllowTrailingCommas = true });
string json = document.RootElement.GetProperty(root).GetRawText(); // string allocation
return JsonSerializer.Deserialize<T>(json);
}
}
public class XMLClient : BaseClient
{
protected override async Task<T> Parse<T>(string uri, string root)
{
return (T)new XmlSerializer(typeof(T), new XmlRootAttribute(root)).Deserialize(await Download(uri)); // want to do this - using JsonSerializer
}
}
public abstract class BaseClient
{
protected virtual async Task<Stream> Download(string uri)
{
uri = uri + (uri.Contains("?") ? "&" : "?") + "api_key=" + "xxxxx";
var response = await new HttpClient() { BaseAddress = new Uri(uri) }.GetAsync(uri);
return await response.Content.ReadAsStreamAsync();
}
protected abstract Task<T> Parse<T>(string uri, string root) where T : class, new();
public async Task<Category> GetCategory(string categoryID)
{
string uri = "https://api.stlouisfed.org/fred/category?category_id=" + categoryID;
return (await Parse<List<Category>>(uri, "categories"))?.FirstOrDefault();
}
}
JSON has no concept of a root element (or element names in general), so there's no equivalent to XmlRootAttribute in System.Text.Json (or Json.NET for that matter). Rather, it has the following two types of container, along with several atomic value types:
Objects, which are unordered sets of name/value pairs. An object begins with {left brace and ends with }right brace.
Arrays, which are ordered collections of values. An array begins with [left bracket and ends with ]right bracket.
As System.Text.Json.JsonSerializer is designed to map c# objects to JSON objects and c# collections to JSON arrays in a 1-1 manner, there's no built-in attribute or declarative option to tell the serializer to automatically descend the JSON hierarchy until a property with a specific name is encountered, then deserialize its value to a required type.
If you need to access some JSON data that is consistently embedded in some wrapper object containing a single property whose name is known at runtime but not compile time, i.e.:
{
"someRuntimeKnownWrapperPropertyName" : // The value you actually want
}
Then the easiest way to do that would be to deserialize to a Dictionary<string, T> where the type T corresponds to the type of the expected value, e.g.:
protected override async Task<T> Parse<T>(string uri, string root)
{
uri = uri + (uri.Contains("?") ? "&" : "?") + "file_type=json";
using var stream = await Download(uri); // Dispose here or not? What about disposing of the containing HttpResponseMessage?
var options = new JsonSerializerOptions
{
AllowTrailingCommas = true,
// PropertyNameCaseInsensitive = false, Uncomment if you need case insensitivity.
};
var dictionary = await JsonSerializer.DeserializeAsync<Dictionary<string, T>>(stream, options);
// Throw an exception if the dictionary does not have exactly one entry, with the required name
var pair = dictionary?.Single();
if (pair == null || !pair.Value.Key.Equals(root, StringComparison.Ordinal)) //StringComparison.OrdinalIgnoreCase if required
throw new JsonException();
// And now return the value
return pair.Value.Value;
}
Notes:
By deserializing to a typed dictionary rather than a JsonDocument you avoid the intermediate allocations required for the JsonDocument itself as well as the string returned by GetRawText(). The only memory overhead is for the dictionary itself.
Note also that JsonDocument is disposable. Failure to dispose it will result in the memory not being returned to the pool, which will increase GC impact across various parts of the framework.
Alternatively, you could create a [custom JsonConverter](Failure to properly dispose this object will result in the memory not being returned to the pool, which will increase GC impact across various parts of the framework.) to read through the incoming JSON until a property of the required name is encountered, then deserialize the value to the expected type.
For some examples, see Is there a simple way to manually serialize/deserialize child objects in a custom converter in System.Text.Json? or https://stackoverflow.com/a/62155881/3744182.
Rather than allocating an HttpClient for every call, you should allocate only one and reuse it. See Do HttpClient and HttpClientHandler have to be disposed between requests?.
You are not disposing of the HttpResponseMessage or response content stream. You may want to refactor your code to do that; see When or if to Dispose HttpResponseMessage when calling ReadAsStreamAsync?. You may also want to check response.IsSuccessStatusCode before deserializing.
In .Net 5, HttpClientJsonExtensions can make deserializing JSON returned by HttpClient simpler.

Create invalid Json with Newtonsoft - Allow invalid objects?

I'm deliberately trying to create invalid JSON with Newtonsoft Json, in order to place an ESI include tag, which will fetch two more json nodes.
This is my JsonConverter's WriteJson method:
public override void WriteJson(JsonWriter writer, object value, JsonSerializer serializer)
{
mApiResponseClass objectFromApi = (mApiResponseClass)value;
foreach (var obj in objectFromApi.GetType().GetProperties())
{
if (obj.Name == "EsiObj")
{
writer.WriteRawValue(objectFromApi.EsiObj);
}
else
{
writer.WritePropertyName(obj.Name);
serializer.Serialize(writer, obj.GetValue(value, null));
}
}
}
The EsiObj in mApiResponseClass is just a string, but it needs to be written into the JSON response to be interpretted without any property name - so that hte ESI can work.
This of course results in an exception with the Json Writer, with value:
Newtonsoft.Json.JsonWriterException: 'Token Undefined in state Object
would result in an invalid JSON object. Path ''.'
Is there any way around this?
An ideal output from this would be JSON formatted, technically not valid, and would look like this:
{
value:7,
string1:"woohoo",
<esi:include src="/something" />
Song:["I am a small API","all i do is run","but from who?","nobody knows"]
}
Edit:
Using ESI allows us to have varying cache lengths of a single response - i.e. we can place data that can be cached for a very long time in some parts of the JSON, and only fetch updated parts, such as those that rely on client-specific data.
ESI is not HTML specific. (As some state below) It's being run via Varnish, which supports these tags.
Unfortunately, it's required that we do only put out 1 file as a response, and require no further request from the Client.
We cannot alter our response either - so i can't just add a JSON node specifically to contain the other nodes.
Edit 2: The "more json nodes" part is solved by ESI making a further request to our backend for user/client specific data, i.e. to another endpoint. The expected result is that we then merge the original JSON document and the later requested one together seamlessly. (This way, the original document can be old, and client-specific can be new)
Edit 3:
The endpoint /something would output JSON-like fragments like:
teapots:[ {Id: 1, WaterLevel: 100, Temperature: 74, ShortAndStout: true}, {Id: 2, WaterLevel: 47, Temperature: 32, ShortAndStout: true} ],
For a total response of:
{
value:7,
string1:"woohoo",
teapots:[ {Id: 1, WaterLevel: 100, Temperature: 74, ShortAndStout: true}, {Id: 2, WaterLevel: 47, Temperature: 32, ShortAndStout: true} ],
Song:["I am a small API","all i do is run","but from who?","nobody knows"]
}
Your basic problem is that a JsonWriter is a state machine, tracking the current JSON state and validating transitions from state to state, thereby ensuring that badly structured JSON is not written. This is is tripping you up in two separate ways.
Firstly, your WriteJson() method is not calling WriteStartObject() and WriteEndObject(). These are the methods that write the { and } around a JSON object. Since your "ideal output" shows these braces, you should add calls to these methods at the beginning and end of your WriteJson().
Secondly, you are calling WriteRawValue() at a point where well-formed JSON would not allow a value to occur, specifically where a property name is expected instead. It is expected that this would cause an exception, since the documentation states:
Writes raw JSON where a value is expected and updates the writer's state.
What you can instead use is WriteRaw() which is documented as follows:
Writes raw JSON without changing the writer's state.
However, WriteRaw() won't do you any favors. In specific, you will need to take care of writing any delimiters and indentation yourself.
The fix would be to modify your converter to look something like:
public class EsiObjConverter<T> : JsonConverter
{
const string EsiObjName = "EsiObj";
public override void WriteJson(JsonWriter writer, object value, JsonSerializer serializer)
{
var contract = serializer.ContractResolver.ResolveContract(value.GetType()) as JsonObjectContract;
if (contract == null)
throw new JsonSerializationException(string.Format("Non-object type {0}", value));
writer.WriteStartObject();
int propertyCount = 0;
bool lastWasEsiProperty = false;
foreach (var property in contract.Properties.Where(p => p.Readable && !p.Ignored))
{
if (property.UnderlyingName == EsiObjName && property.PropertyType == typeof(string))
{
var esiValue = (string)property.ValueProvider.GetValue(value);
if (!string.IsNullOrEmpty(esiValue))
{
if (propertyCount > 0)
{
WriteValueDelimiter(writer);
}
writer.WriteWhitespace("\n");
writer.WriteRaw(esiValue);
// If it makes replacement easier, you could force the ESI string to be on its own line by calling
// writer.WriteWhitespace("\n");
propertyCount++;
lastWasEsiProperty = true;
}
}
else
{
var propertyValue = property.ValueProvider.GetValue(value);
// Here you might check NullValueHandling, ShouldSerialize(), ...
if (propertyCount == 1 && lastWasEsiProperty)
{
WriteValueDelimiter(writer);
}
writer.WritePropertyName(property.PropertyName);
serializer.Serialize(writer, propertyValue);
propertyCount++;
lastWasEsiProperty = false;
}
}
writer.WriteEndObject();
}
static void WriteValueDelimiter(JsonWriter writer)
{
var args = new object[0];
// protected virtual void WriteValueDelimiter()
// https://www.newtonsoft.com/json/help/html/M_Newtonsoft_Json_JsonWriter_WriteValueDelimiter.htm
// Since this is overridable by client code it is unlikely to be removed.
writer.GetType().GetMethod("WriteValueDelimiter", BindingFlags.Instance | BindingFlags.Public | BindingFlags.NonPublic).Invoke(writer, args);
}
public override bool CanConvert(Type objectType)
{
return typeof(T).IsAssignableFrom(objectType);
}
public override bool CanRead { get { return false; } }
public override object ReadJson(JsonReader reader, Type objectType, object existingValue, JsonSerializer serializer)
{
throw new NotImplementedException();
}
}
And the serialized output would be:
{
"value": 7,
"string1": "woohoo",
<esi:include src="/something" />,
"Song": [
"I am a small API",
"all i do is run",
"but from who?",
"nobody knows"
]
}
Now, in your question, your desired JSON output shows JSON property names that are not properly quoted. If you really need this and it is not just a typo in the question, you can accomplish this by setting JsonTextWriter.QuoteName to false as shown in this answer to Json.Net - Serialize property name without quotes by Christophe Geers:
var settings = new JsonSerializerSettings
{
Converters = { new EsiObjConverter<mApiResponseClass>() },
};
var stringWriter = new StringWriter();
using (var writer = new JsonTextWriter(stringWriter))
{
writer.QuoteName = false;
writer.Formatting = Formatting.Indented;
writer.Indentation = 0;
JsonSerializer.CreateDefault(settings).Serialize(writer, obj);
}
Which results in:
{
value: 7,
string1: "woohoo",
<esi:include src="/something" />,
Song: [
"I am a small API",
"all i do is run",
"but from who?",
"nobody knows"
]
}
This is almost what is shown in your question, but not quite. It includes a comma delimiter between the ESI string and the next property, but in your question there is no delimiter:
<esi:include src="/something" /> Song: [ ... ]
Getting rid of the delimiter turns out to be problematic to implement because JsonTextWriter.WritePropertyName() automatically writes a delimiter when not at the beginning of an object. I think, however, that this should be acceptable. ESI itself will not know whether it is replacing the first, last or middle property of an object, so it seems best to not include the delimiter in the replacement string at all.
Working sample .Net fiddle here.

C# - OutOfMemoryException saving a List on a JSON file

I'm trying to save the streaming data of a pressure map.
Basically I have a pressure matrix defined as:
double[,] pressureMatrix = new double[e.Data.GetLength(0), e.Data.GetLength(1)];
Basically, I'm getting one of this pressureMatrix every 10 milliseconds and I want to save all the information in a JSON file to be able to reproduce it later.
What I do is, first of all, write what I call the header with all the settings used to do the recording like this:
recordedData.softwareVersion = Assembly.GetExecutingAssembly().GetName().Version.Major.ToString() + "." + Assembly.GetExecutingAssembly().GetName().Version.Minor.ToString();
recordedData.calibrationConfiguration = calibrationConfiguration;
recordedData.representationConfiguration = representationSettings;
recordedData.pressureData = new List<PressureMap>();
var json = JsonConvert.SerializeObject(csvRecordedData, Formatting.None);
File.WriteAllText(this.filePath, json);
Then, every time I get a new pressure map I create a new Thread to add the new PressureMatrix and re-write the file:
var newPressureMatrix = new PressureMap(datos, DateTime.Now);
recordedData.pressureData.Add(newPressureMatrix);
var json = JsonConvert.SerializeObject(recordedData, Formatting.None);
File.WriteAllText(this.filePath, json);
After about 20-30 min I get an OutOfMemory Exception because the system cannot hold the recordedData var because the List<PressureMatrix> in it is too big.
How can I handle this to save a the data? I would like to save the information of 24-48 hours.
Your basic problem is that you are holding all of your pressure map samples in memory rather than writing each one individually and then allowing it to be garbage collected. What's worse, you are doing this in two different places:
You serialize your entire list of samples to a JSON string json before writing the string to a file.
Instead, as explained in Performance Tips: Optimize Memory Usage, you should serialize and deserialize directly to and from your file in such situations. For instructions on how to do this see this answer to Can Json.NET serialize / deserialize to / from a stream? and also Serialize JSON to a file.
The recordedData.pressureData = new List<PressureMap>(); accumulates all pressure map samples, then writes all of them every time a sample is made.
A better solution would be to write each sample once and forget it, but the requirement for each sample to be nested inside some container objects in the JSON makes it nonobvious how to do that.
So, how to attack issue #2?
First, let's modify your data model as follows, partitioning the header data into a separate class:
public class PressureMap
{
public double[,] PressureMatrix { get; set; }
}
public class CalibrationConfiguration
{
// Data model not included in question
}
public class RepresentationConfiguration
{
// Data model not included in question
}
public class RecordedDataHeader
{
public string SoftwareVersion { get; set; }
public CalibrationConfiguration CalibrationConfiguration { get; set; }
public RepresentationConfiguration RepresentationConfiguration { get; set; }
}
public class RecordedData
{
// Ensure the header is serialized first.
[JsonProperty(Order = 1)]
public RecordedDataHeader RecordedDataHeader { get; set; }
// Ensure the pressure data is serialized last.
[JsonProperty(Order = 2)]
public IEnumerable<PressureMap> PressureData { get; set; }
}
Option #1 is a version of the producer-comsumer pattern. It involves spinning up two threads: one to generate PressureData samples, and one to serialize the RecordedData. The first thread will generate samples and add them to a BlockingCollection<PressureMap> collection that is passed to the second thread. The second thread will then serialize BlockingCollection<PressureMap>.GetConsumingEnumerable()
as the value of RecordedData.PressureData.
The following code gives a skeleton for how to do this:
var sampleCount = 400; // Or whatever stopping criterion you prefer
var sampleInterval = 10; // in ms
using (var pressureData = new BlockingCollection<PressureMap>())
{
// Adapted from
// https://learn.microsoft.com/en-us/dotnet/standard/collections/thread-safe/blockingcollection-overview
// https://learn.microsoft.com/en-us/dotnet/api/system.collections.concurrent.blockingcollection-1?view=netframework-4.7.2
// Spin up a Task to sample the pressure maps
using (Task t1 = Task.Factory.StartNew(() =>
{
for (int i = 0; i < sampleCount; i++)
{
var data = GetPressureMap(i);
Console.WriteLine("Generated sample {0}", i);
pressureData.Add(data);
System.Threading.Thread.Sleep(sampleInterval);
}
pressureData.CompleteAdding();
}))
{
// Spin up a Task to consume the BlockingCollection
using (Task t2 = Task.Factory.StartNew(() =>
{
var recordedDataHeader = new RecordedDataHeader
{
SoftwareVersion = softwareVersion,
CalibrationConfiguration = calibrationConfiguration,
RepresentationConfiguration = representationConfiguration,
};
var settings = new JsonSerializerSettings
{
ContractResolver = new CamelCasePropertyNamesContractResolver(),
};
using (var stream = new FileStream(this.filePath, FileMode.Create))
using (var textWriter = new StreamWriter(stream))
using (var jsonWriter = new JsonTextWriter(textWriter))
{
int j = 0;
var query = pressureData
.GetConsumingEnumerable()
.Select(p =>
{
// Flush the writer periodically in case the process terminates abnormally
jsonWriter.Flush();
Console.WriteLine("Serializing item {0}", j++);
return p;
});
var recordedData = new RecordedData
{
RecordedDataHeader = recordedDataHeader,
// Since PressureData is declared as IEnumerable<PressureMap>, evaluation will be lazy.
PressureData = query,
};
Console.WriteLine("Beginning serialization of {0} to {1}:", recordedData, this.filePath);
JsonSerializer.CreateDefault(settings).Serialize(textWriter, recordedData);
Console.WriteLine("Finished serialization of {0} to {1}.", recordedData, this.filePath);
}
}))
{
Task.WaitAll(t1, t2);
}
}
}
Notes:
This solution uses the fact that, when serializing an IEnumerable<T>, Json.NET will not materialize the enumerable as a list. Instead it will take full advantage of lazy evaluation and simply enumerate through it, writing then forgetting each individual item encountered.
The first thread samples PressureData and adds them to the blocking collection.
The second thread wraps the blocking collection in an IEnumerable<PressureData> then serializes that as RecordedData.PressureData.
During serialization, the serializer will enumerate through the IEnumerable<PressureData> enumerable, streaming each to the JSON file then proceeding to the next -- effectively blocking until one becomes available.
You will need to do some experimentation to make sure that the serialization thread can "keep up" with the sampling thread, possibly by setting a BoundedCapacity during construction. If not, you may need to adopt a different strategy.
PressureMap GetPressureMap(int count) should be some method of yours (not shown in the question) that returns the current pressure map sample.
In this technique the JSON file remains open for the duration of the sampling session. If sampling terminates abnormally the file may be truncated. I make some attempt to ameliorate the problem by flushing the writer periodically.
While data serialization will no longer require unbounded amounts of memory, deserializing a RecordedData later will deserialize the PressureData array into a concrete List<PressureMap>. This may possibly cause memory issues during downstream processing.
Demo fiddle #1 here.
Option #2 would be to switch from a JSON file to a Newline Delimited JSON file. Such a file consists of sequences of JSON objects separated by newline characters. In your case, you would make the first object contain the RecordedDataHeader information, and the subsequent objects be of type PressureMap:
var sampleCount = 100; // Or whatever
var sampleInterval = 10;
var recordedDataHeader = new RecordedDataHeader
{
SoftwareVersion = softwareVersion,
CalibrationConfiguration = calibrationConfiguration,
RepresentationConfiguration = representationConfiguration,
};
var settings = new JsonSerializerSettings
{
ContractResolver = new CamelCasePropertyNamesContractResolver(),
};
// Write the header
Console.WriteLine("Beginning serialization of sample data to {0}.", this.filePath);
using (var stream = new FileStream(this.filePath, FileMode.Create))
{
JsonExtensions.ToNewlineDelimitedJson(stream, new[] { recordedDataHeader });
}
// Write each sample incrementally
for (int i = 0; i < sampleCount; i++)
{
Thread.Sleep(sampleInterval);
Console.WriteLine("Performing sample {0} of {1}", i, sampleCount);
var map = GetPressureMap(i);
using (var stream = new FileStream(this.filePath, FileMode.Append))
{
JsonExtensions.ToNewlineDelimitedJson(stream, new[] { map });
}
}
Console.WriteLine("Finished serialization of sample data to {0}.", this.filePath);
Using the extension methods:
public static partial class JsonExtensions
{
// Adapted from the answer to
// https://stackoverflow.com/questions/44787652/serialize-as-ndjson-using-json-net
// by dbc https://stackoverflow.com/users/3744182/dbc
public static void ToNewlineDelimitedJson<T>(Stream stream, IEnumerable<T> items)
{
// Let caller dispose the underlying stream
using (var textWriter = new StreamWriter(stream, new UTF8Encoding(false, true), 1024, true))
{
ToNewlineDelimitedJson(textWriter, items);
}
}
public static void ToNewlineDelimitedJson<T>(TextWriter textWriter, IEnumerable<T> items)
{
var serializer = JsonSerializer.CreateDefault();
foreach (var item in items)
{
// Formatting.None is the default; I set it here for clarity.
using (var writer = new JsonTextWriter(textWriter) { Formatting = Formatting.None, CloseOutput = false })
{
serializer.Serialize(writer, item);
}
// http://specs.okfnlabs.org/ndjson/
// Each JSON text MUST conform to the [RFC7159] standard and MUST be written to the stream followed by the newline character \n (0x0A).
// The newline charater MAY be preceeded by a carriage return \r (0x0D). The JSON texts MUST NOT contain newlines or carriage returns.
textWriter.Write("\n");
}
}
// Adapted from the answer to
// https://stackoverflow.com/questions/29729063/line-delimited-json-serializing-and-de-serializing
// by Yuval Itzchakov https://stackoverflow.com/users/1870803/yuval-itzchakov
public static IEnumerable<TBase> FromNewlineDelimitedJson<TBase, THeader, TRow>(TextReader reader)
where THeader : TBase
where TRow : TBase
{
bool first = true;
using (var jsonReader = new JsonTextReader(reader) { CloseInput = false, SupportMultipleContent = true })
{
var serializer = JsonSerializer.CreateDefault();
while (jsonReader.Read())
{
if (jsonReader.TokenType == JsonToken.Comment)
continue;
if (first)
{
yield return serializer.Deserialize<THeader>(jsonReader);
first = false;
}
else
{
yield return serializer.Deserialize<TRow>(jsonReader);
}
}
}
}
}
Later, you can process the newline delimited JSON file as follows:
using (var stream = File.OpenRead(filePath))
using (var textReader = new StreamReader(stream))
{
foreach (var obj in JsonExtensions.FromNewlineDelimitedJson<object, RecordedDataHeader, PressureMap>(textReader))
{
if (obj is RecordedDataHeader)
{
var header = (RecordedDataHeader)obj;
// Process the header
Console.WriteLine(JsonConvert.SerializeObject(header));
}
else
{
var row = (PressureMap)obj;
// Process the row.
Console.WriteLine(JsonConvert.SerializeObject(row));
}
}
}
Notes:
This approach looks simpler because the samples are added incrementally to the end of the file, rather than inserted inside some overall JSON container.
With this approach both serialization and downstream processing can be done with bounded memory use.
The sample file does not remain open for the duration of sampling, so is less likely to be truncated.
Downstream applications may not have built-in tools for processing newline delimited JSON.
This strategy may integrate more simply with your current threading code.
Demo fiddle #2 here.

Serializing cyclic object references using DataContractSerializer not working

I'm building an XNA game and I'm trying to save game/map etc. state completely, and then be able to load and resume from exactly the same state.
My game logic consists of fairly complex elements (for serializing) such as references, delegates etc. I've done hours of research and decided that it's the best to use a DataContractSerializer that preserves the object references. (I also got around for delegates but that's another topic) I have no problem serializing and deserializing the state, re-creating the objects, the fields, lists, and even object references correctly and completely. But I've got a problem with cyclic references. Consider this scenario:
class X{
public X another;
}
//from code:
X first = new X();
X second = new X();
first.another = second;
second.another = first;
Trying to serialize X will result in an exception complaining about cyclic references. If I comment out the last line it works fine. Well, I can imagine WHY it is happening, but I have no idea HOW to solve it. I've read somewhere that I can use the DataContract attribute with IsReference set to true, but it didn't change anything for me -- still got the error. (I want to avoid it anyway since the code I'm working on is portable code and may someday run on Xbox too, and portable library for Xbox doesn't support the assembly that DataContract is in.)
Here is the code to serialize:
class DataContractContentWriterBase<T> where T : GameObject
{
internal void Write(Stream output, T objectToWrite, Type[] extraTypes = null)
{
if (extraTypes == null) { extraTypes = new Type[0]; }
DataContractSerializer serializer = new DataContractSerializer(typeof(T), extraTypes, int.MaxValue, false, true, null);
serializer.WriteObject(output, objectToWrite);
}
}
and I'm calling this code from this class:
[ContentTypeWriter]
public class PlatformObjectTemplateWriter : ContentTypeWriter<TWrite>
(... lots of code ...)
DataContractContentWriterBase<TWrite> writer = new DataContractContentWriterBase<TWrite>();
protected override void Write(ContentWriter output, TWrite value)
{
writer.Write(output.BaseStream, value, GetExtraTypes());
}
and for deserialization:
class DataContractContentReaderBase<T> where T: GameObject
{
internal T Read(Stream input, Type[] extraTypes = null)
{
if (extraTypes == null) { extraTypes = new Type[0]; }
DataContractSerializer serializer = new DataContractSerializer(typeof(T), extraTypes, int.MaxValue, false, true, null);
T obj = serializer.ReadObject(input) as T;
//return obj.Clone() as T; //clone falan.. bi bak iste.
return obj;
}
}
and it's being called by:
public class PlatformObjectTemplateReader : ContentTypeReader<TRead>
(lots of code...)
DataContractContentReaderBase<TRead> reader = new DataContractContentReaderBase<TRead>();
protected override TRead Read(ContentReader input, TRead existingInstance)
{
return reader.Read(input.BaseStream, GetExtraTypes());
}
where:
PlatformObjectTemplate was my type to write.
Any suggestions?
SOLUTION: Just a few minutes ago, I've realized that I wasn't marking the fields with DataMember attribute, and before I added the DataContract attribute, the XNA serializer was somehow acting as the "default" serializer. Now, I've marked all the objects, and things are working perfectly now. I now have cyclic references with no problem in my model.
If you don't want to use [DataContract(IsReference=true)] then DataContractSerializer won't help you, because this attribute is the thing that does the trick with references.
So, you should either look for alternative serializers, or write some serialization code that transforms your graphs into some conventional representation (like a list of nodes + a list of links between them) and back, and then serialize that simple structure.
In case you decide to use DataContract(IsReference=true), here's a sample that serializes your graph:
[DataContract(IsReference = true)]
class X{
[DataMember]
public X another;
}
static void Main()
{
//from code:
var first = new X();
var second = new X();
first.another = second;
second.another = first;
byte[] data;
using (var stream = new MemoryStream())
{
var serializer = new DataContractSerializer(typeof(X));
serializer.WriteObject(stream, first);
data = stream.ToArray();
}
var str = Encoding.UTF8.GetString(data2);
}
The str will contain the following XML:
<X z:Id="i1" xmlns="http://schemas.datacontract.org/2004/07/GraphXmlSerialization"
xmlns:i="http://www.w3.org/2001/XMLSchema-instance"
xmlns:z="http://schemas.microsoft.com/2003/10/Serialization/">
<another z:Id="i2">
<another z:Ref="i1"/>
</another>
</X>

Categories