Json.Net - performant deserialization onto interface based data structure? - c#

As soon as I don't have only concrete types in the data structures I deserialize a HUGE json tree structure into, it starts using enormous amounts of memory, but its memory footprint stays relatively slim when deserializing into entirely concrete types… is there an elegant workaround for this?
The json I get is generated elsewhere, so I have no influence as to the format I get it in (it's a tree structure, similar to the code example below if it were serialized to json directly), and in the worst case about 250-300MB of it.
My data structure for mapping it used to look somewhat like the following example (structs in some places, though)
public class Node : INode
{
[JsonConverter(typeof(NodeTypeConverter<IInnerNode, InnerNodeType1>))]
public List<INodeInner> InnerNodes { get; set; }
}
public class InnerNodeNodeType1 : INode
{
[JsonConverter(typeof(NodeTypeConverter<IInnerNode, InnerNodeType2>))]
public List<INodeInner> InnerNodes { get; set; }
// some other properties
}
public class InnerNodeNodeType2 : INode
{
[JsonConverter(typeof(NodeTypeConverter<IInnerNode, InnerNodeType3>))]
public List<INodeInner> InnerNodes { get; set; }
// some even different properties
}
…
however, I did not find a way to map this without bringing the PC it runs on to its knees, especially memory-wise (apart from that, in some places with List<interface> I didn't even get json.Net to use the converter, it threw an error Could not create an instance of type {type}. Type is an interface or abstract class and cannot be instantiated. before even checking the converter class…).
So now, I changed it to all-concrete types/Lists of concrete type instances instead of the interfaces plus a converter, and it runs with MUCH less of a memory footprint (orders of magnitude!). But it's inelegant, because this way, I can't reuse most of the classes for different kinds of trees I'll have to use in other places of the program, which are similar, but subtly different.
Is there an elegant solution for this?
PS: Thanks for reading this far! This question might not be perfectly posed and/or contain all and any type of info you might need to suggest a solution. I've found, however, that anytime I tried to cover all bases and anticipate all further questions, I got no responses at all, so that's my attempt to ask differently this time… :P

You don't provide a concrete example of your problem, but you did write I changed it to all-concrete types/Lists of concrete type instances instead of the interfaces plus a converter, and it runs with MUCH less of a memory footprint (orders of magnitude!). It sounds as though you must be loading large chunks of JSON into memory in some intermediate representation, such as a string for the entire JSON or a JArray for the complete contents of your public List<INodeInner> InnerNodes arrays. After that you are converting the intermediate representation(s) into your final objects.
What you need to do is to avoid loading any intermediate representations, or if you must do so, load only the smallest possible JSON chunk at once. The following is an example implementation:
public interface INode
{
List<INodeInner> InnerNodes { get; set; }
}
public interface INodeInner : INode
{
}
public class Node : INode
{
[JsonProperty(ItemConverterType = typeof(InterfaceToConcreteConverter<INodeInner, InnerNodeNodeType1>))]
public List<INodeInner> InnerNodes { get; set; }
}
public class InnerNodeNodeType1 : INodeInner
{
[JsonProperty(ItemConverterType = typeof(InterfaceToConcreteConverter<INodeInner, InnerNodeNodeType2>))]
public List<INodeInner> InnerNodes { get; set; }
// some other properties
public int Type1Property { get; set; }
}
public class InnerNodeNodeType2 : INodeInner
{
[JsonProperty(ItemConverterType = typeof(InterfaceToConcreteConverter<INodeInner, InnerNodeNodeType3>))]
public List<INodeInner> InnerNodes { get; set; }
// some even different properties
public int Type2Property { get; set; }
}
public class InnerNodeNodeType3 : INodeInner
{
[JsonProperty(ItemConverterType = typeof(InterfaceToConcreteConverter<INodeInner, InnerNodeNodeType3>))]
public List<INodeInner> InnerNodes { get; set; }
// some even different properties
public int Type3Property { get; set; }
}
public class InterfaceToConcreteConverter<TInterface, TConcrete> : JsonConverter where TConcrete : TInterface
{
public InterfaceToConcreteConverter()
{
// TConcrete should be a subtype of an abstract type, or an implementation of an interface. If they
// are identical an infinite recursion could result, so throw an exception.
if (typeof(TInterface) == typeof(TConcrete))
throw new InvalidOperationException(string.Format("typeof({0}) == typeof({1})", typeof(TInterface), typeof(TConcrete)));
}
public override bool CanConvert(Type objectType)
{
return objectType == typeof(TInterface);
}
public override object ReadJson(JsonReader reader, Type objectType, object existingValue, JsonSerializer serializer)
{
return serializer.Deserialize(reader, typeof(TConcrete));
}
public override bool CanWrite { get { return false; } }
public override void WriteJson(JsonWriter writer, object value, JsonSerializer serializer)
{
throw new NotImplementedException();
}
}
And then, to load:
Node root;
var settings = new JsonSerializerSettings
{
// Whatever settings you need.
};
using (var stream = File.OpenRead(fileName))
using (var textReader = new StreamReader(stream))
using (var reader = new JsonTextReader(textReader))
{
root = JsonSerializer.CreateDefault(settings).Deserialize<Node>(reader);
}
Notes:
Rather than writing a converter for the entire List<INodeInner> InnerNodes and applying it with [JsonConverter(typeof(NodeTypeConverter<IInnerNode, InnerNodeType1>))], I created a converter for each list item and apply it by setting JsonPropertyAttribute.ItemConverterType:
[JsonProperty(ItemConverterType = typeof(InterfaceToConcreteConverter<INodeInner, InnerNodeNodeType1>))]
public List<INodeInner> InnerNodes { get; set; }
Thus simplifies the converter and guarantees that, if the converter needs to preload the JSON into an intermediate JToken, only one list item is preloaded and converted at once.
Since, in your example, the type of INodeInner is fixed for each type of INode, it isn't even necessary to preload individual list items. Instead, in JsonConverter.ReadJson(), deserialize directly from the incoming JsonReader using the correct concrete type:
public override object ReadJson(JsonReader reader, Type objectType, object existingValue, JsonSerializer serializer)
{
return serializer.Deserialize(reader, typeof(TConcrete));
}
As explained in Newtonsoft Performance Tips: Optimize Memory Usage, when deserializing a large JSON file, deserialize directly from a stream:
using (var stream = File.OpenRead(fileName))
using (var textReader = new StreamReader(stream))
using (var reader = new JsonTextReader(textReader))
{
root = JsonSerializer.CreateDefault(settings).Deserialize<Node>(reader);
}
Sample fiddle showing this working.

Related

How to use JSON.NET to serialize a list of POCO's as a list of strings, to reduce size?

I've got a simple POCO like this:
public class Edit { ... }
public class CharInsert : Edit
{
public int ParagraphIndex { get; set; }
public int CharacterIndex { get; set; }
public char Character { get; set; }
}
which serializes in JSON like this (note that I'm recording the object type, because of the inheritance):
{
"$type": "MyNamespace.CharInsert, MyAssembly",
"paragraphIndex": 7,
"characterIndex": 15,
"character": "e"
}
But this takes up a HUGE amount of space for a fairly little amount of data. And I have a LOT of them, so I need to be more compact about it.
I made a custom JsonConverter so that it will instead serialize as this:
"CharInsert|7|15|e"
and when I persist a list of these, I get:
[
"CharInsert|7|12|Z",
"CharInsert|7|13|w",
"CharInsert|7|14|i",
"CharInsert|7|15|e",
]
But when I try to deserialize this list, I get the error:
'Error converting value "CharInsert|7|12|Z" to type 'MyNamespace.Edit'
I suppose this is because the actual object is a subclass of the Edit type and it doesn't know which one because it doesn't know how to parse the string. How can I implement this so it can parse the string, resolve the typename contained therein, and then create the needed object type?
An alternative approach to custom converters, consider using [JsonProperty(PropertyName = "")] to shorten the json property names, which should decrease space and you dont have to worry about custom converters.
public class CharInsert : Edit
{
[JsonProperty(PropertyName = "p")]
public int ParagraphIndex { get; set; }
[JsonProperty(PropertyName = "i")]
public int CharacterIndex { get; set; }
[JsonProperty(PropertyName = "c")]
public char Character { get; set; }
}
I figured it out. The issue is that, without type information in the serialized string ("CharInsert|7|15|e"), the deserializer doesn't know what derived class to call.
So I made a JsonConverter for the base Edit type that knows how to parse the string and create and return and object from that string:
public override object ReadJson(JsonReader reader, Type objectType, object existingValue, JsonSerializer serializer)
{
//get the persisted value
var s = reader.Value?.ToString();
var fields = s.Split('|');
var typeName = ...get the type name from the field
var type = Type.GetType(typeName);
//create an object from the remaining fields using the ctor(string value) that
//each subclass must have
return System.Activator.CreateInstance(type, new object[] { fields.Skip(1).ToJoinedString("|") });
}

C# Newtonsoft JSON - Deserializing Object with collection of unknown objects

I'm struggling with deserialization of the json file using the newtonsoft.json. Object which I want to deserialize looks like this:
public class Device
{
public string Name { get; set; }
public int Id { get; set; }
public string Type { get; set; }
public List<Sensor> Sensors { get; }
public bool IsPaired { get; set; }
}
Sensor class is Virtual.
I have multiple classes which inherit from Sensor class (TemperatureSensor, WaterLevelSensor etc.) and add some new properties. Instances of these classes are stored in Sensors collection.
Json file looks like this:
[
{
"Name":"Device1",
"Id":0,
"Type":"TemperatureSensor",
"Sensors":[
{
"Id":0,
"Type":"TemperatureSensor",
"Data":18.136218099999997,
"ReadIntervalInMiliseconds":5000
},
{
"Id":1,
"Type":"TemperatureSensor",
"Data":18.0999819,
"ReadIntervalInMiliseconds":5000
}
],
"IsPaired":false
},
{
"Name":"Device2",
"Id":1,
"Type":"AutomaticGate",
"Sensors":[
{
"OpenPercentage":0,
"Id":0,
"Type":"AutomaticGate",
"Data":0.0,
"ReadIntervalInMiliseconds":0
}
],
"IsPaired":false
},
{
"Name":"Device3",
"Id":2,
"Type":"Other",
"Sensors":[
{
"IsActive":false,
"Id":0,
"Type":"AirConditioner",
"Data":0.0,
"ReadIntervalInMiliseconds":0
},
{
"Id":1,
"Type":"LightSensor",
"Data":4.0,
"ReadIntervalInMiliseconds":5000
}
],
"IsPaired":false
}
]
I assume that i have to read the "Type" of Sensor from json file, and on this basis create the Object and add it to some collection and then return Device class object with this collection.
I was trying to make custom JsonConverter like in this blog post but with little effect.
You can create a custom JsonConverter to convert Sensor objects to concrete derived classes. Here's a working example of such a JsonConverter:
public class SensorConverter : JsonConverter
{
public override bool CanRead => true;
public override bool CanWrite => false;
public override bool CanConvert(Type objectType)
{
// Don't do IsAssignableFrom tricks here, because you only know how to convert the abstract class Sensor.
return objectType == typeof(Sensor);
}
public override object ReadJson(JsonReader reader, Type objectType, object existingValue, JsonSerializer serializer)
{
var jObject = JObject.Load(reader);
string sensorType = jObject["Type"].Value<string>();
switch (sensorType)
{
case "TemperatureSensor":
return jObject.ToObject<TemperatureSensor>(serializer);
case "AutomaticGate":
return jObject.ToObject<AutomaticGate>(serializer);
case "AirConditioner":
return jObject.ToObject<AirConditioner>(serializer);
case "LightSensor":
return jObject.ToObject<LightSensor>(serializer);
default:
throw new NotSupportedException($"Sensor type '{sensorType}' is not supported.");
}
}
public override void WriteJson(JsonWriter writer, object value, JsonSerializer serializer) => throw new NotImplementedException();
}
Then, when deserializing, you will have to add your custom converter to the settings in order for this to work.
Note that your Sensors property is get-only at the moment. You will have to provide a setter in order for NewtonSoft to populate the property.
Another solution that requires much less code is using JsonSubTypes
Assuming an abstract Sensor class, you need to register via custom attribute a known subclass and it's identifier. So in your case, the identifier is property named "Type" and the class mappings is in KnownSubType attributes.
[JsonConverter(typeof(JsonSubtypes), "Type")]
[JsonSubtypes.KnownSubType(typeof(TemperatureSensor), "TemperatureSensor")]
[JsonSubtypes.KnownSubType(typeof(WaterLevelSensor), "WaterLevelSensor")]
[JsonSubtypes.KnownSubType(typeof(AirConditioner), "AirConditioner")]
[JsonSubtypes.KnownSubType(typeof(AutomaticGate), "AutomaticGate")]
[JsonSubtypes.KnownSubType(typeof(LightSensor), "LightSensor")]
public abstract class Sensor
{
}
In your Device class, Sensors property must have a set property.
public List<Sensor> Sensors { get; set;}
Usage:
var items = JsonConvert.DeserializeObject<List<Device>>(json);

Generic Tree to JSON

I have a generic tree structure representing employees in an organization chart.
The tree consists of a graph of custom Node<Person> objects which have references to each other and other properties like Level showing their level in the tree, parent, sibilings, etc.
I have to serialize a portion of this organization chart from a specific Person on down through everyone below them, and I have a method on the Node object called SelfAndDescendants() that returns an IEnumerable<Node<Person>>.
So basically I locate the specific person's Node in the tree, then get them and all their descendants in an IEnumerable. This part works fine.
That's where I am stuck. I now need to get this IEnumerable set of Nodes into hierarchical JSON.
My first attempt was just to throw it straight at the JSON serializer but that does not work (nor did I really expect it to), because it's a set of generic Node objects. There is a Value property on the Node object that will return a Person object ... which is what I need to get into the JSON (just the name).
string json = JsonConvert.SerializeObject(personNode.SelfAndDescendants.ToList());
This obviously is trying to serialize a List<Node<Person>> at this point which is not what I need. All the JSON return needs is a hierarchical format with simple the Person object's Name. Nothing else.
Do I have to do anything manually here in a loop to build custom JSON and return that?
This is not a duplicate of this post, as I am dealing with a generic recursive tree here and not a simple generic data structure.
Do I have to implement a custom JsonConverter? How does this work with a series of Node objects in a tree?
The Node class has all sorts of properties but it basically looks like this:
public class Node<T> : IEqualityComparer, IEnumerable<T>, IEnumerable<Node<T>> {
public Node(T value) {
Value = value;
}
public Node<T> this[int index] {
get {
return _children[index];
}
}
public Node<T> Add(T value, int index = -1) {
var childNode = new Node<T>(value);
Add(childNode, index);
return childNode;
}
public IEnumerable<Node<T>> SelfAndDescendants {
get {
return this.ToIEnumarable().Concat(Children.SelectMany(c => c.SelfAndDescendants));
}
}
}
The Person class is just a POCO class representing a person. This class is already serializing out to JSON correctly for another part of the system.
[JsonObject]
public class Person {
public string Title { get; set; }
public DateTime DateOfBirth { get; set; }
[JsonConverter(typeof(StringEnumConverter))]
public Gender Gender { get; set; }
public List<StreetAddress> Addresses { get; set; }
... etc
}
The desired output is an organization chart, showing people's levels in JSON. So employee under their boss, that boss under their boss, etc etc.
The JSON is extremely simple in this regard, it's just the persons name and title. It can even just be a single string per employee.
Yes, given your constraints, I would say that creating a custom JsonConverter is an appropriate solution for this situation. It is actually pretty straightforward to write, so long as Node<T> has public properties to allow at least read access to the Value and the Children. You don't have to worry about looping; the serializer will call back into the converter for each child as it iterates over the Children collection via the JArray.FromObject call.
Here's how I would write it:
public class OrgChartConverter : JsonConverter
{
public override bool CanConvert(Type objectType)
{
return (objectType == typeof(Node<Person>));
}
public override void WriteJson(JsonWriter writer, object value, JsonSerializer serializer)
{
Node<Person> node = (Node<Person>)value;
JObject obj = new JObject();
obj.Add("Name", node.Value.Name);
obj.Add("Subordinates", JArray.FromObject(node.Children, serializer));
obj.WriteTo(writer);
}
public override bool CanRead
{
get { return false; }
}
public override object ReadJson(JsonReader reader, Type objectType, object existingValue, JsonSerializer serializer)
{
throw new NotImplementedException();
}
}
Then, use the converter like this:
var settings = new JsonSerializerSettings
{
Converters = new List<JsonConverter> { new OrgChartConverter() },
Formatting = Formatting.Indented
};
string json = JsonConvert.SerializeObject(rootNode, settings);
Here is a demo: https://dotnetfiddle.net/BfdFdW
You want to describe how this class should be serialized by adding serialization attributes on the class and on the members needed and then serialize to string
string nodes = JsonConvert.SerializeObject<Node<Person>>(personNode);
[JsonObject]
public class Node<T> : IEqualityComparer, IEnumerable<T>, IEnumerable<Node<T>> {
[JsonProperty]
public IEnumerable<Node<T>> Children { get { return _children; } }
...
}
JSON.NET Serialization Attributes

Deserialize derived classes using Json.net without using JObject

I have a large json dataset that I need to deserialize. I am using Json.net's JsonTextReader to read the data.
My problem is that I need to deserialize some derived classes, so I need to be able to "look ahead" for a particular property defining my data type. In the example below, the "type" parameter is used to determine the object type to deserialize.
{
type: "groupData",
groupParam: "groupValue1",
nestedObject:
{
type: "groupData",
groupParam: "groupValue2",
nestedObject:
{
type: "bigData",
arrayData: [ ... ]
}
}
My derived objects can be heavily nested and very deep. Loading the entire dataset in memory is not desired since it will require much memory. Once I get down to the "bigData" object, I will be processing the data (such as the array in the example above), but it will not be stored in memory (it is too big).
All solutions to my problem that I have seen so far have utilized JObject to deserialize the partial objects. I want to avoid using JObject because it will deserialize every object down the hierarchy repeatedly.
How can I solve my deserialization issue?
Is there any way to search ahead for the "type" parameter, then backtrack to the start of the object's { character to start processing?
I not aware of anyway to prempt the loading in of the object in order to specify a lookahead (at least not in Json.NET) but you could use the other attribute based configuration items at your disposal in order to ignore unwanted properties:
public class GroupData {
[JsonIgnore]
public string groupParam { get; set; }
[JsonIgnore]
public GroupData nestedObject { get; set; }
public string[] arrayData { get; set; }
}
Alternatively, you can give custom creation converters a try:
For example..
public class GroupData {
[JsonIgnore]
public string groupParam { get; set; }
[JsonIgnore]
public GroupData nestedObject { get; set; }
}
public class BigData : GroupData {
public string[] arrayData { get; set; }
}
public class ObjectConverter<T> : CustomCreationConverter<T>
{
public ObjectConverter() { }
public override bool CanConvert(Type objectType)
{
return objectType.Name == "BigData";
}
public override object ReadJson(JsonReader reader, Type objectType, object existingValue, JsonSerializer serializer)
{
// Some additional checks/work?
serializer.Populate(reader, target);
}
}

Serialize list of interface types with ServiceStack.Text

I'm looking at ways to introduce something other than BinaryFormatter serialization into my app to eventually work with Redis. ServiceStack JSON is what I would like to use, but can it do what I need with interfaces?
It can serialize (by inserting custom __type attribute)
public IAsset Content;
but not
public List<IAsset> Contents;
- the list comes up empty in serialized data. Is there any way to do this - serialize a list of interface types?
The app is big and old and the shape of objects it uses is probably not going to be allowed to change.
Thanks
Quoting from http://www.servicestack.net/docs/framework/release-notes
You probably don't have to do much :)
The JSON and JSV Text serializers now support serializing and
deserializing DTOs with Interface / Abstract or object types. Amongst
other things, this allows you to have an IInterface property which
when serialized will include its concrete type information in a __type
property field (similar to other JSON serializers) which when
serialized populates an instance of that concrete type (provided it
exists).
[...]
Note: This feature is automatically added to all
Abstract/Interface/Object types, i.e. you don't need to include any
[KnownType] attributes to take advantage of it.
By not much:
public interface IAsset
{
string Bling { get; set; }
}
public class AAsset : IAsset
{
public string Bling { get; set; }
public override string ToString()
{
return "A" + Bling;
}
}
public class BAsset : IAsset
{
public string Bling { get; set; }
public override string ToString()
{
return "B" + Bling;
}
}
public class AssetBag
{
[JsonProperty(TypeNameHandling = TypeNameHandling.None)]
public List<IAsset> Assets { get; set; }
}
class Program
{
static void Main(string[] args)
{
try
{
var bag = new AssetBag
{
Assets = new List<IAsset> {new AAsset {Bling = "Oho"}, new BAsset() {Bling = "Aha"}}
};
string json = JsonConvert.SerializeObject(bag, new JsonSerializerSettings()
{
TypeNameHandling = TypeNameHandling.Auto
});
var anotherBag = JsonConvert.DeserializeObject<AssetBag>(json, new JsonSerializerSettings()
{
TypeNameHandling = TypeNameHandling.Auto
});

Categories