I have a large json dataset that I need to deserialize. I am using Json.net's JsonTextReader to read the data.
My problem is that I need to deserialize some derived classes, so I need to be able to "look ahead" for a particular property defining my data type. In the example below, the "type" parameter is used to determine the object type to deserialize.
{
type: "groupData",
groupParam: "groupValue1",
nestedObject:
{
type: "groupData",
groupParam: "groupValue2",
nestedObject:
{
type: "bigData",
arrayData: [ ... ]
}
}
My derived objects can be heavily nested and very deep. Loading the entire dataset in memory is not desired since it will require much memory. Once I get down to the "bigData" object, I will be processing the data (such as the array in the example above), but it will not be stored in memory (it is too big).
All solutions to my problem that I have seen so far have utilized JObject to deserialize the partial objects. I want to avoid using JObject because it will deserialize every object down the hierarchy repeatedly.
How can I solve my deserialization issue?
Is there any way to search ahead for the "type" parameter, then backtrack to the start of the object's { character to start processing?
I not aware of anyway to prempt the loading in of the object in order to specify a lookahead (at least not in Json.NET) but you could use the other attribute based configuration items at your disposal in order to ignore unwanted properties:
public class GroupData {
[JsonIgnore]
public string groupParam { get; set; }
[JsonIgnore]
public GroupData nestedObject { get; set; }
public string[] arrayData { get; set; }
}
Alternatively, you can give custom creation converters a try:
For example..
public class GroupData {
[JsonIgnore]
public string groupParam { get; set; }
[JsonIgnore]
public GroupData nestedObject { get; set; }
}
public class BigData : GroupData {
public string[] arrayData { get; set; }
}
public class ObjectConverter<T> : CustomCreationConverter<T>
{
public ObjectConverter() { }
public override bool CanConvert(Type objectType)
{
return objectType.Name == "BigData";
}
public override object ReadJson(JsonReader reader, Type objectType, object existingValue, JsonSerializer serializer)
{
// Some additional checks/work?
serializer.Populate(reader, target);
}
}
Related
I've got a simple POCO like this:
public class Edit { ... }
public class CharInsert : Edit
{
public int ParagraphIndex { get; set; }
public int CharacterIndex { get; set; }
public char Character { get; set; }
}
which serializes in JSON like this (note that I'm recording the object type, because of the inheritance):
{
"$type": "MyNamespace.CharInsert, MyAssembly",
"paragraphIndex": 7,
"characterIndex": 15,
"character": "e"
}
But this takes up a HUGE amount of space for a fairly little amount of data. And I have a LOT of them, so I need to be more compact about it.
I made a custom JsonConverter so that it will instead serialize as this:
"CharInsert|7|15|e"
and when I persist a list of these, I get:
[
"CharInsert|7|12|Z",
"CharInsert|7|13|w",
"CharInsert|7|14|i",
"CharInsert|7|15|e",
]
But when I try to deserialize this list, I get the error:
'Error converting value "CharInsert|7|12|Z" to type 'MyNamespace.Edit'
I suppose this is because the actual object is a subclass of the Edit type and it doesn't know which one because it doesn't know how to parse the string. How can I implement this so it can parse the string, resolve the typename contained therein, and then create the needed object type?
An alternative approach to custom converters, consider using [JsonProperty(PropertyName = "")] to shorten the json property names, which should decrease space and you dont have to worry about custom converters.
public class CharInsert : Edit
{
[JsonProperty(PropertyName = "p")]
public int ParagraphIndex { get; set; }
[JsonProperty(PropertyName = "i")]
public int CharacterIndex { get; set; }
[JsonProperty(PropertyName = "c")]
public char Character { get; set; }
}
I figured it out. The issue is that, without type information in the serialized string ("CharInsert|7|15|e"), the deserializer doesn't know what derived class to call.
So I made a JsonConverter for the base Edit type that knows how to parse the string and create and return and object from that string:
public override object ReadJson(JsonReader reader, Type objectType, object existingValue, JsonSerializer serializer)
{
//get the persisted value
var s = reader.Value?.ToString();
var fields = s.Split('|');
var typeName = ...get the type name from the field
var type = Type.GetType(typeName);
//create an object from the remaining fields using the ctor(string value) that
//each subclass must have
return System.Activator.CreateInstance(type, new object[] { fields.Skip(1).ToJoinedString("|") });
}
I need to validate JSON response dynamically. Schema should depends on value of specific key in key-pair.
Please, note, that I am using JSON.NET 5.0.8 and cannot upgrade to higher version due to compatibility with infrastructre.
If value of "type" is not "SumProperty" ("CountersProperty" in example), then validate "rule" like string:
"property": [
{
"type": "CountersProperty",
"rule": "MEQUAL"
}
]
But! If value of "type" is "SumProperty", then validate "rule" like array (and "rule" should be inside "config"):
"property": [
{
"type": "SumProperty",
"config": {
"rule": [
{
"type": "MEQUAL",
"value": 2
}
]
}
}
]
So I need some kind of dynamic validation, that can "understand" what kind of property we have and validate it appropriately. JSON response can have multiple of "properties" at the same time, so I can't choose one kind of validation or another, it should work dynamically.
You can do this by implementing a custom JsonConverter.
I made the following classes following your sample input
public class Schema
{
[JsonProperty("Property")]
public List<Property> Properties { get; set; }
}
public abstract class Property
{
public string Type { get; set; }
}
public class NotSumProperty : Property
{
public string Rule { get; set; }
}
public class SumProperty : Property
{
public Config Config { get; set; }
}
public class Config
{
[JsonProperty("Rule")]
public List<Rule> Rules { get; set; }
}
public class Rule
{
public string Type { get; set; }
public int Value { get; set; }
}
Then we define our custom JsonConverter. We override the ReadJson() method to implement our conversion clause, in this case, we evaluate the type of the Property.
public class PropertyConverter : JsonConverter
{
public override bool CanConvert(Type objectType) => typeof(Property).IsAssignableFrom(objectType);
public override void WriteJson(JsonWriter writer, object value, JsonSerializer serializer) { }
public override object ReadJson(JsonReader reader, Type objectType, object existingValue, JsonSerializer serializer)
{
JObject obj = JObject.Load(reader);
Property p;
switch ((string)obj["type"])
{
case "SumProperty":
p = new SumProperty();
break;
default:
p = new NotSumProperty();
break;
}
serializer.Populate(obj.CreateReader(), p);
return p;
}
}
Finally, here's the usage:
JsonSerializerSettings settings = new JsonSerializerSettings
{
TypeNameHandling = TypeNameHandling.Objects
};
settings.Converters.Add(new PropertyConverter());
Schema schema = JsonConvert.DeserializeObject<Schema>(json, settings);
Another option, if you don't want to write your own converter, is to deserialize into a a dynamic object, and check these values at run time, as demonstrated here.
This could potentially be more useful if you can't define a clear inheritance patter, though it does rely on clients to implement more parsing/validation logic themselves. In other words, it's a little easier to hit unexpected exceptions - it basically moves potential issues from compile-time errors to run-time errors.
I'm struggling with deserialization of the json file using the newtonsoft.json. Object which I want to deserialize looks like this:
public class Device
{
public string Name { get; set; }
public int Id { get; set; }
public string Type { get; set; }
public List<Sensor> Sensors { get; }
public bool IsPaired { get; set; }
}
Sensor class is Virtual.
I have multiple classes which inherit from Sensor class (TemperatureSensor, WaterLevelSensor etc.) and add some new properties. Instances of these classes are stored in Sensors collection.
Json file looks like this:
[
{
"Name":"Device1",
"Id":0,
"Type":"TemperatureSensor",
"Sensors":[
{
"Id":0,
"Type":"TemperatureSensor",
"Data":18.136218099999997,
"ReadIntervalInMiliseconds":5000
},
{
"Id":1,
"Type":"TemperatureSensor",
"Data":18.0999819,
"ReadIntervalInMiliseconds":5000
}
],
"IsPaired":false
},
{
"Name":"Device2",
"Id":1,
"Type":"AutomaticGate",
"Sensors":[
{
"OpenPercentage":0,
"Id":0,
"Type":"AutomaticGate",
"Data":0.0,
"ReadIntervalInMiliseconds":0
}
],
"IsPaired":false
},
{
"Name":"Device3",
"Id":2,
"Type":"Other",
"Sensors":[
{
"IsActive":false,
"Id":0,
"Type":"AirConditioner",
"Data":0.0,
"ReadIntervalInMiliseconds":0
},
{
"Id":1,
"Type":"LightSensor",
"Data":4.0,
"ReadIntervalInMiliseconds":5000
}
],
"IsPaired":false
}
]
I assume that i have to read the "Type" of Sensor from json file, and on this basis create the Object and add it to some collection and then return Device class object with this collection.
I was trying to make custom JsonConverter like in this blog post but with little effect.
You can create a custom JsonConverter to convert Sensor objects to concrete derived classes. Here's a working example of such a JsonConverter:
public class SensorConverter : JsonConverter
{
public override bool CanRead => true;
public override bool CanWrite => false;
public override bool CanConvert(Type objectType)
{
// Don't do IsAssignableFrom tricks here, because you only know how to convert the abstract class Sensor.
return objectType == typeof(Sensor);
}
public override object ReadJson(JsonReader reader, Type objectType, object existingValue, JsonSerializer serializer)
{
var jObject = JObject.Load(reader);
string sensorType = jObject["Type"].Value<string>();
switch (sensorType)
{
case "TemperatureSensor":
return jObject.ToObject<TemperatureSensor>(serializer);
case "AutomaticGate":
return jObject.ToObject<AutomaticGate>(serializer);
case "AirConditioner":
return jObject.ToObject<AirConditioner>(serializer);
case "LightSensor":
return jObject.ToObject<LightSensor>(serializer);
default:
throw new NotSupportedException($"Sensor type '{sensorType}' is not supported.");
}
}
public override void WriteJson(JsonWriter writer, object value, JsonSerializer serializer) => throw new NotImplementedException();
}
Then, when deserializing, you will have to add your custom converter to the settings in order for this to work.
Note that your Sensors property is get-only at the moment. You will have to provide a setter in order for NewtonSoft to populate the property.
Another solution that requires much less code is using JsonSubTypes
Assuming an abstract Sensor class, you need to register via custom attribute a known subclass and it's identifier. So in your case, the identifier is property named "Type" and the class mappings is in KnownSubType attributes.
[JsonConverter(typeof(JsonSubtypes), "Type")]
[JsonSubtypes.KnownSubType(typeof(TemperatureSensor), "TemperatureSensor")]
[JsonSubtypes.KnownSubType(typeof(WaterLevelSensor), "WaterLevelSensor")]
[JsonSubtypes.KnownSubType(typeof(AirConditioner), "AirConditioner")]
[JsonSubtypes.KnownSubType(typeof(AutomaticGate), "AutomaticGate")]
[JsonSubtypes.KnownSubType(typeof(LightSensor), "LightSensor")]
public abstract class Sensor
{
}
In your Device class, Sensors property must have a set property.
public List<Sensor> Sensors { get; set;}
Usage:
var items = JsonConvert.DeserializeObject<List<Device>>(json);
As soon as I don't have only concrete types in the data structures I deserialize a HUGE json tree structure into, it starts using enormous amounts of memory, but its memory footprint stays relatively slim when deserializing into entirely concrete types… is there an elegant workaround for this?
The json I get is generated elsewhere, so I have no influence as to the format I get it in (it's a tree structure, similar to the code example below if it were serialized to json directly), and in the worst case about 250-300MB of it.
My data structure for mapping it used to look somewhat like the following example (structs in some places, though)
public class Node : INode
{
[JsonConverter(typeof(NodeTypeConverter<IInnerNode, InnerNodeType1>))]
public List<INodeInner> InnerNodes { get; set; }
}
public class InnerNodeNodeType1 : INode
{
[JsonConverter(typeof(NodeTypeConverter<IInnerNode, InnerNodeType2>))]
public List<INodeInner> InnerNodes { get; set; }
// some other properties
}
public class InnerNodeNodeType2 : INode
{
[JsonConverter(typeof(NodeTypeConverter<IInnerNode, InnerNodeType3>))]
public List<INodeInner> InnerNodes { get; set; }
// some even different properties
}
…
however, I did not find a way to map this without bringing the PC it runs on to its knees, especially memory-wise (apart from that, in some places with List<interface> I didn't even get json.Net to use the converter, it threw an error Could not create an instance of type {type}. Type is an interface or abstract class and cannot be instantiated. before even checking the converter class…).
So now, I changed it to all-concrete types/Lists of concrete type instances instead of the interfaces plus a converter, and it runs with MUCH less of a memory footprint (orders of magnitude!). But it's inelegant, because this way, I can't reuse most of the classes for different kinds of trees I'll have to use in other places of the program, which are similar, but subtly different.
Is there an elegant solution for this?
PS: Thanks for reading this far! This question might not be perfectly posed and/or contain all and any type of info you might need to suggest a solution. I've found, however, that anytime I tried to cover all bases and anticipate all further questions, I got no responses at all, so that's my attempt to ask differently this time… :P
You don't provide a concrete example of your problem, but you did write I changed it to all-concrete types/Lists of concrete type instances instead of the interfaces plus a converter, and it runs with MUCH less of a memory footprint (orders of magnitude!). It sounds as though you must be loading large chunks of JSON into memory in some intermediate representation, such as a string for the entire JSON or a JArray for the complete contents of your public List<INodeInner> InnerNodes arrays. After that you are converting the intermediate representation(s) into your final objects.
What you need to do is to avoid loading any intermediate representations, or if you must do so, load only the smallest possible JSON chunk at once. The following is an example implementation:
public interface INode
{
List<INodeInner> InnerNodes { get; set; }
}
public interface INodeInner : INode
{
}
public class Node : INode
{
[JsonProperty(ItemConverterType = typeof(InterfaceToConcreteConverter<INodeInner, InnerNodeNodeType1>))]
public List<INodeInner> InnerNodes { get; set; }
}
public class InnerNodeNodeType1 : INodeInner
{
[JsonProperty(ItemConverterType = typeof(InterfaceToConcreteConverter<INodeInner, InnerNodeNodeType2>))]
public List<INodeInner> InnerNodes { get; set; }
// some other properties
public int Type1Property { get; set; }
}
public class InnerNodeNodeType2 : INodeInner
{
[JsonProperty(ItemConverterType = typeof(InterfaceToConcreteConverter<INodeInner, InnerNodeNodeType3>))]
public List<INodeInner> InnerNodes { get; set; }
// some even different properties
public int Type2Property { get; set; }
}
public class InnerNodeNodeType3 : INodeInner
{
[JsonProperty(ItemConverterType = typeof(InterfaceToConcreteConverter<INodeInner, InnerNodeNodeType3>))]
public List<INodeInner> InnerNodes { get; set; }
// some even different properties
public int Type3Property { get; set; }
}
public class InterfaceToConcreteConverter<TInterface, TConcrete> : JsonConverter where TConcrete : TInterface
{
public InterfaceToConcreteConverter()
{
// TConcrete should be a subtype of an abstract type, or an implementation of an interface. If they
// are identical an infinite recursion could result, so throw an exception.
if (typeof(TInterface) == typeof(TConcrete))
throw new InvalidOperationException(string.Format("typeof({0}) == typeof({1})", typeof(TInterface), typeof(TConcrete)));
}
public override bool CanConvert(Type objectType)
{
return objectType == typeof(TInterface);
}
public override object ReadJson(JsonReader reader, Type objectType, object existingValue, JsonSerializer serializer)
{
return serializer.Deserialize(reader, typeof(TConcrete));
}
public override bool CanWrite { get { return false; } }
public override void WriteJson(JsonWriter writer, object value, JsonSerializer serializer)
{
throw new NotImplementedException();
}
}
And then, to load:
Node root;
var settings = new JsonSerializerSettings
{
// Whatever settings you need.
};
using (var stream = File.OpenRead(fileName))
using (var textReader = new StreamReader(stream))
using (var reader = new JsonTextReader(textReader))
{
root = JsonSerializer.CreateDefault(settings).Deserialize<Node>(reader);
}
Notes:
Rather than writing a converter for the entire List<INodeInner> InnerNodes and applying it with [JsonConverter(typeof(NodeTypeConverter<IInnerNode, InnerNodeType1>))], I created a converter for each list item and apply it by setting JsonPropertyAttribute.ItemConverterType:
[JsonProperty(ItemConverterType = typeof(InterfaceToConcreteConverter<INodeInner, InnerNodeNodeType1>))]
public List<INodeInner> InnerNodes { get; set; }
Thus simplifies the converter and guarantees that, if the converter needs to preload the JSON into an intermediate JToken, only one list item is preloaded and converted at once.
Since, in your example, the type of INodeInner is fixed for each type of INode, it isn't even necessary to preload individual list items. Instead, in JsonConverter.ReadJson(), deserialize directly from the incoming JsonReader using the correct concrete type:
public override object ReadJson(JsonReader reader, Type objectType, object existingValue, JsonSerializer serializer)
{
return serializer.Deserialize(reader, typeof(TConcrete));
}
As explained in Newtonsoft Performance Tips: Optimize Memory Usage, when deserializing a large JSON file, deserialize directly from a stream:
using (var stream = File.OpenRead(fileName))
using (var textReader = new StreamReader(stream))
using (var reader = new JsonTextReader(textReader))
{
root = JsonSerializer.CreateDefault(settings).Deserialize<Node>(reader);
}
Sample fiddle showing this working.
I have some JSON like:
{
"companyName": "Software Inc.",
"employees": [
{
"employeeName": "Sally"
},
{
"employeeName": "Jimmy"
}
]
}
I want to deserialize it into:
public class Company
{
public string companyName { get; set; }
public IList<Employee> employees { get; set; }
}
public class Employee
{
public string employeeName { get; set; }
public Company employer { get; set; }
}
How can I have JSON.NET set the "employer" reference? I tried using a CustomCreationConverter, but the public override object ReadJson(JsonReader reader, Type objectType, object existingValue, JsonSerializer serializer) method doesn't contain a any reference to the current parent object.
That's only going to cause you headaches if you're trying to do that as part of the deserialization. It'd be much easier to perform that task after deserialization. Do something like:
var company = //deserialized value
foreach (var employee in company.employees)
{
employee.employer = company;
}
Or a one-liner, if you prefer the syntax:
company.employees.ForEach(e => e.employer = company);
I have handled a similar situation by defining a "callback" in the parent class like so:
[OnDeserialized]
private void OnDeserialized(StreamingContext context)
{
// Add logic here to pass the `this` object to any child objects
}
This works with JSON.Net without any other setup. I have not actually needed the StreamingContext object.
In my case the child objects have a SetParent() method that gets called here and also when a new child object is created in other ways.
[OnDeserialized] is from System.Runtime.Serialization, and so you won't need to add a JSON library reference.
Json.net solved this with PreserveReferencesHandling. Simply set PreserveReferencesHandling = PreserveReferencesHandling.Objects and Newtonsoft does it all for you.
https://www.newtonsoft.com/json/help/html/T_Newtonsoft_Json_PreserveReferencesHandling.htm
Regards,
Fabianus