Using JSON serialization as a persistence mechanism instead of RDB

Using JSON serialization as a persistence mechanism instead of RDB - c#

I'm thinking about throwing away my DB in my next project to simplify development/evolution.
One way to do it is not to leave the Objects realm at all and persist my objects by some kind of serialization. It would be nice to be able to edit the initial object state when the app is down, so a format like JSON would be great.
The problem is that JSON tools (like Java Jackson), or rather JSON itself, are not able to keep the references, so after deserializing object graph I can get more instances than I got before serialization - each reference to the same object gets new instance.
I've noticed JSPON but it doesn't seem to be alive.
What do you think about such approach - isn't it too simple to be possible? Or maybe I should use some OODB (although it would create additional configurational overhead and I want to keep it simple).

Most of the simple portable serializers (xml, json, protocol buffers) are tree serializers (not graph serializers), so you'll see this problem a bit...
You could perhaps try using a DTO tree that doesn't need the references? i.e. instead of:
Parent -(children)-> Child
<--(parent)--
you have (at the DTO level):
Parent {Key="abc"} -(child keys)-> {string}
Child {Key="def"} -(parent key)-> {string}
This should be usable with any tree serializer; but it does require extra (manual) processing.
There are graph-based serializers like .NET's DataContractSerializer (with graph-mode enabled; it is disabled by default); but this is non-portable.

The references issue should be simple enough to solve assuming you control the serialization - you'd simply save the objects giving each an id and then save the references in terms of those ids.
However, while I think you'd get a simple version working and I reckon you'd run into problems down the line. Things that spring to mind are:
What would happen as the code evolves and the classes change?
How would you support query operations, particularly indexing to make the queries fast?
How would you manage concurrent access?
How would you manage transactions?
How would it scale?
I don't think these problems are insurmountable but IMHO, relational databases are the way they are based on years of development and use in the wild and the OODBs that I've seen are not a realistic proposition at this time.
Also, there's a whole class of problem that the set based logic provided by relational databases is ideal for, let alone the power of SQL in refining the data-sets you load, which just isn't as easy in the object world. With the modern ORMs making life so easy these days I certainly would want to confine myself to either realm.

The latest version of Json.NET supports serializing references.
string json = JsonConvert.SerializeObject(people, Formatting.Indented,
new JsonSerializerSettings { PreserveReferencesHandling = PreserveReferencesHandling.Objects });
//[
// {
// "$id": "1",
// "Name": "James",
// "BirthDate": "\/Date(346377600000)\/",
// "LastModified": "\/Date(1235134761000)\/"
// },
// {
// "$ref": "1"
// }
//]
List<Person> deserializedPeople = JsonConvert.DeserializeObject<List<Person>>(json,
new JsonSerializerSettings { PreserveReferencesHandling = PreserveReferencesHandling.Objects });
Console.WriteLine(deserializedPeople.Count);
// 2
Person p1 = deserializedPeople[0];
Person p2 = deserializedPeople[1];
Console.WriteLine(p1.Name);
// James
Console.WriteLine(p2.Name);
// James
bool equal = Object.ReferenceEquals(p1, p2);
// true

I've found this SO question helpful. XStream seems to cope with references by using relative paths in tree structure to the first reference when finding next one, even for json ( see here ).
Simple apparently can deal with more complex object graphs, but XStream seems more popular, does JSON and will probably suit my needs (I won't be ever having cyclic references).

the itemscript project proposes a schema language based on JSON. itemscript schema describes data types. itemscript JAM is an application markup developed in itemscript.
the reference implementation includes a GWT client (Item Lens) and a column store (Item Store) that persists JSON data.

As long as you're not leaving .NET realm, why not use the serialization mechanisms that .NET offers? It can easily serialize your object graphs (private fields included) to a binary blob and back again. There is also a built-in mechanism for serializing to and from XML, although that has some limitations when it comes to private fields and stuff (you can work around that though). There are also attributes that specify that some fields are newer and may not be in the serialized stream. You'll have to deal with them being NULL yourself, but then - you'd have to do that anyway.
Added: Ah, yes, forgot to mention - I'm talking about System.Runtime.Serialization.Formatters.Binary.BinaryFormatter class.

Related

Generic serializer with ProtoBuf.net

I'm attempting to write a generic serializer using protobuf.net v2. However I'm running into some issues which make me wonder if perhaps what I'm doing is impossible. The objects to be serialized are of an indeterminate type to which I don't have access so I'm attempting to walk the object and add its properties to the type model.
var model = TypeModel.Create();
List<string> propertiesToSerialize = new List<string>();
foreach (var property in typeToSerialize.GetProperties())
{
propertiesToSerialize.Add(property.Name);
}
model.AutoAddMissingTypes = true;
model.Add(typeToSerialize, true).Add(propertiesToSerialize.ToArray());
For simple objects which contain only primitives this seems to work just fine. However when working with an object which contains, say, a Dictionary<string,object> I encounter an error telling me that no serializer is registered for Object.
I did look at serializing a Dictionary<string,object> in ProtoBuf-net fails but it seems the suggested solution requires some knowledge and access to the object being serialized.
Any suggestions on how I might proceed?

protobuf-net does not set out to be able to serialize every scenario (especially those dominated by object), in exactly the same way that XmlSerializer and DataContractSerializer have scenarios which they can't model. In particular, the total lack of metadata in the protobuf format (part of why it is very efficient) means that it is only intended to be consumed by code that knows the structure of the data in advance - which is not possible if too much is object.
That said, there is some support via DynamicType=true, but that would not currently be enabled for the dictionary scenario you mention.
In most cases, though, it isn't really the case that the data can be anything; more typically there are a finite number of expected data types. When that is the case, the object problem can be addressed in a cleaner way using a slightly different model (specifically, a non-generic base-type, a generic sub-type, and a few "include" options). As with most serialization, there are scenarios were it may be desirable to have a separate "DTO" model, that looks closer to the serialization output than to your domain model.
A final note: the GetProperties()/Add() approach is not robust, as GetProperties() does not guarantee any particular order to the members; with protobuf-net in the way you show, order is important, as this helps determine the keys to use. Even if the order was fixed (sorting alphabetically, for example), note that adding a member could be a breaking change.

How to dynamically create a JSON object in c# (from an ASP.NET resource file)?

I need to serialize the strings from a resource file (.resx) into a JSON object. The resource file's keys are in flux and thus I cannot just create a C# object that accepts the appropriate values. It needs to be a dynamic solution. I am able to loop through the key-value pairs for the file, but I need an easy way to serialize them to JSON.
I know I could do:
Object thing = new {stringOne = StringResource.stringOne; ...}
But, I'd rather have something like:
Object generic = {}
foreach (DictionaryEntry entry in StringResource) {
generic.(entry.Key) = entry.Value
}
Or should I just create a custom JSON serializer that constructs the object piecemeal (i.e. foreach loop that appends part of the JSON string with each cycle)?
EDIT
I ended up writing a quick JSON serializer that constructs the string one field at a time. I didn't want to include a whole JSON library as this is the only use of JSON objects (for now at least). Ultimately, what I wanted is probably impractical and doesn't exist as it's function is better served by other data structures. Thanks for all the answers though!

If you're using C# 4.0, you should look at the magical System.Dynamic.ExpandoObject. It's an object that allows you to dynamically add and remove properties at runtime, using the new DLR in .NET 4.0. Here is a good example use for the ExpandoObject.
Once you have your fully populated ExpandoObject, you can probably easily serialize that with any of the JSON libraries mentioned by the other excellent answers.

This sounds like an accident waiting to happen (i.e. creating output prior to cementing the structure), but it happens.
The custom JSON serializer is a compelling option, as it allows you to easily move from your dictionary into a JSON format. I would look at open source libraries (JSON.NET, etc) to see if you can reduce the development time.
I also think setting up in a slightly more structured format, like XML, is a decent choice. It is quite easy to serialize from XML to JSON using existing libraries, so you avoid heavy customization/
The bigger question is what purposes will the data ultimately serve. If you solve this problem using either of these methods, are you creating bigger problems in the future.

Probably I would use JSON.NET and the ability to create JSON from XML.
Then, you could create an XML in-memory and let JSON.NET convert it to JSON for you. Maybe if you dig deeper into the API, there are other options, too.

Newtonsoft is a library that has all kinds of nifty JSON tools...among them, on-the-fly one-line serializer and deserializers...check it out, it's my favorite JSON library out there
http://james.newtonking.com/pages/json-net.aspx
If I remember correctly, it has a class that will convert JSON to a .NET object without having to create the .NET object first. Think it is in the Json.Convert class

The way I do it is:
var serialiser = new System.Web.Script.Serialization.JavaScriptSerializer();
string json = serialiser.Serialize(data);
context.Response.Write(json);

MongoDb and self referencing objects

I am just starting to learn about mongo db and was wondering if I am doing something wrong....I have two objects:
public class Part
{
public Guid Id;
public ILIst<Materials> Materials;
}
public class Material
{
public Guid MaterialId;
public Material ParentMaterial;
public IList<Material> ChildMaterials;
public string Name;
}
When I try to save this particular object graph I receive a stack overflow error because of the circular reference. My question is, is there a way around this? In WCF I am able to add the "IsReference" attribute on the datacontract to true and it serializes just fine.

What driver are you using?
In NoRM you can create a DbReference like so
public DbReference<Material> ParentMaterial;
Mongodb-csharp does not offer strongly typed DbReferences, but you can still use them.
public DBRef ParentMaterial;
You can follow the reference with Database.FollowReference(ParentMaterial).

Just for future reference, things like references between objects which are not embedded within a sub-document structure, are handled extremely well by a NoSQL ODB, which is generally designed to deal with transparent relations in arbitrarity complex object models.
If you are familiar with Hibernate, imagine that without any mapping file AT ALL and orders of magnitude faster performance because there is no runtime JOIN behind the scenes, all relations are resolved with the speed of a b-tree lookup.
Here is a video from Versant (disclosure - I work for them), so you can see how it works.
This is a little boring in the beginning, but shows every single step to take a Java application and make it persistent in an ODB... then make it fault tolerant, distributed, do some parallel queries, optimize cache load, etc...
If you want to skip to the cool part, jump about 20 minutes in and you will avoid the building of the application and just see the how easy it is to dynamically evolve schema, add distribution and fault tolerance to any existing application ):

If you want to store object graphs with relationships between them requiring multiple 'joins' to get to the answer you are probably better off with a SQL-style database. The document-centric approach of MongoDB and others would probably structure this rather differently.
Take a look at MongoDB nested sets which suggests some ways to represent data like this.

I was able to accomplish exactly what I needed by using a modified driver from NoRM mongodb.

Slightly non-trivial data structure: is XmlSerializer right for me?

I'm currently using XmlSerializer to, surprisingly enough :), handle de/serialization of my data structures - I find it wonderfully simple to use, but at the cost of flexibility. At the moment, I'm using it for a tree-based structure; since XmlSerializer doesn't handle cyclic structures, I've added [XmlIgnore] to my Parent property, and do a post-deserialization iteration over the tree to fix up node parents.
Is there a better way to handle this using XmlSerializer, or would it be better to rewrite the code using XmlReader/XmlWriter? I suppose implementing IXmlSerializable would work, but it seems like a fair amount of work, while still retaining the cons of XmlSerializer.
The current post-deserialization step is OK, but I'm adding a data structure that has to be serialized to a separate XML file: basically a flat list of items that need a Parent property referencing a node from the previous tree structure. This would require yet a post-deserialization step, as well as keeping both a Parent attribute as well as as ParentId (or some trickery) in the new data structure.
So, any smart (and non-fragile) ideas? Or XmlReader/XmlWriter it is?
Solution
DataContractSerializer turned out to be a pretty decent solution, with pretty much the same simplicity as XmlSerializer. I opted not to use the automatic cycle handling but instead defining and OnDeserialized decorated method to handle setting the parent node; that way, the generated XML is standard-conforming.
One thing that confused me for a while was that I got crashes on some properties after deserializing, with the backing members set to null - couldn't figure out how this was possible since the backing members were definitely initialized in all possible constructors. Debugging showed constructors were never called, and after some googling I found this SO post with an explanation.

You could try binary serialization (BinarySerializer or DataContractSerializer), which I think handles cyclic graphs somewhat better, at the cost of not having a human-readable representation of the data. Alternatively, you can try the SoapFormatter as detailed here.

Use DataContractSerializer. Mark your classes with [DataContract(IsReference = true)].

How to design to prompt users for new values for properties of deserialized objects?

Right now, I'm currently serializing a class like this:
class Session
{
String setting1;
String setting2;
...etc... (other member variables)
List<SessionAction> actionsPerformed;
}
Where SessionAction is an interface that just has one method. All implementations of the SessionAction interface have various properties describing what that specific SessionAction does.
Currently, I serialize this to a file which can be loaded again using the default .Net binary serializer. Now, I want to serialize this to a template. This template will just be the List of SessionActions serialized to a file, but upon loading it back into memory at another time, I want some properties of these SessionActions to require input from the user (which I plan to dynamically generate GUI controls on the fly depending on the property type). Right now, I'm stuck on determining the best way to do this.
Is there some way I could flag some properties so that upon using reflection, I could determine which properties need input from user? Or what are my other options? Feel free to leave comments if anything isn't clear.

For info, I don't recommend using BinaryFormatter for anything that you are storing long-term; it is very brittle between versions. It is fine for short-lived messages where you know the same version will be used for serialization and deserialization.
I would recommend any of: XmlSerializer, DataContractSerializer (3.0), or for fast binary, protobuf-net; all of these are contract-based, so much more version tolerant.
Re the question; you could use things like Nullable<T> for value-types, and null for strings etc - and ask for input for those that are null? There are other routes involving things like the ShouldSerialize* pattern, but this might upset the serialization APIs.

If you know from start what properties will have that SessionAction, you must implement IDeserializationCallback and put to those props the attribute [NonSerialized]. When you implement the OnDeserialization method you get the new values from the user.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.