protobuf.net & conditional serialization

protobuf.net & conditional serialization - c#

We are using protobuf.net to serialize classes between mobile devices and back end services, but we now need to adjust what is sent back to the client based upon the 'context' of the user.
We would typically do this by implementing the ISerializable interface and look at the context value to then decide what to serialize. Similarly in the constructor we would then deserialize the provided values.
But it would appear that ISerializable isn't implemented/support (i can see why) for protobuf.net, so we have got around this by taking the 'ShouldSerialize*' and 'OnSerializing' approaches. This does however mean that we end up having to store the StreamingContext in each class which doesn't feel right. We could potentially stick it in a global but this also doesn't feel right.
Is there a better way to achieve what we want, e.g. serialization only using protobuf.net format but with what is serialized being influenced by an externally provided context?

It is a good question. The patterns currently supported (ShouldSerialize* etc) are borrowed "as is" entirely from the BCL, hence no context - however there is no reason it can't support parameters in the same way that the callbacks do - indeed, for callbacks it supports pretty much any usage (with/without context etc) - so I can't think of a good reason not to support them here to.
You are right t say it isn't supported currently, but it could be - let me know of this would be useful.

Related

C# Xml Serialization big purpose?

I was just wondering about XML Serialization. If i understand correctly, the main reason for using it is that it lets you transport your object data more easily, am I right? Also, i tried serializing data using a constructor but it says that that you can only serialize data that are "parameterless". The thing is I like constructors because it allows me to have for example a Player class, and adding a new player with all properties is much more productive than having to set all properties one by one.
So the big question here is, what's the BIG purpose of XML serialization, what are the ways to use it? the way I see it is that it adds another level of complexity to my code, because i now need a class to serialize my data. Can someone shed some light?!

If you're talking about the overall purpose of serialization, strictly speaking, serialization (note that I said "serialization," not "XML Serialization" - more on that in a second) doesn't just make transporting objects easier, it's the only way you could transport an object.
As indicated in Pablo Santa Cruz's answer, XML is one of many ways you can serialize data. If you're going to save or send data somewhere, by definition you must first have some way to represent it. Serialization basically means that you represent your object state in some specified format. Deserialization is the opposite - given some representation of an object state, reconstruct what the original object state was.
In that sense, XML serialization, saving an object state to a database somehow, saving it as JSON, saving it in some binary format, and saving in some XML format are all examples of serialization (because you're representing the object state in a pre-defined format for later use).
While any defined format can technically be serialization, there are several standard ways of doing that. XML and JSON are by far the most common formats because they're standardized, easy to parse, easy to constrain (e.g. with XML Schema), are widely supported by libraries, can be relatively human-readable (which makes debugging easier), and they're widely used.
In case the last point sounds a little odd (they're widely used because they're widely used), standards by their very nature tend to have a strong network effect. In other words, the more people adapt them the more useful they are; for example, it's only useful to have email if you can actually use it to contact other people - it wouldn't be even slightly useful to have email if you were the only one using it.
A lot of standards and technologies will win out over competitors more because they have more early adapters than because they're necessarily technically superior. For example, even if someone could clearly prove that OS X is a "better" operating system than Windows, it wouldn't matter because there's vastly more software developed for Windows and it would be prohibitively expensive for people to try to switch to OS X. (You could make a similar argument for Token Ring vs. Ethernet).

Serialization is for storing object representation somehow (on a disk file, on the wire {network transportation}, on a HTTP session, on a database). XML Serialization is just one type of serialization.
The reason you need a parameter-less constructor to support serialization, is that the AUTO DESERIALIZER needs to create an EMPTY (with no o little data) class before start populating it with the corresponding data.
You don't need to use ONE WAY or THE OTHER, because you can have a class with multiple constructor (the parameter-less one will be used on deserialization, and you can use the other one wherever you need in your code).

Is there a way to emulate Jackson's Mixins in JSON.Net?

I'm currently working on a few utility libraries to aid in the integration between two existing systems. As part of the integration process, I need to be able to convert objects to JSON.
For various reasons, I need to be able to modify the serialized field names (i.e convert camel case to snake case, and in some instances change the field name altogether).
One half of the system is written (mostly) in Java, and is entirely under my control. My preferred solution for serializing / deserializing JSON is to use Jackson. For a variety of reasons, it is considered a risk for us to modify the existing entity classes in order to apply the required attributes for Jackson to produce the correct JSON. Fortunately, Jackson provides Mixins, which essentially allow me to apply annotations dynamically. This is far, far superior to writing custom serializers and deserializers to do the same job.
The other half of the system is an ASP.Net application, and again I would like to modify as little of the existing code as I can get away with. I am currently using JSON.Net for serialization / deserialization, and it seems to support everything I need, including defining attributes to override property names.
However, one thing I can't seem to work out is whether JSON.Net supports the same concept of Mixins as Jackson does. If I can get away with it, I'd like to avoid modifying the existing .NET entity classes to include new attributes, but I can't find any documentation suggesting that this feature exists within JSON.Net.
So, does anybody know if there is a (documented / undocumented) way to apply Jackson-like mixins using JSON.Net, or will I need to write customer serializers / deserializers?

Not sure if this helps, but there is sort of external implementation of Jackson's mix-in handling, as part of ClassMate project. Library does many other things too, so I don't know how easy it'd be to extract part that handles merging of regular annotations and mix-ins.

Questions on Juval Lowy's IDesign C# Coding Standard [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
We are trying to use the IDesign C# Coding standard. Unfortunately, I found no comprehensive document to explain all the rules that it gives, and also his book does not always help.
Here are the open questions that remain for me (from chapter 2, Coding Practices):
No. 26: Avoid providing explicit values for enums unless they are integer powers of 2
No. 34: Always explicitly initialize an array of reference types using a for loop
No. 50: Avoid events as interface members
No. 52: Expose interfaces on class hierarchies
No. 73: Do not define method-specific constraints in interfaces
No. 74: Do not define constraints in delegates
Here's what I think about those:
I thought that providing explicit values would be especially useful when adding new enum members at a later point in time. If these members are added between other already existing members, I would provide explicit values to make sure the integer representation of existing members does not change.
No idea why I would want to do this. I'd say this totally depends on the logic of my program.
I see that there is alternative option of providing "Sink interfaces" (simply providing already all "OnXxxHappened" methods), but what is the reason to prefer one over the other?
Unsure what he means here: Could this mean "When implementing an interface explicitly in a non-sealed class, consider providing the implementation in a protected virtual method that can be overridden"? (see Programming .NET Components 2nd Edition, end of chapter “Interfaces and Class Hierarchies”).
I suppose this is about providing a "where" clause when using generics, but why is this bad on an interface?
I suppose this is about providing a "where" clause when using generics, but why is this bad on a delegate?

Ok so I will basically chip in with the little rep I have on stackoverflow: You can all kill me for opening a religious debate.
The hint to working with Juval's code standard is really in it's preface:
"I believe that while fully understanding every insight that goes into a particular programming decision may require reading books and even years of experience, applying the standard should not."
Next hint is to read his work.
Next would be doing what he advices.
Maybe on the way understand why he is recognised as a "microsoft software legend".
The next might be that it is delivered in a read-only pdf.
That didnt satisfy you?
The key here is realizing this is not a religious narrative, it is indeed a pragmatic one.
My personal experience with his standard is pretty much:
If you follow the advices as best you can, you will get a productive outcome.
Your needs of refactoring will be less, you productivity in the later cycles of the project will be higher.
This is an experience based evaluation.
Basically what you are looking for are the answers to "why" certain portions of the coding standard is in there, specifically these: 26, 34, 50, 52, 73, 74
You are asking these questions from a pragmatic perspective: You want to incorporate the standard and you "somehow" know this will give you a benefit in the longer run.
By examining these rules, through experience, you will perhaps come to understand why they are indeed there. Experience is here doing and not doing your code in accordance with the principles.
But really that isnt the point. The point is that by working from the basis of a well considered, mature and reliable standard, your work now becomes work of quality.
On how to follow a standard
Really read the rules as advices, they are already very strictly worded: "Avoid, do not, do and so on" really means what they say. Avoid means "think really hard before breaking the principle". "Do not" really means "do not".
Juval is all about decoupling and most of his harder-to-grok advices really comes from not thinking "separation" into your code design.
Sometimes this is hard to do with a team that works in less abstract terms or in more feature oriented environments, and you can therefore sometimes find it necessary to break rules, "to get the team moving", but you really should refactor this to the standard when you can do so.
After a few years, a few projects more, you perhaps will come to understand the rationale for each and every rule in the simple guide. If you are a bright student. Me, I'm not so bright, so I base it on experience: Stuff that is in the non-standardized parts of the code often fails during integration testing. It is often harder to come back to. It often ties poorly to the environment and so on.
It really is a matter of trust. If you cant find it in yourself to trust this (not adopt it - trust it), standard, I will propose to you that you will find it hard to ever really write comprehensible c# code that is generally of a acceptable quality.
This of course not considering the few really smart and really experienced out there that managed to build up their own set of internalizable rule sets on how to write code that is:
Stable, readable, maintainable, extendable and generally mega-ble.
This not saying that this is the only standard, just that it do indeed work. Also over time.
You will be able to find other standards, most if not all, though sub-standard to this for working with code you "really want to work" in the long run, involving a team of developers and changing real time situations.
So consider your standard, it is really setting the bar for the quality you provide for the $ you earn.
Do not reconsider it, unless you have "real good reasons to".
But do learn from it.
You reached here and you are still not satisfied. You want answers!
Darn it. Ok let me give you my clouded and sub-par experience based view on the rules your team can't grok.
No. 26: Avoid providing explicit values for enums unless they are integer powers of 2
Its about simplicity and working within that simplicity.
Enum's in c# are already numbered from "i=0" and injecting "++i" for every new field you define. The simplest way to [read code with/] think about enums are therefore either you flag them or you enumerate them.
When you dont do this, you are probably doing something wrong.
If you use the enum to "map" some other domain model, this rule can be broken, but the enum should be visibily separated from normal enums through placement/namespace/etc - to indicate you are "doing something not ordinary".
Now look at the enums you have created out-of-standard. Really look at them. Probably they are mapping something that really do not belong in a hard-coded enum at all. You wanted the efficiency of a switch, but you have in actuality now begun to hardcode out-of-domain properties. Sometimes this is ok and can be refactored later, sometimes it is not.
So avoid it.
The argument of "inserting into the middle" in the development process, is not really an issue that should break the standard. Why? Well if you are using or in any way storing the "int value" of the enumeration, then you are already diverging into a usage pattern where you need to stay very focused indeed. Not using 0, 1, 2 is not the answer to problems in that particular domain, i.e you are probably "doing it wrong".
The EF code-first argument is, probably not an issue to circumvent the rule here. If you feel you have such a need, please do describe your situation in a separate thread and I will try to guide you how to both follow the standard and work with your persistance framework.
My initial take would be: If they are code-first poco entities, then let them abide by the code standard, After all your team have decided to work with a code-first view on the data model. Allow them to do just that, following the same semantics as the rest of their code base.
If you run into specific problems related to the mapping to database tables, then solve these problems while maintaining the standard. For EF 4.3 I would say use the get/set with int pattern and replace in 5.0.
The real gritty stuff here is how to maintain this entity as the database evolves. Be sure to provide clear migration paths for your production database when your enum containing entities change design. Be very sure if the entities themselves change values. This go for add/remove/insert of new definitions. Not just add.
No. 34: Always explicitly initialize an array of reference types using a for loop
My guess is this is an "experience based" rule. In such a way that he looked through a lot of code and this point seemed to fail often for those that did not do it.
No. 50: Avoid events as interface members
Basically a question of separation - your interfaces are now coupled "both" ways into the interface and out of it. Read his book for clarification on why that is not "best practice" in the c# world.
In my limited view this is probably argumentation along the lines of: "Call-backs are very different in nature from function-calls. they make assumptions on the state of the receiver at an undefined "later time" of execution. If you want to provide callbacks, by all means do provide them, but make them into separate interfaces and also define an interface to negotiate these callbacks, in effect establishing all the hard things you wanted to abstract away, but really just circumvented. Or probably just use a recognized pattern. aka read the book."
No. 52: Expose interfaces on class hierarchies
Separation. I cant really explain it any further than: The alternative is hard to factor into a larger context/solution.
No. 73: Do not define method-specific constraints in interfaces
Separation. same as #52
No. 74: Do not define constraints in delegates
Separation.
Conclusion
......
Another view on #50 is that it is one of those rules where:
If you dont get it, its probably not important for you.
Importance here is an estimate on how critical the code is - and how critically you want that code to always work as intended.
This again leads into a broader context on how to verify your code actually is quality.
Some would say this can be done with tests only.
They however fail to understand the interconnectedness between defensively writing quality software and aggressively trying to prove it wrong.
In the real world, those two should be parts of a greater totality with the actual requirements of the software being the third.
Doing a quick mock-up? who cares. Doing a little visual thingy that you understand the runtime constraints of? who cares - well probably you should, but perhaps acceptable. and so on.
In the end compliance to a coding standard is not something you take on lightly.
There are reasons for doing it that goes beyond just the code:
Collaboration and communication primarily.
Agreement on shared views.
Assertions of assumptions.
etc.

No. 26: Power of two means you want to use the enum as a bitmask (flags). Thats the only reason to specify enum values. For adding new members later on, you can still append them to the enum definition without changing existing values. No reason to put them between existing members.
No. 34: I think he wants to avoid the situation where you have an array wich contains (partially) uninitialized pointers (null references). As the consumer of an array its tempting
not to check for null entries in a valid array variable.

No. 26: He's quite wrong, at least for public enums. The problem crops up when you remove an item, not when you add one (for which adding to the end of the list would be equivalent to adding an item with the next available value).
For the others, I'm really not sure why he's making these recommendations, although I have to admit that I find a few (or maybe most) of them rather dubious...

protobuf-net -- DataContractSurrogates?

Right now, I'm using DataContractSerializer along with DataContractSurrogate to provide serialization descriptions for NHibernate proxy classes (as described in http://timvasil.com/blog14/post/2008/02/WCF-serialization-with-NHibernate.aspx).
I'm really interested in switching to protobuf-net to serialize my data using protobufs, but I can't seem to find a way to consume the DataContractSurrogate's. Without this feature, I'm dead in the water for serializing NHibernate dynamic proxy classes that derive from my model classes.

Im not a NHibernate expert, but in v2 there are a few things designed to cater for this scenario; foremost there is code built in intended to recognise NH proxies and treat appropriately (in particular, not complain about unknown types).
I will read the linked article, though; without more NH experience I can't be sure that the current approach is sufficient. I would also be more than happy to receive any test cases I could use to prove that it meets the need.

Any reason not to use XmlSerializer?

I just learned about the XmlSerializer class in .Net. Before I had always parsed and written my XML using the standard classes. Before I dive into this, I am wondering if there are any cases where it is not the right option.
EDIT: By standard classes I mean XmlDocument, XmlElement, XmlAttribute...etc.

There are many constraints when you use the XmlSerializer:
You must have a public parameterless constructor (as mentioned by idlewire in the comments, it doesn't have to be public)
Only public properties are serialized
Interface types can't be serialized
and a few others...
These constraints often force you to make certain design decisions that are not the ones you would have made in other situations... and a tool that forces you to make bad design decisions is usually not a good thing ;)
That being said, it can be very handy when you need a quick way to store simple objects in XML format. I also like that fact that you have a pretty good control over the generated schema.

Well, it doesn't give you quite as much control over the output, obviously. Personally I find LINQ to XML makes it sufficiently easy to write this by hand that I'm happy to do it that way, at least for reasonably small projects. If you're using .NET 3.5 or 4 but not using LINQ to XML, look into it straight away - it's much much nicer than the old DOM.
Sometimes it's nice to be able to take control over serialization and deserialization... especially when you change the layout of your data. If you're not in that situation and don't anticipate being in it, then the built-in XML serialization would probably be fine.
EDIT: I don't think XML serialization supports constructing genuinely immutable types, whereas this is obviously feasible from hand-built construction. As I'm a fan of immutability, that's definitely something I'd be concerned about. If you implement IXmlSerializable I believe you can make do with public immutability, but you still have to be privately mutable. Of course, I could be wrong - but it's worth checking.

The XmlSerializer can save you a lot of trouble if you are regularly serializing and deserializing the same types, and if you need the serialized representations of those types to be consumable by different platforms (i.e. Java, Javascript, etc.) I do recommend using the XmlSerializer when you can, as it can alleviate a considerable amount of hassle trying to manage conversion from object graph to XML yourself.
There are some scenarios where use of XmlSerializer is not the best approach. Here are a few cases:
When you need to quickly, forward-only process large volumes of xml data
Use an XmlReader instead
When you need to perform repeated searches within an xml document using XPath
When the xml document structure is rather arbitrary, and does not regularly conform to a known object model
When the XmlSerializer imposes requirements that do not satisfy your design mandates:
Don't use it when you can't have a default public constructor
You can't use the xml serializer attributes to define xml variants of element and attribute names to conform to the necessary Xml schema

I find the major drawbacks of the XmlSerializer are:
1) For complex object graphs involving collections, sometimes it is hard to get exactly the XML schema you want by using the serialization control attributes.
2) If you change the class definitions between one version of the app and the next, your files will become unreadable.

Yes, I personally use automatic XML serialization - although I use DataContractSerializer initially brought in because of WCF instead (ability to serialize types without attributes at all is very helpful) as it doesn't embed types in there. Of course, you therefore need to know the type of object you are deserializing when loading back in.
The big problem with that is it's difficult to serialize to attributes as well without implementing IXmlSerializable on the type whose data you might want to be written so, or exposing some other types that the serializer can handle natively.
I guess the biggest gotcha with this is that you can't serialise interfaces automatically, because the DCS wants to be able to construct instances again when it receives the XML back. Standard collection interfaces, however, are supported natively.
All in all, though, I've found the DCS route to be the fastest and most pain-free way.
As an alternative, you could also investigate using Linq to XML to read and write the XML if you want total control - but you'll still have to process types on a member by member basis with this.
I've been looking at that recently (having avoided it like the plague because I couldn't see the point) after having read about it the early access of Jon Skeet's new book. Have to say - I'm most impressed with how easy it makes it to work with XML.

I've used XmlSerializer a lot in the past and will probably continue to use it. However, the greatest pitfall is one already mentioned above:
The constraints on the serializer (such as restriction to public members) either 1) impose design constraints on the class that have nothing to do with its primary function, or 2) force an increase in complexity in working around these constraints.
Of course, other methods of Xml serialization also increase the complexity.
So I guess my answer is that there's no right or wrong answer that fits all situations; chosing a serialization method is just one design consideration among many others.

Thera re some scenarios.
You have to deal with a LOT of XML data -the serializer may overlaod your memory. I had that once for a simple schema that contained a database dump for 2000 or so tables. Only a handfull of classes, but in the end serialization did not work - I had to use a SAX streaming parser.
Besides that - I do not see any under normal circumstances. It is a much easier way to deal with the XML Serializer than to use the lower level parser, especially for more complex data.

When You want to transmit lot of data and You have very limited resources.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.