I have recently started working with protobuf in my project and I am wondering, is there some way to deserialize a proto message if I don't know exactly what entity I have? When I am working with JSON or XML I can easily do it.
I was searching for some way to convert protobuf to json or xml, but found nothing for c#.
I have already looked in popular libraries, but they can only serialize json to protobuf, but not in both directions.
Does someone know how to solve this problem? I would be appreciative for any advice or solution!
In general, you can't work with protobufs if you don't know the message format. In order to be compact the wire format doesn't include all the information necessary to reconstruct the message. JSON and XML contain a lot of extra stuff in the message that allows you to (kind of) work with them even if you have no idea what they contain, but the trade-off there is a bloated format.
By the way, do not try to "guess" what a message is by going down a list of possible message formats and trying one after the other until your message successfully deserializes. It's entirely possible to "get lucky" and have a message of one type successfully deserialize as a different type, but with bogus data. I have been bitten by that one rather badly. :(
Look at union types if you want to wrap several different message types in a single message: https://developers.google.com/protocol-buffers/docs/techniques#union
There is a workaround (mentioned in the comments) of using self-describing messages, but I've never found them to be useful and evidently google didn't either: https://developers.google.com/protocol-buffers/docs/techniques#self-description
Related
Is there a 'correct' or preferred manner for sending data over a web socket connection?
In my case, I am sending the information from a C# application to a python (tornado) web server, and I am simply sending a string consisting of several elements separated by commas. In python, I use rudimentary techniques to split the string and then structure the elements into an object.
e.g:
'foo,0,bar,1'
becomes:
object = {
'foo': 0,
'bar': 1
}
In the other direction, I am sending the information as a JSON string which I then deserialise using Json.NET
I imagine there is no strictly right or wrong way of doing this, but are there significant advantages and disadvantages that I should be thinking of? And, somewhat related, is there a consensus for using string vs. binary formats?
Writing a custom encoding (eg, as "k,v,..") is different than 'using binary'.
It is still text, just a rigid under-defined one-off hand-rolled format that must be manually replicated. (What happens if a key or value contains a comma? What happens if the data needs to contain nested objects? How can null be interpreted differently than '' or 'null'?)
While JSON is definitely the most ubiquitous format for WebSockets one shouldn't (for interchange purposes) write JSON by hand - one uses an existing serialization library on both ends. (There are many reasons why JSON is ubiquitous which are covered in other answers - this doesn't mean it is always the 'best' format, however.)
To this end a binary serializer can also be used (BSON being a trivial example as it is effectively JSON-like in structure and operation). Just replace JSON.parse with FORMATX.parse as appropriate.
The only requirements are then:
There is a suitable serializer/deserializer for the all the clients and servers. JSON works well here because it is so popular and there is no shortage of implementations.
There are various binary serialization libraries with both Python and C# libraries, but it will require finding a 'happy intersection'.
The serialization format can represent the data. JSON usually works sufficiently and it has a very nice 1-1 correspondence with basic object graphs and simple values. It is also inherently schema-less.
Some formats are better are certain tasks and have different characteristics, features, or tool-chains. However most concepts (and arguably most DTOs) can be mapped onto JSON easily which makes it a good 'default' choice.
The other differences between different kinds of binary and text serializations is most mostly dressing - but if you'd like to start talking about schema vs. schema-less, extensibility, external tooling, metadata, non-compressed encoded sizes (or size after transport compression), compliance with a specific existing protocol, etc..
.. but the point to take away is don't create a 'new' one-off format. Unless of course, you just like making wheels or there is a very specific use-case to fit.
First advice would be to use the same format for both ways, not plain text in one direction and JSON in the other.
I personally think {'foo':0,'bar':1} is better than foo,0,bar,1 because everybody understands JSON but for your custom format they might not without some explanations. The idea is you are inventing a data interchange format when JSON is already one and #jfriend00 is right, pretty much every language now understands JSON, Python included.
Regarding text vs binary, there isn't any consensus. As # user2864740 mentions in the comments to my answer as long as the two sides understand each other, it doesn't really matter. This only becomes relevant if one of the sides has a preference for a format (consider for example opening the connection from the browser, using JavaScript - for that people might prefer JSON instead of binary).
My advice is to go with something simple as JSON and design your app so that you can change the wire format by swapping in another implementation without affecting the logic of your application.
I am developing a WCF SOA(ish) architecture application that I needs to receive and return XML.
I have scoured the net for best practices. The only thing I know is sending raw xml as a string is going to cause problems.
I was therefore looking at an XmlTextReader type object that could perhaps be more elegantly marshalled from a to b and then back.
I get an error when i try and call my service that takes a XmlTextReader as a type and frankly it is confusing the hell out of me.
Bottom line it needs to accept and recieve large amounts of xml and I can't/don't want to use my own definded types.
Any help?
One answer without deeper analysis : you could maybe convert XML to Base64 encoded string and send it like that.
You could use XElement - personally have only used this on relatively small chunks of Xml - YMMV.
Im looking for a simple solution to serialize and store objects that contain configuration, application state and data. Its a simple application, its not alot of data. Speed is no issue. I want it to be in-process. I want it to be more easy-to-edit in a texteditor than xml.
I cant find any document database for .net that can handle it in-process.
Simply serializing to xml Im not sure I want to do because its... xml.
Serializing to JSON seems very javascript specific, and I wont use this data in javascript.
I figure there's very neat ways to do this, but atm im leaning to using JSON despite its javascript inclenation.
Just because "JSON" it's an acronym for JavaScript Object Notation, has no relevance on if it fits your needs or not as a data format. JSON is lightweight, text based, easily human readable / editable and it's a language agnostic format despite the name.
I'd definitely lean toward using it, as it sounds pretty ideal for your situation.
I will give a couple of choices :
Binary serialization: depends on content of your objects, if you have complicated dependecy tree it can create a problems on serializing. Also it's not very flexible, as standart binary serialization provided by Microsoft stores saving type information too. That means if you save a type in binary file, and after one month decide to reorganize your code and let's say move the same class to another namespace, on desirialization from binary file previously saved it will fail, as the type is not more the same. There are several workarrounds on that, but I personally try to avoid that kind of serialization as much as I can.
ORM mapping and storing it into small database. SQLite is awesome choice for this kind of stuff as it small (single file) and full ACID support database. You need a mapper, or you need implement mapper by yourself.
I'm sure that you will get some other choice from the folks in a couple of minutes.
So choice is up to you.
Good luck.
I have a C# TCP chat program. Currently, I have formatted the messages sent using strings i.e, a "login" message starts with a "3" then followed by a "U:" then the username etc.
I think this method is very crude in a way that it's not really readable and not standardized. In early research, I have read that I can format my messages using XML but I dont know where to start exactly. Do I just make a string builder and append it tags like .append("<Login>"+message)?
The most common approach for dealing with a problem like this is to use serialization. Serialization is the process of converting an in-memory object into a format that can be easily streamed "over the wire," and de-serialization is the reverse process of converting the serialized format back into an object. .NET has good support for XML and binary serialization out-of-the-box, but there are other ways to implement this. Here's a link to get you started:
http://msdn.microsoft.com/en-us/library/7ay27kt9(VS.71).aspx
You can send whatever you like over the connection - as long as it's just for your program it doesn't really matter what you choose. Xml might give you some benefits as it lends itself to some kind of more structured messages and there are many classes and tools and knowledge around on the net regarding XML. JSon format might be another option - it will make it potentially easier creating a JavaScript client for it in case you want to go web based.
Unless there is a reqirement that 3rd parties be able to read these messages then I would probably favour binary serialisation, as it has a more compact format.
That said, I'd probably just use WCF rather than uisng TCP directly.
If you want to know more about XML serialisation then the most commonly used methods are:
Generating a stronly typed C# object decorated with attributes to control XML serialisation using XSD.exe, and then using XmlSerializer to serialise and deserialise XML. (recommended)
Using the XmlDocument class
You can write our XML yourself as a string, but its better to use the serialisation methods made available in the .Net framework as it makes things considerably easier and reduces the chance that you will make a mistake and inadvertantly start working with invalid xml.
Is there such a thing as a JSON file? That is, *.json?
Can JSON be used in C# code without any JavaScript stuff, sort of as a replacement for XML?
And is there any official LINQ to JSON stuff around for C#?
I did find one website for my last question, but it took me to a page to download JSON.NET, and that page doesn't seem to mention anything about LINQ.
Yes, there is such a thing as a *.json file. The MIME type is application/json (source). JSON is a text-based format though, so you could hypothetically store JSON-formatted data in a text file with whatever extension you choose.
JSON can absolutely be used independently of JavaScript. In some cases, it's probably better suited to representing your data than XML. JSON.org has a great comparison page between JSON and XML.
JSON.org lists several JSON libraries for C# (for example, JSON.NET which you have already discovered), and most (if not all) of the collections that these libraries use should support LINQ. JSON.NET definitely does offer support for it. See here or here.
Everyone tends to stick with the JavaScriptSerializer (from the System.Web.Extensions library) when working with JSON in .NET. The handy part about this is the ability to create a custom JavaScriptConverter that will take custom objects and serialize them the way you chose. Likewise, you can make a deserialization method to receive in custom JSON formatting.
Though this of course depends on your application. Given that it's a Windows Forms application, is there any particular reason you'd chose JSON over storing the information natively or just use the XML format? If your application communicates with webpages, the JavaScriptSerializer is probably the best bet, though if you're using it to store/retrieve settings I'd use XML. And, if it's necessary to synchronize your application with a web-based one, just serialize to JSON when the time is ready.
You can deserialize your JSON file into C# objects. After that, you can query with LINQ on these objects.