Read XML into xsd.exe generated classes. Good idea? - c#

I have a fairly complex XML coming my way and I have the XSD for it. I generated classes via xsd.exe and read XML into the class structure via the XmlSerializer described here.
It works great. However, this is the first time I've done it this way and I'll be reading in tons of XML files going forward from various sources. How reliable is this method? Could one say with certainty that if the XML file conforms to the XSD specification, that the XmlSerializer will be able to read it in just fine?

Short answer: it's better. This is exactly how MSfts web services work, so if what you described didn't work, any of the .NET consumers would fail, like when you add a reference to a web service in .NET or Silverlight.

Related

compiling proto files at runtime

I'm working a generic protobuf decoder that works as follows:
The user can specify the .proto file at runtime and specify the data file and the program would display the data in the file based on the .proto definition.
To do the above, the most obvious things seems like I would need to interpret the .proto file (or compile it) and then decode the protobuf message using it. Any ideas on how I can proceed on this? Is there a library out there that would help me with this.
As always, any feedback is much appreciated.
Thanks!
I keep meaning to write my own parser, but for now I just use "protoc" to parse the .proto to a protobuf binary. I then deserialize that using my own protobuf library, giving me a populated object model to work with.
I don't know how far along you are, but you might also be interested in some of the runtime support in protobuf-net v2, which allows on-the-fly mapping of protobuf data to types. Alternatively there's also a fairly re-usable reader implementation that might suit your needs.
If you could work from XML, I include a tool in protobuf-net, "protogen", which does code-gen; but pass in a -t:xml and it should transform a .proto into XML for you.
Iirc, "protoc" outputs a protobuf using "descriptor.proto" from the google package.

What is the best way to read and write cXML documents in C#?

I know this is a vague open ended question. I'm hoping to get some general direction.
I need to add cXML punchout to an ASP.NET C# site / application. This is replacing something that I wrote years ago in ColdFusion.
I'm a reasonably experienced C# developer but I haven't done much with XML. There seems to be lots of different options for processing XML in .NET.
Here's the open ended question: Assuming that I have an XML document in some form, eg a file or a string, what is the best way to read it into my code? I want to get the data and then query databases etc. The cXML document size and our traffic volumes are easily small enough so that loading the a cXML document into memory is not a problem.
Should I:
1) Manually build classes based on the dtd and use the XML Serializer?
2) Use a tool to generate classes. There are sample cXML files downloadable from Ariba.com.
I tried xsd.exe to generate an xsd and then xsd.exe /c to generate classes. When I try to deserialize I get errors because there seems to be "confusion" around whether some elements should be single values or arrays.
I tried the CodeXS online tool but that gives errors in it's log and errors if I try to deserialize a sample document.
2) Create a dataset and ReadXml()?
3) Create a typed dataset and ReadXml()?
4) Use Linq to XML. I often use Linq to Objects so I'm familiar with Linq in general but I'm struggling to see what it gives me in this situation.
5) Some other means.
I guess I need to improve my understanding of XML in general but even so ... am I missing some obvious way of doing this? In the old ColdFusion site I found a free component ("tag") which basically ignored any schema and read the XML into a "structure" which is essentially a series of nested hash tables which was then easy to read in code. That was probably quite sloppy but it worked.
I also need to generate XML files from my C# objects. Maybe Linq to XML will be good for that. I could start with a default "template" document and manipulate it before saving.
Thanks for any pointers ...
If you need to generate arbitrary XML in an exact format, you should generate it manually using LINQ-to-XML.

XSD.exe and "Circular Group references"

I am attempting to build some classes so that I can deserialise an XML file created by a third party application. Luckily the developer of the 3rd party application included a schema file with their code so that the XML file can be understood.
When I use the XSD.exe tool from Visual Studio the process fails reporting the following error
"Group 'SegGroupOrSegmentGrouping' from targetNamespace='' has invalid definition: Circular group reference."
Any help in how I can generate the class files in light of this error would be appreciated.
A copy of the schema file can be found here : schema file
Try using svcutil; it can handle the circular references.
In the following example, eExact-Schema.xsd is an XSD that xsd.exe cannot handle.
Example:
C:\SRC\Exact>svcutil eExact-Schema.xsd /language:C# /dataContractOnly /importxmltypes /out:exact.cs
This is always a good place to start; you can now use this class and alter to suit your style/needs, add comments, etc, and it will save you a lot of time/searching over doing it all from scratch.
I had this same problem recently,
I was given a Schema from a third party company who were returning an xml structure from a webservice. I then wanted to deserialise the response and store the information into a database with NHibernate.
No problem I thought I'll just use xsd.exe and I'll have my classes. Unfortunately this was not to be. Xsd.exe failed with exactly the same error you are getting. This is because it is unable to resolve circular references.
I spent a good few days looking at alternatives until in the end I wrote my own class structure to the schema and was able to deserialise perfectly. The answer here is to write your own C# classes and decorate them with the appropriate attributes.
Save yourself some time and heartache and don't continue to try and generate the classes you need automatically in the end although time consuming the classes you write won't make the compromises that most tools (which don't work perfectly) will make you make.
Took me about 3 days to write the class structure (it was large) but I ended up with a quality solution.
One thing is certain you will not be able to use xsd.exe and most other tools I tried after googling this either did not work properly or were buggy.
After trying several third party tools, I found that Liquid Technologies has a very robust generator called Liquid XML Data Binder 2012. It was able to handle the circular group reference problem I faced. It can generate code for just about any version of .net from 2.0 on. The classes it generates do depend on a redistributable dll that they provide. I'm using the trial version and I wouldn't be surprised if a purchase of the full version will be necessary before I go to release. However, having saved me probably a hundred hours or more of error prone hand coding, I can't complain.
The easiest method for me is to create the XSD file from the actual XML file with XSD.EXE. Then create a class from the new XSD file. You may be required to modify the class periodically if nodes or types are introduced that did not exist in the original XML but you will save yourself HOURS of coding time!!!!

What would be the best way to validate XML?

I been looking at XML Serialization for C# and it looks interesting. I was reading this tutorial
http://www.switchonthecode.com/tutorials/csharp-tutorial-xml-serialization
and of course you can de serialize it back to a list of objects. So I am wondering would it be better to de serialize it back to to a list of objects and then go through each object and validate it or validate it by using a schema then de serializing it and doing stuff with it?
http://support.microsoft.com/kb/307379
Thanks
I guess it would depend a bit on what you want to validate, and for what purpose. If it is intended for interop to other systems, then validating via xsd is a reasonable idea not least because you can use xsd.exe to write your classes for you from the xsd (you can also generate xsd from xml or dll, but it isn't as accurate). Likewise you can use XmlReader (appropriately configured) to check against xsd,
If you just want valid .NET objects, I'd be tempted to leave the serialized form as an implementation detail, and write some C# validation code - perhaps implementing IDataErrorInfo, or using data-annotations.
You can create an XmlValidatingReader and pass that into your serializer. That way you can read the file in one pass and validate it at the same time.
I believe the same technique will work even if you are using hand rolled XML classes (for extremely large XML files) so you might find it worth a look.
Edit:
Sorry just reread some of my code, XmlValidatingReader is obsolete, you can do what you need with the XmlReader.
See XmlReader Settings
For speed I would do it in C#, however for completeness you might want to do it using an XSD. The issue with that is you have to learn the verbose and cumbersome XSD syntax, which from experience takes a lot of trial and error, is time consuming and holds not a lot of reward for serialization. Particularly with constants where you have to map them in C# and also in the XSD.
You'll always be writing the XML as C#. Anything not known when read back in is simply ignored. If you aren't editing the XML with a text editor you can guarantee that it will come back in the right way, in which case XSD is definitely not needed.
If you validate the XML, you can only prove that it's structurally correct. An attempt to deserialize from the XML will tell you the same thing.
Typically business objects can implement business logic/rules/conditions that go beyond a valid schema. That type of knowledge should stay with the business objects themselves, rather than being duplicated in some sort of external validation routine (otherwise, if you change a business rule, you have to update the validator at the same time).

Generating a Protocol Buffers definition

I have a large set of XML files of a propriatary schema -the XML files define binary communication protocol (message structure).
I'd like to leverage Google's protocol buffers technology.
I am using existing code to load the XML files into an object model (in memory).
I'd like to generate a .proto file from that object model.
so basically what I am looking for is code/library (in C#/.NET) that represents the .proto file format as an object model and can save that object model into a .proto file.
I took a look at Jon Skeet's dotnet-protobufs, I think I understand what it does (generate c# code based on .proto files)
However, I didn't figure out if I can use it for my project (it probably has the .proto format object model there, but probably only code that can parse this format and not write it out)
protobuf-net (my version of protocol buffers in .NET) has primitive support for generating proto files, but it wouldn't be hard to fill in the blanks. I concentrated on the core engine first, then the generation of C# from proto. Writing an xslt to generate a proto from the object model wouldn't be much different. It would take a few days though... (I have limited time at the moment).
If this would be useful, please let me know.
For info, the protobuf-net engine is compatible with most XmlSerializer classes (and DataContractSerializer, and recently BinaryFormatter) - so if your code currently works as xml, we can probably get it working in protobuf-net. No guarantees, of course...
My code can only serialize and deserialize to binary and text. However, I believe Marc Gravell's project has XML capabilities. In fact, I believe he generates C# code based on loading the binary version of a .proto file (which is itself encoded as a protobuf), writing it out as XML, and then applying XSLT to it...

Categories