Converting XSD Schemas to Classes - c#

I am working on a project that consumes (external) services.
The vendor has provided a whole heap of XSDs (89 of them) and I want the convert them all into .Net (C#) classes / class library.
I am using the XSD utility on these but as there is a lot of cross-referencing and importing, they are failing with error messages saying type 'xxxxx' not declared
Now, based my my googling, this is quite simply overcome by compiling the complete reference "tree" but ....
I have 89 files to convert
It concatenates all the schema names together for the output .cs file name (and breaks due to being too long (> 260char))
I thought about creating a class library assembly, starting with the base level schemas (ones without imports) and then telling XSD to convert a schema but use any referenced types from this assembly... but I am not sure how or even if it is possible.
So, how can I best do this please... any advice is welcome..
And yes, 89 schemas are a lot and unfortunately, I have no control on this, I just have to suck it up and deal with it.

You can use /P[arameters]:file.xml option in xsd.exe to specify many parameters in separate file instead of pass them in command line.
Sample of this xml:
<xsd xmlns='http://microsoft.com/dotnet/tools/xsd/'>
<generateClasses language='CS' namespace='MyNamespace'>
<schema>FirstSchema.xsd</schema>
<schema>SecondSchema.xsd</schema>
<schema>ThirdSchema.xsd</schema>
</generateClasses>
</xsd>

Related

How to handle XML deserialization with a changing schema

We have a service that's receiving data in XML, and there's an accompanying namespace/schema definition that usually changes once a year, sometimes more.
The schema describes a very large object and we only use a small portion of it that has not changed in at least 2 years since I've been handling it. However, the schema change forces us to re-generate the C# classes, re-build and re-deploy the application.
It would be good to not have to touch the application unless there's a change in the parts that we use.
For a separate throwaway application that was set up with a certain namespace, I had the code replace the incompatible namespace with the compatible one and deserialize the data that way.
Is there a solution for this problem that's more elegant?
Edit: the data we receive is only the subset of the whole schema, that's why it's not a problem to deserialize it with the namespace replacement.

XmlSerializer - the first deserialization is very slow

I have a solution with two projects; an asp.net MVC application, and a class library. Let's call them project MVC and project CLS.
In the project CLS, there are two different versions (V1 and V2) of an XSD file that I have used to create two serializable classes with the same name, but under different namespaces (V1 and V2) using xsd2code.
In the MVC project, when the user uploads an XML file, the CLS.dll is used to deserialize the XML into an object. When the XML file is of type V1, the deserialization is very fast, but the XSD file for the V2 version is a lot more complex, and the deserialization can take up to a couple of minutes, only the first time (it's very fast afterwards, until the application is run again).
I used the Sgen.exe tool to create a serializer assembly (CLS.XmlSerializers.dll) for the CLS.V2 type in order to eliminate the first-time creation of the assembly on the fly, and therefore improving the performance.
I have successfully managed to add the Sgen Task to the Post Build events, and the assembly CLS.XmlSerializers.dll is created every time I build the project. Also, I have used the unit test code in this post to make sure the assembly is loaded, and it does. The test passes susscessfully.
However, still, the first time the XML file is deserialized, it takes a long time. So, something still should be wrong. But, I don't know what. Please help.
UPDATE:
I used Fuslogvw.exe as was suggested in the comments, and I can see that the CLS.XmlSerializers.dll is being loaded successfully. Then, how come the first time the XML file is deserialized it takes around one minute, but every time after that takes less than a second?
UPDATE 2:
One of the differences between the two XSD files is that the second one (V2) has a reference to a very big XSD file that containes definitions of some xs:enumeration types that are used in the main file. And, that's the reason the deserialization took a long time. Since all I need to do is to deserialize the XML files into objects and do not need to validate the values of the attributes and elements against those enumerations, I ended up removing the reference to that XSD file, and replacing all the enumeration types with their base types (in this case, xs:string). Now, V2 is deserialized as fast as V1, and I don't even need to use Sgen.exe. I guess Sgen.exe only helps in situations where you need to deserialize a very large XML file. In my case, the XML files are always very small, but the desrialization is (was) complex.
In order to increase performance of XML serialization, assemblies are dynamically generated each time XmlSerializer is instantiated for the first time for a specific type. It happens only once in the application lifetime, but that makes its first usage slow.
When you instantiate an XmlSerializer you have to pass the Type of the objects that you will attempt to serialize and deserialize with that serializer instance. The serializer examines all public fields and properties of the Type to learn about which types an instance references at runtime. It then proceeds to create C# code for a set of classes to handle serialization and deserialization using the classes in the System.CodeDOM namespace. During this process, the XmlSerializer checks the reflected type for XML serialization attributes to customize the created classes to the XML format definition. These classes are then compiled into a temporary assembly and called by the Serialize() and Deserialize() methods to perform the XML to object conversions.
Full Content: Troubleshooting Common Problems with the XmlSerializer
More Info: XmlSerializer Constructor Performance Issues
It is a known issue of x64 jit compiler, it can be very slow in some cases. That's why you have much better performance when running the deserializtion the second time when code is already compiled.
Try to use .net 4.6 or higher, it features a new version of x64 jit compiler (RyuJIT). If it is not possible to update .net version then take a look at this thread.

Reusing generated classes from included XSDs

I have 6 XSD files all of which have these lines to import/include 3 further XSD files (I did not write them, I just need to use them):-
<xsd:import namespace="http://www.w3.org/2000/09/xmldsig#" schemaLocation="xmldsig-core-schema.xsd"/>
<xsd:include schemaLocation="ORIGOMESSAGEHEADER.xsd"/>
<xsd:include schemaLocation="ORIGODATATYPELIBRARY.xsd"/>
The 6 XSD files all need to be in different (C#) namespaces (because they use the same top-level type names) but that then means I get 6 copies of the types from the ORIGO* XSDs. This is very frustrating because I would need to duplicate lots of code for each duplicate type (or add interfaces for them all).
Is there any way of generating C# classes for the 6 main XSDs which will use a single, shared copy of classes generated from the ORIGO*.XSD types?
I tried using XSD2CODE with the ExcludeImportedTypes option but got an error:
"Error: Cannot use wildcards at the top level of a schema." - no idea what that means.
(The XSD are not secret so I can upload a zip file containing all 9 if that helps)

Xsd to object class

So, I'm trying to take an .xsd file (musicxml fixed standard), create an object class, use portions of it - specificially the note object - include it in a graph object, and then save both the graph object and a musicxml validated file.
All in all, the solutions I'm using have one or two massively breaking shortcomings.
Xsd2Code - Creates the file; but for some reason it makes a Items collection (of the type I need, ObservableCollection), and then an enumerable ItemsChoiceType[0-9] ObservableCollection. The problem with the enumerable is after it generates, I have to either have to switch the latter to an Array, or do mumbo-jumbo for the XmlSerialisation attrs. Generates a 2mb .cs file, so alot of code that would be autogenerated and would have to have a crapton of .extend.cs files to get it to fit. Maybe I have to change some switches for it to work? What switches fix this?
LinqToXsd / OpenLinqToXsd - Generates the file, hard codes it to reference a DLL file, then forces you to use List (no option to go to ObservableCollection), which doesn't have EditItem and can't be used for binding to WPF/XAML. Otherwise, a bunch more .extend.cs files.
Altova C# generator - Expensive, requires a bunch of their DLLs to include in the project, messy.
Long story short, has anyone used any of these systems successfully and what did you have to do to shoehorn them? What kind of pain will I have to deal with beyond the issues I'm having
I remember now for XSD.exe: XSD notation doesn't export, individual classes (such as 'note') don't serialise out to xml. I would have to write out the entire thing from scorepartwise to every piece inbetween. Which means I can't serialise a graph object that has 'note's as vertices.

/sharedtypes equivalent for svcutil.exe?

Building an app that is relying on a 3rd party provider who has a very verbose set of SOAP services (we're talking 50+ WSDL files). Each individual WSDL however has numerous shared type declarations. When generating client code with wsdl.exe, there used to be a /sharedtypes flag that would merge duplicate entries if a type was found several times.
When I attempt to generate my client code, I bomb on these overlapping types that the 3rd party includes in all their WSDL files.
svcutil /t:code /importxmltypes [mypath]/*.wsdl
Results in error messages alluding to the type collisions. For example, a couple samples of the error messages below:
Error: There was an error verifying some XML Schemas generated during export:
The simpleType 'http://common.soap.3rdparty.com:CurrencyNotation' has already been
declared.
Error: There was an error verifying some XML Schemas generated during export:
The complexType 'http://common.soap.3rdparty.com:NumberFormat' has already been
declared.
I do not have control over the output of the WSDLs. I do not want to have to edit the WSDLs by hand for fear of an error that breaks in a fashion at runtime that would be highly difficult to track back to our editing of the WSDL files. Not to mention that there are 50 some WSDL files that range from 200-1200 lines of XML. (Remind me again why we thought SOAP was the great salvation to us all back in the late 90s?)
Try specifying all the WSDLs in one command:
svcutil http://example.com/service1?wsdl http://example.com/service2?wsdl ...
This should automatically take care of duplicate types. Another option is to take a look at the /reference command switch:
/reference:<file path> - Add the specified assembly to the set of
assemblies used for resolving type
references. If you are exporting or
validating a service that uses 3rd-party
extensions (Behaviors, Bindings and
BindingElements) registered in config use
this option to locate extension assemblies
that are not in the GAC. (Short Form: /r)
This means that if you already have some types defined in some assembly you may include this assembly and svcutil will exclude types from it to avoid duplicates:
svcutil /reference:someassembly.dll http://example.com/service?wsdl
I was having similar problems. By defining different CLR namespaces for the different xml namespaces (using the /namespace argument of svcutil) i was able to get it working.
/namespace:http://www.opengis.net/gml,OpenGIS.GML
I have been using wsdl.exe to get round this because I work with some SOAP webservices which define the same data transfer objects at different endpoints. So I use wsdl.exe because it has the sharetypes switch. I'm not a WPF developer so I don't really care that the output does not implement IWhatever for WPF, but the classes generated are all partial so you can do some work to implement interfaces you care about in a separate file.

Categories