XmlSerializer - the first deserialization is very slow - c#

I have a solution with two projects; an asp.net MVC application, and a class library. Let's call them project MVC and project CLS.
In the project CLS, there are two different versions (V1 and V2) of an XSD file that I have used to create two serializable classes with the same name, but under different namespaces (V1 and V2) using xsd2code.
In the MVC project, when the user uploads an XML file, the CLS.dll is used to deserialize the XML into an object. When the XML file is of type V1, the deserialization is very fast, but the XSD file for the V2 version is a lot more complex, and the deserialization can take up to a couple of minutes, only the first time (it's very fast afterwards, until the application is run again).
I used the Sgen.exe tool to create a serializer assembly (CLS.XmlSerializers.dll) for the CLS.V2 type in order to eliminate the first-time creation of the assembly on the fly, and therefore improving the performance.
I have successfully managed to add the Sgen Task to the Post Build events, and the assembly CLS.XmlSerializers.dll is created every time I build the project. Also, I have used the unit test code in this post to make sure the assembly is loaded, and it does. The test passes susscessfully.
However, still, the first time the XML file is deserialized, it takes a long time. So, something still should be wrong. But, I don't know what. Please help.
UPDATE:
I used Fuslogvw.exe as was suggested in the comments, and I can see that the CLS.XmlSerializers.dll is being loaded successfully. Then, how come the first time the XML file is deserialized it takes around one minute, but every time after that takes less than a second?
UPDATE 2:
One of the differences between the two XSD files is that the second one (V2) has a reference to a very big XSD file that containes definitions of some xs:enumeration types that are used in the main file. And, that's the reason the deserialization took a long time. Since all I need to do is to deserialize the XML files into objects and do not need to validate the values of the attributes and elements against those enumerations, I ended up removing the reference to that XSD file, and replacing all the enumeration types with their base types (in this case, xs:string). Now, V2 is deserialized as fast as V1, and I don't even need to use Sgen.exe. I guess Sgen.exe only helps in situations where you need to deserialize a very large XML file. In my case, the XML files are always very small, but the desrialization is (was) complex.

In order to increase performance of XML serialization, assemblies are dynamically generated each time XmlSerializer is instantiated for the first time for a specific type. It happens only once in the application lifetime, but that makes its first usage slow.
When you instantiate an XmlSerializer you have to pass the Type of the objects that you will attempt to serialize and deserialize with that serializer instance. The serializer examines all public fields and properties of the Type to learn about which types an instance references at runtime. It then proceeds to create C# code for a set of classes to handle serialization and deserialization using the classes in the System.CodeDOM namespace. During this process, the XmlSerializer checks the reflected type for XML serialization attributes to customize the created classes to the XML format definition. These classes are then compiled into a temporary assembly and called by the Serialize() and Deserialize() methods to perform the XML to object conversions.
Full Content: Troubleshooting Common Problems with the XmlSerializer
More Info: XmlSerializer Constructor Performance Issues

It is a known issue of x64 jit compiler, it can be very slow in some cases. That's why you have much better performance when running the deserializtion the second time when code is already compiled.
Try to use .net 4.6 or higher, it features a new version of x64 jit compiler (RyuJIT). If it is not possible to update .net version then take a look at this thread.

Related

Converting XSD Schemas to Classes

I am working on a project that consumes (external) services.
The vendor has provided a whole heap of XSDs (89 of them) and I want the convert them all into .Net (C#) classes / class library.
I am using the XSD utility on these but as there is a lot of cross-referencing and importing, they are failing with error messages saying type 'xxxxx' not declared
Now, based my my googling, this is quite simply overcome by compiling the complete reference "tree" but ....
I have 89 files to convert
It concatenates all the schema names together for the output .cs file name (and breaks due to being too long (> 260char))
I thought about creating a class library assembly, starting with the base level schemas (ones without imports) and then telling XSD to convert a schema but use any referenced types from this assembly... but I am not sure how or even if it is possible.
So, how can I best do this please... any advice is welcome..
And yes, 89 schemas are a lot and unfortunately, I have no control on this, I just have to suck it up and deal with it.
You can use /P[arameters]:file.xml option in xsd.exe to specify many parameters in separate file instead of pass them in command line.
Sample of this xml:
<xsd xmlns='http://microsoft.com/dotnet/tools/xsd/'>
<generateClasses language='CS' namespace='MyNamespace'>
<schema>FirstSchema.xsd</schema>
<schema>SecondSchema.xsd</schema>
<schema>ThirdSchema.xsd</schema>
</generateClasses>
</xsd>

Deserialization exception: Unable to find assembly

I'm serializing some data like fields and custom class to create a binary data (byte array).
Then I want to Deserialize it back from binary data to fields and class.
But I get an exception. It would all work fine if these two methods would happen in same assembly - but its not.
I do Serialization in one assambly, and do the Deserialization in another one. And this is the excaption saying too:
Unable to find assembly 'MyAssamblyName, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null'.
NOTE 1: I have no issues with getting the fields back, only the classes causes them.
NOTE 2: I have this same class in both assemblies.
NOTE 2: I have this same class in both assemblies
No you don't. At least, not as far as the runtime is concerned. You have two different types that happen to have the same name. A type is defined by its assembly. Thus "SomeType in AssemblyA" is completely different to "SomeType in AssemblyB", even if they happen to have been compiled from the same source file.
BinaryFormatter works with type information, so this won't work. One option would be to move the type to a library dll that both the other projects reference - then it is only defined once, and it will be happy.
Another option is to work with a contract-based serializer (rather than a type-based serializer). This means that "classes that look similar enough" are fine, even if they are in different assemblies (and perhaps have different source, as long as it is "similar enough"). Examples of suitable serializers for this would include (plus a few others) XmlSerializer, DataContractSerializer (but not NetDataContractSerializer), JavaScriptSerializer, or protobuf-net if you want dense raw binary.
All the assemblies containing classes in the class hierarchy of the object you are deserializing must be present in the application in which you are performing this deserialization. They could be either explicitly referenced (if you need compile-time safety with those classes) or only placed in the bin folder of the application so that they could be resolved at runtime. If they are not explicitly referenced you will have to use reflection in order to read the values from the deserialized instance.

Xsd to object class

So, I'm trying to take an .xsd file (musicxml fixed standard), create an object class, use portions of it - specificially the note object - include it in a graph object, and then save both the graph object and a musicxml validated file.
All in all, the solutions I'm using have one or two massively breaking shortcomings.
Xsd2Code - Creates the file; but for some reason it makes a Items collection (of the type I need, ObservableCollection), and then an enumerable ItemsChoiceType[0-9] ObservableCollection. The problem with the enumerable is after it generates, I have to either have to switch the latter to an Array, or do mumbo-jumbo for the XmlSerialisation attrs. Generates a 2mb .cs file, so alot of code that would be autogenerated and would have to have a crapton of .extend.cs files to get it to fit. Maybe I have to change some switches for it to work? What switches fix this?
LinqToXsd / OpenLinqToXsd - Generates the file, hard codes it to reference a DLL file, then forces you to use List (no option to go to ObservableCollection), which doesn't have EditItem and can't be used for binding to WPF/XAML. Otherwise, a bunch more .extend.cs files.
Altova C# generator - Expensive, requires a bunch of their DLLs to include in the project, messy.
Long story short, has anyone used any of these systems successfully and what did you have to do to shoehorn them? What kind of pain will I have to deal with beyond the issues I'm having
I remember now for XSD.exe: XSD notation doesn't export, individual classes (such as 'note') don't serialise out to xml. I would have to write out the entire thing from scorepartwise to every piece inbetween. Which means I can't serialise a graph object that has 'note's as vertices.

Other types of .net configuration files

OK, so this is not the most useful question since I can't remember the feature in .net that does this. Basically, that's what I'm asking; what feature is this?
A year or so ago, I was working on a project and we used configuration files that mapped directly to a class using the specific attributes on the class members. This is not the standard app.config, but assemblyname.dll.xml instead.
Perhaps it's a feature within the unity framework? Just a stab in the dark.
It is not critical I figure this out today, but it is just weighing on my brain and annoys me that i can't remember!
thanks!
It's not the standard XML config, but it is built into .NET. Basically, XML serialization allows you to project an XML document from a hydrated class instance, that will map 1:1 to the class it came from and can be used to re-hydrate a new instance of that class.
This can, in the majority of cases, be done without much effort on your part. All that's usually necessary for XML serialization to work is that the object must have a public default constructor, and that all the state information you want to serialize must be public and read-write. In a few cases, some attributes are necessary to define certain behaviors, like derived classes in arrays of their parent class, and defining non-default names for field and property element tags.
One of the major uses of this is for custom configuration files as you stated; you can load the configuration from a persistent state by simply deserializing the file into an instance of the configuration object.
Article: MSDN How To Serialize an Object
This isn't part of the .Net bcl or Unity as far as I am aware. Perhaps it's some other third party or open source component? That being said, it wouldn't be too difficult to build something like this on your own using XmlSerialization.
.net allows for multi layered configuration.
Every machine has the machine.config file. each application has the app.config file (which gets renamed to applicationname.exe.config upon building), but each dll can also have it's own config file. so, if I have the following binaries in my executable folder:
flexitris.exe
flexitrisHelpers.dll
thirdPartyContent.dll
each of them can have their own config file:
flexitris.exe.config
flexitrisHelpers.dll.config
thirdPartyContent.dll.config
and all of them will be read at runtime and accessible using the normal System.Configuration namespace.

Serialize in memory object with C#

I've got a program that picks up some code from script files and compiles it.
And It works fine.
The problem is: in the scripts I declare a couple of classes and I want to serialize them.
Obviously the C# serializer (xml and binary) doesn't like to serialize and the de-serialize object defined in a in-memory assembly.
I prefer to don't leave the in-memory assembly so i'm looking for another way of serializing, but in case, is possible to build assembly in memory and eventually write it on file ?
You could always write your own ToXml function using reflection to write out your property data to a string. Then your object would deserialize itself.
Just a thought.
If you want to create assemblies dynamically look into IL emitting via reflection. Here is a good article to get you started.
So just to clarify, are you asking how you can serialize a type if it hasn't got the [Serializable] attribute applied?
One solution is to use the WCF Data Contract Serializer: http://msdn.microsoft.com/en-us/library/ms731923.aspx.
Obviously this will only work if you can target .Net 3.0 or higher.
Alternately you can implement an ISerializationSurrogate. Jeffrey Richter has a great introduction at http://msdn.microsoft.com/en-us/magazine/cc188950.aspx.
I would avoid all built-in serialization whenever possible, both are badly broken. For example, XML serialization doesn't support dictionaries and normal serialization/SOAP doesn't support generics. And both have versioning issues.
It is time consuming, but createing ToXML and FromXML methods is probably to most effective way to go.
Hava a look at here for custom serialisers, which is a sample for dictionary XML serializing
I'm slightly confused by the statement that the XmlSerializer can't serialize dynamically generated types. The XmlSerializer generates it's own serialization code dynamically as well during construction so there should be no issue with it serializing your type.
You may need to decorate your dynamic classes with the appropriate attributes, depending on what you are generating (like derived classes), but there shouldn't be any issue with using the XmlSerializer in the situation you described.
If you could post details about the issues the XmlSerializer is giving you I can help you work out what the problem is.
Also, I'm of the belief that auto-generating code is in general a blessing. All to often have I had to go back into a class to fix one or all of the copy/paste/save/load functions, just because someone forgot to update them when adding a new variable. Save/Load code is boiler plate code. Let the computers write it.

Categories