I generated C# classes based on XSD using the xsd.exe tool from the SDK. Then I can use that class to [de]serialize objects using XmlSerializer... However the serializer seems to be very forgiving.
Is it possible that I can make the serializer throw an exception in case there is missing property or a "strange" XML node?
I think one way is to modify the setter of the property and make it validate the data (or use XSD validation)... However is there any other alternative solution for this problem ?
You can implement the IXmlSerializable interface and in the ReadXml method implementation, check for the specific elements that you require, throwing exceptions when you don't find them (or setting whatever notification you need to).
If you want to use a schema for validation (to use the minOccurs and maxOccurs schema attributes, for example), then you can configure the XmlReader instance to validate against the schema by setting the Schemas property on the XmlReaderSettings class that you pass to the Create method (note there are overloads of Create which take a TextReader, etc.).
Related
I have a large complex XSD set.
I have C# classes generated from those XSDs using xsd.exe. Naturally, though the majority of properties in the generated classes are strings, many are decimals, DateTimes, enums or bools, just as they should be.
Now, I have some UNVALIDATED data that is structured in the correct XML format, but may well NOT be able to pass XSD validation, let alone be put into an instance of the relevant .Net object. For example, at this stage, for all we know the value for the element that should be a DateTime could be "ABC" - not even parseable as a DateTime - let alone other string elements respecting maxLength or regex pattern restrictions. This data is ready to be passed in to a rules engine that we already have to make everything valid, including defaulting things appropriately depending on other data items, etc.
I know how to use the various types in System.Xml to read the string value of a given element by name. Clearly I could just hand craft code to get out all the elements that exist today by name - but if the XSD changes, the code would need to be reworked. I'd like to be able to either directly read the XSD or use reflection on the generated classes (including attributes like [System.Xml.Serialization.XmlTypeAttribute(TypeName=...] where necessary) to find exactly how to recursively query the XML down to the the raw, unverified string version of any given element to pass through to the ruleset, and then after the rules have made something valid of it, either put it back into the strongly typed object or back into a copy of the XML for serialization into the object.
(It has occurred to me that an alternative approach would be to somehow automatically generate a 'stringly typed' version of the object - where there are not DateTimes etc; nothing but strings - and serialize the xml into that. I have even madly thought of taking the xsd.exe generated .cs file and search/replacing all the enums and base types that aren't strings to strings, but there has to be a better way.)
In other words, is there an existing generic way to pull the XElement or attribute value from some XML that would correspond to a given item in a .Net class if it were serialized, without actually serializing it?
Sorry to self-answer, and sorry for the lack of actual code in my answer, but I don't yet have the permission of my employer to share the actual code on this. Working on it, I'll update here when there is movement.
I was able to implement something I called a Tolerant XML Reader. Unlike most XML deserializing, it starts by using reflection to look at the structure of the required .Net type, and then attempts to find the relevant XElements and interpret them. Any extra elements are ignored (because they are never looked for), any elements not found are defaulted, and any elements found are further interpreted.
The main method signature, in C#, is as follows:
public static T TolerantDeserializeIntoType<T>(
XDocument doc,
out List<string> messagesList,
out bool isFromSuppliedData,
XmlSchemaSet schemas = null,
bool tolerant = true)
A typical call to it might look like this:
List<string> messagesList;
bool defaultOnly;
SomeType result = TolerantDeserializeIntoType<SomeType>(someXDocument, out messagesList, out defaultOnly);
(you may use var; I just explicitly put the type there for clarity in this example).
This will take any XDocument (so the only criteria of the original was that it was well-formed), and make an instance of the specified type (SomeType, in this example) from it.
Note that even if nothing at all in the XML is recognized, it will still not fail. The new instance will simply have all properties / public fields nulled or defaulted, the MessageList would list all the defaulting done, and the boolean out paramater would be FALSE.
The recursive method that does all the work has a similar signature, except it takes an XElement instead of an XDocument, and it does not take a schemaSet. (The present implementation also has an explicit bool to indicate a recursive call defaulting to false. This is a slightly dirty way to allow it to gather all failure messages up to the end before throwing an error if tolerant is false; in a future version I will refactor that to only expose publicly a version without that, if I even want to make the XElement version public at all):
public static T TolerantDeserializeXElementIntoType<T>(
ref XElement element,
ref List<string> messagesList,
out bool isFromSuppliedValue,
bool tolerant = true,
bool recursiveCall = false)
How it works, detail
Starting with the main call, the one with with an XDocument and optional SchemaSet:
If a schema Set that will compile is supplied (actually, it also looks for xsi:noNamespaceSchemaLocation as well) the initial XDocument and schemaSet call runs a standard XDocument.Validate() across the supplied XDocument, but this only collects any issued validation error callbacks. It won't throw an exception, and is done for only two reasons:
it will give some useful messages for the MessageList, and
it will populate the SchemaInfo of all XElements to
possibly use later in the XElement version.
(note, however, that the
schema is entirely optional. It is actually only used to resolve
some ambiguous situations where it can be unclear from the C#
object if a given XElement is mandatory or not.)
From there, the recursive XElement version is called on the root node and the supplied C# type.
I've made the code look for the style of C# objects generated by xsd.exe, though most basic structured objects using Properties and Fields would probably work even without the CustomAttributes that xsd.exe supplies, if the Xml elements are named the same as the properties and fields of the object.
The method looks for:
Arrays
Simple value types, explicitly:
String
Enum
Bool
then anything
else by using the relevant TryParse() method, found by reflection.
(note that nulls/xsi:nill='true' values also have to be specially
handled)
objects, recursively.
It also looks for a boolean 'xxxSpecified' in the object for each field or property 'xxx' that it finds, and sets it appropriately. This is how xsd.exe indicates an element being omitted from the XML in cases where null won't suffice.
That's the outline. As I said, I may be able to put actual code somewhere like GitHub in due course. Hope this is helpful to someone.
I know this is pretty silly but just wondered if anyone had a link or knows exactly what this code is doing on my page?
namespace com.gvinet.EblAdapter.ebl
{
[Serializable]
[DesignerCategory("code")]
[GeneratedCode("System.Xml", "4.0.30319.225")]
[DebuggerStepThrough]
[XmlType(Namespace = "http://addresshere")]
public class TSAPassenger
{
then here is all of the strings for the form like name, address and such
I am thinking it is trying to grab the XML file that was created from the Database but just want to make sure.
It is not. These are all just metadata attributes.
Serializeable - Use the standard XmlSerializer to take public properties and fields and convert to XML for transport using no customization to the format (like ISerializable would). It is usually only used when going out of process (remoting, Web Services, WCF, etc)
DesignerCategory - This can be used a number of ways. This one tends to be used by the property grid in visual studio as a way to organize sections.
GeneratedCode - The application generated it for you, utilizing the System.Xml namespace in version 4.0.
DebuggerStepThrough - If you are stepping through code (F11), by default, skip over anything here (don't step into a property getting for example).
XmlType - Part of the serializer that allows you to provide a specific namespace that is generated in the output.
The items here do not actually get anything, just describe certain aspects of how something may be loaded/handled.
Hope that makes sense.
The Serializable and XmlType attributes are instructing the XML serializer that the class can be serialized and the schema to use when doing so.
XmlType Attribute
Serializable Attribute
DesignerCategory("code") Attribute
DebuggerStepThrough Attribute
These are attributes - used for declarative programming - you can find more about declarative programming online. But here is the link to .net attribute hierarchy page to get you started: http://msdn.microsoft.com/en-us/library/aa311259(VS.71).aspx
Also, these pages may be helpful:
What are attributes: What are attributes in .NET?
Attributes in C#: http://www.codeproject.com/Articles/2933/Attributes-in-C
I need to transform quickly some object into an XML string. If my project would not been in Silverlight, I would simply use the [Serializable] attribute with [XmlElement] and [XmlAttribute]. Unfortunately, this is not available in Silverlight. I can't use DataContract because it does not give the control of if the property need to be attribute or element tag.
So, what is my other option? I can do manually the XML using Linq-To-Xml but is there anything else more fast?
Another option is to use System.Xml.Serialization.XmlSerializer .
In terms of performance, XmlWriter (fast, non-cached, forward-only) with self-implemented serialization is preferable.
I need to serialize dynamically loaded types' classes using XMLSerializer.
When using XML serializer non initialized values are not being serialized. I dont have control over the assemblies I am working with so can not use XML attributes for specifying default values on properties. So I think I need to set all properties and sub properties to their default values recursively and then serialize. ( Please let me know if there is any better way )
Followed this :
Activator.CreateInstance(propType);
but above line complains about not having a parameterless constructor for some types.
Tried this :
subObject = FormatterServices.GetUninitializedObject(propType);
but this one gives an error "value was invalid" with no inner exception.
Please let me know if you need any further information.
If the types in question don't have public parameterless constructors, you'll struggle. You can get around the attributes issue by using the constructor overload that accepts a XmlAttributeOverrides object, which you can use to fully configure the serializer including the default value (via XmlAttributes.XmlDefaultValue), but some things you can't do - and get around the constructor limitation is one of them.
What is the scenario here?
if you want xml, then I would introduce a DTO layer: some objects that look like the ones you're talking about, but are simple and under your control. Ideal for XmlSerializer. You then write code to map between the two
if you just want serialization (and xml is an implementation detail) then there are other serializers that may help. DataContractSerializer or protobuf-net, for example; either would be more versatile here.
I have a object that has a number of properties that aren't present in the xsd file. When doing XmlDocument.Validate is there a way I can tell it to ignore properties that are not present in the xsd and instead just ensure that the properties required by xsd are present in the xml document?
I can get around this by adding [XmlIgnore] attributes all over my class but I would rather accomplish this by convention rather then explicitly add attributes all throughout my object model.
I doubt there is. Personally I would create a separate DTO, as it sounds like you're trying to make one object serve two jobs. Another option would be to use the XmlSerializer ctor that allows you to specify attribs at runtime, but this is much more work than [XmlIgnore].
So if you just want it to work: [XmlIgnore]. If you want it to be "pure", create a second DTO model and translate between them.