Validating XML documents with XSD correctly

Validating XML documents with XSD correctly - c#

As a developer with a good deal of XML consuming and producing experience, I've never really interacted with schemas before. For the first time this is actually occurring for me.
I've run across a "feature" that I consider more of a bug which is well documented.
When using XDocument.Validate() it seems that there are cases in which the document will be valid if it doesn't match the schema specified. I feel this is most likely a flaw in my understanding of the relationship between XSDs, XML namespaces, and expected validation processes.
Therefore I submit to you my XML sample, my XSD sample, and my validation code.
XML - this is INTENTIONALLY the wrong document.
<?xml version="1.0" encoding="utf-8" ?>
<SuppliesDefinitions
xmlns="http://lavendersoftware.org/schemas/SteamGame/Data/Xml/Supplies.xsd">
<Supply type="Common">
<Information/>
<Ritual/>
<Weapon/>
<Tool count="1"/>
<Tool count="2"/>
<Tool count="3"/>
</Supply>
<Supply type="Uncommon">
<Information/>
<Weapon/>
<Tool count="1"/>
<Tool count="2"/>
<Tool count="3"/>
<Tool count="4"/>
</Supply>
<Supply type="Rare">
<Information/>
<Rune/>
<Weapon/>
<Tool count="2"/>
<Tool count="3"/>
<Tool count="4"/>
</Supply>
</SuppliesDefinitions>
The XSD used to validate it. (Again, this is intentionally the WRONG document for the above XML)
<?xml version="1.0" encoding="utf-8"?>
<xs:schema id="Encounters"
targetNamespace="http://lavendersoftware.org/schemas/SteamGame/Data/Xml/Encounters.xsd"
elementFormDefault="qualified"
xmlns="http://lavendersoftware.org/schemas/SteamGame/Data/Xml/Encounters.xsd"
xmlns:mstns="http://lavendersoftware.org/schemas/SteamGame/Data/Xml/Encounters.xsd"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
>
<xs:complexType name="ToolType">
<xs:attribute name="count" use="required" type="xs:int"/>
</xs:complexType>
<xs:complexType name="TaskType">
<xs:choice maxOccurs="unbounded" minOccurs="1">
<xs:element name="Weapon"/>
<xs:element name="Information"/>
<xs:element name="Tool" type="ToolType"/>
<xs:element name="Ritual"/>
</xs:choice>
</xs:complexType>
<xs:complexType name="EncounterType">
<xs:sequence maxOccurs="unbounded" minOccurs="1">
<xs:element name="Task" type="TaskType"/>
</xs:sequence>
<xs:attribute name="name" use="required" type="xs:string"/>
</xs:complexType>
<xs:element name="EncounterDefinitions">
<xs:complexType>
<xs:sequence maxOccurs="unbounded" minOccurs="1">
<xs:element name="Encounter" type="EncounterType"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
And finally the validation code.
private static void ValidateDocument(XDocument doc)
{
XmlSchemaSet schemas = new XmlSchemaSet();
schemas.Add(null, XmlReader.Create(new StreamReader(XmlSchemaProvider.GetSchemaStream("Encounters.xsd"))));
doc.Validate(schemas, (o, e) =>
{
//This is never hit!
Console.WriteLine("{0}", e.Message);
Assert.False(e.Severity == XmlSeverityType.Error);
});
}
I was wondering if someone can explain what I am doing wrong. I feel I'm making some incorrect assumptions about the way this SHOULD be working. It seems to me using one xsd against a completely unrelated XML document would be invalid.

There is no nodes in your XML that can be validated by the schema (namespaces are different). As result it does not report any errors. As far as I know behavior for nodes that are not matched to any schema is allow anything.
You also could set validation options in XmlReaderSettings to allow warnings:
ReportValidationWarnings - Indicates that events should be reported if a validation warning occurs. A warning is typically issued when there is no DTD or XML Schema to validate a particular element or attribute against. The ValidationEventHandler is used for notification.
Check out XmlSchemaSet.Add and HOW TO: Validate an XML Document by Using Multiple Schemas if you expect nodes from multiple namespaces to be present in the XML.

Related

How can I get all the XSD validation errors from an XML file in dotnet?

Caveat: I'm fairly new to .NET so I'm unfamiliar with a lot of libraries available.
I have developed a function app using java that I'm having to port over to C#. I am currently using the built in XSD validation library for .NET and my file is being successfully validated, however when testing the results against the java version there are significantly less validation errors being reported in the .NET version.
Root XSD file
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns="schema_namespace" xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="schema_namespace" elementFormDefault="qualified" attributeFormDefault="unqualified">
<xs:include schemaLocation="included_xsd_1.xsd"/>
<xs:include schemaLocation="included_xsd_2.xsd"/>
<xs:include schemaLocation="included_xsd_3.xsd"/>
<xs:include schemaLocation="included_xsd_4.xsd"/>
<xs:include schemaLocation="included_xsd_5.xsd"/>
<xs:element name="DEMDataSet">
<xs:annotation>
<xs:documentation>Root Tag</xs:documentation>
</xs:annotation>
<xs:complexType>
<xs:sequence>
<xs:element name="dCustomElement" type="dCustomElement" id="dElementSection" minOccurs="0">
<xs:annotation>
<xs:documentation>Contains information for custom elements.</xs:documentation>
</xs:annotation>
</xs:element>
<xs:element name="Report" id="ReportGroup" maxOccurs="unbounded">
<xs:annotation>
<xs:documentation>Container Tag to hold each instance of a record</xs:documentation>
</xs:annotation>
<xs:complexType>
<xs:sequence>
<xs:element name="from_included_xsd_1" type="type" id="typeSection">
<xs:annotation>
<xs:documentation>Some Information</xs:documentation>
</xs:annotation>
</xs:element>
<xs:element name="from_included_2" type="type" id="typeSection" minOccurs="0">
<xs:annotation>
<xs:documentation>Some Information</xs:documentation>
</xs:annotation>
</xs:element>
...
</xs:sequence>
<xs:attribute name="timeStamp" type="DateTimeType" use="required"/>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
My issue appears to be occuring when there is a missing element in one of my included xsd files, when this happens all subsequent errors in that xsd file are not reported (e.g. if the initial element is missing, no other validation errors are reported for that xsd file and the validation moves on to the next included xsd).
My code is pretty much lifted straight from the Microsoft documentation:
public XsdValidationResult ValidateXml(string filePath)
{
var schemas = GetSchemas();
var settings = new XmlReaderSettings();
settings.Schemas.Add(schemas);
settings.ValidationType = ValidationType.Schema;
settings.ValidationFlags = XmlSchemaValidationFlags.ReportValidationWarnings;
settings.ValidationEventHandler += XsdValidationEventHandler;
using var reader = XmlReader.Create(filePath, settings);
var doc = new XmlDocument
{
PreserveWhitespace = true
};
doc.Load(reader);
return GetValidationResult();
}
private void XsdValidationEventHandler(object sender, ValidationEventArgs e)
{
if (e.Severity != XmlSeverityType.Error) return;
var sb = new StringBuilder();
sb.Append(e.Message);
sb.Append($" Line Number: {e.Exception.LineNumber}");
sb.Append($" Position: {e.Exception.LinePosition}");
_errors.Add(sb.ToString());
}
The validation error I get looks something like this (before moving on to the next schema):
The element 'subSchema.elementGroup' in namespace 'namespace' has
invalid child element 'subSchema.element02' in namespace 'namespace'.
List of possible elements expected: 'subSchema.element03' in namespace
'namespace'.
I did come across this possible explanation of what is happening in the Microsoft docs:
When the new XmlReader has been closed, the original XmlReader will be
positioned on the EndElement node of the sub-tree. Thus, if you called
the ReadSubtree method on the start tag of the book element, after the
sub-tree has been read and the new XmlReader has been closed, the
original XmlReader is positioned on the end tag of the book element.
but this behaviour is different from what I need :(
Ultimately my question is, is there a way to continue parsing the XML file after an element has been discovered as missing (either using a different library or through the included .NET library) so as to capture all XSD validation errors?

How to validate Xml in that specific way

My problem is that i dont know how to properly write a xsd that allows me to validate xml in such way :
i need to have few necessary nodes (in any order) and to allow any other nodes to be in root
so for example i need to validate such xml, with 2 necessary nodes:
<root>
<necessary1/>
<someRandomNode1/>
<necessary2/>
<someRandomNode2/>
<someRandomNode3/>
</root>
but this has to be in any order and <xs:any/> is probably not what im looking for.
edit:
this 'someRandomNodeX' is not name of node, it can be everything. Number of this unscpecified nodes is unknown too.

There is a solution if the required elements and the non-required elements can be made in different namespaces. It requires XML Schema 1.1.
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema
xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified"
xmlns:vc="http://www.w3.org/2007/XMLSchema-versioning" vc:minVersion="1.1"
xmlns:namespace="http://www.example.com/"
targetNamespace="http://www.example.com/">
<xs:element name="root">
<xs:complexType>
<xs:choice maxOccurs="unbounded">
<xs:element name="necessary1" type="xs:string"/>
<xs:element name="necessary2" type="xs:string"/>
<xs:any namespace="##other" processContents="lax"/>
<xs:any namespace="##local" processContents="lax"/>
</xs:choice>
<xs:assert test="exactly-one(namespace:necessary1) and exactly-one(namespace:necessary2)"/>
</xs:complexType>
</xs:element>
</xs:schema>
This validates:
<?xml version="1.0" encoding="UTF-8"?>
<namespace:root
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.example.com test.xsd"
xmlns:namespace="http://www.example.com/">
<namespace:necessary1/>
<someRandomNode1/>
<namespace:necessary2/>
<someRandomNode2/>
<someRandomNode3/>
</namespace:root>

XSD2Code classes require duplicate-named element containing collection of elements

Given XSD like:
<xs:complexType name="accident">
<xs:sequence>
<xs:element name="NAME" type="xs:string" />
<xs:element name="DESCRIPTION" type="xs:string" />
<xs:element name="CREATIONDATE" type="xs:dateTime" />
</xs:sequence>
</xs:complexType>
<xs:element name="accidents">
<xs:complexType>
<xs:sequence>
<xs:element name="accident" type="accident" maxOccurs="unbounded" minOccurs="0"/>
</xs:sequence>
</xs:complexType>
</xs:element>
I expect XML like:
<?xml version="1.0" encoding="UTF-8"?>
<accidents>
<accident>
<NAME>Accident 123</NAME>
<DESCRIPTION>Car crash</DESCRIPTION>
<CREATIONDATE>2016-01-20T12:08:00+00:00</CREATIONDATE>
</accident>
</accidents>
I used XSD2Code to generate C# classes so I can easy deserialize XML from a web-service. But they weren't working right - they were successfully loading a test XML like my example but there were zero accident elements.
So I decided to reverse the process:
accidents aa = new accidents();
accident a = new accident();
a.NAME = "test";
aa.accident.Add(a);
aa.SaveToFile("accidents.xml");
This emitted the following XML:
<?xml version="1.0"?>
<accidents xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<accident>
<accident>
<NAME>test</NAME>
<CREATIONDATE>0001-01-01T00:00:00</CREATIONDATE>
</accident>
</accident>
</accidents>
If I attempt to deserialize that XML, it works just fine. But note, there is a nested accident which is not correct and I have no idea it would do this or what to do to fix it!
This seems to be a similar question but since it didn't get much attention and the XSD isn't included, I'm not sure: xsd2code creates extra nested collection when serializing lists

I'm a bit late on the scene for this one but here goes anyway !!
I have been using Xsd2Code myself for a while to take advantage of some cool features, but I have found it does have some annoying quirks. I agree that this issue you describe looks like a bug. However I have found that the issue disappears if your collection is itself a child element of another complex type. If you are happy for your "accidents" to exist as a property of a "report" for example, then you would alter your schema as follows:
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified">
<xs:complexType name="accident">
<xs:sequence>
<xs:element name="NAME" type="xs:string" />
<xs:element name="DESCRIPTION" type="xs:string" />
<xs:element name="CREATIONDATE" type="xs:dateTime" />
</xs:sequence>
</xs:complexType>
<xs:complexType name="accidents">
<xs:sequence>
<xs:element name="accident" type="accident" maxOccurs="unbounded" minOccurs="0"/>
</xs:sequence>
</xs:complexType>
<xs:complexType name="report">
<xs:sequence>
<xs:element name="accidents" type="accidents"/>
</xs:sequence>
</xs:complexType>
</xs:schema>
When you run this through the Xsd2Code tool you will find that the
generated code creates the accidents property of report type as a list of accidents and will serialize in the way you would expect it.
Your test code should look more like this:
report r = new report();
r.accidents = new List<accident>();
accident a = new accident();
a.NAME = "test";
r.accidents.Add(a);
r.SaveToFile("accidents.xml");
The dodgy accidents class is still generated unfortunately - which could cause confusion to other developers, but there is a way to prevent this.
First, put the accident and accidents complexType definitions in a
file, accidents.xsd. Then put the report definition in report.xsd with
an include statement referencing accidents.xsd. Only pass report.xsd
through the Xsd2Code tool. The malformed accidents class will not appear in the generated code. This is just an illustrative example of course - expand as required. In the absence of a fix, this has been a very good solution for me - hopefully it will suit your needs.

You might start by specifying a target namespace and explicit qualification flags in your XSD. That is, convert your xsd to:
<?xml version="1.0" encoding="utf-8"?>
<xs:schema attributeFormDefault="unqualified"
elementFormDefault="qualified"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://example.com/foo"
xmlns="http://example.com/foo">
<xs:complexType name="accident">
<xs:sequence>
<xs:element name="NAME" type="xs:string" />
<xs:element name="DESCRIPTION" type="xs:string" />
<xs:element name="CREATIONDATE" type="xs:dateTime" />
</xs:sequence>
</xs:complexType>
<xs:element name="accidents">
<xs:complexType>
<xs:sequence>
<xs:element name="accident" type="accident" maxOccurs="unbounded" minOccurs="0"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
and your XML to:
<?xml version="1.0" encoding="UTF-8"?>
<accidents xmlns="http://example.com/foo"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://example.com/foo foo.xsd">
<accident>
<NAME>Accident 123</NAME>
<DESCRIPTION>Car crash</DESCRIPTION>
<CREATIONDATE>2016-01-20T12:08:00+00:00</CREATIONDATE>
</accident>
</accidents>
(You will need to save your xsd file as foo.exe for the above reference to work).
I'm not sure if this will fix your XSD2Code issue, but I've used this header format with xsd.exe for lots of equivalent (and much more complex) code. It gets you Intellisense in your XML and might also be sufficient to get XSD2Code to behave properly.

XSD includes a XSD and in turn includes another

Ok I don't know if the title is specific enough, but I'm having a problem here.
I have a XSD called
"A"
<?xml version="1.0" encoding="ISO-8859-1"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:include schemaLocation="ATypeIn.xsd"/>
<xs:element name="A" type="ATypeIn"/>
</xs:schema>
As you can see it includes this next XSD
<?xml version="1.0" encoding="ISO-8859-1"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:include schemaLocation="file:////C:/Users/aaaaa/Documents/GerarClasses/Types.xsd"/>
<xs:complexType name="ATypeIn">
<xs:sequence>
<xs:element name="Apol">
<xs:complexType>
<xs:sequence>
<xs:element name="R" type="RType"/>
<xs:element name="NAp" type="NApType"/>
<xs:element name="Ut"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:schema>
For the XSD.Exe to generate the class it needs the 3rd XSD called "Types" which is included in the 2nd.
Problem is in here, this XSD "Types" has a lot of types.
and the class generated when I call "xsd A.xsd /classes"
Includes all these extra things:
Am I doing something wrong or it's supposed to be like this and I can't do anything about it?
Thanks,
If I didn't explain myself good enough, please do ask about it and I will try to explain better.
PS: Obviously I changed the names in the code, so if there is any "mistake" it's for this reason.

How do I validate two dates in XSD

I have two date elements in my XSD file.
For example
<xs:element type="xs:date" name="DateFrom"/>
<xs:element type="xs:date" name="DateTo"/>
Basically, I'm wanting to check that the number of days between DateFrom and DateTo doesn't exceed 7 days.
I can do this check in my C# XML validation routine, but wondered can I do it in Xsd as well, and if so how ?

In XSD 1.1 you can use assertions to check constraints like this; in XSD 1.0, you're out of luck.
[Addendum]: Another reader asks for a working example. Here is one.
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
elementFormDefault="qualified">
<xs:element name="DateRange">
<xs:complexType>
<xs:sequence>
<xs:element name="DateFrom" type="xs:date"/>
<xs:element name="DateTo" type="xs:date"/>
</xs:sequence>
<xs:assert test="DateFrom lt DateTo"/>
</xs:complexType>
</xs:element>
</xs:schema>
The schema described by this schema document accepts the following document.
<DateRange>
<DateFrom>2011-01-01</DateFrom>
<DateTo>2012-01-01</DateTo>
</DateRange>
It rejects the following document.
<DateRange>
<DateFrom>2011-01-01</DateFrom>
<DateTo>2010-01-01</DateTo>
</DateRange>

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Validating XML documents with XSD correctly - c#

Related

How can I get all the XSD validation errors from an XML file in dotnet?

How to validate Xml in that specific way

XSD2Code classes require duplicate-named element containing collection of elements

XSD includes a XSD and in turn includes another

How do I validate two dates in XSD

Categories

Resources