Partial XML file validation using XSD - c#

I am trying to use the XDocument class and XmlSchemaSet class to validate an XMl file.
The XML file already exists but I want to add in just a single element consisting of a couple other elements and I only want to validate this node.
Here is an example of the XML file. The piece I would like to validate is the TestConfiguration node:
<?xml version="1.0" encoding="ISO-8859-1"?>
<Root>
<AppType>Test App</AppType>
<LabelMap>
<Label0>
<Title>Tests</Title>
<Indexes>1,2,3</Indexes>
</Label0>
</LabelMap>
<TestConfiguration>
<CalculateNumbers>true</CalculateNumbers>
<RoundToDecimalPoint>3</RoundToDecimalPoint>
</TestConfiguration>
</Root>
Here is my xsd so far:
<?xml version="1.0" encoding="utf-8"?>
<xs:schema id="TestConfiguration"
targetNamespace="MyApp_ConfigurationFiles" elementFormDefault="qualified" xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="TestConfiguration">
<xs:complexType>
<xs:sequence>
<xs:element name="CalculateNumbers" type="xs:boolean" minOccurs="1" maxOccurs="1"/>
<xs:element name="RoundToDecimalPoint" type="xs:int" minOccurs="1" maxOccurs="1"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
Here is the code I use to validate it:
private bool ValidateXML(string xmlFile, string xsdFile)
{
string xsdFilePath = Path.Combine(Path.GetDirectoryName(Assembly.GetEntryAssembly().Location) ?? string.Empty, xsdFile);
Logger.Info("Validating XML file against XSD schema file.");
Logger.Info("XML: " + xmlFile);
Logger.Info("XSD: " + xsdFilePath);
try
{
XDocument xsdDocument = XDocument.Load(xsdFilePath);
XmlSchemaSet schemaSet = new XmlSchemaSet();
schemaSet.Add(XmlSchema.Read(new StringReader(xsdDocument.ToString()), this.XmlValidationEventHandler));
XDocument xmlDocument = XDocument.Load(xmlFile);
xmlDocument.Validate(schemaSet, this.XmlValidationEventHandler);
}
catch (Exception e)
{
Logger.Info("Error parsing XML file: " + xmlFile);
throw new Exception(e.Message);
}
Logger.Info("XML validated against XSD.");
return true;
}
Even validating the full XML file, the validation will pass successfully causing me to run into problems when I try to load the XML file into the generated class file created by xsd2code, the error: <Root xmlns=''> was not expected..
How can I validate just the TestConfiguration piece?
Thanks

You have a few issues here:
Validating the entire document succeeds when it should fail.
This happens because the root node is unknown to the schema, and encountering an unknown node is considered a validation warning not a validation error - even if that unknown node is the root element. To enable warnings while validating, you need to set XmlSchemaValidationFlags.ReportValidationWarnings. However, there's no way to pass this flag to XDocument.Validate(). The question XDocument.Validate is always successful shows one way to work around this.
Having done this, you must also throw an exception in your validation handler when ValidationEventArgs.Severity == XmlSeverityType.Warning.
(As for requiring a certain root element in your XSD, this is apparently not possible.)
You need a convenient way to validate elements as well as documents, so you can validate your <TestConfiguration> piece.
Your XSD and XML are inconsistent.
You XSD specifies that your elements are in the XML namespace MyApp_ConfigurationFiles in the line targetNamespace="MyApp_ConfigurationFiles" elementFormDefault="qualified". In fact the XML elements shown in your question are not in any namespace.
If the XSD is correct, your XML root node needs to look like:
<Root xmlns="MyApp_ConfigurationFiles">
If the XML is correct, your XSD needs to look like:
<xs:schema id="TestConfiguration"
elementFormDefault="unqualified" xmlns:xs="http://www.w3.org/2001/XMLSchema">
After you have resolved the XSD and XML inconsistency from #3, you can solve issues #1 and #2 by introducing the following extension methods that validate both documents and elements:
public static class XNodeExtensions
{
public static void Validate(this XContainer node, XmlReaderSettings settings)
{
if (node == null)
throw new ArgumentNullException();
using (var innerReader = node.CreateReader())
using (var reader = XmlReader.Create(innerReader, settings))
{
while (reader.Read())
;
}
}
public static void Validate(this XContainer node, XmlSchemaSet schemaSet, XmlSchemaValidationFlags validationFlags, ValidationEventHandler validationEventHandler)
{
var settings = new XmlReaderSettings();
settings.ValidationType = ValidationType.Schema;
settings.ValidationFlags |= validationFlags;
if (validationEventHandler != null)
settings.ValidationEventHandler += validationEventHandler;
settings.Schemas = schemaSet;
node.Validate(settings);
}
}
Then, to validate the entire document, do:
try
{
var xsdDocument = XDocument.Load(xsdFilePath);
var schemaSet = new XmlSchemaSet();
using (var xsdReader = xsdDocument.CreateReader())
schemaSet.Add(XmlSchema.Read(xsdReader, this.XmlSchemaEventHandler));
var xmlDocument = XDocument.Load(xmlFile);
xmlDocument.Validate(schemaSet, XmlSchemaValidationFlags.ReportValidationWarnings, XmlValidationEventHandler);
}
catch (Exception e)
{
Logger.Info("Error parsing XML file: " + xmlFile);
throw new Exception(e.Message);
}
And to validate a specific node, you can use the same extension methods:
XNamespace elementNamespace = "MyApp_ConfigurationFiles";
var elementName = elementNamespace + "TestConfiguration";
try
{
var xsdDocument = XDocument.Load(xsdFilePath);
var schemaSet = new XmlSchemaSet();
using (var xsdReader = xsdDocument.CreateReader())
schemaSet.Add(XmlSchema.Read(xsdReader, this.XmlSchemaEventHandler));
var xmlDocument = XDocument.Load(xmlFile);
var element = xmlDocument.Root.Element(elementName);
element.Validate(schemaSet, XmlSchemaValidationFlags.ReportValidationWarnings, this.XmlValidationEventHandler);
}
catch (Exception e)
{
Logger.Info(string.Format("Error validating element {0} of XML file: {1}", elementName, xmlFile));
throw new Exception(e.Message);
}
Now validating the entire document fails while validating the {MyApp_ConfigurationFiles}TestConfiguration node succeeds, using the following validation event handlers:
void XmlSchemaEventHandler(object sender, ValidationEventArgs e)
{
if (e.Severity == XmlSeverityType.Error)
throw new XmlException(e.Message);
else if (e.Severity == XmlSeverityType.Warning)
Logger.Info(e.Message);
}
void XmlValidationEventHandler(object sender, ValidationEventArgs e)
{
if (e.Severity == XmlSeverityType.Error)
throw new XmlException(e.Message);
else if (e.Severity == XmlSeverityType.Warning)
throw new XmlException(e.Message);
}

Related

Read an existing xml file and apply changes

I'm trying to learn how to read a XML file and do changes using C#.
Problem:
The XMLfile already exists. I would like to read the file and search for a specific element which has another element with an attribute:
<ELement>
<Element2 Attribute = "Value" />
</Element>
The problem is if this element "Element2" does not exist in the xml file I would like to create it under the same path.
Here is my actual code.
private void UpdateConfig(string configPath, string serverName)
{
string oldServername = null;
XElement config = XElement.Load(configPath);
CreateConfigBackup(configPath);
try
{
//which could be Null if the element does not exist in xml file.
oldServername = config.Element("WebGUI").Element("ServerIP").Attribute("Value").Value.ToString();
oldServername = oldServername.Split(':').FirstOrDefault();
// Config updaten
config.Element("WebGUI").Element("ServerIP").Attribute("Value").SetValue(serverName);
}
else
{
XElement ELM = new XElement("ServerIP");
ELM.SetAttributeValue("Value",serverName);
config.Element("WebGUI").Add(ELM);
}
}
catch (Exception ex)
{
if (oldServername == null)
{
MessageBox.Show(configPath+ "enthält nicht das Element Server in der Web-Konfigurationsdatei.");
}
}
SaveConfig(config, configPath);
}

Different types of XML validation

I am looking at something to validate XML we are sent against the XSD for it. I have come across these three but only one seems to 'work' I'm guessing there is a reason for one flagging and issue where others don't but wondering what is the best method to use and the difference, apart from the way it is done, with these three.
XML
<?xml version="1.0" encoding="UTF-8"?>
<Person>
<Forename>John</Forename>
</Person>
XSD
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<xs:schema
xmlns:xs="http://www.w3.org/2001/XMLSchema"
elementFormDefault="qualified" version="0.2">
<xs:annotation>
<xs:documentation>
</xs:documentation>
</xs:annotation>
<xs:element name ="Person">
<xs:complexType>
<xs:sequence>
<xs:element name="Forename" type="xs:string"/>
<xs:element name="Surname" type="xs:string"/>
<xs:element name="Middlename" type="xs:string" minOccurs="0"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
The first one flag's an error that the surname element is expected but isn't in the XML which I would expect.
class XPathValidation
{
static void Main()
{
XmlSchemaSet schemas = new XmlSchemaSet();
schemas.Add("", XmlReader.Create(#"C:\test\test.xsd"));
XDocument doc = XDocument.Load(#"C:\test\test.xml");
Console.WriteLine("Validating doc1");
bool errors = false;
doc.Validate(schemas, (o, e) =>
{
Console.WriteLine("{0}", e.Message);
errors = true;
});
Console.WriteLine("doc1 {0}", errors ? "did not validate" : "validated");
Console.ReadKey();
}
}
These two both just run and return nothing.
class XmlSchemaSetExample
{
static void Main()
{
XmlReaderSettings booksSettings = new XmlReaderSettings();
booksSettings.Schemas.Add("http://www.w3.org/2001/XMLSchema", #"C:\test\test.xsd");
booksSettings.ValidationType = ValidationType.Schema;
booksSettings.ValidationEventHandler += new ValidationEventHandler(booksSettingsValidationEventHandler);
XmlReader books = XmlReader.Create(#"C:\test\test.xml", booksSettings);
while (books.Read()) { }
Console.ReadKey();
}
static void booksSettingsValidationEventHandler(object sender, ValidationEventArgs e)
{
if (e.Severity == XmlSeverityType.Warning)
{
Console.Write("WARNING: ");
Console.WriteLine(e.Message);
}
else if (e.Severity == XmlSeverityType.Error)
{
Console.Write("ERROR: ");
Console.WriteLine(e.Message);
}
}
}
and
class XPathValidation
{
static void Main()
{
try
{
XmlReaderSettings settings = new XmlReaderSettings();
settings.Schemas.Add("http://www.w3.org/2001/XMLSchema", #"C:\test\test.xsd");
settings.ValidationType = ValidationType.Schema;
XmlReader reader = XmlReader.Create(#"C:\test\test.xml", settings);
XmlDocument document = new XmlDocument();
document.Load(reader);
ValidationEventHandler eventHandler = new ValidationEventHandler(ValidationEventHandler);
// the following call to Validate succeeds.
document.Validate(eventHandler);
// the document will now fail to successfully validate
document.Validate(eventHandler);
Console.ReadKey();
}
catch (Exception ex)
{
Console.WriteLine(ex.Message);
}
}
static void ValidationEventHandler(object sender, ValidationEventArgs e)
{
switch (e.Severity)
{
case XmlSeverityType.Error:
Console.WriteLine("Error: {0}", e.Message);
break;
case XmlSeverityType.Warning:
Console.WriteLine("Warning {0}", e.Message);
break;
}
}
}
thanks for the info, still learning all this.
I would imagine the second two do not work because you're supplying an incorrect value for targetNamespace when you add the schema to your XmlReaderSettings. This should be an empty string, as your XML has no namespace (or null, as per the docs, this will infer the namespace from the schema).
As to which is better, it depends what your requirement is. If simply to validate it, option 2 using the XmlReader is preferred because it doesn't go to the expense of loading the entire XML into a DOM which you'd then throw away.
If you do need to query the XML using a DOM, the XDocument / LINQ to XML API (option 1) is a much better, more modern API than the old XmlDocument API (option 3).

Problems adding Any-Element to dynamically created XSD-Schema

I just wanna add an Any-element node to an existing XSD-Schema created by this code particle:
private void CreateSchema()
{
//This function returns the XML Schema definition of a XML Element by using the Generation function of a Dataset
XmlSchemaInference x_scheme = new XmlSchemaInference();
this.XsDSchemaSet = x_scheme.InferSchema(this.myXmlElement.CreateReader());
this.XsDSchemaSet.Compile();;
}
After I created the XSD-Schemaset some parts have to be modified. The following code sets the Min- and max-Occurs attributes of existing elements which also works fine.
After modification of the attributes I also have to add an Element of type XmlSchemaElement to the Items-collection of the XmlSchemaSequence like you see in the few lines above the end of the sample code. That does not work. While debugging i can see the element within the Items-collection, but after Reprocessing & recompilation of the Schemaset the modified attributes are set pretty well, but the generated Any-element is not present like you see in the MessageBox of final result. Could anybody help?
private bool ModifyXsdElement(XmlSchemaElement element, XElement myXmlElement)
{
// this function modifies the occurance min an max of the child elements
XmlSchemaSimpleType simpleType = element.ElementSchemaType as XmlSchemaSimpleType;
if (simpleType != null)
{
MessageBox.Show("Function XsdModifyElement: Error - Simple Type!");
return false;
}
else
{
XmlSchemaComplexType complexType = element.ElementSchemaType as XmlSchemaComplexType;
if (complexType != null) //This is a complexType object
{
if (complexType.AttributeUses.Count > 0)
{
//todo: anything if there are attributes
}
bool typeMatch = false;
XmlSchemaSequence sequence = complexType.ContentTypeParticle as XmlSchemaSequence;
if (sequence != null)
{
typeMatch = true;
string fixedValue = string.Empty;
XmlSchemaElement el = new XmlSchemaElement();
foreach (XmlSchemaElement childElement in sequence.Items)
{
//MessageBox.Show("Child Element: " + childElement.Name);
int iOccCtr = GetNoOfXmlChildElements(childElement.Name, myXmlElement);
childElement.MinOccurs = iOccCtr;
childElement.MaxOccurs = iOccCtr;
childElement.MinOccursString = iOccCtr.ToString();
childElement.MaxOccursString = iOccCtr.ToString();
if (FixedValues.TryGetValue(childElement.Name.ToString(), out fixedValue))
childElement.FixedValue = fixedValue;
el = childElement;
}
//Add any element
XmlSchemaAny anyElement = new XmlSchemaAny();
anyElement.MinOccurs = 0;
anyElement.MaxOccurs = 1;
anyElement.ProcessContents = XmlSchemaContentProcessing.Lax;
anyElement.Parent = sequence;
sequence.Items.Add(anyElement);
}
}
}
return true;
}
The final result of the compiled Schema looks like that:
<?xml version=\"1.0\"?>
<xs:schema attributeFormDefault=\"unqualified\" elementFormDefault=\"unqualified\" xmlns:xs=\"http://www.w3.org/2001/XMLSchema\">
<xs:element name=\"STEP\">
<xs:complexType>
<xs:sequence>
<xs:element minOccurs=\"1\" maxOccurs=\"1\" fixed=\"0002\" name=\"LFDNR\" type=\"xs:unsignedByte\" />
<xs:element minOccurs=\"1\" maxOccurs=\"1\" name=\"FUNKTIONSNUMMER\" />
<xs:element minOccurs=\"1\" maxOccurs=\"1\" fixed=\"Firmwareinformationen lesen\" name=\"FUNKTIONSNAME\" type=\"xs:string\" />
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
Thanks for your help!
Br Matthias
Your problem is due to your use of a post compilation property. As the help has it:
The particle for the content type. The post-compilation value of the ContentType particle.
In general, one hint in using .NET's SOM API is to look for properties that have a setter as well. "hint" since some are properties are both: post compilation, and user configurable.
If your complex type's definition has an explicit content model (extension or restriction), then you need to use the XmlSchemaComplexType.ContentModel. If it's an XmlSchemaComplexContent, navigate its Content property (one of XmlSchemaComplexContentRestriction or XmlSchemaComplexContentExtension); each of these types have a Particle property, which is the one you can modify.
Otherwise, if there's no content model, simply access the XmlSchemaComplexType.Particle.
ContentTypeParticle is a post-compiled property. Only some attributes like min-/max-Occurs can be modified. To add new nodes, like the any node in this case, the Particle-property must be modified. After reprocessing of the schema and recompilation of the SchemaSet the new element will be added to the post-compiled ContentTypeParticle.
Thanks to #Petru-Gardea

C# read XML with DTD verification

I'm trying to read an XML file with dtd verification but no mather how I do it seems like the program doesn't read my dtd file. I have concentrated the problem to a small xml file and a small dtd file:
test.xml - Located at c:\test.xml
<?xml version="1.0"?>
<!DOCTYPE Product SYSTEM "test.dtd">
<Product ProductID="123">
<ProductName>Rugby jersey</ProductName>
</Product>
test.dtd - located at c:\test.dtd
<!ELEMENT Product (ProductName)>
<!ATTLIST Product ProductID CDATA #REQUIRED>
<!ELEMENT ProductName (#PCDATA)>
My C# program looks like this
namespace XML_to_csv_converter
{
public partial class Form1 : Form
{
public Form1()
{
InitializeComponent();
}
private void Form1_Load(object sender, EventArgs e)
{
ReadXMLwithDTD();
}
public void ReadXMLwithDTD()
{
// Set the validation settings.
XmlReaderSettings settings = new XmlReaderSettings();
settings.ValidationType = ValidationType.DTD;
settings.DtdProcessing = DtdProcessing.Parse;
settings.ValidationEventHandler += new ValidationEventHandler(ValidationCallBack);
settings.IgnoreWhitespace = true;
// Create the XmlReader object.
XmlReader reader = XmlReader.Create("c:/test.xml", settings);
// Parse the file.
while (reader.Read())
{
System.Console.WriteLine("{0}, {1}: {2} ", reader.NodeType, reader.Name, reader.Value);
}
}
private static void ValidationCallBack(object sender, ValidationEventArgs e)
{
if (e.Severity == XmlSeverityType.Warning)
Console.WriteLine("Warning: Matching schema not found. No validation occurred." + e.Message);
else // Error
Console.WriteLine("Validation error: " + e.Message);
}
}
}
This results in the output:
XmlDeclaration, xml: version="1.0"
DocumentType, Product:
Validation error: The 'Product' element is not declared.
Element, Product:
Validation error: The 'ProductName' element is not declared.
Element, ProductName:
Text, : Rugby jersey
EndElement, ProductName:
EndElement, Product:
I have tried to have the files in defferent locations and i have tried both relative and absolute paths. I have tried to copy an example from microsoft webpage and it resulted in the same problem. Someone have an idea of what can be the problem? Is there any way to see if the program was able to load the dtd file?
I cannot comment so I add an answer to the correct answer by Jim :
// SET THE RESOLVER
settings.XmlResolver = new XmlUrlResolver();
this is a breaking change between .Net 4.5.1 and Net 4.5.2 / .Net 4.6. The resolver was set by default to XmlUrlResolver before. Got stung by this.
You need to add the resolver.
XmlReaderSettings settings = new XmlReaderSettings();
// SET THE RESOLVER
settings.XmlResolver = new XmlUrlResolver();
settings.ValidationType = ValidationType.DTD;
settings.DtdProcessing = DtdProcessing.Parse;
settings.ValidationEventHandler += new ValidationEventHandler(ValidationCallBack);
settings.IgnoreWhitespace = true;
As long as the two files are in the same directory, this will work.
Alternatively you need to provide an URL to the DTD.
XmlUrlResolver can also be overridden to provide additional semantics to the resolution process.

Reading XML File with multiple NS

I am trying to read an XML feed to get the last post date. My xml looks like this:
<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
xmlns:content="http://purl.org/rss/1.0/modules/content/"
xmlns:wfw="http://wellformedweb.org/CommentAPI/"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:atom="http://www.w3.org/2005/Atom"
xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
>
<channel>
<title>mysite</title>
<atom:link href="http://www.mysite.com/news/feed/" rel="self" type="application/rss+xml" />
<link>http://www.mysite.com/news</link>
<description>mysite</description>
<lastBuildDate>Tue, 22 Nov 2011 16:10:27 +0000</lastBuildDate>
<language>en</language>
<sy:updatePeriod>hourly</sy:updatePeriod>
<sy:updateFrequency>1</sy:updateFrequency>
<generator>http://wordpress.org/?v=3.0.4</generator>
<item>
<title>My first post!</title>
<link>http://www.mysite.com/news/2011/11/22/docstore-v2-released/</link>
<comments>http://www.mysite.com/news/2011/11/22/docstore-v2-released/#comments</comments>
<pubDate>Tue, 22 Nov 2011 16:10:27 +0000</pubDate>
<dc:creator>mysite</dc:creator>
<category><![CDATA[News]]></category>
<category><![CDATA[Promotions]]></category>
<category><![CDATA[docstore]]></category>
I didn't show all of the xml since it is rather long.
My method, so far, looks like this:
private void button1_Click(object sender, EventArgs e)
{
var XmlDoc = new XmlDocument();
// setup the XML namespace manager
var mgr = new XmlNamespaceManager(XmlDoc.NameTable);
// add the relevant namespaces to the XML namespace manager
mgr.AddNamespace("ns", "http://purl.org/rss/1.0/modules/content/");
var webClient = new WebClient();
var stream = new MemoryStream(webClient.DownloadData("http://www.mysite.com/news/feed/"));
XmlDoc.Load(stream);
// **USE** the XML anemspace in your XPath !!
XmlElement NodePath = (XmlElement)XmlDoc.SelectSingleNode("/ns:Response");
while (NodePath != null)
{
foreach (XmlNode Xml_Node in NodePath)
{
Console.WriteLine(Xml_Node.Name + ": " + Xml_Node.InnerText);
}
}
}
I'm having a problem with it telling me:
Namespace Manager or XsltContext needed. This query has a prefix,
variable, or user-defined function.
All I want to pull out of this xml code is the 'lastBuildDate'. I'm going in circles trying to get this code right.
Can someone tell me what I am doing wrong here?
Thank you!
You're not using the namespace manager.
// **USE** the XML anemspace in your XPath !!
XmlElement NodePath = (XmlElement)XmlDoc.SelectSingleNode("/ns:Response", mgr);
There is only one of the element you are going after, you could go directly to it using the XPath. That element is also in the default namespace, so you do not need to do anything special to get to it. What about:
var XPATH_BUILD_DATE="/rss/channel/lastBuildDate";
private void button1_Click(object sender, EventArgs e){
var xmlDoc = new XmlDocument();
var webClient = new WebClient();
var stream = new MemoryStream(webClient.DownloadData("http://www.mysite.com/news/feed/"));
xmlDoc.Load(stream);
XmlElement xmlNode = (XmlElement)xmlDoc.SelectSingleNode(XPATH_BUILD_DATE);
Console.WriteLine(xmlNode.Name + ": " + xmlNode.InnerText);
}
If you did however need to dig into elements in a different namespace, you can do that also with the XPath (example, getting the dc:creator:
/rss/channel/item[1]/*[local-name() = 'creator']

Categories