Validate xml against dtd from string - c#

I have an xml file that refers to a local dtd file. But the problem is that my files are being compressed into a single file (I am using Unity3D and it puts all my textfiles into one binary). This question is not Unity3D specific, it is useful for anyone that tries to load a DTD schema from a string.
I have thought of a workaround to load the xml and load the dtd separately and then add the dtd file to the XmlSchemas of my document. Like so:
private void ReadConfig(string filePath)
{
// load the xml file
TextAsset text = (TextAsset)Resources.Load(filePath);
StringReader sr = new StringReader(text.text);
sr.Read(); // skip BOM, Unity3D catch!
// load the dtd file
TextAsset dtdAsset = (TextAsset)Resources.Load("Configs/condigDtd");
XmlSchemaSet schemaSet = new XmlSchemaSet();
schemaSet.Add(...); // my dtd should be added into this schemaset somehow, but it's only a string and not a filepath.
XmlReaderSettings settings = new XmlReaderSettings() { ValidationType = ValidationType.DTD, ProhibitDtd = false, Schemas = schemaSet};
XmlReader r = XmlReader.Create(sr, settings);
XmlDocument doc = new XmlDocument();
doc.Load(r);
}
The xml starts like this, but the dtd cannot be found. Not strange, because the xml file was loaded as a string, not from a file.
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE Scene SYSTEM "configDtd.dtd">

XmlSchema has a Read method that takes in a Stream and a ValidationEventHandler. If the DTD is a string, you could convert it to a stream
System.Text.Encoding encode = System.Tet.Encoding.UTF8;
MemoryStream ms = new MemoryStream(encode.GetBytes(myDTD));
create the XmlSchema
XmlSchema mySchema = XmlSchema.Read(ms, DTDValidation);
add this schema to the XmlDocument containing the xml you are validating
myXMLDocument.Schemas.Add(mySchema);
myXMLDocument.Schemas.Compile();
myXMLDocument.Validate(DTDValidation);
The DTDValidation() handler would contain code handling what to do if the xml is invalid.

Related

Alternatives to XDocument and XmlDocument for loading xml files in C#?

I want to change an attribute inside an xml file using C#.
Here is a sample XML file
<?xml version="1.0" encoding="us-ascii"?>
<Client>
<Age>25</Age>
<Weight>50</Weight>
</Client>
I tried loading the xml file using both XmlDocument and XDocument. They both take so much time (more than 5 minutes) to load.
Here is the code I am using to load the file:
string filePath = #"myFile.xml";
XmlDocument xmlData = new XmlDocument();
As per Google, the problem is that XDocument and XmlDocument will load all the DTDs for XML file, and this is why it takes much time. Is there a workaround for this? or maybe any alternative that allows me to change an attribute without loading all the DtDs?
You can control how DTDs are cached, parsed or used for validation with XmlReaderSettings and still use XDocument.
If you can take the time to cache the DTDs and changing them isn't part of your test, you could take the hit once and cache them.
If that's too much time or they aren't available and they aren't needed for your tests, you could skip DTD processing.
using (var reader = XmlReader.Create(_,
new XmlReaderSettings
{
DtdProcessing = DtdProcessing.Ignore,
ValidationType = ValidationType.None,
//DtdProcessing = DtdProcessing.Parse,
//ValidationType = ValidationType.DTD,
XmlResolver = new XmlUrlResolver
{
CachePolicy = new RequestCachePolicy(RequestCacheLevel.CacheIfAvailable),
//CachePolicy = new RequestCachePolicy(RequestCacheLevel.NoCacheNoStore),
}
}))
{
var doc = XDocument.Load(reader);
//…
}
XmlReaderSettings has many other properties that sometimes come in handy.

C# Validate XML file using a DTD file without the DOCTYPE string

I am trying to write a C# class that validates a xml file using a DTD file located in another folder that is not in a relative location with the DOCTYPE string, so far, my code is like this:
var settings = new XmlReaderSettings();
settings.DtdProcessing = DtdProcessing.Parse;
settings.ValidationType = ValidationType.DTD;
settings.XmlResolver = new XmlUrlResolver();
settings.ValidationEventHandler += new ValidationEventHandler(IsLoaded);
using (var reader = XmlReader.Create(new StringReader(xmlString), settings))
{
while (reader.Read()) { }
reader.Close();
}
So far this works fine loading the DTD file from the DOCTYPE string included in the xml file, but the DTD file itself must be kept in a folder that is relative to where the program is being excuted. Is there a way to mingle with the XmlResolver class where I can ask it to get a DTD file from another location on my hard drive, like an absoute path being passed in the find the DTD files instead of using the DOCTYPE string?

Encoding attribute from XML file Declaration turns to lower case when the file is created by a MemoryStream

I have to generate a XML file using "ISO-8859-1" encoding from my Asp.Net Web API application but MemoryStream lowercases the encoding attribute from the generated XML definition to "iso-8859-1".
This method generates a XML file based on a object which has been created by a XSD.
public static MemoryStream GenerateXml<T>(T entity) where T : class
{
XmlSerializerNamespaces ns = new XmlSerializerNamespaces();
//Add an empty namespace and empty value
ns.Add("", "");
var memoryStream = new MemoryStream();
var streamWriter = new StreamWriter(memoryStream, Encoding.GetEncoding("ISO-8859-1"));
var serializer = new XmlSerializer(typeof(T));
serializer.Serialize(streamWriter, entity, ns);
return memoryStream;
}
Then I need to use XDocument to replace the prefix definition of XML elements (Its a prerequisite that all elements should be only named with their own tags). So I had to do this:
public MemoryStream GenerateXmlOpening<T>(T entity) where T : class
{
var xmlMemStream = XmlHelper.GenerateXml(entity);
xmlMemStream.Position = 0;
XDocument doc = XDocument.Load(xmlMemStream, LoadOptions.PreserveWhitespace);
//Removes the namespace declaration as prefix on elements
doc.Descendants().Attributes().Where(a => a.IsNamespaceDeclaration).Remove();
//the memory stream retreived from 'xmlMemStream' is already with "iso-8859-1 in lowercase, so im trying to override it
doc.Declaration.Encoding = "ISO-8859-1";
MemoryStream stream = new MemoryStream();
// when i save the xdoc to the new memorystream, the encoding goes from "ISO-8859-1" to "iso-8859-1" again.
doc.Save(stream);
stream.Position = 0;
return stream;
}
This is the beginning of the returned generated XML file:
<?xml version="1.0" encoding="iso-8859-1"?>
... content
How it's supposed to be:
<?xml version="1.0" encoding="ISO-8859-1"?>
... content
Ps.* Im writing the XML using a MemoryStream because I have to write a .zip file and return a response of all generated XML files within this zip. This .Zip generator receives a list of MemoryStreams.

Trying to Load an XML File Upon Form Load, Getting Error

In C# am trying to check to see if an XML file is created, if not create the file and then create the xml declaration, a comment and a parent node.
When I try to load it, it gives me this error:
"The process cannot access the file 'C:\FileMoveResults\Applications.xml' because it is being used by another process."
I checked the task manager to ensure it wasn't open and sure enough there were no open applications of it. Any ideas of what's going on?
Here is the code I am using:
//check for the xml file
if (!File.Exists(GlobalVars.strXMLPath))
{
//create the xml file
File.Create(GlobalVars.strXMLPath);
//create the structure
XmlDocument doc = new XmlDocument();
doc.Load(GlobalVars.strXMLPath);
//create the xml declaration
XmlDeclaration xdec = doc.CreateXmlDeclaration("1.0", null, null);
//create the comment
XmlComment xcom = doc.CreateComment("This file contains all the apps, versions, source and destination paths.");
//create the application parent node
XmlNode newApp = doc.CreateElement("applications");
//save
doc.Save(GlobalVars.strXMLPath);
Here is the code I ended up using to fix this issue:
//check for the xml file
if (!File.Exists(GlobalVars.strXMLPath))
{
using (XmlWriter xWriter = XmlWriter.Create(GlobalVars.strXMLPath))
{
xWriter.WriteStartDocument();
xWriter.WriteComment("This file contains all the apps, versions, source and destination paths.");
xWriter.WriteStartElement("application");
xWriter.WriteFullEndElement();
xWriter.WriteEndDocument();
}
File.Create() returns a FileStream that locks the file until it's closed.
You don't need to call File.Create() at all; doc.Save() will create or overwrite the file.
I would suggest something like this:
string filePath = "C:/myFilePath";
XmlDocument doc = new XmlDocument();
if (System.IO.File.Exists(filePath))
{
doc.Load(filePath);
}
else
{
using (XmlWriter xWriter = XmlWriter.Create(filePath))
{
xWriter.WriteStartDocument();
xWriter.WriteStartElement("Element Name");
xWriter.WriteEndElement();
xWriter.WriteEndDocument();
}
//OR
XmlDeclaration xdec = doc.CreateXmlDeclaration("1.0", null, null);
XmlComment xcom = doc.CreateComment("This file contains all the apps, versions, source and destination paths.");
XmlNode newApp = doc.CreateElement("applications");
XmlNode newApp = doc.CreateElement("applications1");
XmlNode newApp = doc.CreateElement("applications2");
doc.Save(filePath); //save a copy
}
The reason your code is currently having problems is because of: File.Create creates the file and opens the stream to the file, and then you never make use of it (never close it) on this line:
//create the xml file
File.Create(GlobalVars.strXMLPath);
if you did something like
//create the xml file
using(Stream fStream = File.Create(GlobalVars.strXMLPath)) { }
Then you would not get that in use exception.
As a side note XmlDocument.Load will not create a file, only work with an already create one
You could create a stream, setting the FileMode to FileMode.Create and then use the stream to save the Xml to the path specified.
using (System.IO.Stream stream = new System.IO.FileStream(GlobalVars.strXMLPath, FileMode.Create))
{
XmlDocument doc = new XmlDocument();
...
doc.Save(stream);
}

Adding (Embedded Resource) Schema To XmlReaderSettings Instead Of Filename?

I am writing an application that parses an Xml file. I have the schema (.xsd) file which I use to validate the Xml before trying to deserialize it:
XmlReaderSettings settings = new XmlReaderSettings();
settings.Schemas.Add(null, "./xml/schemas/myschema.xsd");
settings.ValidationType = ValidationType.Schema;
XmlReader reader = XmlReader.Create(xmlFile, settings);
XmlDocument document = new XmlDocument();
document.Load(reader);
ValidationEventHandler eventHandler = new ValidationEventHandler(settings_ValidationEventHandler);
document.Validate(eventHandler);
Note that the parameter *./xml/schemas/myschema.xsd" is the path to the .xsd relative to program execution.
I don't want to use filenames/paths, instead I would rather compile the .xsd file as an embedded resource in my project (I have already added the .xsd file and set the Build Action to Embedded Resource).
My question is.... how do I add the Embedded Resource schema to the XmlReaderSettings schema list? There are 4 overloaded methods for settings.Schemas.Add but none of them take an embedded resource as an argument. They all take the path to the schema file.
I have used embedded resources in the past for dynamically setting label images so I am somewhat familiar with using embedded resources. Looking at my other code it looks like what I eventually end up with is a Stream that contains the content:
System.Reflection.Assembly myAssembly = System.Reflection.Assembly.GetExecutingAssembly();
Stream myStream = myAssembly.GetManifestResourceStream(resourceName);
I am assuming that the embedded .xsd will also be read in as a stream so this narrows down my question a bit. How do I add the schema to XmlReaderSettings when I have a reference to the stream containing the schema and not the filename?
You can use the Add() overload that takes an XmlReader as its second parameter:
Assembly myAssembly = Assembly.GetExecutingAssembly();
using (Stream schemaStream = myAssembly.GetManifestResourceStream(resourceName)) {
using (XmlReader schemaReader = XmlReader.Create(schemaStream)) {
settings.Schemas.Add(null, schemaReader);
}
}
Or you can load the schema first and then add it:
Assembly myAssembly = Assembly.GetExecutingAssembly();
using (Stream schemaStream = myAssembly.GetManifestResourceStream(resourceName)) {
XmlSchema schema = XmlSchema.Read(schemaStream, null);
settings.Schemas.Add(schema);
}

Categories