XML Deserialization Error - Processing instructions and DTDs are not supported - c#

I have been given an XML file and an XSD file. I am trying to validate the XML against the XSD and then, using Serialization, load the the XML into an object.
I have the validation working as expected but when I try to DeserializeDocToObj I get the following error.
There was an error deserializing the object of type
Aaa.Bbb.Common.DataTypes.SurveyGroup. Processing instructions
(other than the XML declaration) and DTDs are not supported.
Line 1, position 2.
I have no idea what this means and all I have read is not really helping.
The header in the XSD:
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns="http://www.mydomain.co.uk/srm/mscc"
targetNamespace="http://www.mydomain.co.uk/srm/mscc"
elementFormDefault="qualified"
attributeFormDefault="unqualified">
<xs:element name="SurveyGroup">
The header in the XML
<?xml version="1.0" encoding="utf-8" ?>
<?xml-stylesheet type="text/xsl" href="mscc4_cctv.xsl"?>
<SurveyGroup xmlns="http://www.mydomain.co.uk/srm/mscc"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="
http://www.mydomain.co.uk/srm/mscc
http://www.mydomain.co.uk/srm/schemas/mscc4_cctv.xsd">
<Survey>
Deserialization Code:
public T DeserializeDocToObj(string fileLocation)
{
T returnObj;
using (FileStream reader = new FileStream(fileLocation, FileMode.Open, FileAccess.Read))
{
DataContractSerializer ser = new DataContractSerializer(typeof(T));
returnObj = (T)ser.ReadObject(reader);
}
return returnObj;
}
Any help greatly appreciated

Create an XmlReader with the correct XmlReaderSettings and call DataContractSerializer.ReadObject(XmlReader) instead of DataContractSerializer.ReadObject(Stream):
using (var reader = XmlReader.Create(fileName, new XmlReaderSettings { IgnoreProcessingInstructions = true }))
{
var serializer = new DataContractSerializer(typeof(T));
return (T)serializer.ReadObject(reader);
}
The XmlReader used by DataContractSerializer.Read(Stream) does not IgnoreProcessingInstructions. DataContractSerializer.Read(Stream) calls XmlDictionaryReader.CreateTextReader (see the source) which creates a XmlUTF8TextReader (see the source) which does not accept XmlReaderSettings.
Apparently the default behaviour is to crap on (unknown) processing instructions. And the string <?xml-stylesheet type="text/xsl" href="mscc4_cctv.xsl"?> is a processing instruction as C.M. Sperberg-McQueen states.

The string <?xml-stylesheet type="text/xsl" href="mscc4_cctv.xsl"?> is a processing instruction. Your software is telling you it cannot handle processing instructions in its input. This means that your software appears not to be an XML parser; you need either to restrict your input to the subset of XML it can handle, or get a real parser.

Related

Parsing wrongly formatted XML-SOAP with C#

I have malformed XML (SOAP) file which I need to parse. The issue is that XML doesn't have proper header tags.
I've tried to parse file with XDocument and XmlDocument but neither has worked. XML starts from the line 30, so maybe there is some way to skip those lines before file is read by XML parser?
<SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:eb="http://www.oasis-open.org/committees/ebxml-msg/schema/msg-header-2_0.xsd">
<SOAP-ENV:Header>
</SOAP-ENV:Header>
<SOAP-ENV:Body>
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>
<?xml version="1.0" encoding="ISO-8859-1"?>
<?xml-stylesheet type="text/xsl" href="Finvoice.xsl"?>
<GGVersion="2.01" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="a.xsd">
XmlReaderSettings settings = new XmlReaderSettings();
settings.ConformanceLevel = ConformanceLevel.Fragment;
XmlReader r = XmlReader.Create(file.FullName, settings);
XmlDocument xDoc = new XmlDocument();
xDoc.PreserveWhitespace = true;
xDoc.LoadXml("<xml/>");
xDoc.DocumentElement.CreateNavigator().AppendChild(r);
XmlNamespaceManager manager = new XmlNamespaceManager(xDoc.NameTable);
Once trying to parse I get: Unexpected xml declaration. The xml declaration must be the first node in the document ....
If I understand you correctly, then the data you are looking for starts after the SOAP envelope. There is no garbage/unnessescary contents after the data you are looking for.
The SOAP header does not start with the XML declaration (<?xml version=, etc).
Looking for the start of the document
A simple solution is to find the start of the XML document (the data you are looking for), and chop away everything before that.
var startOfRealDocumentMarker = "<?xml version=\"1.0\"";
var startIndex = dirtyXmlString.IndexOf(startOfRealDocumentMarker);
if(startIndex == -1) {
throw new Exception("Start of XML not found. Now what?");
}
var cleanXmlString = dirtyXmlString.Substring(startIndex);
If the SOAP header also has an XML declaration, you could look for the end-tag of the SOAP envelope instead. Or you could start looking for the declaration at the 2nd character, so you would skip over the first one.
This is obviously not a fool-proof solution that will work in every case. But maybe it will work in all of your cases?
Skipping lines
If you're sure it will work to always start reading from line 30 of the input file, you can use this method instead.
XmlDocument xDoc = new XmlDocument();
using (var rdr = new StreamReader(pathToXmlFile))
{
// Skip until reader is positioned at start of line 30
for (var i = 0; i < 29; ++i)
{
rdr.ReadLine();
}
// Load document from current position of reader
xDoc.Load(rdr);
}

How to add xsi:noNamespaceSchemaLocation to Serializer

I build an XML Document which needs to be validated against a xsd file. Thus I need a reference to the xsd file in the root element of the xml. So far I use this C# Code:
var ser = new XmlSerializer(typeof(myspecialtype));
XmlSerializerNamespaces MainNamespace = new XmlSerializerNamespaces();
MainNamespace.Add("xlink", "http://www.w3.org/1999/xlink");
MainNamespace.Add("xsi", "http://www.w3.org/2001/XMLSchema-instance");
using (XmlWriter w = XmlWriter.Create(#"C:\myxmlfile.xml"))
{
w.WriteProcessingInstruction("xml-stylesheet", "type=\"text/xsl\" href=\"utils/somexsl.xsl\"");
ser.Serialize(w, LeBigObject, HauptNs);
}
The resulting Xml begins like this:
<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type="text/xsl" href="utils/somexsl.xsl"?>
<caddy-xml xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink" xmlVersion="03.07.00">
but I need this:
<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type="text/xsl" href="utils/somexsl.xsl"?>
<caddy-xml xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink" xmlVersion="03.07.00" xsi:noNamespaceSchemaLocation="utils/theveryimportant.xsd">
I came across "CreateAttribute" here: Add Namespace to an xml root node c# but I can't put it together with the Serializer. Thank you!
I was pointed to the solution here:
https://social.msdn.microsoft.com/Forums/en-US/e43585c6-181b-4449-8806-b07f82681a2a/how-to-include-xsinonamespaceschemalocation-in-the-xml?forum=asmxandxml
I added this to my class:
[XmlAttribute("noNamespaceSchemaLocation", Namespace = XmlSchema.InstanceNamespace)]
public string attr = "utils/theveryimportant.xsd";
and it works.

Exception in Deserialize xml to c# object

In my application, i have serialized c# object to xml and passed xml to API to generate data and got the reponse xml in return as expected as below,
<?xml version="1.0" encoding="utf-8"?>
<SaveLockResponse xmlns:i="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.easy2access.no/webservice/types">
<Data xmlns:d2p1="http://schemas.microsoft.com/2003/10/Serialization/Arrays" i:nil="true"/>
<Header>New lock was created</Header>
<Message>A lock with serialnumber [23-215-038-028476] was successfully created for customer [28242].</Message>
<Status>Success</Status>
<Lock xmlns:d2p1="http://schemas.datacontract.org/2004/07/Easy2Access.Engine.Engine.Types">
<d2p1:CustomerNumber>28242</d2p1:CustomerNumber>
<d2p1:Description>String</d2p1:Description>
<d2p1:G3LockId>0</d2p1:G3LockId>
<d2p1:LockId>28158</d2p1:LockId>
<d2p1:LockType>G2</d2p1:LockType>
<d2p1:MultiCode>String</d2p1:MultiCode>
<d2p1:OnetimeCode>String</d2p1:OnetimeCode>
<d2p1:SerialNumber>23-33-44-02846</d2p1:SerialNumber>
</Lock>
</SaveLockResponse>
Now i want this to be converted back to c# object, And i follow the below code,
public static T DeserializeFromXml<T>(string xml)
{
T result;
XmlSerializer ser = new XmlSerializer(typeof(T));
using (TextReader tr = new StringReader(xml))
{
result = (T)ser.Deserialize(tr);
}
return result;
}
When i call this method i will get error as below,
There is an error in XML document (1, 40).And inner exception as
{"http://www.easy2access.no/webservice/types'> was not expected."}
Any suggesions most welcome!.
Regards
Sangeetha

XmlReader : data invalid exception

I'm trying to read an XML file and I get an XmlException : "data at the root level is invalid. Line 1, position 1".
Here is the content of the XML file :
<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<root>
<Materials override="TRUE">
<Material name="" diffuse="" />
</Materials>
</root>
And here is my code :
using (FileStream fstr = File.OpenRead(sFullPath))
{
XmlReaderSettings settings = new XmlReaderSettings();
settings.ConformanceLevel = ConformanceLevel.Document;
fstr.Position = 0;
using (XmlReader xmlReader = XmlReader.Create(fstr, settings))
{
while (xmlReader.Read())
{
}
}
}
The exception is raised by the call to Read().
I've been searching for an answer on different sites, had a look at the MSDN too, but can't solve my problem.
My code is taken from http://www.codeproject.com/Articles/318876/Using-the-XmlReader-class-with-Csharp but I tried different snippets too.
I also checked the encoding of my file on Notepad++, tried both UTF-8 and UTF-8 without BOM, didn't make a change.
I'm stuck on this for a couple of days and I'm running out of ideas.
Thanx for your help!
Edit : removed the "..." in the snippet to avoid confusing people. I also did a try with :
using (XmlTextReader xmlReader = new XmlTextReader(fstr))
and it appears that xmlReader.Encoding is returning null, whereas my file is encoded to UTF-8.

C# check if XML file is not corrupted before deserialization

I didn't find anything for my problem in internet.
I deserialize data for playlists.
He is my code :
using (var fs = new FileStream("playlist.xml", FileMode.OpenOrCreate))
{
XmlSerializer xml = new XmlSerializer(typeof(ObservableCollection<Playlist>));
if (fs.Length > 0)
pl = (ObservableCollection<Playlist>)xml.Deserialize(fs);
else
pl = new ObservableCollection<Playlist>();
}
Here is the result XML :
<?xml version="1.0"?>
<ArrayOfPlaylist xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<Playlist>
<Name>Playlist1</Name>
<List>
<Media>
<path>C:\Users\Tchiko\Videos\Suit Tie (Official Lyric Video).mp4</path>
<name>Suit Tie (Official Lyric Video).mp4</name>
<type>Video</type>
</Media>
</List>
</Playlist>
<Playlist>
<Name>Hip hop</Name>
<List>
<Media>
<path>C:\Users\Tchiko\Videos\Suit Tie (Official Lyric Video).mp4</path>
<name>Suit Tie (Official Lyric Video).mp4</name>
<type>Video</type>
</Media>
</List>
</Playlist>
</ArrayOfPlaylist>
Before loading my playlist, I want to check if a user corrupted the file by hand.
I need to check if the format XML is well, in order to avoid conflicts after deserialization.
EDIT :
Version to avoid error for not-well format :
using (var fs = new FileStream("playlist.xml", FileMode.OpenOrCreate))
{
try
{
XmlSerializer xml = new XmlSerializer(typeof(ObservableCollection<Playlist>));
if (fs.Length > 0)
pl = (ObservableCollection<Playlist>)xml.Deserialize(fs);
else
pl = new ObservableCollection<Playlist>();
}
catch (Exception ex)
{
pl = new ObservableCollection<Playlist>();
}
}
Thanks for helps
To ensure XML validity you'll need to define an XML Schema. An XML Schema declares what tags, in what order and with what type of values are allowed in your XML.
Here is an article about how to validate XML against a Schema.
If your XML is not well-formed (as in, the user didn't close a tag or something of the sort), deserialization will fail and you'll get an InvalidOperationException with more details in the InnerException. See XmlSerializer.Deserialize() on MSDN.

Categories