XmlReader : data invalid exception - c#

I'm trying to read an XML file and I get an XmlException : "data at the root level is invalid. Line 1, position 1".
Here is the content of the XML file :
<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<root>
<Materials override="TRUE">
<Material name="" diffuse="" />
</Materials>
</root>
And here is my code :
using (FileStream fstr = File.OpenRead(sFullPath))
{
XmlReaderSettings settings = new XmlReaderSettings();
settings.ConformanceLevel = ConformanceLevel.Document;
fstr.Position = 0;
using (XmlReader xmlReader = XmlReader.Create(fstr, settings))
{
while (xmlReader.Read())
{
}
}
}
The exception is raised by the call to Read().
I've been searching for an answer on different sites, had a look at the MSDN too, but can't solve my problem.
My code is taken from http://www.codeproject.com/Articles/318876/Using-the-XmlReader-class-with-Csharp but I tried different snippets too.
I also checked the encoding of my file on Notepad++, tried both UTF-8 and UTF-8 without BOM, didn't make a change.
I'm stuck on this for a couple of days and I'm running out of ideas.
Thanx for your help!
Edit : removed the "..." in the snippet to avoid confusing people. I also did a try with :
using (XmlTextReader xmlReader = new XmlTextReader(fstr))
and it appears that xmlReader.Encoding is returning null, whereas my file is encoded to UTF-8.

Related

Deleting element from Xml breaks format on reload

I am creating a system that stores vehicle data. When I serialize the data using Xml serialization, I get the correct format as shown in the example below:
<?xml version="1.0"?>
<ArrayOfVehicle xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<Vehicle>
<Registration>fake1</Registration>
<Model>123</Model>
<Make>test</Make>
<Year>1999</Year>
<Cost>100</Cost>
</Vehicle>
<Vehicle>
<Registration>fake2</Registration>
<Model>321</Model>
<Make>123</Make>
<Year>2000</Year>
<Cost>321</Cost>
</Vehicle>
</ArrayOfVehicle>
The serialization uses a list of vehicles that have the attributes seen in the Xml file. I am trying to figure out how I can delete a vehicle from the list and serialize it back to the Xml file without breaking the format shown above.
The method that I have tried to use to delete the records from the list and serialize and deserialize the data, but when I remove and item, it breaks the format. This is what the Xml file looks like when I remove an item from the list and serialize it:
fake1 123 test 1999 100
Here is my code for removing an item:
for (int i = Business.VehicleList.Count - 1; i >= 0; i--)
{ //Where Business.VehicleList is my list
if (Business.VehicleList[i].Registration == registration)
{
Business.VehicleList.RemoveAt(i);
Business.Save(); //Method for serialization
}
}
Here is the error it throws when I try to deserialize the data again:
System.InvalidOperationException: 'There is an error in XML document (10, 19). XmlException: There are multiple root elements. Line 10, position 19.'
These are my serialization and deserialization methods:
public static void Retrieve()
{
using (FileStream fileStream = new FileStream("C:\\temp\\data.xml", FileMode.OpenOrCreate))
{
using (var reader = new StreamReader(fileStream))
{
if (fileStream.Length <= 0)
{
return;
}
else
{
XmlSerializer deserializer = new XmlSerializer(typeof(List<Vehicle>),
new XmlRootAttribute("ArrayOfVehicle"));
_vehicleList = (List<Vehicle>)deserializer.Deserialize(reader); //This is where the error is thrown
}
}
}
}
public static void Save()
{
XmlSerializer serializer = new XmlSerializer(typeof(List<Vehicle>));
using (FileStream fileStream = new FileStream("C:\\temp\\data.xml", FileMode.Open))
{
serializer.Serialize(fileStream, VehicleList);
fileStream.Close();
}
}
Any suggestions on how to remove a vehicle from my list without it breaking the Xml file?
Here is the source after I tried deleting an item from the vehicle string
<?xml version="1.0"?>
<ArrayOfVehicle xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<Vehicle>
<Registration>123</Registration>
<Model>123</Model>
<Make>23</Make>
<Year>2000</Year>
<Cost>123</Cost>
</Vehicle>
</ArrayOfVehicle><Registration>1321</Registration>
<Model>123123</Model>
<Make>312312</Make>
<Year>2000</Year>
<Cost>321</Cost>
</Vehicle>
</ArrayOfVehicle>
In the Save method, new FileStream("C:\\temp\\data.xml", FileMode.Open) will open the existing file without truncating it. So after you write the new XML data to the file, there will be remnants of the old content if the new content is shorter than the old one.
Changing this to new FileStream("C:\\temp\\data.xml", FileMode.Create) will fix the issue.
I think it's because you are trying to de-serialize a malformed xml. Please first, make sure that your serialization method produces correct xml. The reason may be because of closing the stream inside using statement. And also serializing the list before for-loop finishes.
Try removing fileStream.Close(); and also moving Business.Save(); to outside of for-loop.
Here, I made a fiddle with same conditions and it works.

XmlWriter.WriteNode(XmlReader) writes invalid XML using memory streams

I have a need to normalize XML streams to UTF-16. I use the following method:
All streams passed are byte streams: MemoryStream or FileStream. My problem is when I pass in a filestream containing the following (correctly encoded) XML as jobTicket:
<?xml version="1.0" encoding="utf-8"?>
<workflow>
<file>
<request name="create-temp-file" tag="очень">
</request>
<request name="create-temp-folder" tag="非常に">
</request>
</file>
</workflow>
ticketStreamU16 contains an XML declaration, complete with UTF-8 encoding declaration as UTF-16. This is not well formed XML.
public void EncodeJobTicket(Stream jobTicket, Stream ticketStreamU16)
{
XmlWriterSettings settings = new XmlWriterSettings();
settings.Encoding = Encoding.Unicode;
if (jobTicket.CanSeek)
{
jobTicket.Position = 0;
}
using (XmlReader xmlRdr = XmlReader.Create(jobTicket))
using (XmlWriter xmlWtr = XmlWriter.Create(ticketStreamU16, settings))
{
xmlWtr.WriteNode(xmlRdr, false);
}
}
What am I missing? Shouldn't xmlWtr write the correct xml declartion? Do I have to look for a declaration and replace it?
You should remove the xml declaration from the read stream, because it copy it without change. Xml writer will write correct one for you:
using (XmlWriter xmlWtr = XmlWriter.Create(ticketStreamU16, settings))
{
xmlRdr.MoveToContent();
xmlWtr.WriteNode(xmlRdr, false);
}

XML Deserialization Error - Processing instructions and DTDs are not supported

I have been given an XML file and an XSD file. I am trying to validate the XML against the XSD and then, using Serialization, load the the XML into an object.
I have the validation working as expected but when I try to DeserializeDocToObj I get the following error.
There was an error deserializing the object of type
Aaa.Bbb.Common.DataTypes.SurveyGroup. Processing instructions
(other than the XML declaration) and DTDs are not supported.
Line 1, position 2.
I have no idea what this means and all I have read is not really helping.
The header in the XSD:
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns="http://www.mydomain.co.uk/srm/mscc"
targetNamespace="http://www.mydomain.co.uk/srm/mscc"
elementFormDefault="qualified"
attributeFormDefault="unqualified">
<xs:element name="SurveyGroup">
The header in the XML
<?xml version="1.0" encoding="utf-8" ?>
<?xml-stylesheet type="text/xsl" href="mscc4_cctv.xsl"?>
<SurveyGroup xmlns="http://www.mydomain.co.uk/srm/mscc"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="
http://www.mydomain.co.uk/srm/mscc
http://www.mydomain.co.uk/srm/schemas/mscc4_cctv.xsd">
<Survey>
Deserialization Code:
public T DeserializeDocToObj(string fileLocation)
{
T returnObj;
using (FileStream reader = new FileStream(fileLocation, FileMode.Open, FileAccess.Read))
{
DataContractSerializer ser = new DataContractSerializer(typeof(T));
returnObj = (T)ser.ReadObject(reader);
}
return returnObj;
}
Any help greatly appreciated
Create an XmlReader with the correct XmlReaderSettings and call DataContractSerializer.ReadObject(XmlReader) instead of DataContractSerializer.ReadObject(Stream):
using (var reader = XmlReader.Create(fileName, new XmlReaderSettings { IgnoreProcessingInstructions = true }))
{
var serializer = new DataContractSerializer(typeof(T));
return (T)serializer.ReadObject(reader);
}
The XmlReader used by DataContractSerializer.Read(Stream) does not IgnoreProcessingInstructions. DataContractSerializer.Read(Stream) calls XmlDictionaryReader.CreateTextReader (see the source) which creates a XmlUTF8TextReader (see the source) which does not accept XmlReaderSettings.
Apparently the default behaviour is to crap on (unknown) processing instructions. And the string <?xml-stylesheet type="text/xsl" href="mscc4_cctv.xsl"?> is a processing instruction as C.M. Sperberg-McQueen states.
The string <?xml-stylesheet type="text/xsl" href="mscc4_cctv.xsl"?> is a processing instruction. Your software is telling you it cannot handle processing instructions in its input. This means that your software appears not to be an XML parser; you need either to restrict your input to the subset of XML it can handle, or get a real parser.

Remove comments from an XML file with double dashes --

How can I remove invalid xml comments that contain double dashes(--) from an xml file?
I'm trying to load the xml file, but it is failing. These comments make the xml invalid. The xml comes from a vendor.
I tried removing these based on approaches from other posts, but I was not successful. Here is an example of the xml:
<?xml version="1.0" encoding="ISO-8859-1"?>
<!--MAIN VARIABLES-->
<content type="screwed">
<!--KEEP 19-39 -- SEE HELP.TXT AND THE VIDEO TUTORIALS FOR MORE INFO -->
<!--REGULAR/NON-Regular EXAMPLE --><SomeTag somefile="test.txt3" Name="test"/>
<!-- -->
</content>
I have tried the following without success:
string xmlDocFile = "c:\server\test.xml";
XmlReaderSettings readerSettings = new XmlReaderSettings();
readerSettings.IgnoreComments = true;
readerSettings.ProhibitDtd = false;
readerSettings.ValidationType = ValidationType.DTD;
XmlReader reader = XmlReader.Create(xmlDocFile, readerSettings);
XmlDocument myXmlDoc = new XmlDocument();
myXmlDoc.Load(reader);
myXmlDoc.Save(xmlDocFile);
Before using XmlReader, parse xml file and filter comments using regexp.
// using System.Text.RegularExpressions;
System.IO.StreamReader file= new System.IO.StreamReader(xmlDocFile);
string validXml = Regex.Replace(file.ReadToEnd(),"<!--.*?-->","");
XmlReader reader = XmlReader.Create(validXml);

Confused about XmlSerializer + schemaLocation

I am having trouble validating serialized data.
Ok, so I started with an XSD file which I got from some third party. Generated C# classes using xsd tool. Then I added
[XmlAttribute("noNamespaceSchemaLocation", Namespace = System.Xml.Schema.XmlSchema.InstanceNamespace)]
public string SchemaLocation = "http://localhost/schemas/AP_Transactions_10052011.xsd";
to the top level object. The URL in question is obviously accessible from my machine where I am running the code. Then I am serializing it using XmlSerializer, which correctly produces
<?xml version="1.0" encoding="utf-8"?>
<BU_AP_Vendor_Invoices xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xsi:noNamespaceSchemaLocation="http://local.com/schemas/AP_Transactions_10052011.xsd">
...
</BU_AP_Vendor_Invoices>
So far so good.
Now I am trying to validate the file like so:
public static void Validate(TextReader xmlData)
{
XmlReaderSettings settings = new XmlReaderSettings();
settings.ValidationType = ValidationType.Schema;
settings.ValidationFlags = XmlSchemaValidationFlags.ProcessIdentityConstraints | XmlSchemaValidationFlags.ReportValidationWarnings;
settings.ValidationEventHandler += delegate(object sender, ValidationEventArgs args)
{
Console.WriteLine(args.Message);
};
using (XmlReader xmlReader = XmlReader.Create(xmlData, settings))
while (xmlReader.Read()) ;
}
Which results Could not find schema information for the element 'element name' warnings for every element in the XML file. I assume that means the XSD is simply not being loaded.
I was looking at the XmlReaderSettings.Schemas, but how would the reader know what to add there? I assumed that if I don't add schemas explicitly then magic will simply happen, but that doesn't seem to work.
Question is how to do this properly?
Please take a look at this post; the gist is to use XmlSchemaValidationFlags.ProcessSchemaLocation.

Categories