I got an XML file to load. The problem is that it is physically saved as a fixed width file. This means that the whole is written in lines with a fixed width, with exceptions, i.e. there may be fewer characters in the line.
There are errors when using XmlDocument.Load ().
How to correctly load such an XML file?
This is how the file looks:
Use the file helpers-library from C#
It's free and should solve your problem: https://www.filehelpers.net/
Edit: If you're doing that already it would be good to know the error message you are getting.
My guess is that you have to fix the file into the proper format. To be honest, I've seen these particular xmls and I've never seen such occurence. When you download the XML it is already in such format?
I guess, you're creating an XML out of the given XML schema (XSD), at least that's what I was doing with "JPK". What error are you receiving from XmlDocument.Load()?
Related
This is partly a question for the Microsoft forums too, but I think there might be some coding involved.
We have a system built in C# .NET that generates CSV files. However, we have problems with special characters "æÆøØåÅ". The thing is, when I open the file in NotePad, everything is correct. But when I open the file in Excel, these characters are wrong. If I open in NotePad and save without actually doing any changes, it works in Excel. But I dont understand why? Is there some hidden information added to the file that can we adjusted in our C# code to make it correct in the first place?
There are other questions like this, but all answers I could find are workarounds for when you already have a wrong CSV file. In our case, we create this file, and the people we send the files too are usually not computer-people capable of changing encoding, etc.
Edit:
Here is the code we tried to use at the end, after generating our result CSV-string:
string result = "some;æøå;string";
byte[] bytes = System.Text.Encoding.GetEncoding(65001).GetBytes(result.ToString());
return System.Text.Encoding.GetEncoding(65001).GetString(bytes);
We are generating an xml file in C# using xmlseralizer and UTF8 encoding. We check the output and the xml is well formed and passes XSD validation.
We send this xml to customer who load this in UNIX environment. They keep on telling us that xml is not valid and has invalid characters. We don't have UNIX environment to test.
The question being, is there any difference when loading xml files in UNIX?
What can we ask the customer to provide to better understand this situation?
You might have a UTF-8 BOM as the first three bytes of your file:
<?xml version="1.0" encoding="utf-8"?>
It is not part of the XML document so a file reader should not pass it on to be interpreted by the XML parser. If you have it, you could try to remove it and see if your users have the same complaint. Most editors will not show it to you so you might have use a hex editor. (Hex: EF BB BF).
If the problem remains, you'd need to know at what byte offset the purported invalid characters are and which section of the XML specification they violate. Which program and version they are use and what feedback it gives might be helpful, too.
You might also consider that the file is getting damaged in delivery. A round trip transmission might help detect that.
I'm making a world editor for a game in C# XNA.
The file contains a large sum of data so I feel XmlWriter is necessary.
The application runs perfectly fine. Files are saved in a directory which they're immediately accessible in, however, for the file to load directly into the pipeline it's necessary to include the line
<Asset Type = ObjectID.objectID[]>
Unfortunately this includes hexidecimal characters not supported by XmlWriter, XDocument and XmlDocument so I'm wondering if there's a way around it or perhaps there's an xml type I've not tried that allows odd hexidecimal characters.
If there isn't, that's quite alright as I've a back-up plan, but I'm just wondering.
Thank you kindly for the read and I hope my question is well written. :)
I found that I was able to use WriteRaw to write the line as a raw string, though this breaks the file format :(
writer.WriteRaw("<Asset Type = \"objectID.objectID[]>\"");
Sorry to be the one to answer my own question but thanks for the support all the same.
<?xml version="1.0" encoding="utf-8" standalone="no"?>
<XnaContent><Asset Type = "objectID.objectID[]>"<Item><ID>2</ID><xPos>640</xPos><yPos>280</yPos> <xPath>0</xPath><yPath>0</yPath></Item></Asset></XnaContent>
I was trying to parse an XMLfile created using Visual Studio using a tool which uses Xerces parser and I got "content not allowed in prolog" error.
Now when I create an XML file using some other editor like notepad++ and have the exact same content as the one created above I don't get this error.
What do you think might be the problem. You might understand that this is not a repeat question.
EDIT
So i found out the problem. Its because the tool which i use could not handle the Bom at the beginning of the file
The file starts with a UTF-8 byte-order mark. The XML specifications say that documents may start with a BOM, so it should be fine. Is it possible that the tool uses an old version of Xerces which didn't cope with a BOM? Other than that, the file looks fine to me.
Is this a tool you have the source code to? Are you able to create a short but complete program which demonstrates the problem, failing to parse it? Can you try a later version of Xerces?
Check the encoding of the file created using visual studio and compare it with the notepad file encoding, that must be the issue.
I'm getting sometimes the error "Text node cannot appear in this state" in my application after editing a xml in MonoDevelop and loading it with dotNET.
This error is really annoying, because i have to copy the XML-file to windows and try to fix it there with VS.
The xml file is absolutly correct, must be something with the encoding.
Is there any quick way to fix this in MonoDevelop?
And of course, it would be interesting why this error appears.
Edit ( Short XML example on request )
<?xml version="1.0" encoding="UTF-8"?>
<Data>
</Data>
I was trying to reproduce this problem and I found that (given my contrived reproduction) that all I had to do was edit the first line of the xaml
<?xml version="1.0" encoding="UTF-8"?>
It appears as though, when the encoding changed, that there was a single space before the <?xml node in the file. I used TextWrangler to open the file and saw the space. Simply editing the file in Xamarin Studio resolved the issue. In further investigation it looked as though there were 2 BOMs in the header of the file.
fe ff fe ff
I'd love to hear back if anyone can pinpoint how the encoding changed though.
I found only one workaround for this error when using monodevelop only( There are other ways to solve this issue by using another editor ):
Saving the file with another encoding ( UTF-16 ). This is not going to solve it permanently, if you edit the file again it may occur again.
I think the problem is that the Byte Order Mark appears as the first 2 bytes, and a parser that doesn't expect a byte order mark will interpret it as a short text node. Re-encoding without the BOM should fix it.