Malformed xml not parsing in XDocument.Parse - c#

Please help, I have an xml document that I want to parse with XDocument. But the xml string that I have has multiple xmlns attributes that is empty. xmlns="" and the moment I remove it it parse, But I receive this from a webservice. I tried replacing it but every which way I try it only replaces one " and I am left with an invalid xmlstring <test xmlns="> I tried regex, I tried the Replace function I tried every know way, and I am now stuck,
Any Suggestions?
string xmlString = #"
<UserFile xmlns=""http://temuri.org"">
<user>
<UserName>Daniel</UserName>
<UserSurname>Vrey</UserSurname>
<Toys xmlns="">
<TToy>Toyota</TToy>
<TToy>Ford</TToy>
</Toys>
</user>
</UserFile>";
XDocument d = XDocument.Parse(xmlString);

Related

How to parse a XML with nested XML text

Trying to read XML file with nested XML object with own XML declaration. As expected got exception:
Unexpected XML declaration. The XML declaration must be the first node in the document, and no white space characters are allowed to appear before it.
How can i read that specific element as text and parse it as separate XML document for later deserialization?
<?xml version="1.0" encoding="UTF-8"?>
<Data>
<Items>
<Item>
<Target type="System.String">Some target</Target>
<Content type="System.String"><?xml version="1.0" encoding="utf-8"?><Data><Items><Item><surname type="System.String">Some Surname</surname><name type="System.String">Some Name</name></Item></Items></Data></Content>
</Item>
</Items>
</Data>
Every approach i'm trying fail due to declaration exception.
var xml = System.IO.File.ReadAllText("Info.xml");
var xDoc = XDocument.Parse(xml); // Exception
var xmlDoc = new XmlDocument();
xmlDoc.LoadXml(xml); // Exception
var xmlReader = XmlReader.Create(new StringReader(xml));
xmlReader.ReadToFollowing("Content"); // Exception
I have no control over XML creation.
The only way I would know is by getting rid of the illegal second <?xml> declaration. I wrote a sample that will simply look for and discard the second <?xml>. After that the string has become valid XML and can be parsed. You may need to tweak it a bit to make it work for your exact scenario.
Code:
using System;
using System.Xml;
public class Program
{
public static void Main()
{
var badXML = #"<?xml version=""1.0"" encoding=""UTF-8""?>
<Data>
<Items>
<Item>
<Target type=""System.String"">Some target</Target>
<Content type=""System.String""><?xml version=""1.0"" encoding=""utf-8""?><Data><Items><Item><surname type=""System.String"">Some Surname</surname><name type=""System.String"">Some Name</name></Item></Items></Data></Content>
</Item>
</Items>
</Data>";
var goodXML = badXML.Replace(#"<Content type=""System.String""><?xml version=""1.0"" encoding=""utf-8""?>"
, #"<Content type=""System.String"">");
var xmlDoc = new XmlDocument();
xmlDoc.LoadXml(goodXML);
XmlNodeList itemRefList = xmlDoc.GetElementsByTagName("Content");
foreach (XmlNode xn in itemRefList)
{
Console.WriteLine(xn.InnerXml);
}
}
}
Output:
<Data><Items><Item><surname type="System.String">Some Surname</surname><name type="System.String">Some Name</name></Item></Items></Data>
Working DotNetFiddle: https://dotnetfiddle.net/ShmZCy
Perhaps needless to say: all of this would not have been needed if the thing that created this invalid XML would have applied the common rule to wrap the nested XML in a <![CDATA[ .... ]]> block.
The <?xml ...?> processing declaration is only valid on the first line of an XML document, and so the XML that you've been given isn't well-formed XML. This will make it quite difficult to parse as is without either changing the source document (and you've indicated that's not possible) or preprocessing the source.
You could try:
Stripping out the <?xml ?> instruction with regex or string manipulation, but the cure there may be worse than the disease.
The HTMLAgilityPack, which implements a more forgiving parser, may work with an XML document
Other than that, the producer of the document should look to produce well-formed XML:
CDATA sections can help this, but be aware that CDATA can't contain the ]]> end tag.
XML escaping the XML text can work fine; that is, use the standard routines to turn < into < and so forth.
XML namespaces can also help here, but they can be daunting in the beginning.

C# Skip anything to next tag

I have a log file in xml format like
<log> // skip this node
<?xml version="1.0" encoding="UTF-8"?>
<qbean logger="main-logger">
</qbean>
</log>
<log> // go to this node
</log>
Now ReadToNextSibling("log") throw an exception an I need to skip content of first "log" tag and move to next "log" tag without throwing exception.
Is there a way?
Hint:
Your XML is invalid since the <?xml version="1.0" encoding="UTF-8"?> has to be before the root element. You can search for it and remove it if that fixes your problem. You can use yourXml.Repalce("<?xml version=\"1.0\" encoding=\"UTF-8\"?>", "")
You have to create a root element for your XML to be valid for parsing.
Then, you can use the XmlDocument class to parse the XML data that you have and skip anything you want. You would need something like this:
var document = new XmlDocument();
document.LoadXml(yourXml);
document.DocumentElement.ChildNodes[1]

Get single value from XML File in C#

I am trying to pull a single value from XML stored in a variable in a C# console application.
Here is my XML:
string myxml = #"<?xml version='1.0' encoding='utf-8'?>
<params>
<rowsEffected>1</rowsEffected>
</params>
<data>
<rowData>
<row>
<answer>1234</answer>
</row>
</rowData>
</data>";
var doc = XDocument.Parse(myxml); //This is as far as I can get
I have read thru many tutorials but can't get this simple task.
I want to extract the value from the "answer" tag, so my result should be 1234
The XML will always have one record.
Any help would be greatly appreciated.
Your XML is invalid. There can only be one root element. In your XML params and data are both top level elements which is not allowed. Try it out for yourself at: http://www.xmlvalidation.com/

How to flat xml to one line in c# code?

How to flat xml to one line in c# code ?
Before:
<CATALOG>
<CD>
<TITLE>Empire Burlesque</TITLE>
<ARTIST>Bob Dylan</ARTIST>
<COUNTRY>USA</COUNTRY>
<COMPANY>Columbia</COMPANY>
<PRICE>10.90</PRICE>
<YEAR>1985</YEAR>
</CD>
</CATALOG>
After:
<CATALOG><CD><TITLE>Empire Burlesque</TITLE><ARTIST>Bob Dylan</ARTIST>COUNTRY>USA</COUNTRY>....
Assuming you're able to use LINQ to XML, and the XML is currently in a file:
XDocument document = XDocument.Load("test.xml");
document.Save("test2.xml", SaveOptions.DisableFormatting);
If you cant use LINQ to XML, you can:
XmlDocument xmlDoc = new XmlDocument()
xmlDoc.LoadXml("Xml as string"); or xmlDoc.Load(filepath)
xmlDoc.InnerXml -- this should return one liner
If you have the XML in a string:
xml.Replace("\n", "").Replace("\r", "")
I know, that this is old question, but it helped me to find XDocument.ToString()
XDocument doc = XDocument.Load("file.xml");
// Flat one line XML
string s = doc.ToString(SaveOptions.DisableFormatting);
Check SaveOptions documentaion
Depends what you are working with and what output you need...
John Skeet's answer works if reading and writing to files.
Aleksej Vasinov's answer works if you want xml without the xml declaration.
If you simply want the xml in a string, but want the entire structure of the xml, including the declaration, ie..
<?xml version "1.0" encoding="utf-16"?> <-- This is the declaration ->
<TheRestOfTheXml />
.. use a StringWriter...
XDocument doc = GetTheXml(); // op's xml
var wr = new StringWriter();
doc.Save(wr, SaveOptions.DisableFormatting);
var s = wr.GetStringBuilder().ToString();

XML Illegal Characters in path

I am querying a soap based service and wish to analyze the XML returned however when I try to load the XML into an XDoc in order to query the data. am getting an 'illegal characters in path' error message? This (below) is the XML returned from the service. I simply want to get the list of competitions and put them into a List I have setup. The XML does load into an XML Document though so must be correctly formatted?.
Any advice on the best way to do this and get round the error would be greatly appreciated.
<?xml version="1.0" ?>
- <gsmrs version="2.0" sport="soccer" lang="en" last_generated="2010-08-27 20:40:05">
- <method method_id="3" name="get_competitions">
<parameter name="area_id" value="1" />
<parameter name="authorized" value="yes" />
<parameter name="lang" value="en" />
</method>
<competition competition_id="11" name="2. Bundesliga" soccertype="default" teamtype="default" display_order="20" type="club" area_id="80" last_updated="2010-08-27 19:53:14" area_name="Germany" countrycode="DEU" />
</gsmrs>
Here is my code, I need to be able to query the data in an XDoc:
string theXml = myGSM.get_competitions("", "", 1, "en", "yes");
XmlDocument myDoc = new XmlDocument();
MyDoc.LoadXml(theXml);
XDocument xDoc = XDocument.Load(myDoc.InnerXml);
You don't show your source code, however I guess what you are doing is this:
string xml = ... retrieve ...;
XmlDocument doc = new XmlDocument();
doc.Load(xml); // error thrown here
The Load method expects a file name not an XML itself. To load an actual XML, just use the LoadXml method:
... same code ...
doc.LoadXml(xml);
Similarly, using XDocument the Load(string) method expects a filename, not an actual XML. However, there's no LoadXml method, so the correct way of loading the XML from a string is like this:
string xml = ... retrieve ...;
XDocument doc;
using (StringReader s = new StringReader(xml))
{
doc = XDocument.Load(s);
}
As a matter of fact when developing anything, it's a very good idea to pay attention to the semantics (meaning) of parameters not just their types. When the type of a parameter is a string it doesn't mean one can feed in just anything that is a string.
Also in respect to your updated question, it makes no sense to use XmlDocument and XDocument at the same time. Choose one or the another.
Following up on Ondrej Tucny's answer :
If you would like to use an xml string instead, you can use an XElement, and call the "parse" method. (Since for your needs, XElement and XDocument would meet your needs)
For example ;
string theXML = '... get something xml-ish...';
XElement xEle = XElement.Parse(theXML);
// do something with your XElement
The XElement's Parse method lets you pass in an XML string, while the Load method needs a file name.
Why not
XDocument.Parse(theXml);
I assume this will be the right solution
If this is really your output it is illegal XML because of the minus characters ('-'). I suspect that you have cut and pasted this from a browser such as IE. You must show the exact XML from a text editor, not a browser.

Categories