Problem on converting xml string structure to XDocument object - c#

I have xml string which I want to convert to XDocument object. I've been following this example from Microsoft https://learn.microsoft.com/en-us/dotnet/api/system.xml.linq.xdocument.parse?view=netframework-4.7.2.
The problem is instead of getting this below result as in the example
<!-- comment at the root level -->
<Root>
<Child>Content</Child>
</Root>
I got the below result
{<!-- comment at the root level -->
<Root>
<Child>Content</Child>
</Root>}
BaseUri: ""
Declaration: {<?xml version="1.0"?>}
Document: {<!-- comment at the root level -->
<Root>
<Child>Content</Child>
</Root>}
DocumentType: null
FirstNode: {<!-- comment at the root level -->}
LastNode: {<Root>
<Child>Content</Child>
</Root>}
NextNode: null
NodeType: Document
Parent: null
PreviousNode: null
Root: {<Root>
<Child>Content</Child>
</Root>}
I want to get clear xml result without other metadata like nodes information as shown below
<!-- comment at the root level -->
<Root>
<Child>Content</Child>
</Root>
i'm using XDocument.Parse() method
I have added the code I'm using
xmlString declaration
var xmlString = #"<?xml version=""1.0""?><!-- comment at the root level --><Root><Child>Content</Child></Root>";
and this is how I create XDocument object
XDocument xDoc = XDocument.Parse(xmlString);

The given example from MSDN delivers the expected output of
<!-- comment at the root level -->
<Root>
<Child>Content</Child>
</Root>
The output you posted looks like all the properties of the XDocument. The XDocument object contains more information than just the plain XML you parsed.
In the example the output that is produced by the line Console.WriteLine(doc); is the string you gave in as the XML because it calls doc.ToString() which produces the "raw" XML output.
So I think you may got confused with the XDocument containing more information (properties than your raw xml). But you can perfectly query your XML data using LinqToXML(https://learn.microsoft.com/de-de/dotnet/csharp/programming-guide/concepts/linq/linq-to-xml-overview).
It looks like the parsing works exactly as it should (parsing raw XML to a object of type XDocument).

Related

How to parse a XML with nested XML text

Trying to read XML file with nested XML object with own XML declaration. As expected got exception:
Unexpected XML declaration. The XML declaration must be the first node in the document, and no white space characters are allowed to appear before it.
How can i read that specific element as text and parse it as separate XML document for later deserialization?
<?xml version="1.0" encoding="UTF-8"?>
<Data>
<Items>
<Item>
<Target type="System.String">Some target</Target>
<Content type="System.String"><?xml version="1.0" encoding="utf-8"?><Data><Items><Item><surname type="System.String">Some Surname</surname><name type="System.String">Some Name</name></Item></Items></Data></Content>
</Item>
</Items>
</Data>
Every approach i'm trying fail due to declaration exception.
var xml = System.IO.File.ReadAllText("Info.xml");
var xDoc = XDocument.Parse(xml); // Exception
var xmlDoc = new XmlDocument();
xmlDoc.LoadXml(xml); // Exception
var xmlReader = XmlReader.Create(new StringReader(xml));
xmlReader.ReadToFollowing("Content"); // Exception
I have no control over XML creation.
The only way I would know is by getting rid of the illegal second <?xml> declaration. I wrote a sample that will simply look for and discard the second <?xml>. After that the string has become valid XML and can be parsed. You may need to tweak it a bit to make it work for your exact scenario.
Code:
using System;
using System.Xml;
public class Program
{
public static void Main()
{
var badXML = #"<?xml version=""1.0"" encoding=""UTF-8""?>
<Data>
<Items>
<Item>
<Target type=""System.String"">Some target</Target>
<Content type=""System.String""><?xml version=""1.0"" encoding=""utf-8""?><Data><Items><Item><surname type=""System.String"">Some Surname</surname><name type=""System.String"">Some Name</name></Item></Items></Data></Content>
</Item>
</Items>
</Data>";
var goodXML = badXML.Replace(#"<Content type=""System.String""><?xml version=""1.0"" encoding=""utf-8""?>"
, #"<Content type=""System.String"">");
var xmlDoc = new XmlDocument();
xmlDoc.LoadXml(goodXML);
XmlNodeList itemRefList = xmlDoc.GetElementsByTagName("Content");
foreach (XmlNode xn in itemRefList)
{
Console.WriteLine(xn.InnerXml);
}
}
}
Output:
<Data><Items><Item><surname type="System.String">Some Surname</surname><name type="System.String">Some Name</name></Item></Items></Data>
Working DotNetFiddle: https://dotnetfiddle.net/ShmZCy
Perhaps needless to say: all of this would not have been needed if the thing that created this invalid XML would have applied the common rule to wrap the nested XML in a <![CDATA[ .... ]]> block.
The <?xml ...?> processing declaration is only valid on the first line of an XML document, and so the XML that you've been given isn't well-formed XML. This will make it quite difficult to parse as is without either changing the source document (and you've indicated that's not possible) or preprocessing the source.
You could try:
Stripping out the <?xml ?> instruction with regex or string manipulation, but the cure there may be worse than the disease.
The HTMLAgilityPack, which implements a more forgiving parser, may work with an XML document
Other than that, the producer of the document should look to produce well-formed XML:
CDATA sections can help this, but be aware that CDATA can't contain the ]]> end tag.
XML escaping the XML text can work fine; that is, use the standard routines to turn < into < and so forth.
XML namespaces can also help here, but they can be daunting in the beginning.

C# Skip anything to next tag

I have a log file in xml format like
<log> // skip this node
<?xml version="1.0" encoding="UTF-8"?>
<qbean logger="main-logger">
</qbean>
</log>
<log> // go to this node
</log>
Now ReadToNextSibling("log") throw an exception an I need to skip content of first "log" tag and move to next "log" tag without throwing exception.
Is there a way?
Hint:
Your XML is invalid since the <?xml version="1.0" encoding="UTF-8"?> has to be before the root element. You can search for it and remove it if that fixes your problem. You can use yourXml.Repalce("<?xml version=\"1.0\" encoding=\"UTF-8\"?>", "")
You have to create a root element for your XML to be valid for parsing.
Then, you can use the XmlDocument class to parse the XML data that you have and skip anything you want. You would need something like this:
var document = new XmlDocument();
document.LoadXml(yourXml);
document.DocumentElement.ChildNodes[1]

XPATHSelectElements returning null (no namespaces in XML file)

string xmlFile = GetCountriesFile();
XDocument xd = XDocument.Load(xmlFile);
XElement city = xd.XPathSelectElement("/WorldCities/City");
Console.Clear();
// Print the name of the first city
Console.WriteLine(city.Element("Name").Value);
// Get all the cities in the document
// Works in http://xpath.online-toolz.com/tools/xpath-editor.php
// but returns null in .NET
var cities = xd.XPathSelectElements("/WorldCities/City");
// cities is set to nulll
Console.ReadKey();
The XML file I am using contains no namespaces:
<?xml version="1.0" encoding="utf-8" ?> <!-- XML declaration, there can only be one XML declaration in an XML document -->
<WorldCities> <!-- Root node, there can only be one root node in an XML document -->
<City> <!-- Parent node -->
<Name>Vancouver</Name> <!-- Child node -->
<Country>Canada</Country> <!-- Sibling node of location -->
<Continent>North America</Continent> <!-- Sibling node of location -->
</City>
<City>
<Name>Buenos Aires</Name>
<Country>Argentina</Country>
<Continent>South America</Continent>
</City>
<City>
<Name>Berlin</Name>
<Country>Germany</Country>
<Continent>Europe</Continent>
</City>
<City>
<Name>Nairobi</Name>
<Country>Kenya</Country>
<Continent>Africa</Continent>
</City>
<City>
<Name>Tokyo</Name>
<Country>Japan</Country>
<Continent>Asia</Continent>
</City>
<City>
<Name>Sydney</Name>
<Country>Australia</Country>
<Continent>Australia</Continent>
</City>
</WorldCities>
The XPATH tester at http://xpath.online-toolz.com/tools/xpath-editor.php returned all of the City elements when I used the XPATH path "/WorldCities/City" against the same XML. Why then is the XPATHSelectElements method returning null? There are no namespaces in the XML file to cause problems.
That method never returns null, it instead returns an IEnumerable<XElement> you have to consume in your code to access the elements in the collection, if there are any selected. But you won't get a null from that method.
Your XPath is correct and your code works fine with xml you provided - it returns first city element for first query and collection of 6 cities for second query:
My first thought was that you are loading some other file. But thus your first query returns first city element, then your second query should return at least one city. Looks like you are not using same xpath for second query. Make sure your real code is exactly same as code you provided.

Get single value from XML File in C#

I am trying to pull a single value from XML stored in a variable in a C# console application.
Here is my XML:
string myxml = #"<?xml version='1.0' encoding='utf-8'?>
<params>
<rowsEffected>1</rowsEffected>
</params>
<data>
<rowData>
<row>
<answer>1234</answer>
</row>
</rowData>
</data>";
var doc = XDocument.Parse(myxml); //This is as far as I can get
I have read thru many tutorials but can't get this simple task.
I want to extract the value from the "answer" tag, so my result should be 1234
The XML will always have one record.
Any help would be greatly appreciated.
Your XML is invalid. There can only be one root element. In your XML params and data are both top level elements which is not allowed. Try it out for yourself at: http://www.xmlvalidation.com/

C# XmlDocument Nodes

I'm trying to access UPS tracking info and, as per their example, I need to build a request like so:
<?xml version="1.0" ?>
<AccessRequest xml:lang='en-US'>
<AccessLicenseNumber>YOURACCESSLICENSENUMBER</AccessLicenseNumber>
<UserId>YOURUSERID</UserId>
<Password>YOURPASSWORD</Password>
</AccessRequest>
<?xml version="1.0" ?>
<TrackRequest>
<Request>
<TransactionReference>
<CustomerContext>guidlikesubstance</CustomerContext>
</TransactionReference>
<RequestAction>Track</RequestAction>
</Request>
<TrackingNumber>1Z9999999999999999</TrackingNumber>
</TrackRequest>
I'm having a problem creating this with 1 XmlDocument in C#. When I try to add the second:
<?xml version="1.0" ?> or the <TrackRequest>
it throws an error:
System.InvalidOperationException: This
document already has a
'DocumentElement' node.
I'm guessing this is because a standard XmlDocument would only have 1 root node. Any ideas?
Heres my code so far:
XmlDocument xmlDoc = new XmlDocument();
XmlDeclaration xmlDeclaration = xmlDoc.CreateXmlDeclaration("1.0", "utf-8", null);
XmlElement rootNode = xmlDoc.CreateElement("AccessRequest");
rootNode.SetAttribute("xml:lang", "en-US");
xmlDoc.InsertBefore(xmlDeclaration, xmlDoc.DocumentElement);
xmlDoc.AppendChild(rootNode);
XmlElement licenseNode = xmlDoc.CreateElement("AccessLicenseNumber");
XmlElement userIDNode = xmlDoc.CreateElement("UserId");
XmlElement passwordNode = xmlDoc.CreateElement("Password");
XmlText licenseText = xmlDoc.CreateTextNode("mylicense");
XmlText userIDText = xmlDoc.CreateTextNode("myusername");
XmlText passwordText = xmlDoc.CreateTextNode("mypassword");
rootNode.AppendChild(licenseNode);
rootNode.AppendChild(userIDNode);
rootNode.AppendChild(passwordNode);
licenseNode.AppendChild(licenseText);
userIDNode.AppendChild(userIDText);
passwordNode.AppendChild(passwordText);
XmlElement rootNode2 = xmlDoc.CreateElement("TrackRequest");
xmlDoc.AppendChild(rootNode2);
An XML document can only ever have one root node. Otherwise it's not well formed. You will need to create 2 xml documents and join them together if you need to send both at once.
Its throwing an exception because you are trying to create invalid xml. XmlDocument will only generate well formed xml.
You could do it using an XMLWriter and setting XmlWriterSettings.ConformanceLevel to Fragment or you could create two XmlDocuments and write them out into the same stream.
Build two separate XML documents and concatenate their string representation.
It looks like your node structure always be the same. (I don't see any conditional logic.) If the structure is constant you could define an XML template string. Load that string into an XML Document & do a SelectNode to populate individual nodes.
That may be simpler/cleaner than programatically creating the root, elements & nodes.

Categories