I have an xml that I would like to get all of its elements. I tried getting those elements by Descendants() or DescendantNodes(), but both of them returned me repeated nodes
For example, here is my xml:
<Root xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<FirstElement xsi:type="myType">
<SecondElement>A</SecondElement>
</FirstElement>
</Root>
and when I use this snippet:
XElement Elements = XElement.Parse(XML);
IEnumerable<XElement> xElement = Elements.Descendants();
IEnumerable<XNode> xNodes = Elements.DescendantNodes();
foreach (XNode node in xNodes )
{
stringBuilder.Append(node);
}
it gives me two nodes but repeating the <SecondElement>. I know Descendants call its children, and children of a child all the time, but is there any other way to avoid it?
Then, this is the content of my stringBuilder:
<FirstElement xsi:type="myType" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<SecondElement>A</SecondElement>
</FirstElement>
<SecondElement>A</SecondElement>
Well do you actually want all the descendants or just the top-level elements? If you only want the top level ones, then use the Elements() method - that returns all the elements directly under the current node.
The problem isn't that nodes are being repeated - it's that the higher-level nodes include the lower level nodes. So the higher-level node is being returned, then the lower-level one, and you're writing out the whole of both of those nodes, which means you're writing out the lower-level node twice.
If you just write out, say, the name of the node you're looking at, you won't see a problem. But you haven't said what you're really trying to do, so I don't know if that helps...
XmlDocument doc = new XmlDocument();
doc.LoadXml(XML);
XmlNodeList allElements = doc.SelectNodes("//*");
foreach(XmlElement element in allElements)
{
// your code here
}
Related
Take the following XML as example:
<root>
<lines>
<line>
<number>1</number>
</line>
<line>
<number>2</number>
</line>
</lines>
</root>
XmlNodeList nodeList = doc.SelectNodes("//lines/line");
foreach(XmlNode node in nodeList)
{
int index = node.SelectSingleNode("//number");
}
The above code will result in index = 1 for both iterations.
foreach(XmlNode node in nodeList)
{
int index = node.SelectSingleNode("number");
}
The above code will result in 1,2 respectively. I know that // finds first occurrence of xpath but i feel like the first occurrence should be relative to the node itself. The behavior appears to find first occurrence from the root even when selecting nodes from a child node. Is this the way microsoft intended this to work or is this a bug.
yeah thanks but just removing the slashes worked as well as in my second example.
Removing the slashes only works because number is an immediate child element of line. If it were further down in the hierarchy:
<root>
<lines>
<line>
<other>
<number>1</number>
</other>
</line>
</lines>
</root>
you would still need to use .//number.
I just think it is confusing that if you are searching for node within a node that // would go back to the whole document.
That's just how XPath syntax is designed. // at the beginning of an XPath expression means that the evaluation context is the document node - the outermost node of an XML document. .// means that the context of the path expression is the current context node.
If you think about it, it is actually useful to have a way to select from the whole document in any context.
Is this the way microsoft intended this to work or is this a bug.
Microsoft is implementing the XPath standard, and yes, this is how the W3C intended an XPath library to work and it's not a bug.
I have a UI that uses the DataGridView to display the content of XML files.
If XmlNode contains only InnerText, it's quite simple, however I'm having a problem with nodes that contains childnodes (and not only string).
Simple
<node>value</node>
Displayed as "value" in DataGridViewCell.
Complex
<node>
<foo>bar</foo>
<foo2>bar</foo2>
</node>
The problem is that the InnerXml code is not intended and it's very hard to modify in UI.
I've tried to use XmlTextWriter to "beautify" the string - it works quite well, however requires a XmlNode (includes node, not only childnodes) and I cannot assign it back to InnerXml.
I would like to either see following in the UI:
<foo>bar</foo>
<foo2>bar</foo2>
(this can be assigned to InnerXml afterwards)
Or
<node>
<foo>bar</foo>
<foo2>bar</foo2>
</node>
(and find a way how to replace OuterXml with this string).
Thanks for any ideas,
Martin
You can load the OuterXml to XElement, then use String.Join() to join all child elements of the root node (in other point-of-view, the InnerXml) separated by line break, for example :
XElement e = e.Parse(something.OuterXml);
var result = string.Join(
Environment.NewLine,
e.Elements().Select(o => o.ToString())
);
Assume this simple XML fragment in which there may or may not be the xml declaration and has exactly one NodeElement as a root node, followed by exactly one other NodeElement, which may contain an assortment of various number of different kinds of elements.
<?xml version="1.0">
<NodeElement xmlns="xyz">
<NodeElement xmlns="">
<SomeElement></SomeElement>
</NodeElement>
</NodeElement>
How could I go about selecting the inner NodeElement and its contents without the namespace? For instance, "//*[local-name()='NodeElement/NodeElement[1]']" (and other variations I've tried) doesn't seem to yield results.
As for in general the thing that I'm really trying to accomplish is to Deserialize a fragment of a larger XML document contained in a XmlDocument. Something like the following
var doc = new XmlDocument();
doc.LoadXml(File.ReadAllText(#"trickynodefile.xml")); //ReadAllText to avoid Unicode trouble.
var n = doc.SelectSingleNode("//*[local-name()='NodeElement/NodeElement[1]']");
using(var reader = XmlReader.Create(new StringReader(n.OuterXml)))
{
var obj = new XmlSerializer(typeof(NodeElementNodeElement)).Deserialize(reader);
I believe I'm missing just the right XPath expression, which seem to be rather elusive. Any help much appreciated!
Try this:
/*/*
It selects children of the root node.
Or
/*/*[local-name() = 'NodeElement']
It selects children with local-name() = 'NodeElement' of the root node.
Anyway in your case both expressions select <NodeElement xmlns="">.
walk the tree
foreach(XmlNode node in doc.DocumentElement.childnodes[0].childnodes)
{
// do something with node
}
hideously fragile of course might want to check for nulls here and there.
Apparently XmlNode.ChildNodes-list (in C# .Net) contains not only real child nodes, but also special whitespace nodes. So even in the simplest case when having one tag inside another you can get parentNode.ChildNodes.Count == 3. How to get around this?
Already tried:
xmlDocument.PreserveWhitespace = false;
Also:
foreach(XmlNode node in xmlDocument.SelectNodes("//*))
if (node is XmlWhitespace)
node.ParentNode.RemoveChild(node);
Text nodes are first class children. I guess you want Element nodes only. Can't you do
node.SelectNodes("*")
Or are you saying that <root><child></root> results in root having three child nodes?
Why not just use the following? You won't be able to remove the node from the parent, because then you're modifying the collection while you're enumerating which isn't allowed.
foreach(XmlNode node in xmlDocument.SelectNodes("//*"))
{
if (node is XmlWhitespace)
continue;
else
{
// A real node
}
}
You can do something simple like this.
xmlDocument.SelectNodes("//*).OfType<XmlElement>();
This will filter for only nodes of type XmlElement (meaning "real" nodes). it will exclude CData, whitespace, text, etc.
Make sure to add Linq namespace:
using System.Linq;
Is it possible to get the open tag from a XmlNode with all attributes, namespace, etc?
eg.
<root xmlns="urn:..." rattr="a">
<child attr="1">test</child>
</root>
I would like to retrieve the entire opening tag, exactly as retrieved from the original XML document if possible, from the XmlNode and later the closing tag. Both as strings.
Basically XmlNode.OuterXml without the child nodes.
EDIT
To elaborate, XmlNode.OuterXml on a node that was created with the XML above would return the entire XML fragment, including child nodes as a single string.
XmlNode.InnerXml on that same fragment would return the child nodes but not the parent node, again as a single string.
But I need the opening tag for the XML fragment without the children nodes. And without building it using the XmlAttribute array, LocalName, Namespace, etc.
This is C# 3.5
Thanks
Is there some reason you can't simply say:
string s = n.OuterXml.Substring(0, n.OuterXml.IndexOf(">") + 1);
I think the simplest way would be to call XmlNode.CloneNode(false) which (according to the docs) will clone all the attributes but not child nodes. You can then use OuterXml - although that will give you the closing tag as well.
For example:
using System;
using System.Xml;
public class Test
{
static void Main()
{
XmlDocument doc = new XmlDocument();
doc.LoadXml(#"<root xmlns='urn:xyz' rattr='a'>
<child attr='1'>test</child></root>");
XmlElement root = doc.DocumentElement;
XmlNode clone = root.CloneNode(false);
Console.WriteLine(clone.OuterXml);
}
}
Output:
<root xmlns="urn:xyz" rattr="a"></root>
Note that this may not be exactly as per the original XML document, in terms of the ordering of attributes etc. However, it will at least be equivalent.
How about:
xmlNode.OuterXML.Replace(xmlNode.InnerXML, String.Empty);
Poor man's solution :)