XDocument : Difference between .Root.Value and .Root.ToString - c#

Does anybody know what is the difference between those two below statements :
xdoc.Root.Value;
and
xdoc.Root.ToString();
From my own research, I can see that the first line removes the root node and replaces the '\r\n' to '\n' whereas the second one keeps the content as original. Am I correct ? any documentation to back that up ?
As I want to use the first line but keep the original Windows new lines, is there a way to do that ?

Did you read the documentation?
Value:
A String that contains all of the text content of this element. If there are multiple text nodes, they will be concatenated.
ToString():
Returns the indented XML for this node.

The primary difference is:
ToString() includes the root element tags and the indentation/tabs.
For example:
<Root>
<Child1>1</Child1>
</Root>
Whereas, value doesn't; nor does it maintain the tabs, it just shows the content inside the root tag - it will show you the tags for the children, but not for the root itself:
For example:
<Child1>1</Child1>

Related

XDocument.Parse preserving unwanted whitespace

XDocument.Parse is retaining unwanted white space when parsing my XML. It appears that my XML is "not indented," which means that white space is retained regardless of whether or not I send in the LoadOptions.PreserveWhitespace flag (http://msdn.microsoft.com/en-us/library/bb551294(v=vs.110).aspx).
This means that when I have XML like the following:
<?xml version="1.0" encoding="UTF-8"?>
<blah:Root xmlns:blah="example.blah.com">
<blah:Element>
value
</blah:Element>
</blah:Root>
and then look at
XDocument xDoc = XDocument.Parse(blahXml);
xElement xEl = xDoc.Root.Element("Element");
string value = xEl.Value;
print value;
it will print "\n value\n" instead of "value".
How do I make XDocument.Parse always ignore white space regardless of whether or not I give it indented or not-indented XML?
White space between elements can be ignored (e.g.
<root>
<foo>foo 1</foo>
<foo>foo 2</foo>
</root>
can be parsed into a root element nodes with two foo child elements node if white space is ignored or into a root element node with five child nodes: text node, foo element node, text node, foo element node, text node), but if an element contains some text data that includes white space then it is considered important. So your only option is indeed to use a method like Trim in the .NET framework or like normalize-space in XPath to have the white space removed when you process the element.

How to figure out the Nth node of something that I currently am in using XPATh

Ok. I have an attribute in an xml document that I know will occur more than once. Using C# I loop through all the nodes that have this attribute. I know how to count the occurrence of an element using xpath...
count("//x/y#b")
and so on.
But is there a way that I can get the n-th value of a node that I am on... for example
<?xml version="1.0"?>
<x>
<y/>
<y/>
<y/>
</x>
Let's say I was looping through that programatically using c#. And lets say I was on the second element. Is there any way using xpath that I could figure out that I am on the 2nd node? I guess I am just trying to find my position in the iteration. Any ideas? Currently scouring the internet. If I find it out I will be sure to let you know.
Thanks.
UPDATE: CAN'T SEEM to get my stuff to work
Ok. I thought I would update my question. I can't seem to get any of your suggestions working...
<Template>
<TemplateData>
<ACOData>
<POPULATION_PATIENT_ID>6161</POPULATION_PATIENT_ID>
<PATIENT_ID>4329</PATIENT_ID>
</ACOData>
<ACOData>
<POPULATION_PATIENT_ID>5561</POPULATION_PATIENT_ID>
<PATIENT_ID>4327</PATIENT_ID>
</ACOData>
<ACOData>
<POPULATION_PATIENT_ID>6160</POPULATION_PATIENT_ID>
<PATIENT_ID>4321</PATIENT_ID>
</ACOData>
<ACOData>
<POPULATION_PATIENT_ID>5561</POPULATION_PATIENT_ID>
<PATIENT_ID>4320</PATIENT_ID>
</ACOData>
That is the XML that I am using. But I can't seem to get the correct count. I am always coming up with zero?
encounter = Int32.Parse((patElm.CreateNavigator().Evaluate("count(/Template/TemplateData/ACOData/POPULATION_PATIENT_ID[.='" + populationPatID + "']/preceding-sibling::ACOData/POPULATION_PATIENT_ID[.='"+populationPatID+"'])")).ToString());
The above is the code that I am attempting to use to get the correct value... Note my count function
count(/Template/TemplateData/ACOData/POPULATION_PATIENT_ID[.='" + populationPatID + "']/preceding-sibling::ACOData/POPULATION_PATIENT_ID[.='"+populationPatID+"'])"
To get the second such element in the document use:
(//x/y[#b])[2]
Suppose you want to go the other way. That is, you have one of these nodes and you want to know its overall position. In general, for any expression <expr> the following is true:
$n = count((<expr>)[$n]/preceding::*[count(.|<expr>)=count(<expr>)])
That is, the position of the Nth element selected by <expr> can be found by counting all the preceding elements also selected by that expression. Using similar techniques, we can find the position of some node that would be selected by a more general expression, within the set of all nodes selected by that expression.
For example, suppose we have the following document:
<x>
<y b="true"/>
<y b="true"/>
<y/>
<y/>
<x><y b="true"/><y/><y b="true">77</y></x>
<y/>
<y/>
</x>
And we want to know the position in the document of the node at /*/*/y[.='77'] among all nodes selected by //x/y[#b]. Then use the following expression:
count(/*/*/y[.='77']/preceding::*[count(.|//x/y[#b])=count(//x/y[#b])]) + 1
A more specific one-off solution looks like this:
count(/*/*/y[.='77']/preceding::y[parent::x and #b]) + 1
Result (in both cases):
4
Note: It's assumed that /*/*/y[.='77'] and (<expr>)[$n] above actually select some node in the document. If not, the result will be an erroneous 1 due to adding 1 to the result of the count. For this reason, this method is probably most useful when working on a context node or when it is guaranteed that your initial expression selects a node. (Of course, initial error checking can be employed, as well.)
Let's say I was looping through that programatically using c#. And
lets say I was on the second element. Is there any way using xpath
that I could figure out that I am on the 2nd node?
Suppose, as you say, that the current (initial context) node is /x/y[2] and you want to see what is its "position".
Evaluate this XPath expression (off the current node):
count(preceding-sibling::y) + 1
You can use the position function
x/y[position() = 3]

How to get XElement value with spaces?

I have following XElement:
<title>
<bold>Foo</bold>
<italic>Bar</italic>
</title>
When I get Value property it returns FooBar without space. How to fix it?
By definition, the Value of the <title> element is the concatenation of all text in this element. By default whitespace between elements and their contents is ignored, so it gives "FooBar". You can specify that you want to preserve whitespace:
var element = XElement.Parse(xml, LoadOptions.PreserveWhitespace);
However it will preserve all whitespace, including the line feeds and indentation. In your XML, there is a line feed and two spaces between "Foo" and "Bar"; how is it supposed to guess that you only want to keep one space?
From the documentation for the Value property of the XElement class:
Gets or sets the concatenated text contents of this element.
Given your example, this behavior is expected. If you want spaces, you will have to provide the logic to do it.

How to read from xml file without /n/t/t myValue \n\t\t

Im reading from a xml and my values seem to come all right except with \n\t\t wrapping them... presumably something to do with spacing and stuff... how do i tell C# to ignore this?
I'm assuming that your XML looks something like this:
<foo>
<bar>
baz
</bar>
</foo>
While that may look pretty for a person, the XML parser is required to preserve all whitespace. Nor is the parser permitted to combine whitespace that appears on either side of an element (so in this case the DOM representation of foo has three children: a Text node containing only whitespace, an Element node for bar, and another Text node containing only whitespace).
Bottom line is that Fredrik's answer is the correct one, but I figured that the rationale behind the behavior was important.
Did you try to call the Trim method on the returned string? That should strip off line feeds, tabs and spaces from the start and end of the string.
I faced this issue in my java program. But not in C#.
This suggestion will help someone.
Below steps I did.
1. Before parsing I read the complete XML as a String.
2. Trim the string
3. replace all "\n" to ""
4. replace all "\t" to ""
Then, I parsed perfectly.

Using XPath in SelectSingleNode: Retrieving individual element from XML if it's present

My XML looks like :
<?xml version=\"1.0\"?>
<itemSet>
<Item>one</Item>
<Item>two</Item>
<Item>three</Item>
.....maybe more Items here.
</itemSet>
Some of the individual Item may or may not be present. Say I want to retrieve the element <Item>two</Item> if it's present. I've tried the following XPaths (in C#).
XMLNode node = myXMLdoc.SelectSingleNode("/itemSet[Item='two']") --- If Item two is present, then it returns me only the first element one. Maybe this query just points to the first element in itemSet, if it has an Item of value two somewhere as a child. Is this interpretation correct?
So I tried:
XMLNode node = myXMLdoc.SelectSingleNode("/itemSet[Item='two']/Item[1]") --- I read this query as, return me the first <Item> element within itemSet that has value = 'two'. Am I correct?
This still returns only the first element one. What am I doing wrong?
In both the cases, using the siblings I can traverse the child nodes and get to two, but that's not what I am looking at. Also if two is absent then SelectSingleNode returns null. Thus the very fact that I am getting a successfull return node does indicate the presence of element two, so had I wanted a boolean test to chk presence of two, any of the above XPaths would suffice, but I actually the need the full element <Item>two</Item> as my return node.
[My first question here, and my first time working with web programming, so I just learned the above XPaths and related xml stuff on the fly right now from past questions in SO. So be gentle, and let me know if I am a doofus or flouting any community rules. Thanks.]
I think you want:
myXMLdoc.SelectSingleNode("/itemSet/Item[text()='two']")
In other words, you want the Item which has text of two, not the itemSet containing it.
You can also use a single dot to indicate the context node, in your case:
myXMLdoc.SelectSingleNode("/itemSet/Item[.='two']")
EDIT: The difference between . and text() is that . means "this node" effectively, and text() means "all the text node children of this node". In both cases the comparison will be against the "string-value" of the LHS. For an element node, the string-value is "the concatenation of the string-values of all text node descendants of the element node in document order" and for a collection of text nodes, the comparison will check whether any text node is equal to the one you're testing against.
So it doesn't matter when the element content only has a single text node, but suppose we had:
<root>
<item name="first">x<foo/>y</item>
<item name="second">xy<foo/>ab</item>
</root>
Then an XPath expression of "root/item[.='xy']" will match the first item, but "root/item[text()='xy']" will match the second.

Categories