Read XML using LINQ using formatting

Read XML using LINQ using formatting - c#

I am trying to write a Linq query which will parse my XML tree(which is actually a SQL deconstructed into an XML tree).
My XML looks like this
<SqlRoot>
<SqlStatement>
<Clause>
<OtherKeyword>select</OtherKeyword>
<WhiteSpace></WhiteSpace>
<Other>t1</Other>
<Period>.</Period>
<Other>empid</Other>
<WhiteSpace></WhiteSpace>
</Clause>
<Clause>
<OtherKeyword>from</OtherKeyword>
<SelectionTarget>
<WhiteSpace></WhiteSpace>
<Other>bd_orm</Other>
<Period>.</Period>
<Other>dbo</Other>
<Period>.</Period>
<Other>bal_impacts_t</Other>
<WhiteSpace></WhiteSpace>
<Other>t1</Other>
</SelectionTarget>
</Clause>
</SqlStatement>
</SqlRoot>
I am trying to pick out the table name (SelectionTarget node). The WhiteSpace node is a blank/white space in between the values.
So the output which I expect is something like this bd_orm.dbo.bal_impacts_t t1 but I am unable to figure out how to do this by including a whitespace in between.
I tried this
var xxx = (from res in xDoc.Descendants("Clause").Descendants("SelectionTarget") select res);
Console.WriteLine(xxx.DescendantNodesAndSelf().OfType<XElement>().First().Value);
but it obviously fail cause I do not know how to take into account the whitespace node and convert that into an actual whitespace. Any suggestions?

Simply select a space for the WhiteSpace nodes and the string value for all other nodes, then concatenate the results:
var parts = doc.Descendants("SelectionTarget")
.Elements()
.Select(e => e.Name == "WhiteSpace" ? " " : (string)e);
var text = string.Concat(parts);

var nodes = (from res in xDoc.Descendants("Clause")
.Descendants("SelectionTarget")
.Descendants()
select res);
string name = String.Join("", nodes.Select(n=>n.Name == "WhiteSpace"?" ":n.Value));
name: bd_orm.dbo.bal_impacts_t t1
demo
nodes:
<WhiteSpace></WhiteSpace>
<Other>bd_orm</Other>
<Period>.</Period>
<Other>dbo</Other>
<Period>.</Period>
<Other>bal_impacts_t</Other>
<WhiteSpace></WhiteSpace>
<Other>t1</Other>

You could add the whitespace before constructing the query:
foreach (var ws in xDoc.Descendants("WhiteSpace"))
{
ws.Value = " ";
}

Related

Concatenating inner text in an XPath array

I have some XML that looks like this:
<root>
<instructions type="array">
<item type="string">Cannot do business directly with consumers</item>
<item type="string">Cannot marry a martian</item>
</instructions>
</root>
and an instructions variable that looks like this:
var instructions = myXDocument.XPathSelectElements("/root/instructions").Nodes();
I am attempting to concatenate all of the item strings, thusly:
"Cannot do business directly with consumers, Cannot marry a martian"
and my attempt so far is
instructions.Select(x => x.[[What do I put here?]]).Aggregate((i,j) => i + ", " + j)
but having trouble figuring out how to get the inner text from each node in my lambda expression. x.ToString() yields "<item type="string">Cannot do business directly with consumers</item>"

Using the same approach, you can simply replace Nodes() with Elements(), and then access the Value property of the returned XElements to get the inner texts :
var instructions = myXDocument.XPathSelectElements("/root/instructions").Elements();
var result = instructions.Select(x => x.Value).Aggregate((i,j) => i + ", " + j);

I know this is not the lambda expression which you are looking for but this is way around to get the resultant output.
string result;
XElement root = XElement.Load(SomeXML);
root.Element("root").Elements("instructions").Elements("item").All<XElement>(xe =>
{
result = result + xe.Attribute("type").Value),
return true;
});

Try this
var xml = "<root><instructions type=\"array\"><item type=\"string\">Cannot do business directly with consumers</item><item type=\"string\">Cannot marry a martian</item></instructions></root>";
var document = XDocument.Parse(xml);
var result = string.Join(", ", document.Descendants("instructions").Elements("item").Select(x=>x.Value));
Output:
Cannot do business directly with consumers, Cannot marry a martian

Selecting a child node having specific value

I want to check that if the "< city>" node 'having a specific value (say Pathankot )' already exist in the xml file under the a particular "< user Id="any"> having a specific Id", before inserting a new city node into the xml.
< users>
< user Id="4/28/2015 11:29:44 PM">
<city>Fazilka</city>
<city>Pathankot </city>
<city>Jalandher</city>
<city>Amritsar</city>
</user>
</users>
In order to insert I am using the Following c# code
XDocument xmlDocument = XDocument.Load(#"C:\Users\Ajax\Documents\UserSelectedCity.xml");
string usrCookieId = Request.Cookies["CookieId"].Value;
xmlDocument.Element("Users")
.Elements("user")
.Single(x => (string)x.Attribute("Id") == usrCookieId)
//Incomplete because same named cities can be entered more that once
//need to make delete function
.Add(
new XElement("city", drpWhereToGo.SelectedValue));
My Questions:
How Can i check weather the < city> node having specific value say
Pathankot already exist in the xml file Before Inserting a new city
node.
I am Using absolute Path in" XDocument xmlDocument =
XDocument.Load(#"C:\Users\Ajax\Documents\Visual Studio
2012\WebSites\punjabtourism.com\UserSelectedCity.xml");" This
does not allow me to move the files to new folder without changing
the path which is not desirable. But if i use the relative path The
Error Occures "Access Denied";

I would use this simple approach:
var query =
xmlDocument
.Root
.Elements("user")
.Where(x => x.Attribute("Id").Value == usrCookieId)
.Where(x => !x.Elements("city").Any(y => y.Value == "Pathankot"));
foreach (var xe in query)
{
xe.Add(new XElement("city", drpWhereToGo.SelectedValue));
}
It's best to avoid using .Single(...) or .First(...) if possible. The description of your problem doesn't sound like you need to use these though.

Try this:-
First load the XML file into XDocument object by specifying the physical path where your XML file is present. Once you have the object just take the First node with matching condition (please note I am using First instead of Single cz you may have multiple nodes with same matching condition, Please see the Difference between Single & First)
XDocument xmlDocument = XDocument.Load(#"YourPhysicalPath");
xmlDocument.Descendants("user").First(x => (string)x.Attribute("Id") == "1"
&& x.Elements("city").Any(z => z.Value.Trim() == "Pathankot"))
.Add(new XElement("city", drpWhereToGo.SelectedValue));
xmlDocument.Save("YourPhysicalPath");
Finally add the required city to the node retrieved from the query and save the XDocument object.
Update:
If you want to check first if all the criteria fulfills then simply use Any like this:-
bool isCityPresent = xdoc.Descendants("user").Any(x => (string)x.Attribute("Id") == "1"
&& x.Elements("city").Any(z => z.Value.Trim() == "Pathankot"));

I'd create an extension to clean it up a bit, and use XPath for the search.
public static class MyXDocumentExtensions
{
public static bool CityExists(this XDocument doc, string cityName)
{
//Contains
//var matchingElements = doc.XPathSelectElements(string.Format("//city[contains(text(), '{0}')]", cityName));
//Equals
var matchingElements = doc.XPathSelectElements(string.Format("//city[text() = '{0}']", cityName));
return matchingElements.Count() > 0;
}
}
And call it like that:
XDocument xmlDocument = XDocument.Load("xml.txt");
var exists = xmlDocument.CityExists("Amritsar");
Expanding on your question in the comment, you can then use it as:
if(!xmlDocument.CityExists("Amritsar"))
{
//insert city
}
If you would like to match regardless of the trailing whitespace in the XML, you can wrap the text() call in XPath with a normalize-space:
var matchingElements = doc.XPathSelectElements(string.Format("//city[normalize-space(text()) = '{0}']", cityName.Trim()));

Poorly defined XML, get node and contents of all child nodes as string concat with spaces?

Here's some fantastic example XML:
<root>
<section>Here is some text<mightbe>a tag</mightbe>might <not attribute="be" />. Things are just<label>a mess</label>but I have to parse it because that's what needs to be done and I can't <font stupid="true">control</font> the source. <p>Why are there p tags here?</p>Who knows, but there may or may not be spaces around them so that's awesome. The point here is, there's node soup inside the section node and no definition for the document.</section>
</root>
I'd like to just grab the text from the section node and all sub nodes as strings. BUT, note that there may or may not be spaces around the sub-nodes, so I want to pad the sub notes and append a space.
Here's a more precise example of what input might look like, and what I'd like output to be:
<root>
<sample>A good story is the<book>Hitchhikers Guide to the Galaxy</book>. It was published<date>a long time ago</date>. I usually read at<time>9pm</time>.</sample>
</root>
I'd like the output to be:
A good story is the Hitchhikers Guide to the Galaxy. It was published a long time ago. I usually read at 9pm.
Note that the child nodes don't have spaces around them, so I need to pad them otherwise the words run together.
I was attempting to use this sample code:
XDocument doc = XDocument.Parse(xml);
foreach(var node in doc.Root.Elements("section"))
{
output += String.Join(" ", node.Nodes().Select(x => x.ToString()).ToArray()) + " ";
}
But the output includes the child tags, and is not going to work out.
Any suggestions here?
TL;DR: Was given node soup xml and want to stringify it with padding around child nodes.

Incase you have nested tags to an unknown level (e.g <date>a <i>long</i> time ago</date>), you might also want to recurse so that the formatting is applied consistently throughout. For example..
private static string Parse(XElement root)
{
return root
.Nodes()
.Select(a => a.NodeType == XmlNodeType.Text ? ((XText)a).Value : Parse((XElement)a))
.Aggregate((a, b) => String.Concat(a.Trim(), b.StartsWith(".") ? String.Empty : " ", b.Trim()));
}

You could try using xpath to extract what you need
var docNav = new XPathDocument(xml);
// Create a navigator to query with XPath.
var nav = docNav.CreateNavigator();
// Find the text of every element under the root node
var expression = "/root//*/text()";
// Execute the XPath expression
var resultString = nav.evaluate(expression);
// Do some stuff with resultString
....
References:
Querying XML, XPath syntax

Here is a possible solution following your initial code:
private string extractSectionContents(XElement section)
{
string output = "";
foreach(var node in section.Nodes())
{
if(node.NodeType == System.Xml.XmlNodeType.Text)
{
output += string.Format("{0}", node);
}
else if(node.NodeType == System.Xml.XmlNodeType.Element)
{
output += string.Format(" {0} ", ((XElement)node).Value);
}
}
return output;
}
A problem with your logic is that periods will be preceded by a space when placed right after an element.

You are looking at "mixed content" nodes. There is nothing particularly special about them - just get all child nodes (text nodes are nodes too) and join they values with space.
Something like
var result = String.Join("",
root.Nodes().Select(x => x is XText ? ((XText)x).Value : ((XElement)x).Value));

How to get XML Attribute and Element with One Linq Query?

I have an XML document that looks like this:
<?xml version="1.0" encoding="utf-8" ?>
<configuration>
<searchlayers>
<searchlayer whereClause="ProjectNumber=a">Herbicide</searchlayer>
<searchlayer whereClause="ProjectNumber=b">Herbicide - Point</searchlayer>
<searchlayer whereClause="ProjectNumber=c">miscellaneous</searchlayer>
<searchlayer whereClause="ProjectNumber=d">miscellaneous - Point</searchlayer>
<searchlayer whereClause="ProjectNumber=e">Regrowth Control</searchlayer>
<searchlayer whereClause="ProjectNumber=f">Regrowth Control - Point</searchlayer>
<searchlayer whereClause="ProjectNumber=g">Tree Removal</searchlayer>
<searchlayer whereClause="ProjectNumber=h">Tree Removal - Point</searchlayer>
<searchlayer whereClause="ProjectNumber=i">Trimming</searchlayer>
<searchlayer whereClause="ProjectNumber=j">Trimming - Point</searchlayer>
</searchlayers>
</configuration>
Is it possible to write one single Linq statement to get each of the element (e.g. Herbicide, miscellaneous, Regrowth Control... etc) with its matching whereClause (e.g. for Herbicide, the where clause would be "ProjectNumber=a")?
I can write two statements separately, one to get the elements, one to get the attributes, but it would be nice to write just one Linq statement that gets both at the same time.
Thanks.

Yes it is possible. But there are many possible data structure can be used to store list of 2 values pair, here is one example using Tuple :
XDocument doc = XDocument.Load("path_to_xml_file.xml");
List<Tuple<string, string>> result =
doc.Root
.Descendants("searchlayer")
.Select(o => Tuple.Create((string) o, (string) o.Attribute("whereClause")))
.ToList();

You can create a set of anonymous objects as follows:
var result = root.Element("searchlayers")
.Elements("searchlayer")
.Select(i =>
new {attribute = i.Attribute("whereClause").Value,
value = i.Value});
This will give a set of records, where the attributes are paired with the element values.
If you want this in query syntax, it looks like this:
var result = from el in root.Elements("searchlayers").Elements("searchlayer")
select new {attribute = el.Attribute("whereClause").Value,
value = el.Value};

You can use this to to select all elements matching Herbicide whose whereClause matches ProjectNumber=a
IEnumerable<XElement> result =
from el in doc.Elements("Herbicide")
where (string)el.Attribute("whereClause") == "ProjectNumber=a"
select el;
Another alternative would be:
var result = doc.Descendants()
.Where(e => e.Attribute("ProjectNumber=a") != null)
.ToList();
Which should provide you every element whose whereClause equals "ProjectNumber=a".

You can use the Attributes property to get the attribute from the XML node, along with the InnerText, like so:
XmlDocument doc = new XmlDocument();
doc.LoadXml(yourxml);
XmlNodeList xlist = doc.GetElementsByTagName("searchlayer");
for(int i=0;i<xlist.Count;i++)
{
Console.WriteLine(xlist[i].InnerText + " " + xlist[i].Attributes["whereClause"].Value);
}
If you must use LINQ, you could use XDocument and then return anonymous class objects consisting of the attribute and text, like so:
XDocument xdoc = XDocument.Load(yourxmlfile);
var result = xdoc.Root.Elements("searchlayers").Elements("searchlayers").Select(x => new {attr = x.Attribute("whereClause").Value, txt = x.Value});
foreach (var r in result)
{
Console.WriteLine(r.attr + " " + r.txt);
}

XML Lambda query in C#

I have an XmlDocument object and xml in the format:
<?xml version="1.0" encoding="utf-8"?>
<S xmlns="http://server.com/DAAPI">
<TIMESTAMP>2010-08-16 17:25:45.633</TIMESTAMP>
<MY_GROUP>
<GROUP>1 </GROUP>
<NAME>Amsterdam</NAME>
....
</MY_GROUP>
<MY_GROUP>
<GROUP>2 </GROUP>
<NAME>Ireland</NAME>
....
</MY_GROUP>
<MY_GROUP>
<GROUP>3 </GROUP>
<NAME>UK</NAME>
....
</MY_GROUP>
Using a Lambda expression (or Linq To XML if it's more appropriate) on the XmlDocument object how can i do the following:
get the text of a specific element, say the text of NAME where GROUP = 1
the value of the first occurance of the element "NAME"
Thanks a lot

Assuming you mean XDocument rather than XmlDcoument:
First question:
XNamespace ns = "http://server.com/DAAPI";
string text = (from my_group in doc.Elements(ns + "MY_GROUP")
where (int) my_group.Element(ns + "GROUP") == 1
select (string) my_group.Element(ns + "NAME")).First();
I didn't really understand the second question... what do yuo mean by "contains an element of that name"? Which name? And if you're checking for NAME being equal to a give name, wouldn't you already know that name? Did you perhaps mean the value of GROUP for a specific name? If so, it's easy:
XNamespace ns = "http://server.com/DAAPI";
int group = (from my_group in doc.Elements(ns + "MY_GROUP")
where (string) my_group.Element(ns + "NAME")
select (int) my_group.Element(ns + "GROUP")).First();
Both of these queries assume that the values do exist, and that each MY_GROUP element has a GROUP and NAME subelement. Please let us know if that's not the case.

I have used Linq to XML.
string input = "<?xml version=\"1.0\" encoding=\"utf-8\"?><S xmlns=\"http://server.com/DAAPI\"><TIMESTAMP>2010-08-16 17:25:45.633</TIMESTAMP><MY_GROUP><GROUP>1 </GROUP><NAME>Amsterdam</NAME>....</MY_GROUP><MY_GROUP><GROUP>2 </GROUP><NAME>Ireland</NAME>....</MY_GROUP><MY_GROUP><GROUP>3 </GROUP><NAME>UK</NAME>....</MY_GROUP></S>";
var doc = XDocument.Parse(input);
XNamespace ns = "http://server.com/DAAPI";
//The first question
var name = (from elem in doc.Root.Elements(ns + "MY_GROUP")
where elem.Element(ns + "GROUP") != null //Checks whether the element actually exists - if you KNOW it does then it can be removed
&& (int)elem.Element(ns + "GROUP") == 1 //This could fail if not an integer - insure it is if nessasary
select (string)elem.Element(ns + "NAME")).SingleOrDefault();

I understood only your first question. Here you are for the first:
var xmlSource = myGroup.Load(#"../../MyGroup.xml");
var q = from c in xmlSource.myGroup
where c.group = 1
select c.name;

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Read XML using LINQ using formatting - c#

Simply select a space for the WhiteSpace nodes and the string value for all other nodes, then concatenate the results: var parts = doc.Descendants("SelectionTarget") .Elements() .Select(e => e.Name == "WhiteSpace" ? " " : (string)e); var text = string.Concat(parts);

You could add the whitespace before constructing the query: foreach (var ws in xDoc.Descendants("WhiteSpace")) { ws.Value = " "; }

Related

Concatenating inner text in an XPath array

Selecting a child node having specific value

Poorly defined XML, get node and contents of all child nodes as string concat with spaces?

How to get XML Attribute and Element with One Linq Query?

XML Lambda query in C#

Categories

Resources