I have just written some code, which as i was writing i thought, this is going to be a nice generic method for searching for a particular node. When i finished i actually realised it was a mess :D
public String sqlReading(String fileName, String path, String nodeId )
{
XmlDocument doc = new XmlDocument();
doc.Load(fileName);
XmlNodeList names = doc.SelectNodes(path);
foreach (XmlNode xmlDocSearchTerm in names)
{
//if the attribute of the node i start at is the same as where i am now
if (xmlDocSearchTerm.Attributes.Item(0).Value.ToString().Equals(nodeId))
{
//get a list of all of its child nodes
XmlNodeList childNodes = xmlDocSearchTerm.ChildNodes;
foreach (XmlNode node in childNodes)
{
//if there is a node in here called gui display, go inside
if (node.Name.Equals("GUIDisplay"))
{
XmlNodeList list = node.ChildNodes;
//find the sqlsearchstring tag inside of here
foreach (XmlNode finalNode in list)
{
if (finalNode.Name.Equals("sqlSearchString"))
{
return node.InnerText;
}
}
}
}
}
}
return "";
}
What i intended to do was based on a path - i would start and check to see if the element had the id i was looking for, if it did then i wanted to get inside there and not stop going until i got to the sqlsearchstring tag which was buried two levels deeper. I have managed that, but the issue here is that now i seem to have almost hardcoded a path to the tag opposed to looping there. How could i change my code to stop me from doing this?
Its from the second foreach where its going wrong imo.
Thanks
Haven't tested it but I believe something like this would work, by using a xpath. However I'm not sure the name of the attribute, or is it always the first attribute?
public String sqlReading(String fileName, String path, String nodeId)
{
XmlDocument doc = new XmlDocument();
doc.Load(fileName);
XmlNode foundNode = doc.SelectNodes(path).SelectSingleNode("*[#id='" + nodeId + "']/GUIDisplay/sqlSearchString");
if (foundNode != null)
return foundNode.InnerText;
return string.Empty;
}
Im not sure if this is exaclty right (as I dont have an XML document to try it with, but something similar should work
var innerTexts = XDocument.Load(fileName)
.Elements(path)
.Where(n => n.Attributes().ElementAt(0).Value == nodeId)
.SelectMany(n => n.Elements())
.Where(n => n.Name == "GUIDisplay")
.SelectMany(n => n.Elements())
.Where(n => n.Name == "sqlSearchString")
.Select(n => n.ToString());
I would say recursion is a safe bet (for iterating through nested child nodes) Though, from what I gather, the structure remains the same. And with that in mind, why not use [XmlDocumentObj].SelectSingleNode("/[nodeId='"+nodeId+"']") (or some facsimile) instead? This has the ability to search by attribute name, unless the XML structure is always changed and you never have constant tag (in which case XPath is probably a good idea).
Related
I close my last question as it was commented that not enough research had been done. The more I research them more confused I am getting. What I think should work in my understanding and from post here and elsewhere is not working.
XML sample
<?xml version="1.0" encoding="UTF-8" ?>
<multistatus xmlns="DAV:">
<schmeata>
<classschema name="Space">
<base_class>Space</base_class>
</classschema>
<classschema name="Tapestry">
<base_class>File</base_class>
</classschema>
<classschema name="Document">
<base_class>File</base_class>
</classschema>
</schmeata>
</multistatus>
I am trying to get the name attribute of the classschema nodes that have base_class children with the 'File' value. so the result should be 'Tapestry' and 'Document'
I can easily return all classschemata nodes with
foreach (XmlNode node in xmlDoc.SelectNodes("//DAV:schemata/DAV:classschema", nsmgr))
{
strNode = node.Attributes["name"].Value;
responseString += strNode +" ";
}
return responseString;
And I can get the base_class value = to 'File' by looping through all the base_class nodes like this.
foreach (XmlNode node in xmlDoc.SelectNodes("//DAV:schemata/DAV:classschema/DAV:base_class", nsmgr))
{
if (node.InnerText == "File")
{
strNode = node.InnerText;
responseString += strNode +" ";
}
}
return responseString;
but if I try and filter or use axis to reference the parent node from the child I am failing.
An example of my filtering efforts are based at the SelectNodes method.
foreach (XmlNode node in xmlDoc.SelectNodes("//DAV:schemata/DAV:classschema[/DAV:base_class(contains,'File')]", nsmgr)) or
foreach (XmlNode node in xmlDoc.SelectNodes("//DAV:schemata/DAV:classschema[/DAV:base_class=='File']", nsmgr))
along with many, many other variations as examples I have seen is hard to tell if it is LINQ2XML or XDocument and mix in for PHP or other languages where aren't always specified I am now jumpled.
My next attempt will be SelectNodes("//DAV:schemata/DAV:classschemata[/DAV:baseclass(contains,'File')]"#name,nsmgr);
and variations on that.
I have thought from other examples that they had exactly what I wanted but when implemented did not work, for reasons I cannot explain.
This should give you the result you are after. Basically it looks for all classschema elements that have a first element with Value == "File" and then selects their name attributes Value. Please note I also used the string.Join method (pretty handy for stuff like this) to turn the result into a space delimited string which is what you need.
var xmlDoc = XDocument.Load("YourFile.xml");
var result = xmlDoc.Descendants("{DAV:}classschema")
.Where(x => x.Elements().First().Value == "File")
.Attributes("name")
.Select(x => x.Value);
string spaceDelimited = string.Join(" ", result);
This is what I am using that now works. I would swear on a stack of Bibles I had done this in earlier testing and it filed. But it does work. In posting the code I see that the inner loop I was using is now commented out and was the problem. I was looping something that did not need to be looped so was returning empty.
foreach (XmlNode node in xmlDoc.SelectNodes("//DAV:schemata/DAV:classschema[DAV:base_class='File']", nsmgr))
{
//strNode = node.Attributes["name"].Value;
//if (node.InnerText == "File")
// {
strNode = node.Attributes["name"].Value;
//strNode = node.InnerText;
responseString += strNode +" ";
// }
}
return responseString;
Thanks you for the help.
My next step will be to use each of the returned nodes and get all base_class elements and filter for only some of them to be returned. Not sure what I am looking for yet. Need to evaluate the XML to look for uniquenesss to capture what I want further.
Meaning now that I have the only classschema nodes with children elements containing 'File' get some of the child element siblings but not all. Challenge for another day.
In C# I'm using the following to get some elements from an XML file:
var TestCaseDescriptions = doc.SelectNodes("//testcase/htmlcomment");
This works fine and gets the correct information but when my testcase has no htmlcomment it won't add any entry in the XmlNodeList TestCaseDescriptions.
When there's not htmlcomment I would like to have the value "null" as string the TestCaseDescriptions. So in the end I would have an XMLNodeList like
htmlcomment
htmlcomment
null
htmlcomment
htmlcomment
Can anyone describe or create a sample how to make this happen?
var TestCaseDescriptions = doc.SelectNodes("//testcase/htmlcomment");
When there's not htmlcomment I would like to have the value "null" as string the TestCaseDescriptions.
Your problem comes from the fact that if there is no htmlcomment, the number of selected nodes will be one less. The current answer shows what to do when the htmlcomment element is present, but empty, but I think you need this instead, if indeed the whole htmlcomment element is empty:
var testCases = doc.SelectNodes("//testcase");
foreach (XmlElement element in testCases)
{
var description = element.SelectSingleNode("child::htmlcomment");
string results = description == null ? "null" : description.Value;
}
In above code, you go over each test case, and select the child node htmlcomment of the test case. If not found, SelectSingleNode returns null, so the last line checks for the result and returns "null" in that case, or the node's value otherwise.
To change this result into a node, you will have to create the node as a child to the current node. You said you want an XmlNodeList, so perhaps this works for you:
var testCaseDescriptions = doc.SelectNodes("//testcase");
foreach (XmlElement element in testCaseDescriptions)
{
var comment = element.SelectSingleNode("child::htmlcomment");
if (comment == null)
{
element.AppendChild(
doc.CreateElement("htmlcomment")
.AppendChild(doc.CreateTextNode("none")));
}
}
After this, the node set is updated.
Note: apparently, the OP mentions that element.SelectSingleNode("child::htmlcomment"); does not work, but element.SelectSingleNode("./htmlcomment"); does, even though technically, these are equal expressions from the point of XPath, and should work according to Microsoft's documentation.
Try this
XmlDocument doc = new XmlDocument();
var TestCaseDescriptions = doc.SelectNodes("//testcase/htmlcomment");
foreach (XmlElement element in TestCaseDescriptions)
{
string results = element.Value == null ? "" : element.Value;
}
I'm creating XML from JSON retrieved from an HttpWebRequest call, using JsonConvert. The JSON I'm getting back sometimes has duplicate nodes, creating duplicate nodes in the XML after conversion, which I then have to remove.
The processing of the JSON to XML conversion is being done in a generic service call wrapper that has no knowledge of the underlying data structure and so can't do any XPath queries based on a named node. The duplicates could be at any level within the XML.
I've got to the stage where I have a list of the names of duplicate nodes at each level but am not sure of the Linq query to use this to remove all but the first node with that name.
My code:
protected virtual void RemoveDuplicateChildren(XmlNode node)
{
if (node.NodeType != XmlNodeType.Element || !node.HasChildNodes)
{
return;
}
var xNode = XElement.Load(node.CreateNavigator().ReadSubtree());
var duplicateNames = new List<string>();
foreach (XmlNode child in node.ChildNodes)
{
var isBottom = this.IsBottomElement(child); // Has no XmlNodeType.Element type children
if (!isBottom)
{
this.RemoveDuplicateChildren(child);
}
else
{
var count = xNode.Elements(child.Name).Count();
if (count > 1 && !duplicateNames.Contains(child.Name))
{
duplicateNames.Add(child.Name);
}
}
}
if (duplicateNames.Count > 0)
{
foreach (var duplicate in duplicateNames)
{
xNode.Elements(duplicate).SelectMany(d => d.Skip(1)).Remove();
}
}
}
The final line of code obviously isn't correct but I can't find an example of how to rework it to retrieve and then remove all but the first matching element.
UPDATE:
I have found two ways of doing this now, one using the XElement and one the XmlNode, but neither actually removes the nodes.
Method 1:-
foreach (var duplicate in duplicateNames)
{
xNode.Elements(duplicate).Skip(1).Remove();
}
Method 2:-
foreach (var duplicate in duplicateNames)
{
var nodeList = node.SelectNodes(duplicate);
if (nodeList.Count > 1)
{
for (int i=1; i<nodeList.Count; i++)
{
node.RemoveChild(nodeList[i]);
}
}
}
What am I missing?
If you don't want any duplicate names: (assuming no namespaces)
XElement root = XElement.Load(file); // .Parse(string)
List<string> names = root.Descendants().Distinct(x => x.Name.LocalName).ToList();
names.ForEach(name => root.Descendants(name).Skip(1).Remove());
root.Save(file); // or root.ToString()
You might try to solve the problem at the wrong level. In XML is perfectly valid to have multiple nodes with the same name. JSON structures with duplicate property names should be invalid. You should try to do this sanitation at the JSON level and not after it was already transformed to XML.
For the xml cleanup this might be a starting point:
foreach (XmlNode child
in node.ChildNodes.Distinct(custom comparer that looks on node names))
{
.....
}
I can remove an xml node using
XmlNode node = newsItem.SelectSingleNode("XYZ");
node.ParentNode.RemoveChild(node);
But what if I want to remove multiple nodes at once, for example XYZ,ABC,PQR?
Is there any way to remove all of these nodes at once or do I have to remove them one by one?
NOTE: XYZ,ABC,PQR being at the same level(i.e they all have same parent)
Nothing is inbuilt when using the XmlDocument API, but you could write a utility extension method, for example:
public static void Remove(this XmlNode node, string xpath)
{
var nodes = node.SelectNodes(xpath);
foreach (XmlNode match in nodes)
{
match.ParentNode.RemoveChild(match);
}
}
then call:
newsItem.Remove("XYZ|ABC|PQR");
If you can change to the XDocument API, then things may be different.
think you could do something like that using linq to xml.
var listOfNodesToRemove = new[]{"XYZ", "ABC", "PQR"};
var document = XDocument.Load(<pathtoyourfile>);
document.Descendants
.Where(m => listOfNodesToRemove.Contains(m.Name.ToString())
.Nodes()
.Remove();
That would depend very much on the structure (nesting) etc.
But basically yes, for a handful of unrelated elements, select and remove them one at a time.
You could combine them to some extent:
List<string> RemoveNames = ...
var toBeRemoved = doc.Descendants().Where(d => RemoveNames.Contains(d.name));
foreach (var element in toBeRemoved.ToList()) ...
I have a simple XML
<AllBands>
<Band>
<Beatles ID="1234" started="1962">greatest Band<![CDATA[lalala]]></Beatles>
<Last>1</Last>
<Salary>2</Salary>
</Band>
<Band>
<Doors ID="222" started="1968">regular Band<![CDATA[lalala]]></Doors>
<Last>1</Last>
<Salary>2</Salary>
</Band>
</AllBands>
However ,
when I want to reach the "Doors band" and to change its ID :
using (var stream = new StringReader(result))
{
XDocument xmlFile = XDocument.Load(stream);
var query = from c in xmlFile.Elements("Band")
select c;
...
query has no results
But
If I write xmlFile.Elements().Elements("Band") so it Does find it.
What is the problem ?
Is the full path from the Root needed ?
And if so , Why did it work without specify AllBands ?
Does the XDocument Navigation require me to know the full level structure down to the required element ?
Elements() will only check direct children - which in the first case is the root element, in the second case children of the root element, hence you get a match in the second case. If you just want any matching descendant use Descendants() instead:
var query = from c in xmlFile.Descendants("Band") select c;
Also I would suggest you re-structure your Xml: The band name should be an attribute or element value, not the element name itself - this makes querying (and schema validation for that matter) much harder, i.e. something like this:
<Band>
<BandProperties Name ="Doors" ID="222" started="1968" />
<Description>regular Band<![CDATA[lalala]]></Description>
<Last>1</Last>
<Salary>2</Salary>
</Band>
You can do it this way:
xml.Descendants().SingleOrDefault(p => p.Name.LocalName == "Name of the node to find")
where xml is a XDocument.
Be aware that the property Name returns an object that has a LocalName and a Namespace. That's why you have to use Name.LocalName if you want to compare by name.
You should use Root to refer to the root element:
xmlFile.Root.Elements("Band")
If you want to find elements anywhere in the document use Descendants instead:
xmlFile.Descendants("Band")
The problem is that Elements only takes the direct child elements of whatever you call it on. If you want all descendants, use the Descendants method:
var query = from c in xmlFile.Descendants("Band")
My experience when working with large & complicated XML files is that sometimes neither Elements nor Descendants seem to work in retrieving a specific Element (and I still do not know why).
In such cases, I found that a much safer option is to manually search for the Element, as described by the following MSDN post:
https://social.msdn.microsoft.com/Forums/vstudio/en-US/3d457c3b-292c-49e1-9fd4-9b6a950f9010/how-to-get-tag-name-of-xml-by-using-xdocument?forum=csharpgeneral
In short, you can create a GetElement function:
private XElement GetElement(XDocument doc,string elementName)
{
foreach (XNode node in doc.DescendantNodes())
{
if (node is XElement)
{
XElement element = (XElement)node;
if (element.Name.LocalName.Equals(elementName))
return element;
}
}
return null;
}
Which you can then call like this:
XElement element = GetElement(doc,"Band");
Note that this will return null if no matching element is found.
The Elements() method returns an IEnumerable<XElement> containing all child elements of the current node. For an XDocument, that collection only contains the Root element. Therefore the following is required:
var query = from c in xmlFile.Root.Elements("Band")
select c;
Sebastian's answer was the only answer that worked for me while examining a xaml document. If, like me, you'd like a list of all the elements then the method would look a lot like Sebastian's answer above but just returning a list...
private static List<XElement> GetElements(XDocument doc, string elementName)
{
List<XElement> elements = new List<XElement>();
foreach (XNode node in doc.DescendantNodes())
{
if (node is XElement)
{
XElement element = (XElement)node;
if (element.Name.LocalName.Equals(elementName))
elements.Add(element);
}
}
return elements;
}
Call it thus:
var elements = GetElements(xamlFile, "Band");
or in the case of my xaml doc where I wanted all the TextBlocks, call it thus:
var elements = GetElements(xamlFile, "TextBlock");