Retrieving XML value in C# from arbitary key path

Retrieving XML value in C# from arbitary key path - c#

I've got a project where I'm currently implementing support for reading values from an XML file via an arbitrary/user-defined path within the document's keys.
For example, if the document looks like this:
<information>
<machine>
<foo></foo>
<name>
test machine
</name>
<bar>spam</bar>
</machine>
</information>
then the user might want to retrieve the value from the name key in information/machine.
Is there a way using XDocument/XPath that I can look up the values the user wants without knowing/coding in the schema for the document?
My initial thought was working through the document with a form of recursive function utilizing XElement items, but I feel like there ought to be a simpler/cleaner solution that doesn't require me rolling my own lookup code.
I also tried something along these lines
var doc = XDocument.Load("C:\Path\to\XML\file.xml");
// Split the parent keys string
XElement elem = doc.Root.XPathSelectElement("path/to/key");
if (elem != null && elem.Attribute("wantedKeyName") != null)
replace = elem.Attribute("wantedKeyName").Value;
but elem is always null. I'm assuming there's a problem with the way I'm defining my path or utilizing XPathSelectElement, but I haven't worked it out yet.

static XmlNode SearchNode(XmlNodeList nodeList, string nodeName)
{
for (int i = 0; i < nodeList.Count; i++)
{
if (nodeList[i].Name == nodeName)
{
return nodeList[i];
}
if (nodeList[i].HasChildNodes)
{
XmlNode node = SearchNode(nodeList[i].ChildNodes, nodeName);
if (node != null)
{
return node;
}
}
}
return null;
}
static XmlNodeList SearchNodeByPath(XmlNodeList nodeList, string xPath)
{
for (int i = 0; i < nodeList.Count; i++)
{
var nodes = nodeList[i].SelectNodes(xPath);
if (nodes != null && nodes.Count > 0)
{
return nodes;
}
if (nodeList[i].HasChildNodes)
{
XmlNodeList innerNodes = SearchNodeByPath(nodeList[i].ChildNodes, xPath);
if (innerNodes != null && innerNodes.Count > 0)
{
return innerNodes;
}
}
}
return null;
}
this is using of methods :
var node = SearchNode(doc.ChildNodes, "compiler");
var node1 = SearchNodeByPath(doc.ChildNodes, "compilers/compiler");

I turns out my solution using XPathSelectElement was the correct approach, I just had to prepend the path/to/key string with //.
The following code I ended up using does that and strips off any whitespace around the outside of the value (in case the value is on a separate line than the opening tag.
// xml is a struct with the path to the parent node (path/to/key)
// and the key name to look up
// Split the parent keys string
XElement elem = doc.Root.XPathSelectElement("//" + xml.KeyPath);
if (elem != null && elem.Element(xml.Key) != null)
replace = elem.Element(xml.Key).Value.Trim();

Related

How to save attribute value on xml file?

I'm trying to save a value in my xml file. In the code below, the line "s.Attribute("Value").Value = value; break;" executes and the file is saved but it doesn't change the value of the attribute
public void CustomSettingXML_WriteValue(string key, string value)
{
XDocument doc = XDocument.Load(xmlFile);
var elements = from x in XElement.Load(xmlFile).Elements("Item") select x;
foreach (var s in elements)
{
if (s.Attribute("Text").Value == key)
{
s.Attribute("Value").Value = value;
doc.Save(#xmlFile);
break;
}
}
}

There are in fact two things that might have to vary.
a) You are reading the Xml using XDocument.Load as well as XElement.Load. While altering, you are using Elements, and while saving you are using XDocument.
b) Since hierarchy in XML is (Items.Item), it would be better you use Descendants to parse the elements.
Full Code
public void CustomSettingXML_WriteValue(string key, string value)
{
XDocument doc = XDocument.Load(xmlFile);
var elements = from x in doc.Descendants("Item") select x;
foreach (var s in elements)
{
if (s.Attribute("Text").Value == key)
{
s.Attribute("Value").Value = value;
doc.Save(#xmlFile);
break;
}
}
}

Remove all parent node's content if any child node has no value

I want to remove the parent node if any of its children have no value.
I've tried several methods and nothing works. Remove "S_Industry_Code" parent and all its child nodes if child node "E_Code_List_Qualifier" or "E_Industry_Code is blank. I want a generic case so the code can be used in other empty parent or child nodes too. Please help! Any suggestions?
namespace TEST
{
class Remove_Empty_Tags
{
public static void Main()
{
XmlDocument doc = new XmlDocument();
doc.Load(#"C:\Users\TestDLMS\out\I_636806391809983753.xml");
new RemoveNulls().RemoveEmptyNodes(doc);
doc.Save(#"C:\users\Desktop\AllNullsRemoved.xml");
}
}
}
class RemoveNulls
{
public void RemoveEmptyNodes(XmlDocument doc)
{
XmlNodeList nodes = doc.SelectNodes("//node()");
foreach (XmlNode node in nodes)
if ((node.Attributes != null && node.Attributes.Count == 0) && (node.ChildNodes != null && node.ChildNodes.Count == 0))
{
// node.ParentNode.RemoveChild(node); //removes only nodes that are blank
node.ParentNode.RemoveAll(); //removes child nodes leaves empty parent
// node.ParentNode.ParentNode.RemoveAll(); //removes everything
}
}
}
// Content of XML file
<File>
<T_Requisition_511R Standard="X12">
<S_Transaction_Set_Header>
<E_Transaction_Set_Code>511</E_Transaction_Set_Code>
<E_Transaction_Set_Number>0001</E_Transaction_Set_Number>
</S_Transaction_Set_Header>
<S_Beginning_Segment_for_Material_Management>
<E_Transaction_Set_Purpose_Code>00</E_Transaction_Set_Purpose_Code>
<E_Transaction_Type_Code>A0</E_Transaction_Type_Code>
<E_Date>20181217</E_Date>
<E_Time>152620</E_Time>
</S_Beginning_Segment_for_Material_Management>
<L_Assigned_Number>
<L_Code_Source_Information>
<S_Industry_Code>
<E_Code_List_Qualifier_Code>A9</E_Code_List_Qualifier_Code>
<E_Industry_Code> </E_Industry_Code>
</S_Industry_Code>
<S_Industry_Code>
<E_Code_List_Qualifier_Code>79</E_Code_List_Qualifier_Code>
<E_Industry_Code>03</E_Industry_Code>
</S_Industry_Code>
<S_Industry_Code>
<E_Code_List_Qualifier_Code>80</E_Code_List_Qualifier_Code>
<E_Industry_Code> </E_Industry_Code>
</S_Industry_Code>
</L_Code_Source_Information>
</L_Assigned_Number>
</T_Requisition_511R>
</File>

string xml_string = #"<File>...</File>";
var xml = XElement.Parse(xml_string);
var empty = xml.Descendants()
.Where(d => d.Elements().Count() == 0 && d.Value.Trim() == "")
.Select(d => d.Parent);
empty.Remove();

You are not checking for the case when the attributes are null.
This is easier IMO using XDocument vs XmlDocument. Here is an example:
void Main()
{
var doc = XDocument.Load(#"D:\file.xml");
doc = RemoveNulls.RemoveEmptyNodes(doc);
doc.Dump();
}
// Define other methods and classes here
class RemoveNulls
{
public static XDocument RemoveEmptyNodes(XDocument doc)
{
foreach (XElement element in doc.Descendants().ToList()){
if(!element.HasAttributes && String.IsNullOrWhiteSpace(element.Value) && !element.HasElements)
element.Remove();
}
return doc;
}
}

How to iterate XML by using XDocument in .Net

I have a big XML file where I am taking small snippet by using ReadFrom() and then I will get xmlsnippet which contains leaf, sas, kir tags at different positions (sometimes leaf at top compare to kir or viceversa).
Now the thing is I am using three foreach loop to get these values which is bad logic and it will take time when this snippet also big.
Is there anyway I can use one foreach loop and then three if loop inside foreach to get values?
arr is a custom arraylist
var xdoc = new XDocument(xmlsnippet);
string xml = RemoveAllNamespaces(xdoc.ToString());
foreach (XElement element in XDocument.Parse(xml).Descendants("leaf"))
{
arr.Add(new Test("leaf", element.Value, 2));
break;
}
foreach (XElement element in XDocument.Parse(xml).Descendants("sas"))
{
arr.Add(new Test("sas", element.Value, 2));
break;
}
foreach (XElement element in XDocument.Parse(xml).Descendants("kir"))
{
if (element.Value == "0")
arr.Add(new Test("kir", "90", 2));
break;
}

You only need to Parse that xmlsnippet once (assuming it fits in memory) and then use XNamespace to qualify the right XElement. No need to call RemoveAllnamespaces which I guess does what its name implies and probably does so in an awful way.
I used the following XML snippet as example input, notice the namespaces a, b and c:
var xmlsnippet = #"<root xmlns:a=""https://a.example.com""
xmlns:b=""https://b.example.com""
xmlns:c=""https://c.example.com"">
<child>
<a:leaf>42</a:leaf>
<a:leaf>43</a:leaf>
<a:leaf>44</a:leaf>
<somenode>
<b:sas>4242</b:sas>
<b:sas>4343</b:sas>
</somenode>
<other>
<c:kir>80292</c:kir>
<c:kir>0</c:kir>
</other>
</child>
</root>";
And then use Linq to either return an instance if your Test class or null if no element can be found. That Test class instance is then added to the arraylist.
var arr = new ArrayList();
var xdoc = XDocument.Parse(xmlsnippet);
// add namespaces
var nsa = (XNamespace) "https://a.example.com";
var nsb = (XNamespace) "https://b.example.com";
var nsc = (XNamespace) "https://c.example.com";
var leaf = xdoc.Descendants(nsa + "leaf").
Select(elem => new Test("leaf", elem.Value, 2)).FirstOrDefault();
if (leaf != null) {
arr.Add(leaf);
}
var sas = xdoc.Descendants(nsb + "sas").
Select(elem => new Test("sas", elem.Value, 2)).FirstOrDefault();
if (sas != null) {
arr.Add(sas);
}
var kir = xdoc.
Descendants(nsc + "kir").
Where(ele => ele.Value == "0").
Select(elem => new Test("kir", "90", 2)).
FirstOrDefault();
if (kir != null) {
arr.Add(kir);
}
I expect this to be the most efficient way to find those nodes if you want to stick with using XDocument. If the xml is really huge you might consider using an XMLReader but that probably only helps if memory is a problem.
If you want to do it one LINQ Query you can do this:
var q = xdoc
.Descendants()
.Where(elem => elem.Name.LocalName == "leaf" ||
elem.Name.LocalName == "sas" ||
elem.Name.LocalName == "kir" && elem.Value == "0" )
.GroupBy(k=> k.Name.LocalName)
.Select(k=>
new Test(
k.Key,
k.Key != "kir"? k.FirstOrDefault().Value: "90",
2)
);
arr.AddRange(q.ToList());
That query goes looking for all elements named leaf, sas or kir, groups them on the elementname and then takes the first element in each group. Notice the extra handling in case the elementname is kir. Both the where clause and the projection in Select need to deal with that. You might want to performance test this as I'm not sure how efficient this will be.
For completeness here is an XmlReader version:
var state = FoundElement.NONE;
using(var xe = XmlReader.Create(new StringReader(xmlsnippet)))
while (xe.Read())
{
// if we have not yet found an specific element
if (((state & FoundElement.Leaf) != FoundElement.Leaf) &&
xe.LocalName == "leaf")
{
// add it ... do not change the order of those arguments
arr.Add(new Test(xe.LocalName, xe.ReadElementContentAsString(), 2));
// keep track what we already handled.
state = state | FoundElement.Leaf;
}
if (((state & FoundElement.Sas) != FoundElement.Sas) &&
xe.LocalName == "sas")
{
arr.Add(new Test(xe.LocalName, xe.ReadElementContentAsString(), 2));
state = state | FoundElement.Sas;
}
if (((state & FoundElement.Kir) != FoundElement.Kir) &&
xe.LocalName == "kir")
{
var localName = xe.LocalName; // we need this ...
var cnt = xe.ReadElementContentAsString(); // ... because this moves the reader
if (cnt == "0") {
arr.Add(new Test(localName, "90", 2));
state = state | FoundElement.Kir;
}
}
}
And here is the enum with the different states.
[Flags]
enum FoundElement
{
NONE =0,
Leaf = 1,
Sas = 2,
Kir = 4
}

How to check if a node has a single child element which is empty?

I have the following code,
XDocument doc = XDocument.Parse(input);
var nodes = doc.Element(rootNode)
.Descendants()
.Where(n =>
(n.Value != "0"
&& n.Value != ".00"
&& n.Value != "false"
&& n.Value != "")
|| n.HasElements)
.Select(n => new
{
n.Name,
n.Value,
Level = n.Ancestors().Count() - 1,
n.HasElements
});
var output = new StringBuilder();
foreach (var node in nodes)
{
if (node.HasElements)
{
output.AppendLine(new string(' ', node.Level) + node.Name.ToString() + ":");
}
else
{
}
My problem is that in case my parent node has only one empty child node, I need to insert one extra blank line. I could not figure out how to check if the only child is empty.
I can get the number of descendants using Descendants = n.Descendants().Count() But I do not see how can I test if that only child is empty or not.

My understanding is that you need all of the parent nodes who only have one child node, and that child node is empty, from what I understand --
Here's a simple test that accomplishes this: It doesn't use your example specifically but accomplishes the task. If you provide what your XML looks like I can try and modify my example to fit your post, if the below is not easily transplanted into your project :)
(Taken from a console app, but the query that actually gets the nodes should work.
static void Main(string[] args)
{
var xml = #"<root><child><thenode>hello</thenode></child><child><thenode></thenode></child></root>";
XDocument doc = XDocument.Parse(xml);
var parentsWithEmptyChild = doc.Element("root")
.Descendants() // gets all descendants regardless of level
.Where(d => string.IsNullOrEmpty(d.Value)) // find only ones with an empty value
.Select(d => d.Parent) // Go one level up to parents of elements that have empty value
.Where(d => d.Elements().Count() == 1); // Of those that are parents take only the ones that just have one element
parentsWithEmptyChild.ForEach(Console.WriteLine);
Console.ReadKey();
}
This returns only the 2nd node, which is the one containing only one empty node, where empty is assumed to be a value of string.Empty.

I was trying to solve this problem myself and this is what I come up with:
XDocument doc = XDocument.Parse(input);
var nodes = doc.Element(rootNode).Descendants()
.Where(n => (n.Value != "0" && n.Value != ".00" && n.Value != "false" && n.Value != "") || n.HasElements)
.Select(n => new { n.Name, n.Value, Level = n.Ancestors().Count() - 1,
n.HasElements, Descendants = n.Descendants().Count(),
FirstChildValue = n.HasElements?n.Descendants().FirstOrDefault().Value:"" });
var output = new StringBuilder();
foreach (var node in nodes)
{
if (node.HasElements)
{
output.AppendLine(new string(' ', node.Level) + node.Name.ToString() + ":");
if (0 == node.Level && 1 == node.Descendants && String.IsNullOrWhiteSpace(node.FirstChildValue))
output.AppendLine("");
}

changing a node type to #text whilst keeping the innernodes with the HtmlAgilityPack

I'm using the HtmlAgilityPack to parse an XML file that I'm converting to HTML. Some of the nodes will be converted to an HTML equivalent. The others that are unnecessary I need to remove while maintaining the contents. I tried converting it to a #text node with no luck. Here's my code:
private HtmlNode ConvertElementsPerDatabase(HtmlNode parentNode, bool transformChildNodes)
{
var listTagsToReplace = XmlTagMapping.SelectAll(string.Empty); // Custom Dataobject
var node = parentNode;
if (node != null)
{
var bNodeFound = false;
if (node.Name.Equals("xref"))
{
bNodeFound = true;
node = NodeXref(node);
}
if (node.Name.Equals("graphic"))
{
bNodeFound = true;
node = NodeGraphic(node);
}
if (node.Name.Equals("ext-link"))
{
bNodeFound = true;
node = NodeExtLink(node);
}
foreach (var infoTagToReplace in listTagsToReplace)
{
if (node.Name.Equals(infoTagToReplace.XmlTag))
{
bNodeFound = true;
node.Name = infoTagToReplace.HtmlTag;
if (!string.IsNullOrEmpty(infoTagToReplace.CssClass))
node.Attributes.Add("class", infoTagToReplace.CssClass);
if (node.HasAttributes)
{
var listTagAttributeToReplace = XmlTagAttributeMapping.SelectAll_TagId(infoTagToReplace.Id); // Custom Dataobject
for (int i = 0; i < node.Attributes.Count; i++ )
{
var bDeleteAttribute = true;
foreach (var infoTagAttributeToReplace in listTagAttributeToReplace)
{
if (infoTagAttributeToReplace.XmlName.Equals(node.Attributes[i].Name))
{
node.Attributes[i].Name = infoTagAttributeToReplace.HtmlName;
bDeleteAttribute = false;
}
}
if (bDeleteAttribute)
node.Attributes.Remove(node.Attributes[i].Name);
}
}
}
}
if (transformChildNodes)
for (int i = 0; i < parentNode.ChildNodes.Count; i++)
parentNode.ChildNodes[i] = ConvertElementsPerDatabase(parentNode.ChildNodes[i], true);
if (!bNodeFound)
{
// Replace with #text
}
}
return parentNode;
}
At the end I need to do the node replacement (where you see the "Replace with #text" comment) if the node is not found. I've been ripping my hair (what's left of it) out all day and it's probably something silly. I'm unable to get the help to compile and there is no online version. Help Stackoverflow! You're my only hope. ;-)

I would think you could just do this:
return new HtmlNode(HtmlNodeType.Text, parentNode.OwnerDocument, 0);
This of course adds the node to the head of the document, but I assume you have some sort of code in place to handle where in the document the node should be added.
Regarding the documentation comment, the current (as of this writing) download of the Html Agility Pack documentation contains a CHM file which doesn't require compilation in order to view.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Retrieving XML value in C# from arbitary key path - c#

Related

How to save attribute value on xml file?

Remove all parent node's content if any child node has no value

How to iterate XML by using XDocument in .Net

How to check if a node has a single child element which is empty?

changing a node type to #text whilst keeping the innernodes with the HtmlAgilityPack

Categories

Resources