Remove an XElement from another XDocument

Remove an XElement from another XDocument - c#

I need to remove an XElement from a XDocument.
The problem is i can't just use the .Remove() because my XDocument is not the same as the XElement.
A very important fact is performance.
Scenario: I have an XDocument docSource and I copy this to XDocument doc. I select a Node of docSource and want to delete this Node in my doc.
So far I'm using this workaround (which may also delete some wrong nodes if they got the same Parent Name but this doesn't matter so far):
private static XNode actualNode;
private static void RemoveNode(XDocument doc)
{
doc.Root.Descendants(((XElement)actualNode).Name.LocalName)
.Where(e => actualNode.Parent.Name.LocalName.Equals(e.Parent.Name.LocalName))
.Remove();
}
Is there a better way to do this? And especially a faster way?
My XDocument has like 1000 lines.

Well a better way of doing the existing name-based approach would be:
doc.Root.Descendants(actualNode.Parent.Name)
.Elements(actualNode.Name)
.Remove();
Aside from anything else, that's simpler - and doesn't use just the local name. (If the elements are actually in different namespaces, you should take account of that separately IMO.)
But this is still just using "element name and parent name" as a way of identifying an element. Do you have anything else which will identify the element more reliably? Some kind of attribute? I'm assuming you actually have some idea of what kind of element you'll be finding.
My XDocument has like 1000 lines.
Then it should be blink-of-an-eye quick anyway. Do you actually have any indication that this is causing a performance problem?
Another thing to consider:
Scenario: I have an XDocument docSource and I copy this to XDocument doc. I select a Node of docSource and want to delete this Node in my doc.
Is there any reason you don't just avoid copying the node to start with?

As you have rightly said, if you just rely on the Parent.Name.LocalName you may end up deleting incorrect child nodes when there are Parents with similar names.
If you validate for repeated parent nodes before deleting the child nodes you will be able to over come this issue.
You should be able to achieve accuracy by loading the nodes to an array/list. Then you will be able to find the position of the exact parent node. But I am afraid it will not improve the performance.
For an example you have 3 parent nodes with 'XZY'.
User selects the 2 parent node. So your parent index will be 1(assuming the index starts with 0)
So you should only delete the children under parent index 1.
Hope this helps.

Related

How to get a second or third XML node when using an anonymous type?

I'm using an anonymous type to grab some XML data. All was going well until I ran across a section of XML where there can be 2 or 3 similar nodes. Like in the XML sample below there are 3 separate "Phones". My code was working fine when there was only ONE element that was possible to grab after following the "element path" I led it to. How can i grab a specific one? Or all 3 for that matter? Handling XML is still new to me and there seems to be soo many ways of handling it Searching the web for my exact need here didn't prove successful. Thanks.
var nodes = from node in doc.Elements("ClaimsSvcRs").Elements("ClaimDownloadRs")
select new
{
Phone1 = (string)node.Elements("Communications").Elements("PhoneInfo").Elements("PhoneNumber").FirstOrDefault(),
Phone2 = (string)node.Elements("Communications").Elements("PhoneInfo").Elements("PhoneNumber").FirstOrDefault(),
};
The XML Code is
<?xml version="1.0" encoding="UTF-8"?>
<TEST>
<ClaimsSvcRs>
<ClaimDownloadRs>
<Communications>
<PhoneInfo>
<PhoneTypeCd>Phone</PhoneTypeCd>
<CommunicationUseCd>Home</CommunicationUseCd>
<PhoneNumber>+1-715-5553944</PhoneNumber>
</PhoneInfo>
<PhoneInfo>
<PhoneTypeCd>Phone</PhoneTypeCd>
<CommunicationUseCd>Business</CommunicationUseCd>
<PhoneNumber>+1-715-5552519</PhoneNumber>
</PhoneInfo>
<PhoneInfo>
<PhoneTypeCd>Phone</PhoneTypeCd>
<CommunicationUseCd>Cell</CommunicationUseCd>
<PhoneNumber>+1-715-5551212</PhoneNumber>
</PhoneInfo>
</Communications>
</ClaimDownloadRs>
</ClaimsSvcRs>
</TEST>

I haven't used xpath in a while so i'll let someone else stand in there... but there's a way to select a particular PhoneInfo object based upon its subelements. So if you knew whether you wanted Home or Business or Cell or whatever, you'd be able to select that particular PhoneInfo object. Otherwise if you wanted simple Phone1,2,3 and nulls where ok, use the Skip linq function. Phone2 = query.Skip(1).FirstOrDefault()
lol no worries ;) xpath can be intermixed in here, was my thought, and might be more elegant if your CommunicationUseCd fields were deterministic. Then you could have Home = ... and Work = ..., etc, instead of Phone1 & Phone2
The same could be accomplished by slipping a where clause into each your query lines

If you're up for LINQ you can get all your elements in one go:
foreach(XElement phone in XDocument.Parse(xmlString).Descendants("PhoneInfo"))
{
Console.WriteLine(phone.Element("PhoneNumber").Value);
//etc
}
I find XDocument & LINQ a lot easier than XmlDocument & XPath, if you're okay with the alternative. There's more info on them here

Order HtmlNodes Based on their position on the HTML Page (C# / XPath)

Context:
I am parsing the result of a Query on this service, but the HTML with the result is a mess.
My goal is the build a "KeyValue" pair with each "attribute and value" shown as result of this query.
At the moment only one way came into my mind to solve it.
Logic for Parsing:
Select all the Attribute nodes
Select all the value nodes
Match their "indexes" on each collection built to build the Key Value Pairs
E.g: Attribute[0] with Value[0] -> (In this service, that would be "CNPJ" and "12.272.084/0001-00").
Problem:
Even tho i managed to find a XPath expression to fetch all the attributes nodes:
attrNodes = htmlDoc.DocumentNode.SelectNodes ("//td[#bgcolor='#f1f1b1']/*/font[#face='Verdana']");
I could not manage to find one for the value nodes aswell, since there are different types of nodes that actually look the same when rendered by Html ( "b" and "strong" for example).
There are even nodes with different hierarquies that prevented me from using Wildcards ("*") on XPath to solve it (single tag or two tags nested for example)
My Goal:
Write XPaths to reach each different subset of nodes with values
Put all the nodes in a single Collection
Order the nodes of this Collection based on the position of each node in the Html (nodes that appear first on the HTML will be on the begining of the list)
Any idea of how can i achieve my goal ?
HTML Sample:
You can either give it a check here
or Query yourself the service by typing : 12272084000100 on the CNPJ textbox
and clicking on "Pesquisar". After that, you just have to click on the text "Companhia Eletrica de Alagoas"
Thanks in Advance

I just found an Attribute that can be found on the "HtmlNode" class of the HtmlAgilityPack Framework that managed to solve my problem.
According to this documentation about the HtmlNode Class:
StreamPosition
Gets the stream position of this node in the document, relative to the start of the document.
Here is the output of my tests using a list of tables found in this very same Html Page (tables used for testing purposes)
// HtmlNodeCollection of Tables
tableNodes[0].StreamPosition
925
tableNodes[1].StreamPosition
1651
tableNodes[2].StreamPosition
2387
Ordering my list using this StreamPosition as parameter managed to solve my problem.
List<HtmlNode> OrderedList = valueNodes.OrderBy ( node => node.StreamPosition ).ToList<HtmlNode>();

Check all the children for XElement

I have XElement object which is my XML tree read from XML file. Now I want to check all the nodes in this tree to get first attribute name and value. Is there any simple way to go through all of the nodes (from root till leaves)? My XML file has got very many different and strange nodes - that's why it's harder to solve this issue. I thought about writing some recursion, but hope it's another way to solve that easier.

Maybe take a look to Xpath. an XPath like this //*[#id=42] could do the job.
It means get all nodes which have an attribute "id" of value 42.
You can do just //* which gonna returns all nodes in a tree.
Xpath :
http://msdn.microsoft.com/en-gb/library/ms950786.aspx
Syntax :
http://msdn.microsoft.com/en-us/library/ms256471.aspx

You can get all children elements using XElement.Elements().
Here's some code using recursion to get all elements of each level:
void GetElements(XElement element){
var elements = element.Elements();
foreach(Element e in elements){
//some stuff here
if(e.Elements() != null)
GetElements(e);
}
}

Select child nodes, but ignore non-elements with XPath?

Given the following XML document for example:
<?xml version="1.0"?>
<UrdaObject>
<Date>
<Year>2011</Year>
<Month>5</Month>
<Day>18</Day>
<Hours>8</Hours>
<Minutes>47</Minutes>
<Seconds>36</Seconds>
</Date>
<random_value>24</random_value>
</UrdaObject>
And the understanding the child::node() - Selects all child nodes of the current node how would I create an XPath (starting from the root) that would select all child nodes EXCEPT text, comments, and other things that are NOT elements. For example, when using this code to create a tree view in WPF:
// x is some XmlDocument, xmlTree is my WPF TreeView
XmlDataProvider provider = new XmlDataProvider();
provider.Document = x;
Binding binding = new Binding();
binding.Source = provider;
binding.XPath = "child::node()";
xmlTree.SetBinding(TreeView.ItemsSourceProperty, binding);
How would I go about creating my XPath statement so I build a treeview with nodes going all the way down and stopping before the raw text? For example it would generate a view of:
UrdaObject
Date
Year
...
Instead of...
UrdaObject
Date
Year
2011 (Don't want this!)
...
The sample XML files is just for me to explain my situation. The expression should be able to navigate any valid XML file and pull the elements, but not the individual text.
How did we fix this? I had switched all references of child::node() to child::*. However, I had NOT corrected one line in my XAML, which was pulling child::node(). Correcting this line made the application behave correctly... and made me feel silly.

child::node() finds all child nodes. child::* finds all element nodes.

it's as simple as *.
(that gets immediate children, however; if you want all descendant elements, it would be descendant::*)

child::* will exclude text nodes and leave only element nodes
child::text() will include only text nodes
child::node() will include both element and text nodes
http://www.w3.org/TR/xpath/#location-paths

Not sure if this is what you want but could it be done this way?
var doc =XDocument.Parse(#"
<UrdaObject>
<Date>
<Year>2011</Year>
<Month>5</Month>
<Day>18</Day>
<Hours>8</Hours>
<Minutes>47</Minutes>
<Seconds>36</Seconds>
</Date>
<random_value>24</random_value>
</UrdaObject>
");
var query = from s in doc.Descendants()
select s.Name;
foreach (var name in query)
{
Console.WriteLine(name);
}

XML ancestors wanted

In C#, I need to get
currentnode.parentnode.parentnode.parentnode.firstchild.lastchild.lastchild
I am using to generating MLM tree some of the label which represent individual node overload at the fourth level so I was trying to get that nodes and separate them.
I am new to XML, I hope my question is clear.

If you have the current node you want to work on, then XmlNode defines a ParentNode FirstChild, and LastNode property that you can use to do this see.

Consider using XQuery or XPath in order to perform queries on your XML tree.
There's a nice tutorial here, showing all the common options.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Remove an XElement from another XDocument - c#

Related

How to get a second or third XML node when using an anonymous type?

Order HtmlNodes Based on their position on the HTML Page (C# / XPath)

Check all the children for XElement

Select child nodes, but ignore non-elements with XPath?

XML ancestors wanted

Categories

Resources