Cannot find specific XML elements in XML Document - c#

I just ran into a head scratcher, I'm not quite sure why this does not work. I want to find all the elements with the attribute "video".
My XML document looks like this:
<MainMenu>
<div id="BroughtInMenu">
<div class="menuItem0">
Menu Item
<div class="subMenu0">
<div class="menuItem1">
Dictation
<div class="subMenu1">
<div class="menuItem2" video="1">Fee Earner</div>
<div class="menuItem2" video="1">Secretary</div>
<div class="menuItem2" video="1">View File History</div>
</div>
</div>
<div class="menuItem1">
PM Advanced Agenda
<div class="subMenu1">
<div class="menuItem2">
Help
<div class="subMenu2">
<div class="menuItem3" video="1">Release Notes</div>
</div>
</div>
<div class="menuItem2">
System Maintenance
<div class="subMenu2">
<div class="menuItem3" video="1">Additional Field Setup</div>
<div class="menuItem3" video="1">Role Permission Maintenance</div>
<div class="menuItem3" video="1">Shared Diary Permissions</div>
</div>
</div>
<div class="menuItem2">
Utilities
<div class="subMenu2">
<div class="menuItem3" video="1">Change Entity Subtype</div>
<div class="menuItem3" video="1">Field Maintenance</div>
<div class="menuItem3" video="1">Move Client and Files to Fee Earner</div>
<div class="menuItem3" video="1">Reallocate Files</div>
</div>
</div>
</div>
</div> . . . . . . . . . . . . . . . . ..
This is very the same as HTML. This is for a website, so at the end I want to get all the elements with the attribute "video".
If I can do this, then I will only grab the div elements with the attribute "video", and then I will be able to use that for something else, like in a search, where I actually search the xml document and return the div, etc etc... hope you see my drift here...
Because the video attribute is going to point to a location, it will be very useful for html purposes to just jump to the video when the div is clicked.
So far I have tried this, but i am not getting the elements at all:
XElement xDoc = XElement.Load(Server.MapPath("automation/xml/mainMenu.xml"));
IEnumerable<XElement> list = from el in xDoc.Elements("div") where el.Attribute("video") != null select el;
foreach (XElement element in list)
{
//Nothing found?
}
I also thought about REGEX... maybe regex will be able to pull the divs i want, already in text format so that i can just push it into an html element in the website?
Any help will be greatly appreceiated!

Use Descendands instead of Elements. Elements returns just immediate children.
var xDoc = XElement.Load(Server.MapPath("automation/xml/mainMenu.xml"));
var list = from el in xDoc.Descendants("div")
where el.Attribute("video") != null
select el;
foreach (XElement element in list)
{
//Nothing found?
}

You can select elements where a particular attribute is present with XPath. To use the XPath extension methods, you need to include the namespace.
using System.Xml.XPath;
An XPath such as "//div[#video]" will include all "div" tags at any level, but filter the selected elements to only those with a "video" attribute, so you're not looping unnecessarily through lots of elements checking for the presence of an attribute.
var xDoc = XElement.Load(Server.MapPath("automation/xml/mainMenu.xml"));
foreach (var divWithVideo in xDoc.XPathSelectElements ("//div[#video]")) {
Console.WriteLine (divWithVideo);
}
Here you are only iterating on the elements with a "video" attribute.

Related

document.GetElementsByTagName("div") does not return children

So I have this html code:
<div class="gid-day-container">
<div id="activityCol">
<div id="act-01"></div>
<div id="act-02"></div>
<div id="act-03"></div>
</div>
</div>
And I'm using this code to look voor html elements with the tag div:
var infoBlocks = doc.GetElementsByTagName("div");
It catches the div with the class grid-containter and it catches the div with the id activityCol, but it doesn't catch the three divs with the id act-..
I'm using c# code inside visual studio.
I can't seem to figure out why so help would be appreciated. Thanks in advance!
You should have this:
document.getElementsByTagName("div");
Not:
document.GetElementsByTagName("div");

How to count nested div using selenium c#?

<div class="bodyCells">
<div style="position:absolute;left:0;">
<div style="overflow:hidden;">
<div title="AAA" class="pivotTableCellWrap">AAA</div>
<div title="BBB" class="pivotTableCellWrap">BBB</div>
</div>
<div>
<div title="AAA-123" class="pivotTableCellWrap">AAA-123</div>
<div title="BBB-123" class="pivotTableCellWrap">BBB-123</div>
</div>
</div>
</div>
I have two bodycells div in my page and I want the count the nested div inside the second one.
Required output :- I want the count=2
Tried Approach :-
int rowCount = driver.FindElements(By.XPath("//div[#class='bodyCells[2]']//div").Count());
Console.WriteLine(rowCount);
you can use the below modified XPath inorder to get the count of second nested div
XPath: //div[#class='bodyCells']/div/div[2]/div
Code:
var rowCount = _driver.FindElements(By.XPath("//div[#class='bodyCells']/div/div[2]/div")).Count;
Console.WriteLine(rowCount);
As per the HTML you have provided to count the nested child <divs> inside the second (parent) <div> you can use either of the following solution:
CssSelector:
List<string> elements = driver.FindElements(By.CssSelector("div.bodyCells div.pivotTableCellWrap[title*='-']"));
Console.WriteLine(elements.Count);
XPath:
List<string> elements = driver.FindElements(By.XPath("//div[#class='bodyCells']//div[#class='pivotTableCellWrap' and contains(#title,'-')]"));
Console.WriteLine(elements.Count);

Get all elements in a NodeCollections

I have an html file :
<div class="form-wrapper">
<div></div>
<div class="Clearfix">
<div></div>
<div></div>
<span></span><span class="time">Time</span>
</div>
<div></div>
<div class="Clearfix">
<div></div>
<div></div>
<span></span><span class="time">Time1</span>
</div>
<div></div>
<div class="Clearfix">
<div></div>
<div></div>
<span></span><span class="time">Time2</span>
</div><div></div>
<div class="Clearfix">
<div></div>
<div></div>
<span></span><span class="time">Time3</span>
</div>
I'm using the c# code below to get all the times items :
var node_1 = htmlDocument.DocumentNode.SelectNodes("//div[#class='form-wrapper']").First();
var ITEM = node_1.SelectNodes("//div[#class='clearfix']");
for (int Node = 0; Node < ITEM.Count; Node++)
{
Console.WriteLine(ITEM[Node].SelectNodes("//span[#class='time']")[1].InnerText.Trim());
}
Console.ReadKey();
I'm taking the First() "Form-wrapper" since they're many .
I tried to use this too :
foreach (var Node in node_1.SelectNodes("//div[#class='clearfix']"))
{
//
}
Issue is : as you can see I have 4 Clearfix Classes so i need to get the result :
Time
Time1
Time2
Time3
but for some reasons i only get :
Time
Time
Time
Time
When you are querying over some node you don't need // at the beginning, if you are adding it query will be executed over whole document.
You need to take first node after selecting, so you need to take node with index 0 not 1
This 2 points will solve your problem, but there are some improvements you can do
Instead of SelecNodes().First() you can user SelectSingleNode()
If you don't need any information about parent nodes you can directly query
for child nodes - htmlDocument.SelectNodes("\\span[#class='time']") will do all the work

c# How to grab string from inside <b> that's inside a div class

So I have been using things like this:
webBrowser1.Document.GetElementById("month").SetAttribute("value", exp1);
Which allows me to set values, but now I want to grab a value instead of replacing it.
<div class="contents">
<div class="background">stuff</div>
<div class="content">
<h2>title</h2>
<p>
Blah blah number is <b>0100000</b>
<p>
</div>
</div>
How can I grab the number inside the tag that's inside the content class? Kind of stuck!
Thanks!
It's the code you need:
string theText;
foreach (HtmlElement item in webBrowser1.Document.GetElementsByTagName("div"))
{
if (item.GetAttribute("className") == "content")
theText = item.GetElementsByTagName("b")[0].InnerText;
}
You can always get InnerText or InnerHtml of a chosen tag.
Try this answer: https://stackoverflow.com/a/2958449/1786034
As explained in stackoverflow question
document.getElementById('id').getElementsByTagName('b').firstChild.nodeValue

how to get html div element innertext by id using regular expression in C#

I'm getting full html code using WebClient. But i need to get specified div from full html using regular expression.
for example:
<body>
<div id="main">
<div id="left" style="float:left">this is a <b>left</b> side:<div style='color:red'> 1 </div>
</div>
<div id="right" style="float:left"> main side</div>
<div>
</body>
if i need div named 'main', function return
<div id="left" style="float:left">this is a <b>left</b> side:<div style='color:red'> 1 </div>
</div>
<div id="right" style="float:left"> main side</div>
If i need div named 'left', function return
this is a <b>left</b> side:<div style='color:red'> 1 </div>
If i need div named 'right', function return
main side
How can i do?
Why do people insist on trying to use regex to parse html? You can probably do it if you exclude a whole host of edge-cases... but just use HTML Agility Pack and you're done:
HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(...); // or Load
string main = doc.DocumentNode.SelectSingleNode("//div[#id='main']").InnerHtml;
(note I'm assuming it is not xhtml; if it is xhtml, use XmlDocument or XDocument, and very similar code to the above)
string divname = "somename";
Match m = RegEx.Match(htmlContent, "<div[^>]*id="+divname+".*?>(.*?)</div");
string contenct = m.Groups[1].Tostring();
won't work if you have nested divs inside the desired div

Categories