Get data from HTML child class - c#

I’m attempting to create a tool, in C#, which gathers and analyses data from a web page/form. There are basically 2 different types of data. Data entered by a user and data created by the system (I don’t have access to).
The data created by the user is kept in fields and the form uses IDs - so GetElementByID is used.
The problem I’m running into is obtaining the data created by the system. It shows on the form, but isn’t associated to an ID. I may be reading/interpreting the HTML incorrectly, but it appears to be a child class (I don’t have much HTML experience). I’m attempting to get the “Date Submitted” data (near the bottom of the code). Sample of the HTML code:
<div class="bottomSpace">
<div class="importfromanotherorder">
<div class="level2Panel" >
<div class="left">
<span id="if error" class="error"></span>
</div>
<div class="right">
Enter Submission ID
<input name="Submission$ID" type="text" id="Submission_ID" class="textbox" />
<input type="submit" name="SumbitButton" value="Import" id="SubmitButton" />
</div>
</div>
</div>
</div>
<div class="bottomSpace">
<div class="detailsinfo">
<div class="level2Panel" >
<div class="left">
<h5>Product ID</h5>
1234567
<h5>Sub ID</h5>
Not available
<h5>Product Type</h5>
Type 1
</div>
<div class="right">
<h5>Order Number</h5>
0987654
<h5>Status</h5>
Ordered
<h5>Date Submitted</h5>
7 17 2012 5 45 09 AM
</div>
</div>
</div>
</div>
Using GetElementsByTagName (searching for “div”) and then using GetAttribute(“className”) (searching for “right”) generates some results, but as there are 2 “right” classes, it’s not working as intended.
I’ve tried searching by className = “detailsinfo”, which I can find, but I’m not sure how I could go about getting down to the “right” class. I tried sibling and children, but the results don't appear to be working. The next possible problem is that it appears the date data is actually text belonging to class “right” and not element “Date Submitted” .
So basically, I'm curious as to how the best approach would be to get the data I'm looking for. Would I need to get all of the class “right” text and then try and extract the date string?
Apologizes if there is too much info or not enough of the required info :) Thanks in advance!
EDIT: Added how GetElementsByTagName is called using C# - per Icarus's comment.
HtmlDocument doc = webBrowser1.Document;
HtmlElementCollection elemColl = doc.GetElementsByTagName("div");

This will do it if the 'right' instance you want is the 2nd. Two approaches given:
The commented-out approach is it's zero based, so uses instance 1.
The second approach is xpath and is therefore one-based so uses instance 2.
private string ReadHTML(string html)
{
System.Xml.XmlDocument doc = new System.Xml.XmlDocument();
doc.LoadXml(html);
System.Xml.XmlElement element = doc.DocumentElement;
//This commented-out approach works and might be preferred if you want to iterate
//over a node set instead of choosing just one node
//string key = "//div[#class='right']";
//System.Xml.XmlNodeList setting = element.SelectNodes(key);
//return setting[1].LastChild.InnerText;
// This xpath appraoch will let you select exactly one node:
string key = "((//div[#class='right'])[2])/child::text()[last()]";
System.Xml.XmlNode setting = element.SelectSingleNode(key);
return setting.InnerText;
}

Related

Why is my xpath able to find the element at all?

I want to print text of each div with class="Name". The code below prints Name1 three times instead of Name1, Name2 and Name3.
Why does my code print Name1 three times?
Why is dateInput.FindElement even able to find the Root div at all? Root div is located in completely different level than the date element. And since I'm doing //div..., which means find the div in the current node (right?), on dateInput.FindElement it should NOT even find the Root div, right?
CODE
var dateInput = driver.FindElement(By.Id("date"));
var rootElement = dateInput.FindElement(By.XPath("//div[contains(#class,'Root')]"));
var boxes = rootElement.FindElements(By.XPath("//div[contains(#class,'Box)]"));
foreach (var box in boxes)
{
var nameElement = box.FindElement(By.XPath("//div[contains(#class,'Name')]"));
Console.WriteLine(nameElement.Text);
}
HTML
<div>
<div>
<input id="date"></div>
</div>
<div class="__Root">
<div>
<div class="__Box">
<div class="__Name">Name1</div>
</div>
<div class="__Box">
<div class="__Name">Name2</div>
</div>
<div class="__Box">
<div class="__Name">Name3</div>
</div>
</div>
</div>
</div>
You are evaluating an XPath expression starting with // relative to a particular context node, but the meaning of // is to search the document from the document's root, ignoring the context node altogether (except of course that the context does provide the document which is being searched). So you execute the same query three times. Each time, your query expression matches all 3 div elements in the document, but because the findElement method is defined to return a single element, it is returning the first one each time.
To search within a subtree rooted at the context node, your expression should start with .//.
Secondly, you could just search directly for the "Name" div elements with a single XPath expression (broken onto multiple lines for readability), and simplify your c# code drastically:
//div[contains(#class,'Root')]
//div[contains(#class,'Box')]
//div[contains(#class,'Name')]

Selenium : xpath following-sibling where siblings have more children

I hope I describe my problem/question in a comprehensible way.
I have and html that looks like this:
<div class="class-div">
<label class="class-label">
<span class="class-span">AAAA</span>
</label>
<div class="class-div-a">
<textarea class="class-textarea">
</textarea>
</div>
</div>
<div class="class-div">
<label class="class-label">
<span class="class-span">BBBB</span>
</label>
<div class="class-div-a">
<textarea class="class-textarea">
</textarea>
</div>
</div>
I want the Xpath for the TextArea where the value of the Label is AAAA to populate it with a value in Selenium.
So somelike like this...
wait.Until(ExpectedConditions.ElementIsVisible(
By.XPath("//div[#class='class-div']/label[#class='class-label'][span[#class='class-span' and text()='AAAA']]/following-sibling::div[#class='class-div-a']/textarea[#class='class-textarea']"))).SendKeys(valueTextArea);
Problem could be in this waiter condition, ExpectedConditions.ElementIsVisible
The thing is that your <textarea> is not 'visible' in selenium context, visibility means that element is present in DOM (which is true) and it's size is greater then 0px which could be false for your <textarea> element. In java you would use ExpectedConditions.presenceOfElement() instead of ExpectedConditions.visibilityOfElement(), not sure how it goes in C# but you get the picture.
Try and see if it solves your problem.
Let me quickly rephrase the question to make sure I understand, you need an xpath to find the textbox associated with the label where the text is AAAA.
You'll have to go back up the tree in this case, here are a couple of ways I might do that, although your xpath looks correct:
Using ancestor to be clear about which element you're moving up to (better IMO)
By.XPath("//label/span[text()='AAAA']/ancestor::div[#class='class-div']//textarea");
Or just moving back up the tree with ..
By.XPath("//label/span[text()='AAAA']/../../..//textarea");
If your xpath exists, use asikojevics answer. The C# method is ExpectedConditions.ElementExists(By)
****UPDATE****
Based on your comment of a trailing space after the text value, here is another xpath that should find the textarea in that case, using contains instead of text()=.
By.XPath("//label/span[contains(text(),'AAAA')]/ancestor::div[#class='class-div']//textarea");

c# selenium finding element using xpath

I am trying to find an element which is a div inside a div...
here is example of the code:
<div class="col-md-4">
<div style="display: none;" id="multiplier-win" class="label label-success multiplier">2X</div>
<div style="display: block;" id="multiplier-lose" class="label label-danger multiplier">0X</div>
<div style="display: none;" id="multiplier-tie" class="label label-warning multiplier">1X</div>
</div>
I want to find the class="label label-success multiplier" and check if her style="display:none".
How do I write this in c#?
Please help me
thank you!
In your case, the elements have a unique ID. So instead of finding them by class name (which could lead to multiple/inaccurate results), you should use By.Id(...). It is more easy to write by hand than xpath, too.
Let's say your IWebDriver instance is called driver. The code looks like this:
IWebElement element = driver.FindElement(By.Id("multiplier-win"));
String style = element.GetAttribute("style");
...
I don't want to offend you, but you should probably use google before you post here. This is very basic code you will find in multiple tutorials about selenium.
Edit: In case you are looking for multiple elements of a class:
ReadOnlyCollection<IWebElement> elements = driver.FindElements(By.ClassName("..."));
foreach (IWebElement el in elements)
{
...
}
To Find the element:
IWebElement element = driver.FindElement(By.XPath("//div[#class='label label-success multiplier']"));
To check if an element is displayed, this returns a bool (true if displayed, false if not displayed). If you go with philn's element list code, you can throw this line into his foreach statement and it will tell you which ones are displayed.
el.Displayed;

Scanning html DIVs from C#

I need to scan through a set of DIV collection and get the DIV IDs accordingly. Here's a piece of DIV collection in HTML.
<div id="rack12" rel="12" class="">
<span class="empty"></span>
<input type="hidden" id="GUID" value="">
</div>
<div id="rack13" rel="13" class="">
<span class="full"></span>
<div id="d92eec4f-2674-e311-9422-00155d04941f" rel="430.00 12.00 5 d92eec4f-2674-e311-9422-00155d04941f" class="selectedEquipment" style="height:105px;">
<p>IBM SPARC 5000u | 430.00W | 12.00Kg | 5RU </p>
</div>
<input type="hidden" id="GUID" value="d92eec4f-2674-e311-9422-00155d04941f">
</div>
So, for example, I need to find out which DIV ID (rack12 or rack13) contains this GUID d92eec4f-2674-e311-9422-00155d04941f. After that I need to do some logic and update the properties of that GUID in C# codebehind. By the way, these DIVs are generated from C# dynamically.
I have some difficulties in using javascript inside C#. Can advise me if there's an easy way to implement?
Add runat="server" attribute to the DIVs ("rack12" and "rack13"). Then you could manipulate them in code behind using ID or ClientID as a server control.
I think you can use Jquery , I havent understood exactly what you need, Still here is an example
$('div').each(function(e){
var obj = $(this);
obj.find("input").each(function(e){
var inputobj = $(this);
alert(inputobj.val()); //Here You will have input obj , U can use your code part here
});
});
Here is a Fiddle
http://jsfiddle.net/AmarnathRShenoy/ykPg4/1/

Turning HTML Div code into a Control

I have bunch of HTML code I am using to make rounded edge boxes on my controls. Is there a way to take this code and turn it into some kind of control or something so I do not have to keep pasting 10 lines of HTML code around everything I do?
<div id="BottomBody">
<div class="box1024" >
<div class="content1024">
<div class="top1024"></div>
<h1>My Header Information</h1>
<hr />
<p>Some text for everyone!</p>
</div>
<div class="bottom1024">
<div></div>
</div>
</div>
</div>
One additional thing to note, the number HTML tags used inside the inner most DIV will change depending on where I use it in my site. So in some cases I will only have 1 tag and 1 tag but in other cases I could have 1 tag, 1 tag, 3 tags, and a HTML table. How can I make that work?
Yes, you're thinking of a UserControl. Extract the relevant HTML out, paste it into a UserControl .ascx template.
Now in your case, you'll probably want the text to be customizable, am I right? So you'll need to replace the <h1> through </p> bit with an ASP.NET label. The resulting .ascx HTML (not counting the #Control directive) will look something like:
<div id="BottomBody">
<div class="box1024" >
<div class="content1024">
<div class="top1024"></div>
<asp:Label runat="Server" id="label1" />
</div>
<div class="bottom1024">
<div></div>
</div>
</div>
</div>
Alternatively, you could do two labels -- one for the header, one for the main text. Or even just have the header be runat="Server" itself.
Next, you'll write a little bit of code in the .ascx code-behind file to expose the label's (or labels', as the case may be) Text property. This would probably look something like:
public string Text
{
get { return label1.Text; }
set { label1.Text = value; }
}
If you're in an ASP.NET MVC world, use your Model data instead of a label, and pass in the desired display text string as the model data.
Edit
Addressing the update to the question:
One additional thing to note, the
number HTML tags used inside the inner
most DIV will change depending on
where I use it in my site. So in some
cases I will only have 1 tag and 1
tag but in other cases I could have 1
tag, 1
tag, 3 tags, and a HTML table. How can
I make that work?
The exact same technique, assuming that the content you're referring to is what's within <div class="content1024">. The Text property of a label can contain any desired arbitrary HTML, and of course you can pass any arbitrary amount of HTML as a string to the Model if you're using MVC.
Another left field approach - create a user control but use jQuery to round the corners for you:
<div class="roundcorner">
<h1>My Header Information</h1>
<hr />
<asp:Label runat="Server" id="label1" />
</div>
Issue the following in javascript:
$(document).ready(function(){
$('div.roundedcorner').corner();
});
This eliminates all your extra div's and you can still have a user control that you use at will on the server.
(I know I'm going to get in trouble for this from someone)
If its just static content you can just put it in a separate file and INCLUDE it
<!--#INCLUDE VIRTUAL="/_includes/i_yourfile.htm" -->
(File name, extension and location are arbitrary)

Categories