Force HtmlAgilityPack to use chrome - c#

How can I force HtmlAgilityPack to use Chrome's interpretation of something in XPath?
for example these two lines of code point to the exact same thing on the web page, however the xpath is completely different.
for Chrome:
/html/body[#class=' hasGoogleVoiceExt']/div[#class='fjfe-bodywrapper']/div[#id='fjfe-real-body']/div[#id='fjfe-click-wrapper']/div[#id='appbar']/div[#class='elastic']/div[#class='appbar-center']/div[#class='appbar-snippet-primary']/span
for FireFox:
//*[#id='appbar']/div/div[2]/div[1]/span
I would like to use Chrome however I receive null for both queries.

The Html Agility Pack has no dependency on any browser whatsoever. It uses .NET XPATH implementation. You can't change this, unless you rewrite it completely.
The HTML you see in a browser can be very different from the HTML you download for an url, as the first one could have been modified by dynamic code (javascript, DHTML).
If you have an existing HTML or url, we could help you more.

Here is what I found using a copied XPATH from Chrome:
- I had to remove all of the tbody elements and double up forward slashes and then following code would return the proper element.
doc.DocumentNode.SelectSingleNode(
"//html//body//center//table[3]//tr//td//table//tr//td//table//tr//td//table[3]//tr[3]//td[3]//table//tr//td//table");

Related

Xpath works in Chorome but not in Selenium web-driver

I am working on a website where all other locator doesn't work expect using FindElements and take the 3rd a element. So I was curious to try xpath the first time.
I could get the xpath in chrome, but when I use in xpath, it says element not found.
I did a lot of search, still couldn't' find out what was wrong. So I tried in facebook page and use the login field as a try, the xpath is //*[#id="email"], it works perfectly in chrome, but same result in webdrive.
C# code: driver.findElement(By.xpath("//*[#id='email']"));
Please click for facebook picture and its location
Any advise?
I can give a complete solution on Python taking into account the features of React Native (used on Facebook)
But, you have C#. Therefore, it is possible to use a similar function driver.execute_script (execution of Javascript on Selenium)
driver.get("https://www.facebook.com/")
driver.execute_script('
document.getElementById("email").value = "lg#hosct.com";
document.getElementById("u_0_2").click();
')
I did another try with a more clear code:
driver.Url = "";
driver.findElement(By.xpath("//*[#id='email']"));
It works now, the only difference between this and my code before is: I was visiting some other pages before the facebook page. This seems to make difference. Anyway, above code works. If I encounter the issue again, I will post more detail code.

Generating CSS Selector in Firefox

Well i used to use htmlagilitypack as well as xPath to scrap some info from websites but i have read that css selectors are much faster so i searched for good engine for css and i found CsQuery; However, i am still confused as i don't know how to get the css path of an element.
In xPath i have used a firefox plugin called xPath checker that returned a fine xPaths like this
id('yt-masthead-signin')/button
But i can't find an equivalent one for CSS. So if someone helped my i will really appreciate it because i don't find and answer on google for my question specifically.
Install Firebug + Firepath
Click the selecting button to select something on the page, then it can generate either xpath or css selector. However, you need some changes to make the generated ones more efficient.

Locating HTML Tags

I am trying to automate the testing of web forms. To that end I need to know how to use C# to dynamically locate input tags within the HTML page then assign values to them. I don't want to use XPath, because each time I will be using a different web form. I want to pass the web form's URL to Selenium and then automatically populate the fields. I've heard of HTMLAgilityPack. Would that help me? If so, how can I use it?
I appreciate your help.
I may have missed a crucial part of your question, however, have you looked at Selenium WebDriver?
If you write a test that handles a generic web form you can back your test by data that is dynamic. Therefore you can cater for changes in the page by using Data Driven Tests. I've written tests for many pages and there are always common actions, but I cater for each page differently though as there are different things on that page!
[EDIT]
Following on from your comments, I think looking into Selenium would be a good idea. The way to handle different pages is to have these element definitions ready in a 'definitions' class for each page. That way once you know what the page is, you just use the correct class for your definitions. It is best to know what elements you are going to be interacting with in your tests before the tests run. The point of automated UI testing is for a known set of actions to be performed and a correct result achieved.
I would suggest you look up some tutorials such as this and you can see my blog
though I wrote this when I was initially learning WatiN and then replaced it with Selenium (I like it better :P).
Html Agility Pack
This is an agile HTML parser that builds a read/write DOM and supports
plain XPATH or XSLT (you actually don't HAVE to understand XPATH nor
XSLT to use it, don't worry...). It is a .NET code library that allows
you to parse "out of the web" HTML files. The parser is very tolerant
with "real world" malformed HTML. The object model is very similar to
what proposes System.Xml, but for HTML documents (or streams).
HtmlDocument doc = new HtmlDocument();
doc.Load(path);
foreach (HtmlNode input in doc.DocumentNode.SelectNodes("//input"))
// Your Code...

webbrowser midified HTML code c#

I have webBrowser component and I would like to save modified HTML code to file.
I don't know if you understood me but browser navigates to one page, receives HTML + JS and then JS modifies HTML code, now I need to save that modified HTML code.
I have tried to use DocumentText but form result I get it outputs original HTML code not HTML code modified by JS.
Does anyone know how to solve this problem?
A lot of developer plug-ins (Firebug or Firefox or Developer tools for IE or Chrome) will allow you to see the updated HTML.
You can use outerHTML of an element you are interested in (i.e. BODY).
Look at methods of HTmlDocument like http://msdn.microsoft.com/en-us/library/system.windows.forms.htmldocument.getelementsbytagname.aspx and HtmlElement - http://msdn.microsoft.com/en-us/library/system.windows.forms.htmlelement.outerhtml.aspx

HTML to DOM Library

I am looking for a C# library that would translate the HTML code (and the css specified in the code) into a DOM tree for simpler parsing. I am looking for something similar to this one (which is in PHP):
http://simplehtmldom.sourceforge.net/
Of course I know I could embed a browser control, but I am looking for something more efficient.
Check out the HTML Agility Pack. It hasn't been updated in a while, but it still works very well.
I second Mr. Dorman on the HtmlAgilityPack. I did a brief blog post on web scraping some time ago; it mentions the 'pack, but mostly discusses other details. Depending on your application, it might be of some use.
We have used HTMLAgility here in our project to extract specific html tags with a given set of attributes using XPath and it has never failed us.
There is no way to get DOM with styles like that. Only option is "Selenium" framework that works with real browser.

Categories