Combining Regex with Selenium in C#

Combining Regex with Selenium in C# - c#

I have a Automation Suite, currently testing against Wordpress (a test site to practice against). I am attempting to verify when a user edit's an existing Page they are taken to the correct screen. Previously the following code snippet was working fine, however now the ID mentioned below is no longer present (it was an image).
public static bool IsInEditMode()
{
return Driver.Instance.FindElement(By.Id("icon-edit-pages")) != null;
}
Assert.AreEqual(NewPostPage.IsInEditMode(), "You are not in edit mode");
The HTML I am targeting is...
<h2>
Edit Page
Add New
</h2>
I would like to extract the value of the h2 tag 'Edit Page'. Currently I am also getting the value of the anchor 'Add New', which I need to ignore.
using a CssSelector with "h2:first-child" returns both values.
I think I need to use a regular expression, if anyone has any suggestions to help that would be great.
I attempted doing something similar in JSFiddle but require the C# equivalent
var myString = document.getElementsByTagName('h2')[0].innerHTML;
var newString = myString.replace(/<([^>]+?)([^>]*?)>(.*?)<\/\1>/ig, "");
console.log(newString);

You can also get the parent element's text and remove the child element's text from it:
var parent = Driver.FindElement(By.TagName("h2"));
var child = parent.FindElement(By.TagName("a"));
var text = parent.Text.Replace(child.Text, "").Trim();

You can use StringAssert to verify if the string to check contains the expected string. I think is better because you not need to use regex
Example:
StringAssert.Contains(message, expectedmessage);

Related

Is it possible to display the text associated with a specific element?

I have been working with selenium and C#, and I am wondering if I can return the text associated with a specific element.
For example, here is the line of HTML that contains what I want to display:
<span class="field">1149156-1</span>
Would it be possible to save the "1149156-1" in a string to use for later?
I have tried the following code, but it returns a strange value, definitely not the value I want it to return:
string testvariable = driver.FindElement(By.XPath("/html/body/div/div[4]/div[2]/form/div[2]/div/table/tbody/tr[2]/td[1]")).ToString();
Hope I provided enough information!

You can just use webElement.Text.
string testvariable = driver.FindElement(By.XPath("/html/body/div/div[4]/div[2]/form/div[2]/div/table/tbody/tr[2]/td[1]")).Text;
On another note -- I recommend using relative XPath syntax instead of absolute, and querying on WebElement attributes such as id, name, and class to get more accurate locators.
For example, the XPath in the example you provided can be re-written more robustly as such:
//table[#id='someId']/tbody/tr[2]/td[1]

How do I get a Title of the link in AngleSharp item object?

Here is a link:
<a title = "mylink" href="mysite">content</a>
In AngleSharp object I can easily get content with this code:
string innerContent = item.TextContent;
But I need to get a title of the link and also a href. How do I do that?

Note that AngleSharp uses the standard DOM as defined by the W3C - thus you can just search for, e.g., "how to get href from anchor element in DOM" to retrieve an answer. For completeness, the example search query leads to (first hit on Google) Get local href value from anchor (a) tag, which answers your question.
Just translated to C# that means
var anchor = item as IHtmlAnchorElement; // Assumption: You have obtained it "only" as an IHtmlElement
string title = item.Title;
string href = item.Href;
Remark: There is a difference between .GetAttribute("href") and .Href. The former is always available (even on non-IHtmlAnchorElement) and gives you the real value. The latter is a special computed version available on some elements (e.g., IHtmlAnchorElement) and will get you a normalized version, already considering the base URL of the current document.
TL;DR: .Href will give you an absolute URL while .GetAttribute("href") may give you a relative URL.
HTH!

Using Selenium to select an element with javascript in the href

I have a .NET WebForm and I need to click a link using Selenium and can't use the text content (because of translation issues). How can I identify this element?
registration form
I have tried the following, which does not work:
var element = Driver.FindElementsByXPath($"//*[#href='ctl01']");

The problem is that you are trying to look for an id within the xpath, while the element does not contain an id.
In this case, this should work:
var element = Driver.FindElementsByXPath($"//a[contains(text(), 'registration form')]");
This will only work if all the elements which you are trying to find are links with the text registration form in it.
If you want to find elements on the href, use:
var element = Driver.FindElementsByXPath
("//a[contains(#href, 'javascript:__doPostBack('ctl01','')')]");

Ultimately decided to identify within the href attribute by partial string:
.FindElementsByXPath($"//*[contains(#href, '{id}')]")
This is because putting the whole value of the javascript text into the Selenium call caused it to fail parsing.

try searching for the a href instead of the id like this:
a[#href='javascript:__doPostBack('ctl01','')']
with FindElementsByXPath
then on the var element try using SendKeys like so:
element.SendKeys(Keys.Enter);

Finding an element by partial id with Selenium in C#

I am trying to locate an element with a dynamically generated id. The last part of the string is constant ("ReportViewer_fixedTable"), so I can use that to locate the element. I have tried to use regex in XPath:
targetElement = driver.FindElement(
By.XPath("//table[regx:match(#id, "ReportViewer_fixedTable")]"));
And locating by CssSelector:
targetElement = driver.FindElement(
By.CssSelector("table[id$='ReportViewer_fixedTable']"));
Neither works. Any suggestions would be appreciated.

That is because the css selector needs to be modified you were almost there...
driver.FindElement(By.CssSelector("table[id*='ReportViewer_fixedTable']"))`
From https://saucelabs.com/blog/selenium-tips-css-selectors-in-selenium-demystified:
css=a[id^='id_prefix_']
A link with an id that starts with the text id_prefix_.
css=a[id$='_id_sufix']
A link with an id that ends with the text _id_sufix.
css=a[id*='id_pattern']
A link with an id that contains the text id_pattern.
You were using a suffix which I'm assuming was not the partial link text identifier you were supposed to be using (unless I saw your html, which means try showing your html next time). *= is reliable in any situation though.

try using
targetElement = driver.FindElement(By.XPath("//table[contains(#id, "ReportViewer_fixedTable")]"));
Note this will check for all the elements that have id which contains (and not only ends with 'ReportViewer_fixedTable'). I will try to find a regex option that would be more accurate answer to you question.

This solution will work irrespective of the XPath version. First, create a method somewhere in your COMMON helper class.
public static string GetXpathStringForIdEndsWith(string endStringOfControlId)
{
return "//*[substring(#id, string-length(#id)- string-length(\"" + endStringOfControlId + "\") + 1 )=\"" + endStringOfControlId + "\"]";
}
In my case, below is the control ID in different version of my product ::
v1.0 :: ContentPlaceHolderDefault_MasterPlaceholder_HomeLoggedOut_7_hylHomeLoginCreateUser
v2.0 :: ContentPlaceHolderDefault_MasterPlaceholder_HomeLoggedOut_8_hylHomeLoginCreateUser
Then, you can call the above method to find the control which has static end string.
By.XPath(Common.GetXpathStringForIdEndsWith("<End String of the Control Id>"))
For the control ID's which I mentioned for v1 & v2, I use like below :
By.XPath(Common.GetXpathStringForIdEndsWith("hylHomeLoginCreateUser"))
The overall logic is that, you can use the below XPath expression to find a control which ends with particular string:
//*[substring(#id, string-length(#id)- string-length("<EndString>") + 1 )="<EndString>"]

Parsing HTML "Visually"

OKay I am at loss how to name this question. I have some HTML files, probably written by lord Lucifier himself, that I need to parse. It consists of many segments like this, among other html tags
<p>HeadingNumber</p>
<p style="text-indent:number;margin-top:neg_num ">Heading Text</p>
<p>Body</p>
Notice that the heading number and text are in seperate p tags, aligned in a horizontal line by css. the css may be whatever Lucifier fancies, a mixture of indents, paddings, margins and positions.
However that line is a single object in my business model and should be kept as such. So How do I detect whether two p elements are visually in a single line and process them accordingly. I believe the HTML files are well formed if it helps.

You didn't specify how you were parsing, but this is possible in jQuery since you can determine the offset position of any element from the window origin. Check out the example here.
The code:
$(function() {
function sameHorizon( obj1, obj2, tolerance ) {
var tolerance = tolerance || 0;
var obj1top = obj1.offset().top;
var obj2top = obj2.offset().top;
return (Math.abs(obj1top - obj2top) <= tolerance);
}
$('p').each(function(i,obj) {
if ($(obj).css('margin-top').replace('px','') < 0) {
var p1 = $(obj).prev('p');
var p2 = $(obj);
var pTol = 4; // pixel tolerance within which elements considered aligned
if (sameHorizon(p1, p2, pTol)) {
// put what you want to do with these objects here
// I just highlighted them for example
p1.css('background','#cc0');
p2.css('background','#c0c');
// but you can manipulate their contents
console.log(p1.html(), p2.html());
}
}
});
});
This code is based on the assumption that if a <p> has a negative margin-top then it is attempting to be aligned with the previous <p>, but if you know jQuery it should be apparent how to alter it to meet different criteria.
If you can't use jQuery for your problem, then hopefully this is useful for someone else who is or that you can set something up in jQuery to parse this and output new markup.

You may run irobotsoft web scraper and have a test:
Open the page in its browser window
Select and mark the line
Use menu: Design -> Practice HTQL and see if it can extract the line.

I don't have a ton of experience using it, but if the HTML is well formed and depending on what format you need your parsed data in, you may be able to treat it as an XML doc and use XQuery to parse out your data.
Also open up the HTML in Firefox and see if you can figure out what CSS styles are being applied using Firebug. It may give you a better clue as to how the HTML is being lined up...although it looks like its being done using the 'margin-top:negative_number'...if that's the case I think XQuery should be able to find the elements with that particular style applied.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Combining Regex with Selenium in C# - c#

You can also get the parent element's text and remove the child element's text from it: var parent = Driver.FindElement(By.TagName("h2")); var child = parent.FindElement(By.TagName("a")); var text = parent.Text.Replace(child.Text, "").Trim();

You can use StringAssert to verify if the string to check contains the expected string. I think is better because you not need to use regex Example: StringAssert.Contains(message, expectedmessage);

Related

Is it possible to display the text associated with a specific element?

How do I get a Title of the link in AngleSharp item object?

Using Selenium to select an element with javascript in the href

Finding an element by partial id with Selenium in C#

Parsing HTML "Visually"

Categories

Resources