Web automation using .NET

Web automation using .NET - c#

I am a very newbie programmer. Does anyone of you know how to do Web automation with C#?
Basically, I just want auto implement some simple action on the web.
After I have opened up the web link, i just want to perform the actions below automatically.
Automatically Input some value and Click on "Run" button.
Check In the ComboBox and Click on "Download" button.
How can I do it with C#? My friend introduce me to use Powershell but I guess .Net do provide this kind of library too. Any suggestion or link for me to refer?

You can use the System.Windows.Forms.WebBrowser control (MSDN Documentation). For testing, it allows your to do the things that could be done in a browser. It easily executes JavaScript without any additional effort. If something went wrong, you will be able to visually see the state that the site is in.
example:
private void buttonStart_Click(object sender, EventArgs e)
{
webBrowser1.DocumentCompleted += new WebBrowserDocumentCompletedEventHandler(webBrowser1_DocumentCompleted);
webBrowser1.Navigate("http://www.wikipedia.org/");
}
void webBrowser1_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
HtmlElement search = webBrowser1.Document.GetElementById("searchInput");
if(search != null)
{
search.SetAttribute("value", "Superman");
foreach(HtmlElement ele in search.Parent.Children)
{
if (ele.TagName.ToLower() == "input" && ele.Name.ToLower() == "go")
{
ele.InvokeMember("click");
break;
}
}
}
}
To answer your question: how to check a checkbox
for the HTML:
<input type="checkbox" id="testCheck"></input>
the code:
search = webBrowser1.Document.GetElementById("testCheck");
if (search != null)
search.SetAttribute("checked", "true");
actually, the specific "how to" depends greatly on what is the actual HTML.
For handling your multi-threaded problem:
private delegate void StartTestHandler(string url);
private void StartTest(string url)
{
if (InvokeRequired)
Invoke(new StartTestHandler(StartTest), url);
else
{
webBrowser1.DocumentCompleted += new WebBrowserDocumentCompletedEventHandler(webBrowser1_DocumentCompleted);
webBrowser1.Navigate(url);
}
}
InvokeRequired, checks whether the current thread is the UI thread (actually, the thread that the form was created in). If it is not, then it will try to run StartTest in the required thread.

Check out SimpleBrowser, which is a fairly mature, lightweight browser automation library.
https://github.com/axefrog/SimpleBrowser
From the page:
SimpleBrowser is a lightweight, yet
highly capable browser automation
engine designed for automation and
testing scenarios. It provides an
intuitive API that makes it simple to
quickly extract specific elements of a
page using a variety of matching
techniques, and then interact with
those elements with methods such as
Click(), SubmitForm() and many more.
SimpleBrowser does not support
JavaScript, but allows for manual
manipulation of the user agent,
referrer, request headers, form values
and other values before submission or
navigation.

If you want to simulate a real browser then WatiN will be a good fit for you. (Selenium is another alternative, but I do not recommend it for you).
If you want to work on the HTTP level, then use WebRequest and related classes.

You could use Selenium WebDriver.
A quick code sample below:
using OpenQA.Selenium;
using OpenQA.Selenium.Firefox;
// Requires reference to WebDriver.Support.dll
using OpenQA.Selenium.Support.UI;
class GoogleSuggest
{
static void Main(string[] args)
{
// Create a new instance of the Firefox driver.
// Note that it is wrapped in a using clause so that the browser is closed
// and the webdriver is disposed (even in the face of exceptions).
// Also note that the remainder of the code relies on the interface,
// not the implementation.
// Further note that other drivers (InternetExplorerDriver,
// ChromeDriver, etc.) will require further configuration
// before this example will work. See the wiki pages for the
// individual drivers at http://code.google.com/p/selenium/wiki
// for further information.
using (IWebDriver driver = new FirefoxDriver())
{
//Notice navigation is slightly different than the Java version
//This is because 'get' is a keyword in C#
driver.Navigate().GoToUrl("http://www.google.com/");
// Find the text input element by its name
IWebElement query = driver.FindElement(By.Name("q"));
// Enter something to search for
query.SendKeys("Cheese");
// Now submit the form. WebDriver will find the form for us from the element
query.Submit();
// Google's search is rendered dynamically with JavaScript.
// Wait for the page to load, timeout after 10 seconds
var wait = new WebDriverWait(driver, TimeSpan.FromSeconds(10));
wait.Until(d => d.Title.StartsWith("cheese", StringComparison.OrdinalIgnoreCase));
// Should see: "Cheese - Google Search" (for an English locale)
Console.WriteLine("Page title is: " + driver.Title);
}
}
}
The great thing (among others) about this approach is that you can easily switch the underlying browser implementations, just by specifying a different IWebDriver, like FirefoxDriver, InternetExplorerDriver, ChromeDriver, etc. This also means you can write 1 test and run it on multiple IWebDriver implementations, thus testing how the page works when viewed in Firefox, Chrome, IE, etc. People working in QA sector often use Selenium to write automated web page tests.

I'm using ObjectForScripting to automate WebBrowser, A Javascript callback to C# function and then function in c# extract data or automate many-thing.
I have clearly explained in the following link
Web Automation using Web Browser and C#

.NET does not have any built-in functionality for this. It does have the WebClient and HttpRequest/HttpResponse classes, but they are only building blocks.

You cannot easily automate client-side activity, like filling out forms or clicking on buttons from C#. However, if you look into JavaScript, you may be able to better automate some of those things. To really automate, you would need to reverse engineer the call made by clicking the button, and connect to the url directly, using the classes #John mentions.

Related

WebBrowser Scraping - Return Control to Calling Function or Another Function C#

I am using a WebBrowser control for web scraping pages on Yahoo news. I need to use a WebBrowser rather than HtmlAgilityPack to accommodate for JavaScript and the like.
Application Type: WinForm
.NET Framework: 4.5.1
VS: 2013 Ultimate
OS: Windows 7 Professional 64-bit
I am able to scrape the required text, but I am unable to return control of the application to the calling function or any other function when scraping is complete. I also cannot verify that scraping is complete.
I need to
1. Verify that all page loads and scraping have completed.
2. Perform actions on a list of the results, as by alphabetizing them.
3. Do something with the data, such as displaying text contents in a Text box or writing them to SQL.
I declare new class variables for the WebBrowser and a list of URLs and an object with a property that contains a list of news articles..
public partial class Form1 : Form
{
public WebBrowser w = new WebBrowser(); //WebBrowser
public List<String> lststrURLs = new List<string>(); //URLs
public ProcessYahooNews pyn = new ProcessYahooNews(); //Contains articles
...
lststrURLs.Add("http://news.yahoo.com/sample01");
lststrURLs.Add("http://news.yahoo.com/sample02");
lststrURLs.Add("http://news.yahoo.com/sample03");
Pressing a button, whose handler is calling function, calls this code.
w.Navigate(strBaseURL + lststrTickers[0]); //invokes w_Loaded
foreach (YahooNewArticle article in pyn.articles)
{
textBox1.Text += article.strHeadline + "\r\n";
textBox1.Text += article.strByline + "\r\n";
textBox1.Text += article.strContent + "\r\n";
textBox1.Text += article.dtDate.ToString("yyyymmdd") + "\r\n\r\n";
}
The first problem I have is that program control appears to skip over w.Navigate and pass directly to the foreach block, which does nothing since articles has not been populated yet. Only then is w.Navigate executed.
If I could get the foreach block to wait until after w.Navigate did its work, then many of my problems would be solved. Absent that, w.Navigate will do its work, but then I need control passed back to the calling function.
I have worked on a partial work-around.
w.Navigate loads a page into the WebBrowser. When it is done loading, the event w.DocumentCompleted fires. I am handling the event with w_Loaded, which uses a class with logic to perform the web scraping.
// Sets up the class
pyn.ProcessYahooNews_Setup(w, e);
// Perform the scraping
pyn.ProcessLoad();
The result of the scraping is that pyn.articles is populated. The next page is loaded only when criteria, such as pyn.articles.Count > 0.
if (pyn.articles.Count > 0)
{
//Navigate to the next page
i++;
w.Navigate(lststrURLs[i]);
}
More pages are scraped, and articles.Count grows. However, I cannot determine that scraping is done - that there will not be more page loads resulting in more articles.
Suppose I am confident that the scraping is done, I need to make articles available for further handling, as by sorting it as a list, removing certain elements, and displaying its textual content to a TextBox.
That takes me back the foreach block that was called too early. Now, I need it, but I have no way to get articles into the foreach. I don't think I can call some other function from w_Loaded to the handling for me because it would be called for each page load, and I need to call the function once after all page loads.
It occurs to me that some threaded architecture might help, but I could use some help on figuring out what the architecture would look like.

How do you update ASP.NET web forms from different threads?

In the last few days I've been trying to learn how to use ASP.NET Web Forms together with multithreading the hard way by building a simple applet using both and I've been struggling with aspects of interactions between different threads and the UI.
I've resolved some multithreading issues in some other questions (and also learned after waaaaaay too long that web forms and WPF are not the same thing) but now I'm running into trouble finding the best way to update UI elements based on data acquired in multiple threads.
Here's my code:
Default.aspx
public partial class _Default : System.Web.UI.Page
{
private NlSearch _search;
private static int _counter = 0;
private static SortedList<long, SearchResult> resultsList = new SortedList<long, SearchResult>();
protected void Page_Load(object sender, EventArgs e)
{
_search = new NlSearch();
}
protected void AddSearchMethod(object sender, EventArgs e)
{
var text = SearchForm.Text;
new Task(() => MakeRequest(text));
}
protected void UpdateMethod(object sender, EventArgs e)
{
resultsLabel.Text = "";
foreach (var v in resultsList.Values)
{
resultsLabel.Text += v.SearchTerm + ": " + v.Count + " occurances<br/>";
}
}
protected void ClearSearchMethod(object sender, EventArgs e)
{
resultsLabel.Text = "";
resultsList.Clear();
}
protected void MakeRequest(string text)
{
_counter++;
SearchResult s = new SearchResult
{
SearchTerm = text,
Count = _search.MakeRequests(text)
};
resultsList.Add(_counter, s);
}
}
I've tried quite a few versions of the same basic thing. NlSearch.MakeRequest (called by MakeRequests) sends an HTTP POST request to an outside web site imitating a search bar input, and then extracts an integer from the markup indicating how many results came back.
The current simple UI revolves around a SearchForm textfield, an "Add Search" button, an "Update Label" button a "Clear Search" method, and a ResultsLabel that displays results. The AddSearch button creates a new task that calls MakeRequest, which calls the method to send the HTTP request and then stores the results in the order they were sent in a static sorted list.
So now ideally in a good UI I would like to just update the label every time a thread returns, however I've tried using ContinueWhenAll and a few other task functions and the problem seems to be that other threads do not have the ability to change the UI.
I have also tried running a new thread on page load that updates the label every few seconds, but this likewise failed.
Because I haven't been able to implement this correctly, I've had to use the "Update Label" button which literally just tells the label to display what's currently in the static list. I would really like to get rid of this button but I can't figuer out how to get my threads to make UI changes.

In general, trying to do threading in a web app is a bad idea. Web servers are designed for this, but spinning off new threads or processes should be avoided if at all possible. While there used to be a mechanism (and maybe there still is) to "push" results to a client, there are better solutions available today.
What you're describing is exactly the problem that AJAX is intended to solve.

You mentioned WPF in your question -- are you perhaps instead looking for a Windows application, like WinForms? I think that perhaps the term "web forms" has confused the situation. Web forms are just webpages with some (okay, a lot) of added in Microsoft functionality.
It sounds like you're trying to send updates to a webpage from a thread in code. The web doesn't work that way. I'd suggest reading through the ASP.NET Page Life Cycle Overview if you're actually trying to design webpages. Other answers have suggested AJAX functionality (which is where the web page executes some JavaScript that goes out and talks to a web server).

Have you ever hear about AJAX before? I think you're a thinking as application dev instead of web dev.

If you want to run your code asynchonous you may want to use the Async Await keywords instead of managing threads yourself. See information about Asynchronous Programming with Async and Await
Do not let your threads get tangled up ;)

C# SendKeys.SendWait() doesn't always work

I am trying to make an application that sends keys to an external application, in this case aerofly FS. I have previously used the SendKeys.SendWait() method with succes, but this time, it doesn't quite work the way I want it to. I want to send a "G" keystroke to the application and testing it out with Notepad I do get G's. But in aerofly FS nothing is recieved at all. Pressing G on the keyboard does work though.
This is my code handling input data (from an Arduino) an sending the keystrokes,
private void handleData(string curData)
{
if (curData == "1")
SendKeys.SendWait("G");
else
{ }
}

I too have run into external applications where SendKeys didn't work for me.
As best I can tell, some applications, like applets inside a browser, expect to receive the key down, followed by a pause, followed by a key up, which I don't think can be done with SendKeys.
I have been using a C# wrapper to the AutoIt Library, and have found it quite easy to use.
Here's a link to quick guide I wrote for integrating AutoIt into a C# project.
Once you have the wrapper and references, you can send "G" with the following:
private void pressG()
{
AutoItX3Declarations.AU3_Send("{g}");
}
or with a pause,
private void pressG()
{
AutoItX3Declarations.AU3_Send("{g down}", 0);
AutoItX3Declarations.AU3_Sleep( 50 ); //wait 50 milliseconds
AutoItX3Declarations.AU3_Send("{g up}", 0);
}
AutoIt also allows you programmatically control the mouse.

Is there a publicly accessible event to mark a ModalDialog close?

I recently made a custom ribbon in Sitecore. The two buttons in it fire off a command which activate a Xaml application using SheerResponse.ShowModalDialog. These applications effect the state of a database being read by another component on the ribbon.
I either need to be able to fire a custom event or function from the Xaml application to make the other ribbon component, or I need to be able to make the component on the ribbon aware that it needs to re-render when the ModalDialogs close. I don't see any obvious events which would do this, and I've gone about as far as I can when looking through the raw code with DotPeek and I haven't seen anything which even looks promising.

Apparently, the answer was there the whole time and I had missed it.
SheerResponse has a five parameter version of ShowModalDialog which accepts a boolean as a final parameter. This means I can couple it with ClientPage.Start:
Context.ClientPage.Start(this, "Run", kv);
}
private void Run(ClientPipelineArgs args)
{
var id = args.Parameters["id"];
if(!args.IsPostBack)
{
string controlUrl = string.Format("{0}&id={1}", UIUtil.GetUri("control:AltDelete"), id);
SheerResponse.ShowModalDialog(controlUrl,"","","",true);
args.WaitForPostBack();
}
else
{
Logger.LogDebug("post back");
}
Logger.LogDebug("out of if");
}

How do I manage threads in a C# web app?

I built a little web application that displays charts. I was thinking that it might be useful for the superuser of the app to do a complete data refresh, however this process takes around 10 minutes to complete. I was thinking perhaps the user could click a button that would start off a new thread to do a data refresh and subsequent clicks would kill the thread and restart the data population process. The user would then be free to browse about the site and view the charts as their data is populated.
Is there a simple method of accomplishing something like this?

You can twist ASP.NET to do this sort of thing, but it violates a few good general rules for ASP.NET development -- and could really cause problems in a server farm.
So, the most obvious route is to do this work in a web service. You can have the method return a chunk of HTML if you want. You could also add status methods to see how the thread is progressing.

Other options include: Handing the intense processing off to a database server (sounds like this might be a good use of OLAP) or, another cheap trick might be to set up the click to fire off a scheduled task that runs on the server. Can you provide some additional detail about the environment? Single server? Data storage platform, version of .net?

Ok, I didn't use either answer so here is what I did. I decided that it would be better if subsequent clicks would terminate instead of the currently executing one. Thanks for your answers guys.
//code behind
protected void butRefreshData_Click(object sender, EventArgs e)
{
Thread t = new Thread(new ThreadStart(DataRepopulater.DataRepopulater.RepopulateDatabase));
t.Start();
}
//DataRepopulater.cs
namespace DataRepopulater
{
public static class DataRepopulater
{
private static string myLock = "My Lock";
public static void RepopulateDatabase()
{
if(Monitor.TryEnter(myLock))
{
DoWork();
Monitor.Exit(myLock);
}
}
}

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Web automation using .NET - c#

If you want to simulate a real browser then WatiN will be a good fit for you. (Selenium is another alternative, but I do not recommend it for you). If you want to work on the HTTP level, then use WebRequest and related classes.

I'm using ObjectForScripting to automate WebBrowser, A Javascript callback to C# function and then function in c# extract data or automate many-thing. I have clearly explained in the following link Web Automation using Web Browser and C#

.NET does not have any built-in functionality for this. It does have the WebClient and HttpRequest/HttpResponse classes, but they are only building blocks.

Related

WebBrowser Scraping - Return Control to Calling Function or Another Function C#

How do you update ASP.NET web forms from different threads?

C# SendKeys.SendWait() doesn't always work

Is there a publicly accessible event to mark a ModalDialog close?

How do I manage threads in a C# web app?

Categories

Resources