C# WebBrowser stuck on navigating when used in for loop - c#

I have a for loop that changes the URL
for (int i = 1; i < max; i += 50)
{
completed = false;
string currkey = country;
crawler.Navigate(new Uri("http://www.example.net/func.php?dom=" + currkey + "&key=&start=" + i));
Console.WriteLine("Navigating to " + "http://www.example.net/func.php?dom=" + currkey + "&key=&start=" + i);
while (!completed)
{
Application.DoEvents();
Thread.Sleep(500);
}
}
This is my documentcompleted handler
crawler.Refresh();
Console.WriteLine("Getting universities");
getUniversities();
Console.WriteLine("Finished getting universities");
completed = true;
When i get rid of the for loop and use a single link, it seems to navigate to the website correctly, but when i use for loop to load websites in order, it seems that the web browser gets stuck in the second iteration.
Example:
currkey = United States
In the first iteration, the website link will be http://www.example.net/func.php?dom="United States"&key=&start=1, and on the next one it will be http://www.example.net/func.php?dom="United States"&key=&start=51. The navigation gets stuck when trying to load the second link.
I have used the boolean completed to note that the current iteration is finished, but it is still stuck.
Any kind of help is appreciated

Your Thread.Sleep call is blocking the WebBrowser from continuing to load. What you should be doing is attaching to the DocumentCompleted event, and then loading the next page. Please don't use this while/sleep combination in WinForms - you should use the events that the controls expose.
Attach the event:
crawler.DownloadCompleted += CrawlerDocumentCompleted;
Event handler:
private void CrawlerDocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
//The document has loaded - now do something
}
A final thought
As it looks like you are implementing a crawler, why are you using the WebBrowser control in WinForms to navigate. Surely all you are interested in is the html that the server serves up? Or is the page using JavaScript to load additional elements into the DOM, requiring you to use the WebBrowser?
You could use the WebClient class and the DownloadString or DownloadStringAsync methods. See https://msdn.microsoft.com/en-us/library/fhd1f0sw(v=vs.110).aspx

Related

How to wait until selenium completes its navigation?

Ok here my code and but it immediately executes
private static ChromeDriver mainDriver;
mainDriver.Navigate().GoToUrl(srFetchUrl);
string srPageSource = mainDriver.PageSource;
I have to get the source code after the page is actually navigated to new page and page is loaded
You can try this method, this will wait until page loads completely and you can add your expected time to page load.
public void E_WaitForPageLoad() throws Exception
{
JavascriptExecutor js = (JavascriptExecutor)driver;
//This loop will rotate for 100 times to check If page Is ready after every 1 second.
//You can replace your if you wants to Increase or decrease wait time.
int waittime;
waittime = 60;
for (int i=0; i<waittime; i++)
{
try
{
Thread.sleep(1000);
}catch (InterruptedException e) {}
//To check page ready state.
if (js.executeScript("return document.readyState").toString().equals("complete"))
{
//System.out.println("Wait for Page Load : "+js.executeScript("return document.readyState").toString());
break;
}
}
System.out.println("\nWeb-Page Loaded.");
}
Thank You,
Ed D, India.
Specify , implicit or explicit wait till the element in the page is loaded.
refer this link for C# wait syntax

WPF WebBrowser scroll on reload

I have a System.Windows.Controls.WebBrowser. It has some html that is coming from another document that the user is editing. When the html changes, what I want to do is update the WebBrowser's html, then scroll the WebBrowser back to wherever it was. I am successfully cacheing the scroll offset (see How to retrieve the scrollbar position of the webbrowser control in .NET). But I can't get a callback when the load is complete. Here is what I have tried:
// constructor
public HTMLReferenceEditor()
{
InitializeComponent();
WebBrowser browser = this.EditorBrowser;
browser.LoadCompleted += Browser_LoadCompleted;
//browser.Loaded += Browser_Loaded; // commented out as it doesn't fire when the html changes . . .
}
private void Browser_LoadCompleted(object sender, System.Windows.Navigation.NavigationEventArgs e)
{
CommonDebug.LogLine("LoadCompleted");
this.ScrollWebBrowser();
}
private void ScrollWebBrowser()
{
WebBrowser browser = this.EditorBrowser;
ReferenceHierarchicalViewModel rhvm = this.GetReferenceHierarchichalViewModel();
int? y = rhvm.LastKnownScrollTop; // this is the cached offset.
browser?.ScrollToY(y);
}
The LoadCompleted callbacks are firing all right. But the scrolling is not happening. I suspect the callbacks are coming too soon. But it is also possible that my scroll method is wrong:
public static void ScrollToY(this WebBrowser browser, int? yQ)
{
if (yQ.HasValue)
{
object doc = browser?.Document;
HTMLDocument castDoc = doc as HTMLDocument;
IHTMLWindow2 window = castDoc?.parentWindow;
int y = yQ.Value;
window?.scrollTo(0, y);
CommonDebug.LogLine("scrolling", window, y);
// above is custom log method; prints out something like "scrolling HTMLWindow2Class3 54", which
// at least proves that nothing is null.
}
}
How can I get the browser to scroll? Incidentally, I don't see some of the callback methods others have mentioned, e.g. DocumentCompleted mentioned here does not exist for me. Detect WebBrowser complete page loading. In other words, for some reason I don't understand, my WebBrowser is different from theirs. For me, the methods don't exist.

Firing WebBrowser.DocumentCompleted event whilst in a loop

I have a simple app I am developing that needs to iterate through a list of URLs which are passed to a WebBrowsers Navigate function in a for each loop. I was hoping to see the DocumentCompleted event firing after each call of the Navigate function but it only seems to be fired after the whole form has completed loading - and this the loop has completed.
I guess I am missing something fundamental here but some help and advice would be great!
Thanks!
Here is a sample of code that I am trying...
This foreach loop runs n the Form Load event of the WinForms page I am using...
int id = 0;
foreach (DataRow row in quals.Rows)
{
URN = row["LAIM_REF"].ToString();
string URN_formated = URN.Replace("/", "_");
string URL = "http://URL_I_AM_GOING_TOO/";
string FullURL = URL + URN_formated;
wbrBrowser.ScriptErrorsSuppressed = true;
wbrBrowser.Refresh();
wbrBrowser.Navigate(FullURL);
id += 1;
label1.Text = id.ToString();
}
At the point the loop gets to the line:
wbrBrowser.Navigate(FullURL);
I was hoping that the event:
private void wbrBrowser_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
...
}
would fire therefore being able to run processes against each of the URLs returned in the loop.
Thanks!
I used:
while (wbrBackground.ReadyState != WebBrowserReadyState.Complete) { Application.DoEvents(); }
after the Navigate function and it now works as expected.

How should i properly invoke a WebBrowser using multiplethreads?

Problem Scope:
I'm writing an aplication to save the HTML's retrieved from the Bing and Google searches. I know there are classes to execute the Web Requests using stream such as this example, but since Google and Bing both use Javascript and Ajax to render the results into the HTML, there's no way i can simply read the stream and use get to the result i need.
The solution to this, is to use the WebBrowser class and navigate to the url i want, so that the Browser itself will handle all the Javascript and Ajax scripting executions.
MultiThreading:
In order to make it more efficient, i have the same Form aplication firing a thread for each service (one for Bing, and one for Google).
Problem:
Since i need the WebBrowser, i have instantiated one for each thread (which are 2, at this moment). According to Microsoft, there is a known bug that prevents the DocumentCompleted event from firing if the WebBrowser is not visible and is not added to a visible form aswell (for more information, follow this link).
Real Problem:
The main issue is that, the DocumentCompleted event of the browser, never fires. Never.
I have wrote a proper handler for the DocumentCompleted event that never gets the callback. For handling the wait needed for the Browser event to fire, i have implemented a AutoResetEvent with a high timeout (5 minutes), that will dispose the webbrowser thread if it does not fire the event i need after 5 minutes.
At the moment, i have the Browser created and added into a WindowsForm, both are visible, and the event is still not firing.
Some Code:
// Creating Browser Instance
browser = new WebBrowser ();
// Setting up Custom Handler to "Document Completed" Event
browser.DocumentCompleted += DocumentCompletedEvent;
// Setting Up Random Form
genericForm = new Form();
genericForm.Width = 200;
genericForm.Height = 200;
genericForm.Controls.Add (browser);
browser.Visible = true;
As for the Navigation i have the Following (method for the browser) :
public void NavigateTo (string url)
{
CompletedNavigation = false;
if (browser.ReadyState == WebBrowserReadyState.Loading) return;
genericForm.Show (); // Shows the form so that it is visible at the time the browser navigates
browser.Navigate (url);
}
And, for the call of the Navigation i have this :
// Loading URL
browser.NavigateTo(URL);
// Waiting for Our Event To Fire
if (_event.WaitOne (_timeout))
{
// Success
}
{ // Error / Timeout From the AutoResetEvent }
TL:DR:
My WebBrowser is instantiated into a another STAThread, added to a form, both are visible and shown when the Browser Navigation fires, but the DocumentCompleted event from the Browser is never fired, so the AutoResetEvent always times out and i have no response from the browser.
Thanks in Advance and sorry for the long post
Although this seems a strange way, here is my attempt.
var tasks = new Task<string>[]
{
new MyDownloader().Download("http://www.stackoverflow.com"),
new MyDownloader().Download("http://www.google.com")
};
Task.WaitAll(tasks);
Console.WriteLine(tasks[0].Result);
Console.WriteLine(tasks[1].Result);
public class MyDownloader
{
WebBrowser _wb;
TaskCompletionSource<string> _tcs;
ApplicationContext _ctx;
public Task<string> Download(string url)
{
_tcs = new TaskCompletionSource<string>();
var t = new Thread(()=>
{
_wb = new WebBrowser();
_wb.ScriptErrorsSuppressed = true;
_wb.DocumentCompleted += _wb_DocumentCompleted;
_wb.Navigate(url);
_ctx = new ApplicationContext();
Application.Run(_ctx);
});
t.SetApartmentState(ApartmentState.STA);
t.Start();
return _tcs.Task;
}
void _wb_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
//_tcs.TrySetResult(_wb.DocumentText);
_tcs.TrySetResult(_wb.DocumentTitle);
_ctx.ExitThread();
}
}

How can i change UI of my page after async call?

I am having async call on my page,
This takes about 1 minute.
I need to change UI after call completes.
Sample code is give below.
protected void Unnamed1_Click(object sender, EventArgs e)
{
apicasystemWPMCheckStatsService.CheckStatsServiceClient obj = new apicasystemWPMCheckStatsService.CheckStatsServiceClient();
string xmlOptionForGetCheckStats = "<options><mostrecent count='1'/><dataformat>xml</dataformat><timeformat>tz</timeformat></options>";
string checkId = "";
TextBox1.Text = TextBox1.Text + "test" + "\r\n";
obj.BeginGetCheckStats("admin#azuremonitoring", "Cu4snfPSGr8=", "PD6B685A0-006A-4405-951E-B24BB51E7966",
checkId, xmlOptionForGetCheckStats, new AsyncCallback(ONEndGetCheckStats), null);
TextBox1.Text = TextBox1.Text + "testdone" + "\r\n";
}
public void ONEndGetCheckStats(IAsyncResult asyncResult)
{
System.Threading.Thread.Sleep(3000);
TextBox1.Text = TextBox1.Text + "testcomplete" + "\r\n";
}
The question is that how can i get "testcomplete" in my textbox. as my page is not getting posted back after this async call....
My current O/P :
test
testdone
Expected:
test
testdone
testcomplet
Simple answer: You can not do it like that. Because once the ASPX page is sent to the client there is no somple way for the server to communicate with that page.
You can do this however with AJAX. In your Unnamed1_Click set up a "flag" in Session that signals that an async operation is pending. In your ONEndGetCheckStats set that "flag" to signal that the operation has completed.
Add an ASP.NET page method (Quick Tutorial) to your code-behind that:
Checks whether the operation is pending and returns null wehen it is
When operation is finished removes everything from Session and returns the text that you need
On your ASPX page wire up a client-side event on the Unamed1 (a poor name for a button btw) button that starts a client-side loop checking the status using that PageMethod. When status is not null anymore Javascript to change the TextBox1 text.

Categories