WebBrowser Control Loading Twice - c#

Alrighty, guys. If you'd like to pull your hair out, the I've got a great problem for you. This problem seems very rare, but it effects my program on a few different sites that have pages that load content twice.
For instance: http://www.yelp.com/search?find_desc=donuts&find_loc=78664&ns=1#start=20
If you visit this site, you'll notice that it loads, then reloads different data. That's because there is a parameter in the URL that says start=20, so the results should start at #20 instead of #10. No matter what that is set to, Yelp loads the first 10 results. Not sure why they do this, but this is a prime example of what absolutely breaks my program. :(
Basically, whenever my program has a page that loads, it copies the source code to a string so it can display it somewhere else. It's not really important- What is important is that the string needs to actually have the last thing that is loaded in the page. Whenever a page loads, then loads again, I am not sure how to catch it and it ruins the program by exiting the while loop, and copying the source code into the string called source.
Here is a snippit of some code that I reproduced the problem with. When I attempt to use this in a new program, it will copy the source code for the first pages' results instead of what it is changed to.
GetSite = "http://www.yelp.com/search?find_desc=donuts&find_loc=78664&ns=1#start=20";
webBrowser9.Navigate(GetSite);
while (webBrowser9.ReadyState != WebBrowserReadyState.Complete)
{
p++;
if (p == 1000000)
{
MessageBox.Show("Timeout error. Click OK to skip." + Environment.NewLine + "This could crash the program, but maybe not.");
label15.Text = "Error Code: Timeout";
break;
}
Application.DoEvents();
}
mshtml.HTMLDocument objHtmlDoc = (mshtml.HTMLDocument)webBrowser9.Document.DomDocument;
Source = objHtmlDoc.documentElement.innerHTML;

Why do you wait in while loop for the browser to finish loading data?
Use DocumentCompleted event and you can remember the document's URL from there.

Related

Selinium: Wait function and if condition to repeat

In my automation project.
Browser: Firefox
I would like add a wait function without any specific time
driver.Manage().Timeouts().ImplicitlyWait(TimeSpan.FromSeconds(7));
IWebElement query1 = driver.FindElement(By.("continue"));
How can do that?
Also to verify that if another page did not load then repeat the previous function. The reason why I am doing this is because sometimes browser does not change the page. It actually stays on that same page.
Besides this is below thing possible in Selenium
Clear Cache and Cookie for last hour
Opening URL in new tab (In already opened browser rather then opening new window)
One thing that has worked consistently for me (regarding waits) is used in the conductor framework..
Here's some pseudo-code you can attempt to recreate in C#:
while (size == 0) {
size = driver.findElements(by).size();
if (attempts == MAX_ATTEMPTS) fail(String.format("Could not find %s after %d seconds",
by.toString(),
MAX_ATTEMPTS));
attempts++;
try {
Thread.sleep(1000); // sleep for 1 second.
} catch (Exception x) {
fail("Failed due to an exception during Thread.sleep!");
x.printStackTrace();
}
}
basically this loops through the size of the selector passed, and will poll each second. Another way you can do it, is just by conditions.
Some more pseudo-code:
function waitForElement(element) {
Wait.Until(ExpectedConditions.elementIsClickable(element), 10.Seconds)
}
And to your questions -
Can Selenium...
Clear Cache and Cookie for last hour
Opening URL in new tab (In already opened browser rather then opening new window)
Opening URL in new tab (In already opened browser rather then opening new window)
If you write your tests cases correctly by making them independent of one-another and not re-using the same browser over and over, this happens automatically. When Selenium opens a new window, it starts fresh with an entirely fresh profile - meaning it has "nothing" in the cache from the start.
Selenium does not (and will never) know the difference between a tab and a window. To Selenium, it's just a handle.
Source:

WebBrowser Scraping - Return Control to Calling Function or Another Function C#

I am using a WebBrowser control for web scraping pages on Yahoo news. I need to use a WebBrowser rather than HtmlAgilityPack to accommodate for JavaScript and the like.
Application Type: WinForm
.NET Framework: 4.5.1
VS: 2013 Ultimate
OS: Windows 7 Professional 64-bit
I am able to scrape the required text, but I am unable to return control of the application to the calling function or any other function when scraping is complete. I also cannot verify that scraping is complete.
I need to
1. Verify that all page loads and scraping have completed.
2. Perform actions on a list of the results, as by alphabetizing them.
3. Do something with the data, such as displaying text contents in a Text box or writing them to SQL.
I declare new class variables for the WebBrowser and a list of URLs and an object with a property that contains a list of news articles..
public partial class Form1 : Form
{
public WebBrowser w = new WebBrowser(); //WebBrowser
public List<String> lststrURLs = new List<string>(); //URLs
public ProcessYahooNews pyn = new ProcessYahooNews(); //Contains articles
...
lststrURLs.Add("http://news.yahoo.com/sample01");
lststrURLs.Add("http://news.yahoo.com/sample02");
lststrURLs.Add("http://news.yahoo.com/sample03");
Pressing a button, whose handler is calling function, calls this code.
w.Navigate(strBaseURL + lststrTickers[0]); //invokes w_Loaded
foreach (YahooNewArticle article in pyn.articles)
{
textBox1.Text += article.strHeadline + "\r\n";
textBox1.Text += article.strByline + "\r\n";
textBox1.Text += article.strContent + "\r\n";
textBox1.Text += article.dtDate.ToString("yyyymmdd") + "\r\n\r\n";
}
The first problem I have is that program control appears to skip over w.Navigate and pass directly to the foreach block, which does nothing since articles has not been populated yet. Only then is w.Navigate executed.
If I could get the foreach block to wait until after w.Navigate did its work, then many of my problems would be solved. Absent that, w.Navigate will do its work, but then I need control passed back to the calling function.
I have worked on a partial work-around.
w.Navigate loads a page into the WebBrowser. When it is done loading, the event w.DocumentCompleted fires. I am handling the event with w_Loaded, which uses a class with logic to perform the web scraping.
// Sets up the class
pyn.ProcessYahooNews_Setup(w, e);
// Perform the scraping
pyn.ProcessLoad();
The result of the scraping is that pyn.articles is populated. The next page is loaded only when criteria, such as pyn.articles.Count > 0.
if (pyn.articles.Count > 0)
{
//Navigate to the next page
i++;
w.Navigate(lststrURLs[i]);
}
More pages are scraped, and articles.Count grows. However, I cannot determine that scraping is done - that there will not be more page loads resulting in more articles.
Suppose I am confident that the scraping is done, I need to make articles available for further handling, as by sorting it as a list, removing certain elements, and displaying its textual content to a TextBox.
That takes me back the foreach block that was called too early. Now, I need it, but I have no way to get articles into the foreach. I don't think I can call some other function from w_Loaded to the handling for me because it would be called for each page load, and I need to call the function once after all page loads.
It occurs to me that some threaded architecture might help, but I could use some help on figuring out what the architecture would look like.

JavaScript window.open returns null sometimes

I am attempting maintenance on a system I did not write (and aren't we all?). It is written in C Sharp and JavaScript, with Telerik reports.
It has the following code included in JavaScript that runs when the user clicks a button to display a report in a separate window:
var oIframe = $("iframe id='idReportFrame' style='display:none' name='idReportFrame' src=''>");
oIframe.load(function() { parent.ViewReports(); });
oIframe.appendTo('body');
try
{
$('#idReportForm').attr('target', 'idReportFrame');
$('#idReportForm').submit();
}
catch (err) { // I did NOT write this
}
Then the load function:
function ViewReports()
{
var rptName = $("#ReportNameField").val();
if (rptName == '') { return false; }
var winOption = "fullscreen=no,height=" + $(window).height() + "left=0,directories=yes,titlebar=yes,toolbar=yes,location=yes,status=no,menubar=yes,scrollbars=no,resizable=no, top=0, width=" + $(window).width();
var win = window.open('#Url.Action("ReportView", "MyController")?pReportName=' + rptNameCode, 'Report', winOption);
win.focus();
return false;
}
When I execute this (in Chrome, at least), it does pop up the window and put the report in it. However, breakpoints in the c# code indicate that it is getting called 2 or 3 times. Breakpoints in the JavaScript and examination of the little log in the JavaScript debugging environment in Chrome show that the call to win.focus() fails once or twice before succeeding. It returns an undefined value, and then it appears that the first routine above is executed again.
I am inclined to think it some kind of timing issue, except that the window.open() call is supposed to be synchronous as far as I can tell, and I don't know why it would succeed sometimes and not others. There is a routine that gets executed on load of the window, perhaps that's somehow screwing up the return of the value from open().
I am not a JavaScript person much, as those of you that are can likely tell by this time. If there is something with the code I've put here that you can tell me is incorrect, that's great; what I'm more hopeful for is someone who can explain how the popup-report-in-frame is supposed to work. Hopefully I can do it without having to replace too much of the code I've got, as it is brittle and was not, shall we say, written with refactoring in mind.
From what I could find the window.open will return null when it fails to open. Something may be keeping the browser from opening additional windows a couple of times; maybe it is a popup blocker.
The actual loading of the url and creation of the window are done asynchronously.
https://developer.mozilla.org/en-US/docs/Web/API/Window.open
Popup blocking
In the past, evil sites abused popups a lot. A bad page could open
tons of popup windows with ads. So now most browsers try to block
popups and protect the user.
Most browsers block popups if they are called outside of
user-triggered event handlers like onclick.
For example:
// popup blocked
window.open('https://javascript.info');
// popup allowed
button.onclick = () => {
window.open('https://javascript.info');
};
Source: https://javascript.info/popup-windows
I just ran into this and it seems to be because I had a breakpoint on the line that calls window.open and was stepping through the code, in Chrome dev tools. This was extremely hit-and-miss and seemed to fail (return null, not open a window, whether one already existed or not) more times that it succeeded.
I read #Joshua's comment that the creation is done asynchronously, so I figured that forcing the code to 'stop' each time I step might be screwing things up somehow (though on a single line like var w = window.open(...) doesn't seem like this could happen).
So, I took out my breakpoint.. and everything started working perfectly!
I also took note of https://developer.mozilla.org/en-US/docs/Web/API/Window/open where they specify that if you are re-using a window variable and name (the second argumen to window.open) then a certain pattern of code is recommended. In my case, I am wanting to write HTML content to it, rather than give it a URL and let it async load the content over the network, and I may call the whole function repeatedly without regard for the user closing the window that pops up. So now I have something like this:
var win; // initialises to undefined
function openWindow() {
var head = '<html><head>...blahblah..</head>';
var content = '<h1>Amazing content<h1><p>Isn\'t it, though?</p>';
var footer = '</body></html>';
if (!win || win.closed) {
// window either never opened, or was open and has been closed.
win = window.open('about:blank', 'MyWindowName', 'width=100,height=100');
win.document.write(head + content + footer);
} else {
// window still exists from last time and has not been closed.
win.document.body.innerHTML = content;
}
}
I'm not convinced the write call should be given the full <html> header but this seems to work 100% for me.
[edit] I found that a Code Snippet on Stackoverflow has a some kind of security feature that prevents window.open, but this jsfiddle shows the code above working, with a tweak to show an incrementing counter to prove the content update is working as intended. https://jsfiddle.net/neekfenwick/h8em5kn6/3/
A bilt late but I think it's due to the window not beeing actually closed in js or maybe the memory pointer not being dereferenced.
I was having the same problem and I solved it by enclosing the call in a try finally block.
try {
if (!winRef || winRef.closed) {
winRef = window.open('', '', 'left=0,top=0,width=300,height=400,toolbar=0,scrollbars=0,status=0,dir=ltr');
} else {
winRef.focus();
}
winRef.document.open();
winRef.document.write(`
<html>
<head>
<link rel="stylesheet" href="/lib/bootstrap/dist/css/bootstrap.min.css">
</head>
<body>
${$(id).remove('.print-exclude').html()}
</body>
</html>
`);
winRef.document.close();
winRef.focus();
winRef.print();
} catch { }
finally {
if (winRef && !winRef.closed) winRef.close();
}

How to save my state in Windows Store App?

I have a little problem with saving my state to localsettings. Everything is ok except the situation when someone close my application using alt + f4 and open it before 10 seconds elapsed(after 10 seconds application is in state suspending and data is saved). (Technology xaml/c#)
I save my data in event OnSuspending.
I load my data in event OnLaunched like this:
if (args.PreviousExecutionState == ApplicationExecutionState.Terminated ||
args.PreviousExecutionState == ApplicationExecutionState.ClosedByUser)
{
// save data
}
How to handle this situation ? I know i can save my state every time it is changed but i think it is not good idea in my application.
Thanks for help !
When you close and launch your application before 10 seconds elapsed another instance of it is created and the previous one does not run OnSuspending event (it is strange because it means that asynchronous operations like this event can end or never start without warning us). I think that this is annoying but why would your user do something like that? Most of the times the user "restart" your application because it crashed or he is stuck and can't go back to the main page. You should try to prevent those scenarios and such think rarely will happen.
However, this can also happen because the user forgot to do something and want to start the app again. To prevent lost user data I save the most important data whenever I get the chance and save the rest only OnSuspending method. You need to think about what data will upset your users when lost.
I think Microsoft should get a better way of saving application state. I searched a lot about this problem and didn't found an explanation so for now I will continue to do what I said above. I hope this question you made can help and clarify me about this, in my opinion, strange case.
onLaunched :
CoreWindow.GetForCurrentThread().Activated += App_Activated;
and event:
void App_Activated(CoreWindow sender, WindowActivatedEventArgs args)
{
if (args.WindowActivationState == CoreWindowActivationState.Deactivated)
{
//save Data
}
}
When you load data, remove:
if (args.PreviousExecutionState == ApplicationExecutionState.Terminated || args.PreviousExecutionState == ApplicationExecutionState.ClosedByUser)`.
It works !

Ria Framework DomainDataSource MoveToNextPage, MoveToPage, MoveToFirstPage doesn't move pages

I'm attempting to write a save results extension to the DomainDatasourceView.
I can successfully write the contents of the current page of results but when I attempt to call MoveToNextPage(), the PageIndex stays current. MSDN docs regarding this don't provide any details other than MoveToNextPage returns a bool is it successfully moves to the next page.
The following sample code results in an infinite loop, and the Current page is never changed.
private void WriteResults(DomainDataSourceView resultsview)
{
StringBuilder csvdata = new StringBuilder();
... Do Work on current page ...
if(resultsview.CanChangePage && resultsview.MoveToNextPage())
{
csvdata.Append(WriteResults(resultsview));
}
}
Do I need to listen for the PageChanged Event to continue Saving Results?
Do I need to call Load on the DomainDataSource for each page?
The MSDN Docs on DomainDataSourceView doesn't go into too much details on this subject.
[Edit]
After playing around some more, I was able to determine that the Move...Page commands do call the the DomainDataSource Load operation, however its another Async call, so any consecutive work that needs to be done on the loaded pages, should be handled accordingly.

Categories