WebBrowser Control Retrieving jQuery Text - c#

I am trying to retrieve whenever the website displays the following message from a jQuery event. Initially this HTML inst displayed in the HTML.
<div id="toast-container" class="toast-top-right"><div class="toast toast-error" aria-live="assertive" style="display: block;"><div class="toast-message">Check email & password.</div></div></div>
My assumption is, that the webBrowser1.DocumentText.Contains is only looking from the initial load of the content.
So I thought maybe some sort of timer would work every 5 seconds, looking to see if the code has changed - but I don't even think this is right as it's checking the code that's already loaded repeatedly?
private void timer2_Tick(object sender, EventArgs e)
{
// Checks for any errors on sign in page
if (webBrowser1.DocumentText.Contains("toast toast-error"))
{
// Toast Notifications
var signinErrorNotification = new Notification("Error", "Please check your email and password are correct.", 50, FormAnimator.AnimationMethod.Fade, FormAnimator.AnimationDirection.Left);
signinErrorNotification.Show();
}
}
How do I go about getting the latest code that's been affected by any jQuery.
P.S. My c# level is beginner.

The Document property should give you what you need.
Notice that the docs for DocumentText say
Gets or sets the HTML contents of the page displayed in the WebBrowser
control.
For Document they say
Gets an HtmlDocument representing the Web page currently displayed in the WebBrowser control.
To me that's saying that DocumentText is like the starting document and Document is the current DOM. Also see https://learn.microsoft.com/en-us/dotnet/framework/winforms/controls/how-to-access-the-managed-html-document-object-model

Related

How to properly display the current and real url loaded in a WebBrowser control?

In my Form, I added a WebBrowser and a TextBox on which I would like to display the current loaded Url.
Note what the Microsoft Docs says:
WebBrowser.Navigating event:
Occurs before the WebBrowser control navigates to a new document
WebBrowser.Navigated event:
Occurs when the WebBrowser control has navigated to a new document and has begun loading it.
...
Handle the DocumentCompleted event to receive notification when the WebBrowser control finishes loading the new document.
WebBrowser.DocumentCompleted event:
Occurs when the WebBrowser control finishes loading a document
...
Handle the DocumentCompleted event to receive notification when the new document finishes loading. When the DocumentCompleted event occurs, the new document is fully loaded
The order at which the events are fired is: Navigating, Navigated and DocumentCompleted, so I'm handling those events to try properly update the current url:
private void WebBrowser1_Navigating(object sender, WebBrowserNavigatingEventArgs e) {
this.TextBox1.Text = e.Url.ToString();
}
private void WebBrowser1_Navigated(object sender, WebBrowserNavigatedEventArgs e) {
this.TextBox1.Text = e.Url.ToString();
}
private void WebBrowser1_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e) {
this.TextBox1.Text = e.Url.ToString();
}
The problem is that for some reason the url does not seem to properly update for some websites...
For example when navigating through Google's search engine, if I do click on the Google Images button, the url updates to "http://www.google.com/blank.html". Also, the urls that I get to display in my TextBox are not the same exact urls as I can see in Firefox or Chrome's address bar; for some reason my obtained urls have additional parameters in the query.
See it by yourself:
https://i.imgur.com/PQlSu47.gif
Is there any workaround to improve this annoying behavior so I can display the current url with efficiency as Firefox or Chrome does?. I mean, for example Firefox and Chrome will not show "http://www.google.com/blank.html" in the addres bar, neither will not show the url queries with additional parameters as I got to display (which you can see in the GIF image above).
Please note that the problem with Google website is just as an example. I'm asking for a universal solution due this issue occurs with many more websites.
Also note that if instead the WebBrowser component I use CefSharp's chromium based web browser, adapting my code to reproduce the same as I was doing to display/update the current url, then the problem is partially gone...
Using CefSharp does not shows "http://www.google.com/blank.html" when navigating through Google Images, however the query of the urls still contain additional parameters / many differences in comparison from the urls displayed in Firefox or Chrome browsers.
And apart from that, I would like to avoid using CefSharp just for solving this kind of issue...
TL;DR
Instead of URL in the event args, use URL from the web browser control:
string url = webBrowser1.Url.ToString();
The long answer
What you are missing here is that the HTML page can contain iframe elements. An iframe encapsulates HTML Window and the contained HTML Document, and it performs its own navigation. The navigation events of the WebBrowser control fire for both the top window (for which you want to display the URL) and for the iframe's. You will have to distinguish between the two.
Specifically, in your case http://www.google.com/blank.html comes from an iframe:
<html> <!-- top window -->
<body> <!-- top window's document -->
<iframe src="http://www.google.com/blank.html">
<!--
here the browser will load and "insert" HTML of blank.html
lines below don't exist in the original HTML
they are loaded and "inserted" here by the browser
-->
<html> <!-- iframe's window -->
<body> <!-- iframe's window's document -->
<!-- the body can contain additional iframes... -->
</body>
</html>
</iframe>
</body>
</html>
In general, the DOM of an HTML page is a tree of HTML Window objects, with the root window returned by the window.top property. Depending on how the page is designed, iframe's can be visible or hidden; they can be rendered in HTML by server, and also be manipulated, created, or deleted dynamically in the browser through JavaScript:
when a new iframe is created, it performs navigation to URL specified in its src attribute. If neither src nor embedded contents are specified, the navigation URL will be about:blank.
when src attribute of an existing iframe is modified, the iframe will perform navigation to the new src.
when window.location is modified of either top or iframe HTML Window, it will perform navigation to the new location.
However, determining which HTML Window (top or iframe) performs the navigation doesn't seem to be a trivial task, so a simpler approach would be just getting URL of the top window:
string url = webBrowser1.Url.ToString();
or after the DocumentCompleted event:
HtmlWindow topWindow = webBrowser1.Document.Window;
string url = topWindow.Url.ToString();
The DocumentCompleted can be fired multiple times for a given URL, because a page can contain iframes that also trigger the event. So I suggest that you update the textbox at Navigating and Navigated events only.

C# WebBrowser different html document after navigate

I have a really strange problem in C#:
First I use the WebBrowser control and the navigate method to browse.
wb_email.Navigate("https://registrierung.web.de");
Now I can change the innerText of htmlelements without any problems.
wb_email.Document.GetElementById("id4").InnerText = "Alexander";
But when I reload the page by simply using the navigate method with the same url again,
I get a null exception. It seems as he can't find the element.
So I used an inspector for Firefox to see if the htmlelement really changed, after reloading.
But only the url is changing, htmlelements are all the same.
What I'm doing wrong?
You're just changing the DOM in the displayed page. When you reload the page, the WebBrowser instance will just refresh the DOM from the server again and lose your changes.
The WebBrowser class isn't designed for editing rendered pages inside itself, as it's basically just a wrapper to an embedded Internet Explorer instance.
Make sure the website has finished loading before accessing any element. Like:
webBrowser.DocumentCompleted += new WebBrowserDocumentCompletedEventHandler(webBrowser_DocumentCompleted);
void webBrowser_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
// Access elements here
}

Get "real" HTML source from website

So, I've come across an issue where my favorite radio station plays a song I don't know while I'm driving. They don't have one of those pages that shows a list of songs that they've played; however, they do have a "Now Playing" section on their site that shows what's currently playing and by who. So, I am trying to write a small program that will poll the site ever 2 minutes to retrieve the name of the song and the artist. Using Chrome dev tools, I can see the song title and artist in the source. But when I view the page source, it doesn't show up. They are using a javascript to run display that info. I've tried the following:
private void button1_Click(object sender, EventArgs e)
{
webBrowser1.Navigate(#"http://www.thebuzz.com/main.html");
webBrowser1.DocumentCompleted += new WebBrowserDocumentCompletedEventHandler(webBrowser1_DocumentCompleted);
}
private void webBrowser1_DocumentCompleted(object sender,
WebBrowserDocumentCompletedEventArgs e)
{
do
{
// Do nothing while we wait for the page to load
}
while (webBrowser1.ReadyState == WebBrowserReadyState.Loading);
var test = webBrowser1.DocumentText;
textBox1.Text = test.ToString();
}
Essentially, I'm loading it into a WebBrowser and trying to get the source this way. But I'm still not getting the part after the javascript is run. Is there a way to actually retrieve the rendered HTML after the fact?
EDIT
Also, is there a way in the WebBrowser to allow scripts to run? I get popups asking me if I want to allow them to run. I don't want to suppress them, I need them to run.
As Jay Tomten said in the comments, you're trying to fix the result of your problem, not the cause. The cause of the problem is that they're using Javascript to update that part of the page. Instead of working around that by letting the Javascript do its update and then reading what it wrote, ask yourself where the Javascript is getting the info from and whether you can go to the same place. Open up something that lets you see web traffic - Fiddler, or Chrome's dev console, for example. Watch for POST calls. One of them will likely be an AJAX request in which the Javascript on the page is getting the current song. Note the URL, inspect the call to see what parameters it sends and what data it gets back. You can use Postman or something like it to assemble a POST request and work out how the Javascript on that site is getting its data, and then write a little code to make your own call to that URL and parse what comes back.

WebBroswer_DocumentCompleted event is not working

I have a form with a browser control. (this control uses IE9 because I set values on registry editor)
This web browser navigates to a specific URL and fills all fields on HTML page and submit them, then result page is displayed.
My problem is that i just want to know when this reslut page is fully loaded or completed so that i can fetch some information.
I use WebBroswer_DocumentCompleted event which works fine for the first page but not for the result page as it triggers before result
page is loaded.
I tried other solution which is to check the div tag inside the result page (this tag only appears when result page is loaded completely) and it works but not always.
My code:
private void WebBroswer_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
HtmlElementCollection elc3 = this.BotBrowser.Document.GetElementsByTagName("div");
foreach (HtmlElement el in elc3)
{
if (el.GetAttribute("id").Equals("Summary_Views")) //this determine i am at the result page
{
// fetch the result
}
}}
That div id is "Summary_Views".
I can provide you the link of that website on demand which is just for BLAST tools and database website for research purpose.
Frames and IFrames will cause this event to fire multiple times. Check out this answer:
HTML - How do I know when all frames are loaded?
Or this answer:
How to use WebBrowser control DocumentCompleted event in C#?
Or ms's kb article:
http://support.microsoft.com/kb/180366
Do you know if there are frames? If so then please say so, so people can help with that. If not then say so, so people can offer alternatives.
My guess is that the content is being generated by JavaScript. If it is then the document is complete before the JavaScript executes and you need to somehow wait until the JavaScript is done. The solution depends upon the web page. So you might need to process multiple document completes for diagnostic purposes and attempt to determine if there is a way to know which one you need.
At last i have solved my problem. I put a timer control from toolbox and set its time interval to 200ms and its Autoreset property to false. I set a tick event which has a code to check every 200ms whether this Div has been loaded or not, after that, Autoreset property is set to true.This solution is working perfectly :)

Postback causes whitescreen when posting data to url containing parameters

I have encountered an unexpected behaviour and/or bug in the .net postback system.
I have a page that uses a master page to provide common elements, with form inputs split between the child and master pages. The form submit button is located on the master page, and I am attempting to process postback on the masterpage.
Any time I attempt to submit data where the form contains any non empty values and the url contains parameters, the page fails to process correctly. This does not occur if the page is submitted under either condition by itself.
The form postback method is post.
The page fails to load and in firefox returns the no element found error.
I have checked for correct class names ect and I do have empty attributes in non form elements, but as the page loads correctly at first I don't think that is relevant. I have also checked for infinately looping code.
This is the current postback handling code:
protected void Page_Load(object sender, EventArgs e)
{
if (IsPostBack)
{
save_page();
}
page_render();
}
//save
private void save_page()
{
dev_text.Text = "save in progress";
}
Setting text in an HTML element on the server will only be seen on the browser when the HTML is sent to the browser. Normally this happens once the entire processing of the page has completed... so normally quite some time after the user initiated the post-back.
Instead of setting the text on the server, consider setting the text directly on the browser at the moment of submission. Something like...
function setSavingText(){
// Vanilla javascript...
document.getElementById("<%=dev_text.ClientId%>").innerHTML = "save in progress";
// JQuery...
$("#<%=dev_text.ClientId%>").text("save in progress");
}
<asp:Button runat="server" ... OnClientClick="setSavingText();" />
The above function contains both a line for vanilla (normal) javascript, and one for the jQuery library. You only need one of them.

Categories