Get final HTML content after javascript finished by Open Webkit Sharp - c#

I'm writing a software that gets the content from URL. When working on that, I run into to problem that I can not get exactly the HTML content after the java script finished.
There are some websites that renders HTML by java-script, some do not support browsers which does not run js.
I tried using System.Windows.Controls.WebBrowser with WebBrowser.Document in LoadCompleted but no luck.
After that, I tried the OpenWebkitSharp library. On the UI, it showes the content of website correctly, but with code Document in DocumentCompleted, it still returns the content which does not rendered by java-script.
Here is my code:
...
using WebKit;
using WebKit.Interop;
public MainWindow()
{
windowFormHost = new System.Windows.Forms.Integration.WindowsFormsHost();
webBrowser = new WebKit.WebKitBrowser();
webBrowser.AllowDownloads = false;
windowFormHost.Child = webBrowser;
grdBrowserHost.Children.Add(windowFormHost);
webBrowser.Load += WebBrowser_Load;
}
private void WebBrowser_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
var contentHtml = ((WebKitBrowser)sender).DocumentAsHTMLDocument;
}
The contentHtml has value which is not rendered after java-script finished.

Do solve this problem, I have added some trick into my code to get the full Html content after java-script finished.
using WebKit;
using WebKit.Interop;
using WebKit.JSCore; //We need add refrence JSCore which following with Webkit package.
public MainWindow()
{
InitializeComponent();
InitBrowser();
}
private void InitBrowser()
{
windowFormHost = new System.Windows.Forms.Integration.WindowsFormsHost();
webBrowser = new WebKit.WebKitBrowser();
webBrowser.AllowDownloads = false;
windowFormHost.Child = webBrowser;
grdBrowserHost.Children.Add(windowFormHost);
webBrowser.Load += WebBrowser_Load;
}
private void WebBrowser_Load(object sender, EventArgs e)
{
//The ResourceIntercepter will throws exception if webBrowser have not finished loading its components
//We can not use DocumentCompleted to load the Htmlcontent. Because that event will be fired before Java-script is finised
webBrowser.ResourceIntercepter.ResourceFinishedLoadingEvent += new ResourceFinishedLoadingHandler(ResourceIntercepter_ResourceFinishedLoadingEvent);
}
private void ResourceIntercepter_ResourceFinishedLoadingEvent(object sender, WebKitResourcesEventArgs e)
{
//The WebBrowser.Document still show the html without java-script.
//The trict is call Javascript (I used Jquery) to get the content of HTML
JSValue documentContent = null;
var readyState = webBrowser.GetScriptManager.EvaluateScript("document.readyState");
if (readyState != null && readyState.ToString().Equals("complete"))
{
documentContent = webBrowser.GetScriptManager.EvaluateScript("$('html').html();");
var contentHtml = documentContent.ToString();
}
}
Hope this one can help you.

Related

Replacement of WinForm WebBrower in Wpf

I'm trying to use function from winform Project to my WPF Project, but the code seems not work with WPF or the structure of WPF. After many hours research I figured out that this code only work for winfowm. I really need to use it to print to thermal printer and I'm using to print html because my printer in this case will print Arabic characters.
Here is the code:
private void button2_Click(object sender, EventArgs e)
{
StartBrowser(xx);
}
public static void StartBrowser(string source)
{
var th = new Thread(() =>
{
var webBrowser = new WebBrowser();
webBrowser.ScrollBarsEnabled = false;
webBrowser.IsWebBrowserContextMenuEnabled = true;
webBrowser.AllowNavigation = true;
webBrowser.DocumentCompleted += webBrowser_DocumentCompleted1;
webBrowser.DocumentText = source;
Application.Run();
});
th.SetApartmentState(ApartmentState.STA);
th.Start();
}
static void webBrowser_DocumentCompleted1(object sender, WebBrowserDocumentCompletedEventArgs e)
{
var webBrowser = (WebBrowser)sender;
webBrowser.SetBounds(0, 0, 0, 0);
webBrowser.Print();
}
Is there any other way to print HTML like this using WPF?
You could do something similar using WebView2. However, this method doesn't replicate the current behaviour exactly. The WebView2 control needs to be added to the UI before it will be fully initialized.
Follow the steps in Get started with WebView2 in WPF apps:
Make sure you have the WebView2 runtime installed
Add the WebView2 control to your window/page/control:
<DockPanel>
<DockPanel DockPanel.Dock="Top">
<Button x:Name="PopulateWebView2"
Click="PopulateWebView2_Click"
Content="Print"/>
</DockPanel>
<wv2:WebView2 Name="webView2" />
</DockPanel>
Initialize WebView2, and call this in the window/page constructor:
void InitializeAsync()
{
webView2.EnsureCoreWebView2Async(null);
}
Handle the button click and print the page.
private void PopulateWebView2_Click(object sender, RoutedEventArgs e)
{
webView2.NavigateToString(xx); // your html string to populate the browser
webView2.NavigationCompleted += async (s, e) =>
{
await webView2.CoreWebView2.PrintToPdfAsync(#"Path/To/file.pdf");
};
}
As mentioned, the WebView2 control needs to be rendered on the UI before it can be fully initialized. This will print the page as a PDF file to the path specified in PrintToPdfAsync(path).
You could use the built-in print dialog and use the DOM to execute a print command instead using the following command, however that will require input from the user to select the file location:
await webView2.CoreWebView2.ExecuteScriptAsync("window.print();")
It will work as written in WPF, as well as it does in WinForms anyway...
First, add a reference to the System.Windows.Forms assembly.
Then qualify everything...
using WinForms = System.Windows.Forms;
public static void StartBrowser(string source)
{
var th = new Thread(() =>
{
var webBrowser = new WinForms.WebBrowser();
webBrowser.ScrollBarsEnabled = false;
webBrowser.IsWebBrowserContextMenuEnabled = true;
webBrowser.AllowNavigation = true;
webBrowser.DocumentCompleted += webBrowser_DocumentCompleted1;
webBrowser.DocumentText = source;
WinForms.Application.Run();
});
th.SetApartmentState(ApartmentState.STA);
th.Start();
}
static void webBrowser_DocumentCompleted1(object sender, WinForms.WebBrowserDocumentCompletedEventArgs e)
{
var webBrowser = (WinForms.WebBrowser)sender;
webBrowser.SetBounds(0, 0, 0, 0);
webBrowser.Print();
}

c# Get URL of the new page after invoking click in Web Browser

I have to navigate through some pages with an Application Console in C#, I have a known URL as a starting page, but other URLs are not known, I reach them by clicking on buttons in the starting page.
This is the code of the button I need to click on the page:
<input name="ctl00$ContentPlaceHolder1$btn_invia" type="button" id="myId" onclick="ReDirect();" class="myClass" value="BUTTON TEXT" />
I tried this code to navigate to the start page, and it works:
class Program
{
private static WebBrowser wb1 = new WebBrowser();
[STAThread]
static void Main(string[] args)
{
runBrowserThread(new Uri("http://www.myUrl.com"));
}
private static void runBrowserThread(Uri url)
{
var th = new Thread(() => {
var br = new WebBrowser();
br.DocumentCompleted += Br_DocumentCompleted; ;
br.Navigate(url);
Application.Run();
});
th.SetApartmentState(ApartmentState.STA);
th.Start();
}
private static void Br_DocumentCompleted(object sender,
WebBrowserDocumentCompletedEventArgs e)
{
//Retrieve string content of document
var document = ((WebBrowser)sender).Document;
var documentAsIHtmlDocument3 =
(mshtml.IHTMLDocument3)document.DomDocument;
var content = documentAsIHtmlDocument3.documentElement.innerHTML;
//Parse content with html agility pack or whatever
//Click on button
wb1.Document.GetElementById("myId").InvokeMember("click");
Application.ExitThread();
}
}
When I click on the button, it loads a new page.
But when I call .InvokeMember("click"); it freezes, how can I make the WebBrowser wb1 go to the new page?
I figured out where I was wrong, I called wb1.Document.GetElementById("myId").InvokeMember("click"); but wb1 wasn't initalized, so I didn't call .InvokeMember("click") on an existing element. I modified in this way:
((WebBrowser)sender).Document.GetElementById("myId").InvokeMember("Click");
in the Br_DocumentCompleted method, and now it works. But now I have another problem: I modified the Br_DocumentCompleted method in this way, because I have to click on a button in the new page, to open a third page:
private static void Br_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
//Retrieve string content of document
var document = ((WebBrowser)sender).Document;
var documentAsIHtmlDocument3 = (mshtml.IHTMLDocument3)document.DomDocument;
var content = documentAsIHtmlDocument3.documentElement.innerHTML;
if (((WebBrowser)sender).Url.AbsoluteUri.Contains("startpage.aspx"))
{
//Click on button
((WebBrowser)sender).ScriptErrorsSuppressed = true;
((WebBrowser)sender).Document.GetElementById("myId").InvokeMember("Click");
}
else if (((WebBrowser)sender).Url.AbsoluteUri.Contains("page1.aspx"))
{
((WebBrowser)sender).Document.GetElementById("btn_send").InvokeMember("Click");
}
else if (((WebBrowser)sender).Url.AbsoluteUri.Contains("page2.aspx"))
{
//Some code
Application.ExitThread();
}
}
And the "btn_send" on the page1 has this code:
<input onclick="openEVERFANCY('#CChildren','N'); return false;" id=btn_send class=buttonVerifyDisp type=button value=GO>
and when I call ((WebBrowser)sender).Document.GetElementById("btn_send").InvokeMember("Click"); on this button, it doesn't go to the third page, even if I have the DocumentCompleted event handler

Entering text in website textbox using c#

I am trying to automate fill the textbox of a website in c# and i used:
private void button1_Click(object sender, EventArgs e)
{
System.Windows.Forms.WebBrowser webBrowser = new WebBrowser();
HtmlDocument document = null;
document=webBrowser.Document;
System.Diagnostics.Process.Start("http://www.google.co.in");
document.GetElementById("lst-ib").SetAttribute("value", "ss");
}
The webpage is opening but the text box is not filled with the specified value. I have also tried innertext instead of setAttribute. I am using windows forms.
You are expecting that your webBrowser will load the page at specified address, but actually your code will start default browser (pointing at "http://www.google.co.in"), while webBrowser.Document will remain null.
try to replace the Process.Start with
webBrowser.Navigate(yourUrl);
Eliminate the Process.Start() statement (as suggested by Gian Paolo) because it starts a WebBrowser as an external process.
The problem with your code is that you want to manipulate the value of your element too fast. Wait for the website to be loaded completely:
private void button1_Click(object sender, EventArgs e)
{
System.Windows.Forms.WebBrowser webBrowser = new WebBrowser();
webBrowser.DocumentCompleted += new WebBrowserDocumentCompletedEventHandler(webBrowser_DocumentCompleted);
webBrowser.Navigate("http://www.google.co.in");
}
private void webBrowser_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
webBrowser.document.GetElementById("lst-ib").SetAttribute("value", "ss");
}
Please note that using a instance of a WebBrowser is not often the best solution for a problem. It uses a lot of RAM and has some overhead you could avoid.

phone:WebBrowser handle mail link

I have a WebBrowser control:
<phone:WebBrowser Name="ArticleContent" Navigating="ArticleContent_Navigating" Navigated="ArticleContent_Navigated" />
And i get article from server like a HTML string:
string Article = "<p>Sometext</p><span style=\"font-family:"Arial","sans-serif";mso-fareast-font-family:"Arial Unicode MS"; mso-fareast-language:LV\">artjomgsd#inbox.lv</span>";
I do this:
ArticleContent.NavigateToString(Article);
And have this function to stop loading icon:
private void ArticleContent_Navigated(object sender, NavigationEventArgs e)
{
HideLoading();
}
And this function to handle links ( to open links in external browser):
private void ArticleContent_Navigating(object sender, NavigatingEventArgs e)
{
e.Cancel = true;
WebBrowserTask webBrowserTask = new WebBrowserTask();
webBrowserTask.Uri = new Uri(e.Uri.ToString(), UriKind.Absolute);
webBrowserTask.Show();
}
My question is, why when i tap E-mail hyperlink nothing hapens? It even doesn't enter ArticleContent_Navigating() function?
P.S. I want to open MailTask on clicking on mail hyperlink.
Unfortunately, mailto: it not supported by the WebBrowser control on Windows Phone.
What you can do is inject Javascript in the HTML that will enumerate all a tags and wire up an onclick event. That event will call window.external.Notify which will in turn raise the ScriptNotify event of the WebBrowser, with the URL as a parameter.
It is a little complicated but I think it's the only option for dealing with the mailto protocol on Windows Phone.
Here is some sample code:
// Page Constructor
public MainPage()
{
InitializeComponent();
browser.IsScriptEnabled = true;
browser.ScriptNotify += browser_ScriptNotify;
browser.Loaded += browser_Loaded;
}
void browser_Loaded(object sender, RoutedEventArgs e)
{
// Sample HTML code
string html = #"<html><head></head><body><a href='mailto:test#test.com'>Send an email</a></body></html>";
// Script that will call raise the ScriptNotify via window.external.Notify
string notifyJS = #"<script type='text/javascript' language='javascript'>
window.onload = function() {
var links = document.getElementsByTagName('a');
for(var i=0;i<links.length;i++) {
links[i].onclick = function() {
window.external.Notify(this.href);
}
}
}
</script>";
// Inject the Javascript into the head section of the HTML document
html = html.Replace("<head>", string.Format("<head>{0}{1}", Environment.NewLine, notifyJS));
browser.NavigateToString(html);
}
void browser_ScriptNotify(object sender, NotifyEventArgs e)
{
if (!string.IsNullOrEmpty(e.Value))
{
string href = e.Value.ToLower();
if (href.StartsWith("mailto:"))
{
EmailComposeTask email = new EmailComposeTask();
email.To = href.Replace("mailto:", string.Empty);
email.Show();
}
}
}
Problem with your Html tag just write Http// before mailto:artjomgsd#inbox.lv\ on your link it will work fine
like bellow
"<p>Sometext</p><span style=\"font-family:"Arial","sans-serif";mso-fareast-font-family:"Arial Unicode MS"; mso-fareast-language:LV\">artjomgsd#inbox.lv</span>";
i tested this code in my application now its work..

Why doesn't the webpage appear? What is missing in my code?

I'm trying to make my own webbrowser with C#,
my wpf application seems to be correct. but it's still missing something.
the webpage doesn't appear. :s
Does someone have an idea?
Here's my code in C# :
public partial class Window1 : Window
{
public Window1()
{
InitializeComponent();
}
private void textBox1_TextChanged(object sender, TextChangedEventArgs e)
{
}
private void button1_Click(object sender, RoutedEventArgs e)
{
WebBrowser web = new WebBrowser();
web.NavigateToString (textBox1.Text);
}
Thanks for your help.
As I understand, you are instantiating a new WebBrowser control in code and you aren't adding it as a control to the actual form. You'd better add the control in design view and just do the method call in the code.
When you create the WebBrowser, try adding a third line:
WebBrowser web = new WebBrowser();
Content = web; // extra line
web.NavigateToString (textBox1.Text);
If the textbox is your address bar, it won't work. NavigateToString will interpret what's in your textbox as literal HTML.
web.NavigateToString (textBox1.Text);
should be
web.Source = new Uri(textBox1.Text, UriKind.Absolute);

Categories