First I tried to run from a WebBrowser Control
WebBrowser webBrowser1 = new WebBrowser();
webBrowser1.Visible = false;
webBrowser1.Navigate("about:blank");
webBrowser1.Document.Write("<html><head></head><body></body></html>");
HtmlElement head = webBrowser1.Document.GetElementsByTagName("head")[0];
dynamic scriptEl = webBrowser1.Document.CreateElement("script");
scriptEl.DomElement.text = "function test(fn) { try{ window[fn](); } catch(ex) { return 'abc '.trim(); } }"
+ "function sayHello() { alert('ha'); throw 'error with spaces '; }";
head.AppendChild(scriptEl);
var result = webBrowser1.Document.InvokeScript("test", new object[] { "sayHello" });
It works almost perfectly. It knows what a window, alert is... The only problem is that it apparently runs on ECMA3, so when I tested "abc ".trim() it couldn't execute.
My second attempt was Javascript .NET.
using (JavascriptContext context = new JavascriptContext())
{
// Setting external parameters for the context
//context.SetParameter("console", new SystemConsole());
context.SetParameter("message", "Hello World ! ");
// Script
string script = #"
alert(message.trim());
";
// Running the script
context.Run(script);
}
Unfortunately it doesn't know what alert, window, document, console... is. Unless I tell it setting context parameters.
What else is there? May I should try some headless browsers and invoke using Process?
If you want to run JavaScript server side, I would recommend using PhantomJS. It allows you to run a full WebKit browser from the command line using JavaScript and command line arguments.
JavaScript is definitely not just for client-side scripting any more. As Cameron said PhantomJS is excellent if you need the DOM. If you don't, NodeJS is the clear choice with a wealth of libraries.
Related
If I recive a web site with this function I get the whole page, but without the ajax loaded values.
htmlDoc.LoadHtml(new WebClient().DownloadString(url));
Is it possible to load the web site like in gChrome with all values?
You can use a WebBrowser control to get and render the page. Unfortunately, the control uses Internet Explorer and you have to change a registry value in order to force it to use the latest version and even then the implementation is very brittle.
Another option is to take a standalone browser engine like WebKit and make it work in .NET. I found a page explaining how to do this, but it's pretty dated: http://webkitdotnet.sourceforge.net/basics.php
I worked on a little demo app to get the content and this is what I came up with:
class Program
{
static void Main(string[] args)
{
GetRenderedWebPage("https://siderite.dev", TimeSpan.FromSeconds(5), output =>
{
Console.Write(output);
File.WriteAllText("output.txt", output);
});
Console.ReadKey();
}
private static void GetRenderedWebPage(string url, TimeSpan waitAfterPageLoad, Action<string> callBack)
{
const string cEndLine= "All output received";
var sb = new StringBuilder();
var p = new PhantomJS();
p.OutputReceived += (sender, e) =>
{
if (e.Data==cEndLine)
{
callBack(sb.ToString());
} else
{
sb.AppendLine(e.Data);
}
};
p.RunScript(#"
var page = require('webpage').create();
page.viewportSize = { width: 1920, height: 1080 };
page.onLoadFinished = function(status) {
if (status=='success') {
setTimeout(function() {
console.log(page.content);
console.log('" + cEndLine + #"');
phantom.exit();
}," + waitAfterPageLoad.TotalMilliseconds + #");
}
};
var url = '" + url + #"';
page.open(url);", new string[0]);
}
}
This uses the PhantomJS "headless" browser by way of the wrapper NReco.PhantomJS which you can get through "reference NuGet package" directly from Visual Studio. I am sure it can be done better, but this is what I did today. You might want to take a look at the PhantomJS callbacks so you can properly debug what is going on. My example will wait forever if the URL doesn't work, for example. Here is a useful link: https://newspaint.wordpress.com/2013/04/25/getting-to-the-bottom-of-why-a-phantomjs-page-load-fails/
No its not possible in your example. Since it will load content as a string. You should render that string in "browser engine" or find any components which would do that for you.
I would suggest you to look into abotx they just announce this feature so maybe would be interesting for you but its not free.
I try to use a webbrowser control in my application, in which I want to block scrips and frames.
I used the extended web browser control in this answer to have access to download control flags.
So, I used it as follows in the form constructor:
webBrowser1.DownloadControlFlags = (int)WebBrowserDownloadControlFlags.DLIMAGES
+ (int)WebBrowserDownloadControlFlags.NOFRAMES
+ (int)WebBrowserDownloadControlFlags.NO_SCRIPTS
+ (int)WebBrowserDownloadControlFlags.NO_FRAMEDOWNLOAD
+ (int)WebBrowserDownloadControlFlags.NO_JAVA
+ (int)WebBrowserDownloadControlFlags.NO_DLACTIVEXCTLS
+ (int)WebBrowserDownloadControlFlags.NO_BEHAVIORS
+ (int)WebBrowserDownloadControlFlags.NO_RUNACTIVEXCTLS
+(int)WebBrowserDownloadControlFlags.SILENT;
It seems works, but I have a certain injected script which I want to run it. I injected it after the document was loaded (in DocumentCompleted event)
IHTMLDocument2 doc2 = webBrowser1.Document.DomDocument as IHTMLDocument2;
IHTMLScriptElement script = (IHTMLScriptElement)doc2.createElement("SCRIPT");
script.type = "text/javascript";
script.text = #"// Highlight Words Script ....";
IHTMLElementCollection nodes = doc.getElementsByTagName("head");
foreach (IHTMLElement elem in nodes)
{
//Append script
HTMLHeadElement head = (HTMLHeadElement)elem;
head.appendChild((IHTMLDOMNode)script);
}
But it doesn't run as I call it
wb.Document.InvokeScript("findString", new string[] { toWord });
How can I run my script while I have suppressed running the document scripts?
Can I let scripts run but block script errors and undesired behaviours using other flags?
The questions says it all. I have everything wired up and know how to send messages from the browser html to c#, but not the other way.
I should be able to do something like:
browserControl.JSCall("myFunction('Dave','Smith');");
...and in the web code:
function myFunction(firstName, lastName) {
$("#mydiv").text(firstName + ' ' + lastName);
}
Thanks - Dave
You can do this using Navigate:
browserControl.Navigate("javascript:void(myFunction('Dave','Smith'))");
Note, I find that the code isn't actually run until the application event loop executes. If that's a problem for you, you might be able to follow the Navigate call with
Application.DoEvents();
Make sure you consider the dangers of calling DoEvents explicitly.
I know about AutoJSContext class so there is no need for passing javascript to Navigate().
string outString = "";
using (Gecko.AutoJSContext java = new Gecko.AutoJSContext(geckoWebBrowser1.JSContext))
{
java.EvaluateScript(#"window.alert('alert')", out outString );
}
Dear #SturmCoder and #DavidCornelson are right.
but it seems that for version 60.0.24.0
geckoWebBrowser1.JSCall()
and
Gecko.AutoJSContext() which accepts geckoWebBrowser1.JSContext
are absolete and instead of geckoWebBrowser1.JSContext you should write geckoWebBrowser1.Window
and for me this codes works :
string result = "";
using (Gecko.AutoJSContext js= new Gecko.AutoJSContext(geckoWebBrowser1.Window))
{
js.EvaluateScript("myFunction('Dave','Smith');", out result);
}
or even if the website has jQuery you can run like this :
string result = "";
using (Gecko.AutoJSContext js= new Gecko.AutoJSContext(geckoWebBrowser1.Window))
{
js.EvaluateScript(#"alert($('#txt_username').val())", out result);
}
Besides using Navigate method, you have this another workaround:
var script = geckofx.Document.CreateElement("script");
script.TextContent = js;
geckofx.Document.GetElementsByTagName("head").First().AppendChild(script);
I have an aspx page which has some javascript code like
<script>
setTimeout("document.write('" + place.address + "');",1);
</script>
As it is clear from the code it will going to write something on the page after a very short delay of 1 ms. I have created an another page to get the page executed by some query string and get its output. The problem is
I can not avoid the delay as simply writing document.write(place.address); will not print anything as it takes a little time to get values so if I set it in setTimeout for delayed output of 1 ms it always return me a value
If I request the output from another page using
System.Net.WebClient wc = new System.Net.WebClient();
System.IO.StreamReader sr = new System.IO.StreamReader(wc.OpenRead("http://localhost:4859/Default.aspx?lat=" + lat + "&lng=" + lng));
string strData = sr.ReadToEnd();
I get the source code of the document instead of the desired output.
I would like to either avoid that delay or else delayed the client request output so that I get a desired value not the source code.
The JS on default.aspx is
<script type="text/javascript">
var geocoder;
var address;
function initialize() {
geocoder = new GClientGeocoder();
var qs=new Querystring();
if(qs.get("lat") && qs.get("lng"))
{
geocoder.getLocations(new GLatLng(qs.get("lat"),qs.get("lng")),showAddress);
}
else
{
document.write("Invalid Access Or Not valid lat long is provided.");
}
}
function getAddress(overlay, latlng) {
if (latlng != null) {
address = latlng;
geocoder.getLocations(latlng, showAddress);
}
}
function showAddress(r) {
place = r.Placemark[0];
setTimeout("document.write('" + place.address + "');",1);
//document.write(place.address);
}
</script>
and the code on requestClient.aspx is as
System.Net.WebClient wc = new System.Net.WebClient();
System.IO.StreamReader sr = new System.IO.StreamReader(wc.OpenRead("http://localhost:4859/Default.aspx?lat=" + lat + "&lng=" + lng));
string strData = sr.ReadToEnd();
I'm not a JavaScript expert, but I believe using document.write after the page has finished loading is a bad thing. You should be creating an html element that your JavaScript can manipulate, once the calculation is complete.
Elaboration
In your page markup, create a placeholder for where you want the address to appear:
<p id="address">Placeholder For Address</p>
In your JavaScript function, update that placeholder:
function showAddress(r) {
place = r.Placemark[0];
setTimeout("document.getElementById('address').innerHTML = '" + place.address + "';",1);
}
string strData = sr.ReadToEnd();
I get the source code of the document instead of the desired output
(Could you give a sample of the output. I don't think I've seen a web scraper work that way so that would help me to be sure. But if not this is a good example web scraper)
Exactly what are you doing with the string "strData" If you are just writing it out, I recommend you putting it in a Server side control (like a literal). If at all possible, I'd recommend you do this server side using .net rather than waiting 1 ms in javascript (which isn't ideal considering the possibility that 1 ms may or may not be an ideal amount of time to wait on a particular user's machine hence: "client side"). In a case like this and I had to do it client side I would use the element.onload event to determine if a page has finished loading.
I'm doing some web automation via C# and a WebBrowser. There's a link which I need to 'click', but since it fires a Javascript function, apparently the code needs to be executed rather than just having the element clicked (i.e. element.InvokeMember("click")). Here's the href for the element, which opens an Ajax form:
javascript:__doPostBack("ctl00$cphMain$lnkNameserverUpdate", "")
I've tried:
webBrowser1.Document.InvokeScript("javascript:__doPostBack", new object[] { "ctl00$cphMain$lnkNameserverUpdate", "" });
and:
webBrowser1.Document.InvokeScript("__doPostBack", new object[] { "ctl00$cphMain$lnkNameserverUpdate", "" });
and a few other things. The code gets hit, but the script doesn't get fired. Any ideas would be most appreciated.
Gregg
BTW Here's the full element in case it's useful:
NS51.DOMAINCONTROL.COM<br/>NS52.DOMAINCONTROL.COM<br/>
Have a look at this link:
http://msdn.microsoft.com/en-us/library/system.windows.forms.webbrowser.objectforscripting.aspx
I've actually used this in the past, and it works perfectly.
HtmlDocument doc = browser.Document;
HtmlElement head = doc.GetElementsByTagName("head")[0];
HtmlElement s = doc.CreateElement("script");
s.SetAttribute("text","function sayhello() { alert('hello'); }");
head.AppendChild(s);
browser.Document.InvokeScript("sayHello");