C# parsing web site with ajax loaded content - c#

If I recive a web site with this function I get the whole page, but without the ajax loaded values.
htmlDoc.LoadHtml(new WebClient().DownloadString(url));
Is it possible to load the web site like in gChrome with all values?

You can use a WebBrowser control to get and render the page. Unfortunately, the control uses Internet Explorer and you have to change a registry value in order to force it to use the latest version and even then the implementation is very brittle.
Another option is to take a standalone browser engine like WebKit and make it work in .NET. I found a page explaining how to do this, but it's pretty dated: http://webkitdotnet.sourceforge.net/basics.php
I worked on a little demo app to get the content and this is what I came up with:
class Program
{
static void Main(string[] args)
{
GetRenderedWebPage("https://siderite.dev", TimeSpan.FromSeconds(5), output =>
{
Console.Write(output);
File.WriteAllText("output.txt", output);
});
Console.ReadKey();
}
private static void GetRenderedWebPage(string url, TimeSpan waitAfterPageLoad, Action<string> callBack)
{
const string cEndLine= "All output received";
var sb = new StringBuilder();
var p = new PhantomJS();
p.OutputReceived += (sender, e) =>
{
if (e.Data==cEndLine)
{
callBack(sb.ToString());
} else
{
sb.AppendLine(e.Data);
}
};
p.RunScript(#"
var page = require('webpage').create();
page.viewportSize = { width: 1920, height: 1080 };
page.onLoadFinished = function(status) {
if (status=='success') {
setTimeout(function() {
console.log(page.content);
console.log('" + cEndLine + #"');
phantom.exit();
}," + waitAfterPageLoad.TotalMilliseconds + #");
}
};
var url = '" + url + #"';
page.open(url);", new string[0]);
}
}
This uses the PhantomJS "headless" browser by way of the wrapper NReco.PhantomJS which you can get through "reference NuGet package" directly from Visual Studio. I am sure it can be done better, but this is what I did today. You might want to take a look at the PhantomJS callbacks so you can properly debug what is going on. My example will wait forever if the URL doesn't work, for example. Here is a useful link: https://newspaint.wordpress.com/2013/04/25/getting-to-the-bottom-of-why-a-phantomjs-page-load-fails/

No its not possible in your example. Since it will load content as a string. You should render that string in "browser engine" or find any components which would do that for you.
I would suggest you to look into abotx they just announce this feature so maybe would be interesting for you but its not free.

Related

Searching using the Google custom search API and displaying links

I am working on a personal assistant for home automation and so far it has basic features such as searching wolfram alpha and pulling weather conditions/forecasts but I wan't to enable it to search for things on google and display the results on screen.
After searching around the community it seems the recommended way is to use the Google Search API (which has been replaced with Google Custom Search API. So I have looked at some examples and am able to get the data out into a data grid on the windows form however. I want to show clickable links. How can I do this? I already have an API key and CX to use with the code but cannot get the proper output.
GoogleSearch search = new GoogleSearch()
{
Key = "KEY HERE",
CX = "CX HERE"
};
search.SearchCompleted += (a, b) =>
{
this.DataGridResults.ItemsSource = b.Response.Items;
};
search.Search(search_query.Text);
So I solved this problem after working on it for a long time. Turns out I was just using the list the method returned wrong. I attached a link to the original post that gave me the method and my completed solution which just outputs the titles and HTML links in a text box. You can do whatever you like with them from there.
private void Button_Click_1(object sender, RoutedEventArgs e)
{
GoogleSearch search = new GoogleSearch()
{
Key = "API KEY HERE",
CX = "CX GOES HERE"
};
search.SearchCompleted += (a, b) =>
{
foreach (Item i in b.Response.Items)
{
results_box.Text = results_box.Text + Environment.NewLine + "Page Title: " + i.Title;
results_box.Text = results_box.Text + Environment.NewLine + "Link to Page " + i.Link;
};
};
search.Search(search_query.Text);
The method and original post can be found at http://kiwigis.blogspot.com/2011/03/google-custom-search-in-c.html

How to work with the BING REST Api

How exactly do you use the BING REST api (specifically the ROUTES part) to get a driving distance in ASP.NET.
I have searched high and low on Google for this answer and none is forthcoming.
I have found url strings such as:
http://dev.virtualearth.net/REST/v1/Routes/Driving?waypoint.0=redmond&heading=90&waypoint.1=seattle&du=mi&key=BingMapsKey
That's great! But how to call it from ASP?
I have also found this code:
private void GetResponse(Uri uri, Action<HttpResponse> callback)
{
WebClient wc = new WebClient();
wc.OpenReadCompleted += (o, a) =>
{
if (callback != null)
{
DataContractJsonSerializer ser = new DataContractJsonSerializer(typeof(HttpResponse));
callback(ser.ReadObject(a.Result) as HttpResponse);
}
};
wc.OpenReadAsync(uri);
}
Which is a "generic method to make web requests". But, again, how do you call it? I find it confusing that it doesn't require a return type.
In order to call it, I have found code like this:
string key = "YOUR_BING_MAPS_KEY or SESSION_KEY";
string query = "1 Microsoft Way, Redmond, WA";
Uri geocodeRequest = new Uri(string.Format("http://dev.virtualearth.net/REST/v1/Locations?q={0}&key={1}", query, key));
GetResponse(geocodeRequest, (x) =>
{
Console.WriteLine(x.ResourceSets[0].Resources.Length + " result(s) found.");
Console.ReadLine();
});
But when I add this to the project, I get every error under the sun coming up. So, I am stuck.
I am a total ASP beginner and haven't found any online documentation any help at all.
p.s. I do have a BING api key and do use it in the code above.
I am not an expert in this, but the below compiles for me. Also make sure to add the data constructs as mentioned in the BING documentation:
protected void Page_Load(object sender, EventArgs e)
{
string key = "YOUR KEY";
string query = "ADDRESS";
Uri geocodeRequest = new Uri(string.Format("http://dev.virtualearth.net/REST/v1/Locations?q={0}&key={1}", query, key));
GetResponse(geocodeRequest, (x) =>
{
Console.WriteLine(x.ResourceSets[0].Resources.Length + " result(s) found.");
Console.ReadLine();
});
}
Quoting from another stackoverflow question:
The bottom of the documentation you are using points to the Data contracts you need for the REST services which are available here: http://msdn.microsoft.com/en-us/library/jj870778.aspx
Simply create a empty C# file and copy in paste in the C# Data Contracts. Then add the namespace to this class:
using BingMapsRESTService.Common.JSON;

Invoke JavaScript from C# code behind [duplicate]

This question already has answers here:
Calling JavaScript Function From CodeBehind
(21 answers)
Closed 9 years ago.
I am trying to learn asp.net. Assuming that I have this code:
if (command.ExecuteNonQuery() == 0)
{
// JavaScript like alert("true");
}
else
{
// JavaScript like alert("false");
}
How to I can invoke JavaScript from C# code behind? How to do that by putting that JavaScript in Scripts directory which is created by default in MS Visual Studio?
Here is method I will use from time to time to send a pop message from the code behind. I try to avoid having to do this - but sometimes I need to.
private void LoadClientScriptMessage(string message)
{
StringBuilder script = new StringBuilder();
script.Append(#"<script language='javascript'>");
script.Append(#"alert('" + message + "');");
script.Append(#"</script>");
Page.ClientScript.RegisterStartupScript(this.GetType(), "messageScript", script.ToString());
}
You can use RegisterStartupScript to load a javascript function from CodeBehind.
Please note that javascript will only run at client side when the page is render at client's browser.
Regular Page
Page.ClientScript.RegisterStartupScript(this.GetType(), "myfunc" + UniqueID,
"myJavascriptFunction();", true);
Ajax Page
You need to use ScriptManager if you use ajax.
ScriptManager.RegisterStartupScript(Page, Page.GetType(), "myfunc" + UniqueID,
"myJavascriptFunction();", true);
Usually these "startupscripts" are handy for translations or passing settings to javascript.
Although the solution Mike provided is correct on the .Net side I doubt in a clean (read: no spaghetti code) production environment this is a good practice. It would be better to add .Net variables to a javascript object like so:
// GA example
public static string GetAnalyticsSettingsScript()
{
var settings = new StringBuilder();
var logged = ProjectContext.CurrentUser != null ? "Logged" : "Not Logged";
var account = Configuration.Configuration.GoogleAnalyticsAccount;
// check the required objects since it might not yet exist
settings.AppendLine("Project = window.Project || {};");
settings.AppendLine("Project.analytics = Project.analytics || {};");
settings.AppendLine("Project.analytics.settings = Project.analytics.settings || {};");
settings.AppendFormat("Project.analytics.settings.account = '{0}';", account);
settings.AppendLine();
settings.AppendFormat("Project.analytics.settings.logged = '{0}';", logged);
settings.AppendLine();
return settings.ToString();
}
And then use the common Page.ClientScript.RegisterStartupScript to add it to the HTML.
private void RegisterAnalyticsSettingsScript()
{
string script = GoogleAnalyticsConfiguration.GetAnalyticsSettingsScript();
if (!string.IsNullOrEmpty(script))
{
Page.ClientScript.RegisterStartupScript(GetType(), "AnalyticsSettings", script, true);
}
}
On the JavaScript side it might look like this:
// IIFE
(function($){
// 1. CONFIGURATION
var cfg = {
trackingSetup: {
account: "UA-xxx-1",
allowLinker: true,
domainName: "auto",
siteSpeedSampleRate: 100,
pluginUrl: "//www.google-analytics.com/plugins/ga/inpage_linkid.js"
},
customVariablesSetup: {
usertype: {
slot: 1,
property: "User_type",
value: "Not Logged",
scope: 1
}
}
};
// 2. DOM PROJECT OBJECT
window.Project = window.Project || {};
window.Project.analytics = {
init: function(){
// loading ga.js here with ajax
},
activate: function(){
var proj = this,
account = proj.settings.account || cfg.trackingSetup.account,
logged = proj.settings.logged || cfg.customVariablesSetup.usertype.value;
// override the cfg with settings from .net
cfg.trackingSetup.account = account;
cfg.customVariablesSetup.usertype.value = logged;
// binding events, and more ...
}
};
// 3. INITIALIZE ON LOAD
Project.analytics.init();
// 4. ACTIVATE ONCE THE DOM IS READY
$(function () {
Project.analytics.activate();
});
}(jQuery));
The advantage with this setup is you can load an asynchronous object and override the settings of this object by .Net. Using a configuration object you directly inject javascript into the object and override it when found.
This approach allows me to easily get translation strings, settings, and so on ...
It requires a little bit knowledge of both.
Please note the real power of tis approach lies in the "direct initialization" and "delayed activation". This is necessary as you might not know when (during loading of the page) these object are live. The delay helps overriding the proper objects.
This might be a long shot, but sometimes I need a c# property/value from the server side displaying or manipulated on the client side.
c# code behind page
public string Name {get; set;}
JavaScript on Aspx page
var name = '<%=Name%>';
Populating to client side is generally easier, depending on your issue. Just a thought!

Most reliable way to run Javascript in C#?

First I tried to run from a WebBrowser Control
WebBrowser webBrowser1 = new WebBrowser();
webBrowser1.Visible = false;
webBrowser1.Navigate("about:blank");
webBrowser1.Document.Write("<html><head></head><body></body></html>");
HtmlElement head = webBrowser1.Document.GetElementsByTagName("head")[0];
dynamic scriptEl = webBrowser1.Document.CreateElement("script");
scriptEl.DomElement.text = "function test(fn) { try{ window[fn](); } catch(ex) { return 'abc '.trim(); } }"
+ "function sayHello() { alert('ha'); throw 'error with spaces '; }";
head.AppendChild(scriptEl);
var result = webBrowser1.Document.InvokeScript("test", new object[] { "sayHello" });
It works almost perfectly. It knows what a window, alert is... The only problem is that it apparently runs on ECMA3, so when I tested "abc ".trim() it couldn't execute.
My second attempt was Javascript .NET.
using (JavascriptContext context = new JavascriptContext())
{
// Setting external parameters for the context
//context.SetParameter("console", new SystemConsole());
context.SetParameter("message", "Hello World ! ");
// Script
string script = #"
alert(message.trim());
";
// Running the script
context.Run(script);
}
Unfortunately it doesn't know what alert, window, document, console... is. Unless I tell it setting context parameters.
What else is there? May I should try some headless browsers and invoke using Process?
If you want to run JavaScript server side, I would recommend using PhantomJS. It allows you to run a full WebKit browser from the command line using JavaScript and command line arguments.
JavaScript is definitely not just for client-side scripting any more. As Cameron said PhantomJS is excellent if you need the DOM. If you don't, NodeJS is the clear choice with a wealth of libraries.

C# desktop application doesn't share my physical location

I am trying to get my current location( latitude and longitude ) in web application it works fine with following HTML5 code.
<!DOCTYPE html>
<html>
<body>
<p id="demo">Click the button to get your coordinates:</p>
<button onclick="getLocation()">Try It</button>
<script>
var x = document.getElementById("demo");
function getLocation()
{
if (navigator.geolocation)
{
navigator.geolocation.getCurrentPosition(showPosition);
}
else
{
x.innerHTML = "Geolocation is not supported by this browser.";
}
}
function showPosition(position)
{
x.innerHTML="Latitude: " + position.coords.latitude +
"<br>Longitude: " + position.coords.longitude;
}
</script>
</body>
</html>
But I want to get latitude and longitude of user in desktop app. There is no option to use JavaScript in desktop app, so I am trying to access it using the web browser.
When I am trying to access the above created web page from dektop application using webbrowser control (IE10) it doesn't share physical location, and nothing happens when I call the script by button click.
Can anyone help me to get my location(latitude and longitude) in a desktop app(C#)?
I'm posting another answer to the question as this involves a completely different approach.
The Context
When JavaScript tries to access the location object in IE 10 you are presented with the security bar asking for you to allow the access to your location. The difference for a file which is on the local drive or a network share is that you are not presented with the option to always allow access, but only once (Allow once).
For whatever reason, this security bar doesn't show up in the WebBrowser control (even if I've tried setting the Information Bar Handling for the aplication's .exe, but it seems not to have any effect).
This is why every time when the script executes nothing happens in the web browser control. It is actually blocked by the information bar.
The Solution
What needs to be done:
Emulate a web server inside the application. I've used a Simple C# Web Server class to serve the content. This way, even if there is no web server on the local machine, we may intercept requests to a specific URL address and port and serve the content we want to.
Add the test1.html document to the project and use it's content in the server response. Just add the file into your project, next to the "Program.cs" file and set it's Copy to Output Directory property value to Copy always.
How It Works
First, we need to instantiate a web browser control. Then, navigate to the test1.html file. When the document is loaded, we first check if the web server is not instantiated. If this, we create an instance of it and then we read and store the web browser's HTMl source in the response variable, which we pass to the WebServer constructor.
The http://localhost:9999 registers the HttpListener to that prefix, so every request to this address will be served by our simple web server.
Next, we navigate to that address. When the web server will receive the request, it will deliver the content of the _staticContent variable, which had it's value assigned in the web server's constructor.
After the server delivers the document to the web browser, the webBrowser1_DocumentCompleted handler is triggered. But this time, we already have the web server's instance, so execution goes through the else branch. The important thing to notice is that we will asynchronously wait for the JavaScript to execute, get the location and save it to the hidden input elements in the HTML.
One important remark: the first time you launch the application you will not get any location. You first have to leave the application open, so that you have the custom HTTP listener available, and then, perform the steps I described in my other answer, the browsing location being http://localhost:9999. Once you do that, close and reopen the application.
That's it. Everytime you run the application, you will get the location coordinates in a message box.
The Form1 class file (Form1.cs):
public partial class Form1 : Form
{
WebServer _ws;
WebBrowser _webBrowser1;
public Form1()
{
InitializeComponent();
_webBrowser1 = new WebBrowser();
_webBrowser1.Visible = false;
var location = Assembly.GetExecutingAssembly().Location;
_webBrowser1.Navigate(System.IO.Path.GetDirectoryName(Assembly.GetExecutingAssembly().Location) + #"\test1.html");
_webBrowser1.DocumentCompleted += webBrowser1_DocumentCompleted;
}
private void Form1_Load(object sender, EventArgs e)
{
}
async void webBrowser1_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
if (_ws == null)
{
var html = _webBrowser1.Document.GetElementsByTagName("html");
var response = html[0].OuterHtml;
_ws = new WebServer(response, "http://localhost:9999/");
_ws.Run();
_webBrowser1.Navigate("http://localhost:9999/");
}
else
{
string latitude = "";
string longitude = "";
await Task.Factory.StartNew(() =>
{
while (string.IsNullOrEmpty(latitude))
{
System.Threading.Thread.Sleep(1000);
if (this.InvokeRequired)
{
this.Invoke((MethodInvoker)delegate
{
var latitudeEl = _webBrowser1.Document.GetElementById("latitude");
var longitudeEl = _webBrowser1.Document.GetElementById("longitude");
latitude = latitudeEl.GetAttribute("value");
longitude = longitudeEl.GetAttribute("value");
});
}
}
});
MessageBox.Show(String.Format("Latitude: {0} Longitude: {1}", latitude, longitude));
}
}
// credits for this class go to David
// http://www.codehosting.net/blog/BlogEngine/post/Simple-C-Web-Server.aspx
public class WebServer
{
private readonly HttpListener _listener = new HttpListener();
static string _staticContent;
public WebServer(string[] prefixes, string content)
{
_staticContent = content;
foreach (string s in prefixes)
_listener.Prefixes.Add(s);
_listener.Start();
}
public WebServer(string content, params string[] prefixes)
: this(prefixes, content) { }
public void Run()
{
ThreadPool.QueueUserWorkItem((o) =>
{
try
{
while (_listener.IsListening)
{
ThreadPool.QueueUserWorkItem((c) =>
{
var ctx = c as HttpListenerContext;
try
{
byte[] buf = Encoding.UTF8.GetBytes(_staticContent);
ctx.Response.ContentLength64 = buf.Length;
ctx.Response.OutputStream.Write(buf, 0, buf.Length);
}
catch { } // suppress any exceptions
finally
{
// always close the stream
ctx.Response.OutputStream.Close();
}
}, _listener.GetContext());
}
}
catch { } // suppress any exceptions
});
}
public void Stop()
{
_listener.Stop();
_listener.Close();
}
}
}
The HTML source (test1.html)
<html xmlns="http://www.w3.org/1999/xhtml">
<head runat="server">
<title></title>
<meta http-equiv="X-UA-Compatible" content="IE=10" />
<script type="text/javascript">
window.onload = function () {
var latitude = document.getElementById("latitude");
var longitude = document.getElementById("longitude");
function getLocation() {
if (navigator.geolocation) {
navigator.geolocation.getCurrentPosition(showPosition);
}
else { }
}
function showPosition(position) {
latitude.value = position.coords.latitude;
longitude.value = position.coords.longitude;
}
getLocation();
}
</script>
</head>
<body>
<input type="hidden" id="latitude" />
<input type="hidden" id="longitude" />
</body>
</html>
This could happen because the WebBrowser control uses the compatibility mode for a previous version of Internet Explorer.
You have the possibility to set the default emulation mode for Internet Explorer per application by using the FEATURE_BROWSER_EMULATION feature. This is how you actually set the compatibility mode for the WebBrowser control in your own application.
You may follow the indications from the link below in order to configure it:
Internet Feature Controls (B..C)
[UPDATE]
Go to Internet Options -> Privacy
Under the Location section, make sure that Never allow websites to request your physical location is unchecked
Click on Clear Sites
Open Internet Explorer (not your application) and browse to the URL of the file containing the geolocation script
Trigger the getLocation() function (in your case, click on the Try It button)
When the browser shows the security bar in the lower part of the window, containing the yourSite wants to know your physical location., click on Options for this site and choose Always allow.
That would be it.

Categories