PhantomJs save webpage containing dynamic data as webpage

PhantomJs save webpage containing dynamic data as webpage - c#

I have used jQuery plugin to generate charts on screen along with dynamic data and notes.
I need to generate PDF of this webpage on button click.
I have tried to search good option but not able to find any proper one.
Right now I am trying to use phantomjs for this. It works for internet sites but not working for intranet sites.
I am continuously getting Unable to load address.
Can anyone suggest me the way to achieve this? Or any alternative way to generate PDF from html content (My html contains SVG, Javascript generated dynamic data too).
Code:
string serverPath = "C:\\Phantomjs\\bin\\phantomjs";
var phantomJS = new PhantomJS();
var outFile = Path.Combine(serverPath, "google2.pdf");
if (File.Exists(outFile))
File.Delete(outFile);
try {
phantomJS.Run(Path.Combine(serverPath, "rasterize.js"),
new[] { "http://localhost:61362/HT.aspx?FNo=D1&PrNo=Dummy1&HtId=1033", outFile });
} finally {
phantomJS.Abort();
}
The url is not working in this case. But if I take any internet hosted site, it works.

Related

How to make a complex file download functionality with Blazor server?

I have a Blazor server application. I want to allow the user to download files but the content of the files needs to be built dynamically.
Basically the application shows reports to the user based on filters and etc. and I want the user to have the option to download whatever he is currently seeing.
I know I can make a "link button" that points to a Razor page that returns some sort of FileContentResult in its OnGet method but I have no idea how to pass any data to that so that the correct report file can be built.
I know there is an alternative that uses JavaScript but, as far as I know, it's more cumbersome and I'm not sure if it is any better.
I thought about doing a request to some sort of REST/WebAPI (which would allow me to pass arguments and stuff) but I cannot seem to get a WebAPI and Blazor Server projects run at the same time. The only partial success I've had is adding a WebAPI project to my Blazor Server solution and starting both. But then, while debugging, for some reason, both processes stop when I download the file.
Also the application must be hosted on Azure Web app and am not sure how feasible it would be to run both projects at the same time.
So, how can I make my Blazor Server allow the user to download a file but generate the file dynamically based on what the user is seeing on his browser?

The JavaScript alternative is very straightforward.
export function saveAsFile(filename, bytesBase64) {
if (navigator.msSaveBlob) {
//Download document in Edge browser
var data = window.atob(bytesBase64);
var bytes = new Uint8Array(data.length);
for (var i = 0; i < data.length; i++) {
bytes[i] = data.charCodeAt(i);
}
var blob = new Blob([bytes.buffer], { type: "application/octet-stream" });
navigator.msSaveBlob(blob, filename);
}
else {
var link = document.createElement('a');
link.download = filename;
link.href = "data:application/octet-stream;base64," + bytesBase64;
document.body.appendChild(link); // Needed for Firefox
link.click();
document.body.removeChild(link);
}
}
create a bytestream of whatever content you want to create and call this function through IJSInterop. I have used it in blazor server and it works well.
Please note : I found this piece of code online but I don't remember from where. I will be happy to give credit to the original author if someone knows the source.

c# HtmlAgilityPack class inside many classes, need to check if class exist

I am working on this for days without a solution.
For example, I have this link: https://www.aliexpress.com/item/32682673712.html
I am trying to check if the Buy now button disable
if I have this line inside the DOM : Buy Now
the problem is that this line inside a class that inside a class and so on...
I know there is an option to get a specific node with HtmlAgilityPack But I didn't succeed
var nodes = doc.DocumentNode.SelectNodes("//div[(#class='next-btn next-large next-btn-primary buynow disable')]/p");
but I don't get anything
I tried to get the entire dom and then search with inside but didn't succeed
var getHtmlWeb = new HtmlWeb();
var document = getHtmlWeb.Load(url);
I just got the html and not the DOM
another thing I tried to do is:
var Driver = new FirefoxDriver();
Driver.Navigate().GoToUrl(url);
string pagesource = Driver.PageSource;
and it did works! but this solution open the browser and I don't want that (I am running over many links)
Please help a frustrated guy :)
thanks.

This is happening because the buynow button is being loaded via JavaScript.
If you open network tab in chrome dev tools, you will notice that the page is making a call to some api to load the product information.
The url with json data for the product looks like this:
https://www.aliexpress.com/aeglodetailweb/api/store/header?itemId=32682673712&categoryId=100007324&sellerAdminSeq=202460364&storeNum=720855&minPrice=4.99&maxPrice=4.99&priceCurrency=USD
You will most probably have to send same headers as chrome is sending for the request in order to load the api endpoint in your app.

Xamarin: how to properly display a webpage reading its source code in an Android app

I am working on an Android application using Xamarin but hopefully the question below is more general and can be answered also by people not familiar with Xamarin.
The app is very simple and has one main page with a list where the use can click on any of the items.
These items represent products and all the related information are stored into a dictionary (key is the product name, values are some information about the product).
Amongst the available fields, I also added a URL of the website where the user can access more information about the product.
The app has a WebView to display the page inside the app.
I can display the webpage by using:
web = FindViewById<WebView>(Resource.Id.webView);
web.SetWebViewClient(new myWebViewClient());
web.Settings.JavaScriptEnabled = true;
web.LoadUrl(item.Link);
where the item/product the user has selected is the item object having the URL saved in item.Link.
The problem is that the user need a web connection in order to load the page. So I thought about saving the whole HTML code into a string so that the user can visualize the content even if no connection is available. I am doing this by:
WebClient wc = new WebClient();
using (Stream st = wc.OpenRead(web.Url))
{
using (StreamReader sr = new StreamReader(st, Encoding.UTF8))
{
html = sr.ReadToEnd();
}
}
Then each item in the dictionary will have a field item.Html with the source code of the page.
I can visualize the page with the following code:
web = FindViewById<WebView>(Resource.Id.webView);
web.SetWebViewClient(new myWebViewClient());
web.Settings.JavaScriptEnabled = true;
//web.LoadUrl(item.Link);
web.LoadData(item.Html, "text/html", "UTF-8");
The problem is that the viewer does not display the page in the same way as web.LoadUrl(item.Link) would do (i.e. the page as you would see in any web browser) but it is much messier and difficult to navigate.
What am I doing wrong? Any better way to achieve my goal?

Retrieving webpage data after some delay (web scraping)

The Aim is to retrieve data from a website after it has finished its Ajax calls.
Currently the data is being retrieved when the page first loads. But the required data is found inside a div which is loaded after an ajax call.
To summarize , the Scenario is as follows:
A webpage is called with some parameters passed inside C# code (currently using CsQuery for c#). when the request is sent, the page opens and a "Loading" picture shows and after few seconds the Required data is retrieved. The cSQuery code however retrieves the first Page contents with the "Loading" picture ..
the code is as follows
UrlBuilder ub = new UrlBuilder("<url>")
.AddQuery("departure", "KHI")
.AddQuery("arrival", "DXB")
.AddQuery("queryDate", "2013-03-28")
.AddQuery("queryType", "D");
CQ dom = CQ.CreateFromUrl(ub.ToString());
CQ availableFlights = dom.Select("div#availFlightsDiv");
string RenderedDiv = availableFlights["#availFlightsDiv"].RenderSelection();

When you "scrape" a site you are making a call to the web server and you get what it serves up. If the DOM of the target site is modified by javascript (ajax or otherwise) you are never going to get that content unless you load it into some kind of browser engine on the machine that is doing the scraping, that is capable of executing the javascript calls.

Almost a year old question, you might have got your answer already. But would like mention this awesome project here - SimpleBrowser.
https://github.com/axefrog/SimpleBrowser
It keeps your DOM updated.

C# Open web page in default browser with post data

I am sure this must have been answered before but I cannot find a solution, so I figure I am likely misunderstanding other people's solutions or trying to do something daft, but here we go.
I am writing an add-in for Outlook 2010 in C# where a user can click a button in the ribbon and submit the email contents to a web site. When they click the button the website should open in the default browser, thus allowing them to review what has just been submitted and interact with it on the website. I am able to do this using query strings in the URL using:
System.Diagnostics.Process.Start("http://www.test.com?something=value");
but the limit on the amount of data that can be submitted and the messy URLs are preventing me from following through with this approach. I would like to use an HTTP POST for this as it is obviously more suitable. However, the methods I have found for doing this do not seem to open the page up in the browser after submitting the post data:
http://msdn.microsoft.com/en-us/library/debx8sh9.aspx
to summarise; the user needs to be able to click the button in the Outlook ribbon, have the web browser open and display the contents of the email which have been submitted via POST.
EDIT:
Right, I found a way to do it, its pretty fugly but it works! Simply create a temporary .html file (that is then launched as above) containing a form with hidden fields for all the data, and have it submitted on page load with JavaScript.
I don't really like this solution as it relies on JavaScript (I have a <noscript> submit button just in case) and seems like a bit of a bodge, so I am still really hoping someone on here will come up with something better.

This is eight years late, but here's some code that illustrates the process pretty well:
string tempHTMLLocation = "some_arbitrary_location" + "/temp.html";
string url = https://your_desired_url.com";
// create the temporary html file
using (FileStream fs = new FileStream(tempHTMLLocation, FileMode.Create)) {
using (StreamWriter w = new StreamWriter(fs, Encoding.UTF8)) {
w.WriteLine("<body onload=\"goToLink()\">");
w.WriteLine("<form id=\"form\" method=\"POST\" action=\"" + url + "\">");
w.WriteLine("<input type=\"hidden\" name=\"post1\" value=\"" + post_data1 + "\">");
w.WriteLine("<input type=\"hidden\" name=\"post2\" value=\"" + post_data2 + "\">");
w.WriteLine("</form>");
w.WriteLine("<script> function goToLink() { document.getElementById(\"form\").submit(); } </script>");
w.WriteLine("</body>");
}
}
// launch the temp html file
var launchProcess = new ProcessStartInfo {
FileName = tempHTMLLocation,
UseShellExecute = true
};
Process.Start(launchProcess);
// delete temp file but add delay so that Process has time to open file
Task.Delay(1500).ContinueWith(t=> File.Delete(tempHTMLLocation));
Upon opening the page, the onload() JS script immediately submits the form, which posts the data to the url and opens it in the default browser.

The Dropbox client does it the same ways as you mentioned in your EDIT. But it also does some obfuscation, i.e. it XORs the data with the hash submitted via the URL.
Here are the steps how Dropbox does it:
in-app: Create a token that can be used to authorize at dropbox.com.
in-app: Convert token to hex string (A).
in-app: Create a secure random hex string (B) of the same length.
in-app: Calculate C = A XOr B.
in-app: Create temporary HTML file with the following functionality:
A hidden input field which contains value B.
A submit form with hidden input fields necessary for login to dropbox.com.
A JS function that reads the hash from URI, XORs it with B and writes the result to the submit forms hidden fields.
Delete hash from URI.
Submit form.
in-app: Open the temporary HTML file with the standard browser and add C as hash to the end of the URI.
Now if your browser opens the HTML file it calculates the auth token from the hidden input field and the hash in the URI and opens dropbox.com. And because of Point 5.4. you are not able to hit the back button in your browser to login again because the hash is gone.

I'm not sure I would have constructed the solution that way. Instead, I would post all the data to a web service (using HttpWebRequest, as #Loci described, or just importing the service using Visual Studio), which would store the data in a database (perhaps with a pending status). Then direct the user (using your Process.Start approach) to a page that would display the pending help ticket, which would allow them to either approve or discard the ticket.
It sounds like a bit more work, but it should clean up the architecture of what you are trying to do. Plus you have the added benefit of not worrying about how to trigger a form post from the client side.
Edit:
A plain ASMX web service should at least get you started. You can right-click on your project and select Add Service Reference to generate the proxy code for calling the service.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.