Getting the HTML content of the current page with databound controls

Getting the HTML content of the current page with databound controls - c#

I'm trying to create a control that converts the page it's on into a PDF.
protected void ConvertPageToPDF_click(object sender, EventArgs e)
{
string pageHtml;
byte[] pdfBytes;
string url = HttpContext.Current.Request.Url.AbsoluteUri;
// get the HTML for the entire page into pageHtml
HtmlTextWriter hw = new HtmlTextWriter(new StringWriter());
this.Page.RenderControl(hw);
pageHtml = hw.InnerWriter.ToString();
// send pageHtml to a library for conversion
// send the PDF to the user
}
This partially works. On my pages I have several repeaters; their content does not show up in pageHtml. Any thoughts on why that is? How should I fix this?

It turns out that I had to render the control after the data had been bound to it. My click method just adds an event handler:
protected void ConvertRepeaterToPDF_click(object sender, EventArgs e)
{
// handle the PreRender complete method
Page.PreRenderComplete += new EventHandler(Page_OnPreRenderComplete);
}
Then in Page_OnPreRenderComplete I can use RenderControl as above.

Related

Changing label text on button click that also executes other methods

I am trying to update a label on my windows form with statistics from another methods execution that scrapes a webpage and creates a zip file of the links and creates a sitemap page. Preferably this button would run the scraping operations and report the statistics properly. Currently the scraping process is working fine but the label I want to update with statistics data is not changing on the button click. Here is what my code looks like now:
protected void btn_click(object sender, EventArgs e)
{
//Run scrape work
scrape_work(sender, e);
//Run statistics work
statistics(sender, e);
}
protected void scrape_work(object sender, EventArgs e)
{
//Scraping work (works fine)
}
protected void statistics(object sender, EventArgs e)
{
int count = 0;
if (scriptBox.Text != null)
{
count += 1;
}
var extra = eventsBox.Text;
var extraArray = extra.Split('\n');
foreach (var item in extraArray)
{
count += 1;
}
//scrapeNumLbl is label I want to display text on
scrapeNumLbl.Text = count.ToString();
}
Would I have to use threading for this process or is there some other way I can get this process to work? I have already tried this solution but was having the same issue where the code runs but the label does not update. Any help would be greatly appreciated, this minor thing has been bugging me for hours now.

I eventually solved this by writing the path to the zip file to a label button on the form rather than sending it right to download on the client's browser on button click. The issue was that the request was ending after the zip file was sent for download. To ensure that both methods ran at the proper time I moved the call to the scrape_work method to inside of of statistics
In order for the path to be clickable and the file to download properly I had to make the "label" in the form a LinkButton in the .aspx page
<asp:LinkButton ID="lblDownload" runat="server" CssClass="xclass" OnClick="lblDownload_Click"></asp:LinkButton>
And made the lblDownload_Click run like the following:
Response.Clear();
Response.BufferOutput = false;
Response.ContentType = "application/zip";
Response.AddHeader("content-disposition", "attachment; filename=zipFiles.zip");
string folder = Server.MapPath("~/zip");
string endPath = folder + "/zipFiles.zip";
Response.TransmitFile(endPath);
Response.End();
Running it this way the page reloads with the new labels properly written and the zip file available to download as well.

ASSUMING THIS CODE IS RUNNING SYNCHRONOUSLY (you aren't threading the scraping in a call I don't see), Are you sure you are reaching the code that sets the label text? I simplified your code as below (removing the iteration over the array and just setting the label text to a stringified integer) and am not having any trouble changing the text of a label.
namespace SimpleTest
{
public partial class Form1 : Form
{
public Form1()
{
InitializeComponent();
}
private void button1_Click(object sender, EventArgs e)
{
scrape_work(sender, e);
statistics(sender, e);
}
protected void scrape_work(object sender, EventArgs e)
{
//Scraping work (works fine)
}
protected void statistics(object sender, EventArgs e)
{
int count = 666;
scrapeNumLbl.Text = count.ToString();
}
}
}
Result:

Get final HTML content after javascript finished by Open Webkit Sharp

I'm writing a software that gets the content from URL. When working on that, I run into to problem that I can not get exactly the HTML content after the java script finished.
There are some websites that renders HTML by java-script, some do not support browsers which does not run js.
I tried using System.Windows.Controls.WebBrowser with WebBrowser.Document in LoadCompleted but no luck.
After that, I tried the OpenWebkitSharp library. On the UI, it showes the content of website correctly, but with code Document in DocumentCompleted, it still returns the content which does not rendered by java-script.
Here is my code:
...
using WebKit;
using WebKit.Interop;
public MainWindow()
{
windowFormHost = new System.Windows.Forms.Integration.WindowsFormsHost();
webBrowser = new WebKit.WebKitBrowser();
webBrowser.AllowDownloads = false;
windowFormHost.Child = webBrowser;
grdBrowserHost.Children.Add(windowFormHost);
webBrowser.Load += WebBrowser_Load;
}
private void WebBrowser_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
var contentHtml = ((WebKitBrowser)sender).DocumentAsHTMLDocument;
}
The contentHtml has value which is not rendered after java-script finished.

Do solve this problem, I have added some trick into my code to get the full Html content after java-script finished.
using WebKit;
using WebKit.Interop;
using WebKit.JSCore; //We need add refrence JSCore which following with Webkit package.
public MainWindow()
{
InitializeComponent();
InitBrowser();
}
private void InitBrowser()
{
windowFormHost = new System.Windows.Forms.Integration.WindowsFormsHost();
webBrowser = new WebKit.WebKitBrowser();
webBrowser.AllowDownloads = false;
windowFormHost.Child = webBrowser;
grdBrowserHost.Children.Add(windowFormHost);
webBrowser.Load += WebBrowser_Load;
}
private void WebBrowser_Load(object sender, EventArgs e)
{
//The ResourceIntercepter will throws exception if webBrowser have not finished loading its components
//We can not use DocumentCompleted to load the Htmlcontent. Because that event will be fired before Java-script is finised
webBrowser.ResourceIntercepter.ResourceFinishedLoadingEvent += new ResourceFinishedLoadingHandler(ResourceIntercepter_ResourceFinishedLoadingEvent);
}
private void ResourceIntercepter_ResourceFinishedLoadingEvent(object sender, WebKitResourcesEventArgs e)
{
//The WebBrowser.Document still show the html without java-script.
//The trict is call Javascript (I used Jquery) to get the content of HTML
JSValue documentContent = null;
var readyState = webBrowser.GetScriptManager.EvaluateScript("document.readyState");
if (readyState != null && readyState.ToString().Equals("complete"))
{
documentContent = webBrowser.GetScriptManager.EvaluateScript("$('html').html();");
var contentHtml = documentContent.ToString();
}
}
Hope this one can help you.

Entering text in website textbox using c#

I am trying to automate fill the textbox of a website in c# and i used:
private void button1_Click(object sender, EventArgs e)
{
System.Windows.Forms.WebBrowser webBrowser = new WebBrowser();
HtmlDocument document = null;
document=webBrowser.Document;
System.Diagnostics.Process.Start("http://www.google.co.in");
document.GetElementById("lst-ib").SetAttribute("value", "ss");
}
The webpage is opening but the text box is not filled with the specified value. I have also tried innertext instead of setAttribute. I am using windows forms.

You are expecting that your webBrowser will load the page at specified address, but actually your code will start default browser (pointing at "http://www.google.co.in"), while webBrowser.Document will remain null.
try to replace the Process.Start with
webBrowser.Navigate(yourUrl);

Eliminate the Process.Start() statement (as suggested by Gian Paolo) because it starts a WebBrowser as an external process.
The problem with your code is that you want to manipulate the value of your element too fast. Wait for the website to be loaded completely:
private void button1_Click(object sender, EventArgs e)
{
System.Windows.Forms.WebBrowser webBrowser = new WebBrowser();
webBrowser.DocumentCompleted += new WebBrowserDocumentCompletedEventHandler(webBrowser_DocumentCompleted);
webBrowser.Navigate("http://www.google.co.in");
}
private void webBrowser_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
webBrowser.document.GetElementById("lst-ib").SetAttribute("value", "ss");
}
Please note that using a instance of a WebBrowser is not often the best solution for a problem. It uses a lot of RAM and has some overhead you could avoid.

Exporting Data from Telerik RadGrid to Ms-Word

I'm trying to export a Telerik RadGrid to Word:
protected void DownloadWord_Click(object sender, EventArgs e)
{
aGrid.ExportSettings.Word.Format = GridWordExportFormat.Docx;
aGrid.ExportSettings.ExportOnlyData = true;
aGrid.ExportSettings.FileName = "test.docx";
aGrid.MasterTableView.ExportToWord();
Page.Response.ClearContent();
Page.Response.ClearHeaders();
}
When I click the button, the above code fires and I get a page refresh. Looking at the POST headers and response it is the same as for loading the page, so it seems to only refresh the page.
What's wrong?

The RadGrid writes the exported file into the response of the page. And since you are clearing the response and its headers, you get no file.
For your case, Only this is required:
protected void ImageButton3_Click(object sender, System.Web.UI.ImageClickEventArgs e)
{
aGrid.ExportSettings.Word.Format = Web.UI.GridWordExportFormat.Docx;
aGrid.ExportSettings.FileName = "test.docx";
aGrid.ExportSettings.ExportOnlyData = true;
aGrid.MasterTableView.ExportToWord();
}
I cant understand the use of:
Page.Response.ClearContent();
Page.Response.ClearHeaders();
Unless you have added it for some specific purpose.
See here for demo from telerik on using Export functionalities.

Converting HtmlDocument.DomDocument to string

How to convert HtmlDocument.DomDocument to string?

This example is a bit convoluted, but, assuming you have a form called Form1, with a WebBrowser control called webBrowser1, the variable content will contain the markup that forms the document:
private void Form1_Load(object sender, EventArgs e)
{
webBrowser1.Url = new Uri(#"http://www.robertwray.co.uk/");
}
private void webBrowser1_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
var document = webBrowser1.Document;
var documentAsIHtmlDocument3 = (mshtml.IHTMLDocument3)document.DomDocument;
var content = documentAsIHtmlDocument3.documentElement.innerHTML;
}
The essential "guts" of extracting it from the HtmlDocument.DomDocument is in the webBrowser1_DocumentCompleted event handler.
Note: mshtml is obtained by adding a COM reference to 'Microsoft HTML Object Library` (aka: mshtml.dll)

It would be easier to use the HtmlDocument itself, rather than its DomDocument property:
string html = htmlDoc.Body.InnerHtml;
Or even simpler, if you have access to the WebBrowser containing the document:
string html = webBrowser.DocumentText;

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Getting the HTML content of the current page with databound controls - c#

Related

Changing label text on button click that also executes other methods

Get final HTML content after javascript finished by Open Webkit Sharp

Entering text in website textbox using c#

Exporting Data from Telerik RadGrid to Ms-Word

Converting HtmlDocument.DomDocument to string

Categories

Resources