I have a list, which contains paths to html files on my PC. I would like to loop through this list and print them all, in the same order they are in the list.
I tried to loop the code that i have found on msdn.microsoft.com for printing an HTML file.
List<string> AllHTMLsToPrint = new List<string>();
//things added to AllHTMLsToPrint list
foreach (string strHTMLToPrint in AllHTMLsToPrint)
{
PrintHelpPage(strHTMLToPrint);
}
private void PrintHelpPage(string strHTMLToPrint)
{
// Create a WebBrowser instance.
WebBrowser webBrowserForPrinting = new WebBrowser();
// Add an event handler that prints the document after it loads.
webBrowserForPrinting.DocumentCompleted +=
new WebBrowserDocumentCompletedEventHandler(PrintDocument);
// Set the Url property to load the document.
webBrowserForPrinting.Url = new Uri(strHTMLToPrint);
Thread.Sleep(100);
}
private void PrintDocument(object sender, WebBrowserDocumentCompletedEventArgs e)
{
// Print the document now that it is fully loaded.
((WebBrowser)sender).Print();
// Dispose the WebBrowser now that the task is complete.
((WebBrowser)sender).Dispose();
}
You have a design problem here. You walk your list of html pages to print. Then you open the page in a browser. When the page is loaded you print it.
BUT...
Loading the page may take longer than 100ms. This is the time after which the browser loads the next page. You should change your code so that the next page will load after the current one has been printed. You may not want to use a loop in this case but an index which you may want to increment after printing.
Should look similar to this (not tested):
List<string> AllHTMLsToPrint = new List<string>();
private int index = 0;
PrintHelpPage(AllHTMLsToPrint[index]);
private void PrintHelpPage(string strHTMLToPrint)
{
// Create a WebBrowser instance.
WebBrowser webBrowserForPrinting = new WebBrowser();
// Add an event handler that prints the document after it loads.
webBrowserForPrinting.DocumentCompleted +=
new WebBrowserDocumentCompletedEventHandler(PrintDocument);
// Set the Url property to load the document.
webBrowserForPrinting.Url = new Uri(strHTMLToPrint);
}
private void PrintDocument(object sender, WebBrowserDocumentCompletedEventArgs e)
{
// Print the document now that it is fully loaded.
((WebBrowser)sender).Print();
if (index < AllHTMLsToPrint.Count -1)
PrintHelpPage(AllHTMLsToPrint[++index]);
}
You've stated that you have a bunch of local html files.
The loading of local html files may not work by setting the URI.
You could try setting the DocumentStream instead. strHTMLToPrint must then contain the full/relative path to your local html file.
webBrowserForPrinting.DocumentStream = File.OpenRead(strHTMLToPrint);
Not sure what the exact issue is, but I would put this into a background worker so you don't hold up the main thread. I'd also move the loop into the document loaded system, that way as soon as it has loaded and printed it will move onto the next.
That said you haven't said what your code isn't doing.
public partial class Form1 : Form
{
internal List<string> AllHTMLsToPrint = new List<string>();
public Form1()
{
InitializeComponent();
}
public void StartPrinting()
{
//things added to AllHTMLsToPrint list, please note you may need to add file:/// to the URI list if it is a local file, unless it is compact framework
// start printing the first item
BackgroundWorker bgw = new BackgroundWorker();
bgw.DoWork += bgw_DoWork;
bgw.RunWorkerAsync();
/*foreach (string strHTMLToPrint in AllHTMLsToPrint)
{
PrintHelpPage(strHTMLToPrint);
}*/
}
void bgw_DoWork(object sender, DoWorkEventArgs e)
{
PrintHelpPage(AllHTMLsToPrint[0], (BackgroundWorker)sender);
}
private void PrintHelpPage(string strHTMLToPrint, BackgroundWorker bgw)
{
// Create a WebBrowser instance.
WebBrowser webBrowserForPrinting = new WebBrowser();
// Add an event handler that prints the document after it loads.
webBrowserForPrinting.DocumentCompleted += (s, ev) => {
webBrowserForPrinting.Print();
webBrowserForPrinting.Dispose();
// you can add progress reporting here
// remove the first element and see if we have to do it all again
AllHTMLsToPrint.RemoveAt(0);
if (AllHTMLsToPrint.Count > 0)
PrintHelpPage(AllHTMLsToPrint[0], bgw);
};
// Set the Url property to load the document.
webBrowserForPrinting.Url = new Uri(strHTMLToPrint);
}
}
Related
I am trying to use cefshar browser in C# winforms and need to know how I know when page completely loaded and how I can get browser document and get html elements,
I just Initialize the browser and don't know what I should do next:
public Form1()
{
InitializeComponent();
Cef.Initialize(new CefSettings());
browser = new ChromiumWebBrowser("http://google.com");
BrowserContainer.Controls.Add(browser);
browser.Dock = DockStyle.Fill;
}
CefSharp has a LoadingStateChanged event with LoadingStateChangedArgs.
LoadingStateChangedArgs has a property called IsLoading which indicates if the page is still loading.
You should be able to subscribe to it like this:
browser.LoadingStateChanged += OnLoadingStateChanged;
The method would look like this:
private void OnLoadingStateChanged(object sender, LoadingStateChangedEventArgs args)
{
if (!args.IsLoading)
{
// Page has finished loading, do whatever you want here
}
}
I believe you can get the page source like this:
string HTML = await browser.GetSourceAsync();
You'd probably need to get to grips with something like HtmlAgility to parse it, I'm not going to cover that as it's off topic.
I ended up using:
using CefSharp;
wbAuthorization.AddressChanged += OnAddressChanged;
and
private void OnAddressChanged(
object s,
AddressChangedEventArgs e)
{
if (e.Address.StartsWith(EndUri))
{
ResultUri = new Uri(e.Address);
this.DialogResult = DialogResult.OK;
}
}
EndUri is the final page I want to examine and ResultUri contains a string I want to extract later. Just some example code from a larger class.
I want to get html code from website. In Browser I usually can just click on ‘View Page Source’ in context menu or something similar. But how can I automatized it? I’ve tried it with WebBrowser class but sometimes it doesn’t work. I am not web developer so I don’t really know if my approach at least make sense. I think main problem is that I sometimes get html where not all code was executed. Hence it is uncompleted. I have problem with e.g. this site: http://www.sreality.cz/en/search/for-sale/praha
My code (I’ve tried to make it small but runnable on its own):
using System;
using System.Collections.Generic;
using System.Runtime.InteropServices;
using System.Windows.Forms;
namespace WebBrowserForm
{
internal static class Program
{
[STAThread]
private static void Main()
{
Application.EnableVisualStyles();
Application.SetCompatibleTextRenderingDefault(false);
for (int i = 0; i < 10; i++)
{
Form1 f = new Form1();
f.ShowDialog();
}
// Now I can check Form1.List and see that some html is final and some is not
}
}
public class Form1 : Form
{
public static List<string> List = new List<string>();
private const string Url = "http://www.sreality.cz/en/search/for-sale/praha";
private System.Windows.Forms.WebBrowser webBrowser1;
public Form1()
{
this.webBrowser1 = new System.Windows.Forms.WebBrowser();
this.SuspendLayout();
this.webBrowser1.Dock = System.Windows.Forms.DockStyle.Fill;
this.webBrowser1.Name = "webBrowser1";
this.webBrowser1.TabIndex = 0;
this.ResumeLayout(false);
Load += new EventHandler(Form1_Load);
this.webBrowser1.ObjectForScripting = new MyScript();
}
private void Form1_Load(object sender, EventArgs e)
{
webBrowser1.Navigate(Url);
webBrowser1.DocumentCompleted += new WebBrowserDocumentCompletedEventHandler(webBrowser1_DocumentCompleted);
}
private void webBrowser1_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
if (webBrowser1.ReadyState == WebBrowserReadyState.Complete)
{
// Final html for 99% of web pages, but unfortunately not for all
string tst = webBrowser1.Document.GetElementsByTagName("HTML")[0].OuterHtml;
webBrowser1.DocumentCompleted -= new WebBrowserDocumentCompletedEventHandler(webBrowser1_DocumentCompleted);
Application.DoEvents();
webBrowser1.Navigate("javascript: window.external.CallServerSideCode();");
Application.DoEvents();
}
}
[ComVisible(true)]
public class MyScript
{
public void CallServerSideCode()
{
HtmlDocument doc = ((Form1)Application.OpenForms[0]).webBrowser1.Document;
string renderedHtml = doc.GetElementsByTagName("HTML")[0].OuterHtml;
// here I sometimes get full html but sometimes the same as in webBrowser1_DocumentCompleted method
List.Add(renderedHtml);
((Form1)Application.OpenForms[0]).Close();
}
}
}
}
I would expect that in ‘webBrowser1_DocumentCompleted’ method I could get final html. It usually works, but with this site it doesn’t. So I’ve tried get html in my own code which should be executed in web site -> method ‘CallServerSideCode’. What is strange that sometimes I get final html (basically the same as if I do it manually via Browser) but sometimes not. I think the problem is caused because my script start before whole web site is rendered instead after. But I am not really sure since this kind of things are far from my comfort zone and I don’t really understand what I am doing. I’m just trying to apply something what I found on the internet.
So, does anyone knows what is wrong with the code? Or even more importantly how to easily get final html from the site?
Any help appreciated.
You should use WebClient class to download HTML page. No display control necessary.
You want method DownloadString
May be it will be helpful if you add calling of your external function to the end of the body and wrap it by Jquery "ondomready" function. I mean something like this:
private void webBrowser1_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
if (webBrowser1.ReadyState == WebBrowserReadyState.Complete)
{
// Final html for 99% of web pages, but unfortunately not for all
string tst = webBrowser1.Document.GetElementsByTagName("HTML")[0].OuterHtml;
webBrowser1.DocumentCompleted -= new WebBrowserDocumentCompletedEventHandler(webBrowser1_DocumentCompleted);
HtmlElement body = webBrowser1.Document.GetElementsByTagName("body")[0];
HtmlElement scriptEl = webBrowser1.Document.CreateElement("script");
IHTMLScriptElement element = (IHTMLScriptElement)scriptEl.DomElement;
element.text = "$(function() { window.external.CallServerSideCode(); });";
body.AppendChild(scriptEl);
}
}
[ComVisible(true)]
public class MyScript
{
public void CallServerSideCode()
{
HtmlDocument doc = ((Form1)Application.OpenForms[0]).webBrowser1.Document;
string renderedHtml = doc.GetElementsByTagName("HTML")[0].OuterHtml;
// here I sometimes get full html but sometimes the same as in webBrowser1_DocumentCompleted method
List.Add(renderedHtml);
((Form1)Application.OpenForms[0]).Close();
}
}
I developing winform (c#) to read html form website.
When i click button, Textbox1 don't set text after 1 seconds. It wait unit the end forech.
Now i want, function will set text for textbox in 1 seconds.
how do it?
this is the code:
when kick button1:
private void button1_Click(object sender, EventArgs e)
{
string url = "http://truyentranh8.com/danh_sach_truyen/";
var web = new HtmlWeb();
var doc = web.Load(url);
foreach (HtmlNode node in doc.DocumentNode.SelectNodes("//tbody/tr/td[#class='tit']/a[#class='tipsy']"))
{
textBox1.Text += node.InnerText + "\n";
Thread.Sleep(1000);
}
}
Thread.Sleep in your case puts the main thread in in sleep mode. It can't update the UI till it gets released and the button1_Click method is over. So you don't see text changes per second. All you'll see is Text being updated all at once.
So make it asynchronous. If you're using .Net 4.5, you can use async/await and make life simple.
private async void button1_Click(object sender, EventArgs e)
{
string url = "http://truyentranh8.com/danh_sach_truyen/";
var web = new HtmlWeb();
var doc = web.Load(url);
foreach (HtmlNode node in doc.DocumentNode.SelectNodes("//tbody/tr/td[#class='tit']/a[#class='tipsy']"))
{
textBox1.Text += node.InnerText + "\n";
await Task.Delay(1000);
}
}
If you are interested I have written article on this subject.
Do not use Thread.Sleep on an event thread for this task.
The problem is that the UI is not getting a chance to update as it redraws on the thread that is blocked. As such the UI update only appears after all the thread-blocking code ends and the Click handler is exited.
Use an appropriate Timer instead, or if feeling hackish, read up about DoEvents. Alternatively, consider doing the long running task in a BackgroundWorker - the UserState of the Progress event can be used to report partial updates, already marshaled back to the appropriate thread.
Use DoEvents to refresh the form every time you change something on design
private void button1_Click(object sender, EventArgs e)
{
string url = "http://truyentranh8.com/danh_sach_truyen/";
var web = new HtmlWeb();
var doc = web.Load(url);
foreach (HtmlNode node in doc.DocumentNode.SelectNodes("//tbody/tr/td[#class='tit']/a[#class='tipsy']"))
{
textBox1.Text += node.InnerText + "\n";
Application.DoEvents();
}
}
private void PrintHelpPage()
{
// Create a WebBrowser instance.
WebBrowser webBrowserForPrinting = new WebBrowser();
WebBrowser webBrowserForPrinting1 = new WebBrowser();
// Add an event handler that prints the document after it loads.
webBrowserForPrinting.DocumentCompleted +=
new WebBrowserDocumentCompletedEventHandler(PrintDocument);
webBrowserForPrinting1.DocumentCompleted +=
new WebBrowserDocumentCompletedEventHandler(PrintDocument);
// Set the Url property to load the document.
webBrowserForPrinting.Url = new Uri(#"F:\fichinha.html");
webBrowserForPrinting1.Url = new Uri(#"F:\fichinha2.html");
}
private void PrintDocument(object sender, WebBrowserDocumentCompletedEventArgs e)
{
// Print the document now that it is fully loaded.
((WebBrowser)sender).Print();
// Dispose the WebBrowser now that the task is complete.
((WebBrowser)sender).Dispose();
}
}
I have this code for printing an HTML file, what happens is:
Some of the letters do not appear! Including special characters and non-special...
Example: 1 of the pages: "Agora pode consultar" , appears "Agora pode cons tar"
have you set the encoding, e.g. UTF-8?
webBrowserForPrinting.Document.Encoding = Encoding.GetEncoding("UTF-8");
and the same for webBrowserForPrinting1
I have a list view with a list of sites to check. Checked sites should be loaded and operated after document is fully loaded one by one. That's what I do:
private void submitBtn_Click(object sender, EventArgs e)
{
int i = 0;
foreach (ListViewItem item in sitesList.Items)
{
if (item.Checked) indices.Add(i++);
}
Thread thread = new Thread(new ThreadStart(submit));
thread.Start();
}
private void submit()
{
foreach (int i in indices)
{
SiteInfo currentSite = sites[i];
if (currentSite.AuthOn)
{
inLoadingState = true;
webBrowser.Navigate(currentSite.LoginPage);
loginToSite(currentSite);
}
}
}
Then I handle DocumentCompleted event of WebBrowser control. Currently, the program attempts to make login when the document is not yet loaded. Please, advise how it's better to make a thread to wait until the documents is loaded.
Thanks in advance!
Looks like you would want to do this on the OnDocumentCompleted event.