Aspx to PDF iTextSharp usage - c#

I have this code which I merged and modified for my needs. But I still can't make it work as I need. The first part that I made, it generates PDF with an option from aspx page chosen. Second, I need to have the background over the page, so I added next code, but now it generates just the second code and not the PDF. And im not able to merge those codes together.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Web;
using System.Web.UI;
using System.Web.UI.WebControls;
using System.IO;
using iTextSharp.text;
using iTextSharp.text.pdf;
public partial class CreatePDFFromScratch : System.Web.UI.Page
{
protected void btnCreatePDF_Click(object sender, EventArgs e)
{
// Create a Document object
var document = new Document(iTextSharp.text.PageSize.LETTER.Rotate(), 0f, 0f, 0f, 0f);
// Create a new PdfWrite object, writing the output to a MemoryStream
var output = new MemoryStream();
var writer = PdfWriter.GetInstance(document, output);
// Open the Document for writing
document.Open();
// First, create our fonts..
var titleFont = FontFactory.GetFont("Arial", 18, Font.BOLD);
var subTitleFont = FontFactory.GetFont("Arial", 14, Font.BOLD);
var boldTableFont = FontFactory.GetFont("Arial", 12, Font.BOLD);
var endingMessageFont = FontFactory.GetFont("Arial", 10, Font.ITALIC);
var bodyFont = FontFactory.GetFont("Arial", 12, Font.NORMAL);
// Add the "Northwind Traders Receipt" title
document.Add(new Paragraph("Northwind Traders Receipt", titleFont));
// Now add the "Thank you for shopping at Northwind Traders. Your order details are below." message
document.Add(new Paragraph("Thank you for shopping at Northwind Traders. Your order details are below.", bodyFont));
document.Add(Chunk.NEWLINE);
// Add the "Order Information" subtitle
document.Add(new Paragraph("Order Information", subTitleFont));
// Create the Order Information table
var orderInfoTable = new PdfPTable(2);
orderInfoTable.HorizontalAlignment = 0;
orderInfoTable.SpacingBefore = 10;
orderInfoTable.SpacingAfter = 10;
orderInfoTable.DefaultCell.Border = 0;
orderInfoTable.SetWidths(new int[] { 1, 4 });
orderInfoTable.AddCell(new Phrase("Order:", boldTableFont));
orderInfoTable.AddCell(txtOrderID.Text);
orderInfoTable.AddCell(new Phrase("Price:", boldTableFont));
orderInfoTable.AddCell(Convert.ToDecimal(txtTotalPrice.Text).ToString("c"));
document.Add(orderInfoTable);
// Add the "Items In Your Order" subtitle
document.Add(new Paragraph("Items In Your Order", subTitleFont));
// Create the Order Details table
var orderDetailsTable = new PdfPTable(3);
orderDetailsTable.HorizontalAlignment = 0;
orderDetailsTable.SpacingBefore = 10;
orderDetailsTable.SpacingAfter = 35;
orderDetailsTable.DefaultCell.Border = 0;
orderDetailsTable.AddCell(new Phrase("Item #:", boldTableFont));
orderDetailsTable.AddCell(new Phrase("Item Name:", boldTableFont));
orderDetailsTable.AddCell(new Phrase("Qty:", boldTableFont));
foreach (System.Web.UI.WebControls.ListItem item in cblItemsPurchased.Items)
if (item.Selected)
{
// Each CheckBoxList item has a value of ITEMNAME|ITEM#|QTY, so we split on | and pull these values out...
var pieces = item.Value.Split("|".ToCharArray());
orderDetailsTable.AddCell(pieces[1]);
orderDetailsTable.AddCell(pieces[0]);
orderDetailsTable.AddCell(pieces[2]);
}
document.Add(orderDetailsTable);
// Add ending message
var endingMessage = new Paragraph("Thank you for your business! If you have any questions about your order, please contact us at 800-555-NORTH.", endingMessageFont);
endingMessage.SetAlignment("Center");
document.Add(endingMessage);
document.Close();
Response.ContentType = "application/pdf";
Response.AddHeader("Content-Disposition", string.Format("inline;filename=Receipt-{0}.pdf", txtOrderID.Text));
///create background
Response.BinaryWrite(output.ToArray());
Response.Cache.SetCacheability(HttpCacheability.NoCache);
string imageFilePath = Server.MapPath(".") + "/images/1.jpg";
iTextSharp.text.Image jpg = iTextSharp.text.Image.GetInstance(imageFilePath);
Document pdfDoc = new Document(iTextSharp.text.PageSize.LETTER.Rotate(), 0, 0, 0, 0);
jpg.ScaleToFit(790, 777);
jpg.Alignment = iTextSharp.text.Image.UNDERLYING;
pdfDoc.Open();
pdfDoc.NewPage();
pdfDoc.Add(jpg);
pdfDoc.Close();
Response.Write(pdfDoc);
Response.End();
}
}
Thanks

I almost missed this question because it wasn't tagged as an itext question.
First let me copy/paste/adapt #mkl's comment:
The first part of your code in which you create a document document makes sense.
The second part in which you create a document pdfDoc does not.
First of all, at the end of the first part you write the pdf to
the response. That PDF is complete. It's finished. It's done.
It's ready to send to the browser.
Why do you think anything additional written to the
response thereafter might have a chance of combining with the original
written data to a properly generated PDF?
Also: the second part of your code is
written as if you want to create a new PDF from scratch; but didn't you
want to manipulate the PDF created in the first part?
All of this is true, but it doesn't solve your problem. It only reveals your deep lack of understanding in PDF.
There are different ways to achieve what you want. I see that you want to use an image as a background of all the pages of a newly created PDF. In that case, you should create a page event, and add that image underneath all the existing content in the OnEndPage() method. This is explained in the answer to How can I add an image to all pages of my PDF?
Create a PDF as is done in the first part of your code, but introduce a page event:
// step 1
Document document = new Document();
// step 2
PdfWriter writer = PdfWriter.GetInstance(document, stream);
MyEvent event = new MyEvent();
writer.PageEvent = event;
// step 3
document.Open();
// step 4
// Add whatever content you want to add
// step 5
document.Close();
What is the MyEvent class, you might ask? Well, that's a class you create yourself like this:
protected class MyEvent : PdfPageEventHelper {
Image image;
public override void OnOpenDocument(PdfWriter writer, Document document) {
image = Image.GetInstance(Server.MapPath("~/images/background.png"));
image.SetAbsolutePosition(0, 0);
}
public override void OnEndPage(PdfWriter writer, Document document) {
writer.DirectContent.AddImage(image);
}
}
Suppose that your requirement isn't as easy as adding an image in the background, then you could use the bytes created as output to create a PdfReader instance. You could then use the PdfReader to create a PdfStamper and you can use the PdfStamper to watermark the original document. If the simple solution doesn't meet your needs, create a new question that involves PdfReader/PdfStamper and don't forget to tag that question as an iText question. (And also: please read the documentation. A lot of time was spent on the iText web site. That time was wasted if you don't consult it.)

Related

How to create a PDF containing text?

I want to create a PDF document containing some text that I have in the form of a string. This is what I have so far:
iTextSharp.text.Document d = new iTextSharp.text.Document();
string dosya = (#"C:\Deneme.pdf");
PdfWriter.GetInstance(d, new System.IO.FileStream(dosya, System.IO.FileMode.Create));
d.AddSubject(text);
Your question is unclear because you don't mention if you want to create a PDF from scratch (which may be what you want to do based on your code sample) or if you want to add text to an existing PDF (which is what the subject of your question suggests).
In both cases, you should take a look at the official documentation.
If you want to create a PDF from scratch, take a look at the Hello World example:
public void CreatePdf(Stream stream) {
// step 1
using (Document document = new Document()) {
// step 2
PdfWriter.GetInstance(document, stream);
// step 3
document.Open();
// step 4
document.Add(new Paragraph("Hello World!"));
}
}
The value of stream can be any output stream (one that writes to memory, one that writes to a file,...).
If you want to add a string to an existing PDF, take a look at a PdfStamper example.
public static byte[] Stamp(byte[] resource) {
PdfReader reader = new PdfReader(resource);
using (var ms = new MemoryStream()) {
using (PdfStamper stamper = new PdfStamper(reader, ms)) {
PdfContentByte canvas = stamper.GetOverContent(1);
ColumnText.ShowTextAligned(
canvas,
Element.ALIGN_LEFT,
new Phrase("Hello people!"),
36, 540, 0
);
}
return ms.ToArray();
}
}
These examples were taken from a book I once wrote. You will find the examples through this link: http://developers.itextpdf.com/examples/itext-action-second-edition
This answer assumes that you are using iText 5 (an assumption that is based on your code snippet). The most recent version is iText 7. That requires code that is totally different.

PDF generated with iTextSharp always prompts to save changes when closing. And has missing pages when viewed with non-Acrobat PDF readers

I've recently used iTextSharp to create a PDF by importing the 20 pages from an existing PDF and then adding a dynamically generated link to the bottom of the last page. It works fine... kind of. Viewing the generated PDF in Acrobat Reader on a windows PC displays everything as expected although when closing the document it always asks "Do you want to save changes?". Viewing the generated PDF on a Surface Pro with PDF Reader displays the document without the first and last pages. Apparently on a mobile device using Polaris Office the first and last pages are also missing.
I'm wondering if when the new PDF is generated it's not getting closed off quite properly and that's why it asks "Do you want to save changes?" when closing it. And maybe that's also why it doesn't display correctly in some PDF reader apps.
Here's the code:
using (var reader = new PdfReader(HostingEnvironment.MapPath("~/app/pdf/OriginalDoc.pdf")))
{
using (
var fileStream =
new FileStream(
HostingEnvironment.MapPath("~/documents/attachments/DocWithLink_" + id + ".pdf"),
FileMode.Create, FileAccess.Write))
{
var document = new Document(reader.GetPageSizeWithRotation(1));
var writer = PdfWriter.GetInstance(document, fileStream);
using (PdfStamper stamper = new PdfStamper(reader, fileStream))
{
var baseFont = BaseFont.CreateFont(BaseFont.HELVETICA_BOLD, BaseFont.CP1252,
BaseFont.NOT_EMBEDDED);
Font linkFont = FontFactory.GetFont("Arial", 12, Font.UNDERLINE, BaseColor.BLUE);
document.Open();
for (var i = 1; i <= reader.NumberOfPages; i++)
{
document.NewPage();
var importedPage = writer.GetImportedPage(reader, i);
// Copy page of original document to new document.
var contentByte = writer.DirectContent;
contentByte.AddTemplate(importedPage, 0, 0);
if (i == reader.NumberOfPages) // It's the last page so add link.
{
PdfContentByte cb = stamper.GetOverContent(i);
//Create a ColumnText object
var ct = new ColumnText(cb);
//Set the rectangle to write to
ct.SetSimpleColumn(100, 30, 500, 90, 0, PdfContentByte.ALIGN_LEFT);
//Add some text and make it blue so that it looks like a hyperlink
var c = new Chunk("Click here!", linkFont);
var congrats = new Paragraph("Congratulations on reading the eBook! ");
congrats.Alignment = PdfContentByte.ALIGN_LEFT;
c.SetAnchor("http://www.domain.com/pdf/response/" + encryptedId);
//Add the chunk to the ColumnText
congrats.Add(c);
ct.AddElement(congrats);
//Tell the system to process the above commands
ct.Go();
}
}
}
}
}
I've looked at these posts with similar issues but none seem to quite provide the answer I need:
iTextSharp-generated PDFs cause save dialog when closing
Using iTextSharp to write data to PDF works great, but Acrobat Reader asks 'Do you want to save changes' when closing file
(Or they refer to memory streams instead of writing to disk etc)
My question is, how do I modify the above so that when closing the generated PDF in Acrobat Reader there's no "Do you want to save changes?" prompt. The answer to that may solve the problems with missing pages on Surface Pro etc but if you know anything else about what might be causing that I'd like to hear about it.
Any suggestions would be very welcome! Thanks!
At first glance (and without much coffee yet) it appears that you're using a PdfReader in three different contexts, as a source to a PdfStamper, as a source for Document and as for a source for importing. So you are essentially importing a document into itself that you're also writing to.
To give you a quick overview, the following code will essentially clone the contents of source.pdf into dest.pdf:
using (var reader = new PdfReader("source.pdf")){
using (var fileStream = new FileStream("dest.pdf", FileMode.Create, FileAccess.Write)){
using (PdfStamper stamper = new PdfStamper(reader, fileStream)){
}
}
}
Since that does all of the cloning for you you don't need to import pages or anything.
Then, if the only thing that you want to do is add some text to the last page, you can just use the above and ask the PdfStamper for a PdfContentByte using GetOverContent() and telling it what page number you're interested. Then you can just use the rest of your ColumnText logic.
using (var reader = new PdfReader("Source.Pdf")) {
using (var fileStream = new FileStream("Dest.Pdf"), FileMode.Create, FileAccess.Write) {
using (PdfStamper stamper = new PdfStamper(reader, fileStream)) {
//Get a PdfContentByte object
var cb = stamper.GetOverContent(reader.NumberOfPages);
//Create a ColumnText object
var ct = new ColumnText(cb);
//Set the rectangle to write to
ct.SetSimpleColumn(100, 30, 500, 90, 0, PdfContentByte.ALIGN_LEFT);
//Add some text and make it blue so that it looks like a hyperlink
var c = new Chunk("Click here!", linkFont);
var congrats = new Paragraph("Congratulations on reading the eBook! ");
congrats.Alignment = PdfContentByte.ALIGN_LEFT;
c.SetAnchor("http://www.domain.com/pdf/response/" + encryptedId);
//Add the chunk to the ColumnText
congrats.Add(c);
ct.AddElement(congrats);
//Tell the system to process the above commands
ct.Go();
}
}
}

MVC - Generating multiple PDFs

I am using the following code for generating a PDF file.
It is working good, but now i want to generate 4 PDF's at the same time.
I tried by again initiating Document & repeating the whole code for generating 2nd PDF report, But it generates only 1 PDF.
var document = new Document(PageSize.A4, 50, 50, 25, 25);
// Create a new PdfWrite object, writing the output to a MemoryStream
var output = new MemoryStream();
var writer = PdfWriter.GetInstance(document, output);
// Open the Document for writing
document.Open();
string contents = System.IO.File.ReadAllText(Server.MapPath("~/Reports/Original.html"));
var parsedHtmlElements = HTMLWorker.ParseToList(new StringReader(contents), null);
foreach (var htmlElement in parsedHtmlElements)
document.Add(htmlElement as IElement);
document.Close();
Response.ContentType = "application/pdf";
Response.AddHeader("Content-Disposition", string.Format("attachment;filename=Receipt-{0}.pdf", "Report"));
Response.BinaryWrite(output.ToArray());
return View();
How to generate multiple PDF's?
You are outputting the bytes as a response, so you would never be able of generating 2 different files in one response. Only one response per request.
If you want the user to download 2 different PDFs at the same time you could call the controller using javascript from the view.

Make a pdf conforming PDF/A with only images using iTextSharp

I'm using iTextSharp to generate pdf-a documents from images. So far I've not been successful.
Edit: I'm using iTextSharp to generate the PDF
All I try is to make a pdf-a document (1a or 1b, whatever suits), with some images. This is the code I've come up so far, but I keep getting errors when I try to validate them with pdf-tools or validatepdfa.
This are the errors I get from pdf-tools (using PDF/A-1b validation):
Edit: MarkInfo and Color Space arn't yet working. The rest is okay
Validating file "0.pdf" for conformance level pdfa-1a
The key MarkInfo is required but missing.
A device-specific color space (DeviceRGB) without an appropriate output intent is used.
The document does not conform to the requested standard.
The document contains device-specific color spaces.
The document doesn't provide appropriate logical structure information.
Done.
Main flow
var output = new MemoryStream();
using (var iccProfileStream = new FileStream("ToPdfConverter/ColorProfiles/sRGB_v4_ICC_preference_displayclass.icc", FileMode.Open))
{
var document = new Document(new Rectangle(PageSize.A4.Width, PageSize.A4.Height), 0f, 0f, 0f, 0f);
var pdfWriter = PdfWriter.GetInstance(document, output);
pdfWriter.PDFXConformance = PdfWriter.PDFA1A;
document.Open();
var pdfDictionary = new PdfDictionary(PdfName.OUTPUTINTENT);
pdfDictionary.Put(PdfName.OUTPUTCONDITION, new PdfString("sRGB IEC61966-2.1"));
pdfDictionary.Put(PdfName.INFO, new PdfString("sRGB IEC61966-2.1"));
pdfDictionary.Put(PdfName.S, PdfName.GTS_PDFA1);
var iccProfile = ICC_Profile.GetInstance(iccProfileStream);
var pdfIccBased = new PdfICCBased(iccProfile);
pdfIccBased.Remove(PdfName.ALTERNATE);
pdfDictionary.Put(PdfName.DESTOUTPUTPROFILE, pdfWriter.AddToBody(pdfIccBased).IndirectReference);
pdfWriter.ExtraCatalog.Put(PdfName.OUTPUTINTENT, new PdfArray(pdfDictionary));
var image = PrepareImage(imageBytes);
document.Open();
document.Add(image);
pdfWriter.CreateXmpMetadata();
pdfWriter.CloseStream = false;
document.Close();
}
return output.GetBuffer();
This is prepareImage()
It's used to flatten the image to bmp, so I don't need to bother about alpha channels.
private Image PrepareImage(Stream stream)
{
Bitmap bmp = new Bitmap(System.Drawing.Image.FromStream(stream));
var file = new MemoryStream();
bmp.Save(file, ImageFormat.Bmp);
var image = Image.GetInstance(file.GetBuffer());
if (image.Height > PageSize.A4.Height || image.Width > PageSize.A4.Width)
{
image.ScaleToFit(PageSize.A4.Width, PageSize.A4.Height);
}
return image;
}
Can anyone help me into a direction to fix the errors?
Specifically the device-specific color spaces
Edit: More explanation: What I'm trying to achieve is, converting scanned images to PDF/A for long-term data storage
Edit: added some files I'm using to test with
PDFs and Pictures.rar (3.9 MB)
https://mega.co.nz/#!n8pClYgL!NJOJqSO3EuVrqLVyh3c43yW-u_U35NqeB0svc6giaSQ
OK, I checked one of your files in callas pdfToolbox and it says: "Device color space used but no PDF/A output intent". Which I took as a sign that you do something wrong while writing an output intent to the document. I then converted that document to PDF/A-1b with the same tool and the difference is obvious.
Perhaps there are other errors you need to fix, but the first error here is that you put a key in the catalog dict for the PDF file that is named "OutputIntent". That's wrong: page 75 of the PDF Specification states that the key should be named "OutputIntents".
Like I said, perhaps there are other problems with your file beyond this, but the wrong name for the key causes PDF/A validators not to find the Output Intent you try to put in the file...
First of all, pdfx IS NOT pdfa.
Second, you're using wrong PdfWriter. It should be PdfAWriter.
I do not have solution for image problem unfortunatelly, but I have for 1 and 2.
Regards
using System;
using Microsoft.VisualStudio.TestTools.UnitTesting;
using System.Text;
using System.IO;
using iTextSharp.text;
using iTextSharp.text.pdf;
using iTextSharp.text.html.simpleparser;
using iTextSharp.tool.xml;
using System.Drawing;
using System.Drawing.Imaging;
namespace Tests
{
/*
* References:
* UTF-8 encoding http://stackoverflow.com/questions/4902033/itextsharp-5-polish-character
* PDFA http://www.codeproject.com/Questions/661704/Create-pdf-A-using-itextsharp
* Images http://stackoverflow.com/questions/15896581/make-a-pdf-conforming-pdf-a-with-only-images-using-itextsharp
*/
[TestClass]
public class UnitTest1
{
/*
* IMPORTANT: Restrictions with html usage of tags and attributes
* 1. Dont use * <head> <title>Sklep</title> </head>, because title is rendered to the page
*/
// Test cases
static string contents = "<html><body style=\"font-family:arial unicode ms;font-size: 8px;\"><p style=\"text-align: center;\"> Davčna številka dolžnika: 74605968<br /> </p><table> <tr> <td><b>\u0160t. sklepa: 88711501</b></td> <td style=\"text-align: right;\">Davčna številka dolžnika: 74605968</td> </tr> </table> <br/><img src=\"http://img.rtvslo.si/_static/images/rtvslo_mmc_logo.png\" /></body></html>";
//static string contents = "<html><body style=\"font-family:arial unicode ms;font-size: 8px;\"><p style=\"text-align: center;\"> Davčna številka dolžnika: 74605968<br /> </p><table> <tr> <td><b>\u0160t. sklepa: 88711501</b></td> <td style=\"text-align: right;\">Davčna številka dolžnika: 74605968</td> </tr> </table> <br/></body></html>";
//[TestMethod]
public void CreatePdfHtml()
{
createPDF(contents, true);
}
private void createPDF(string html, bool isPdfa)
{
TextReader reader = new StringReader(html);
Document document = new Document(PageSize.A4, 30, 30, 30, 30);
HTMLWorker worker = new HTMLWorker(document);
PdfWriter writer;
if (isPdfa)
{
//set conformity level
writer = PdfAWriter.GetInstance(document, new FileStream(#"c:\temp\testA.pdf", FileMode.Create), PdfAConformanceLevel.PDF_A_1B);
//set pdf version
writer.SetPdfVersion(PdfAWriter.PDF_VERSION_1_4);
// Create XMP metadata. It's a PDF/A requirement.
writer.CreateXmpMetadata();
}
else
{
writer = PdfWriter.GetInstance(document, new FileStream(#"c:\temp\test.pdf", FileMode.Create));
}
document.Open();
if (isPdfa) // document should be opend, or it will fail
{
// Set output intent for uncalibrated color space. PDF/A requirement.
ICC_Profile icc = ICC_Profile.GetInstance(Environment.GetEnvironmentVariable("SystemRoot") + #"\System32\spool\drivers\color\sRGB Color Space Profile.icm");
writer.SetOutputIntents("Custom", "", "http://www.color.org", "sRGB IEC61966-2.1", icc);
}
//register font used in html
FontFactory.Register(Environment.GetEnvironmentVariable("SystemRoot") + "\\Fonts\\ARIALUNI.TTF", "arial unicode ms");
//adding custom style attributes to html specific tasks. Can be used instead of css
//this one is a must fopr display of utf8 language specific characters (čćžđpš)
iTextSharp.text.html.simpleparser.StyleSheet ST = new iTextSharp.text.html.simpleparser.StyleSheet();
ST.LoadTagStyle("body", "encoding", "Identity-H");
worker.SetStyleSheet(ST);
worker.StartDocument();
worker.Parse(reader);
worker.EndDocument();
worker.Close();
document.Close();
}
}
}

Need help with creating PDF from HTML using itextsharp

I'm trying to crate a PDF out of a HTML page. The CMS I'm using is EPiServer.
This is my code so far:
protected void Button1_Click(object sender, EventArgs e)
{
naaflib.pdfDocument(CurrentPage);
}
public static void pdfDocument(PageData pd)
{
//Extract data from Page (pd).
string intro = pd["MainIntro"].ToString(); // Attribute
string mainBody = pd["MainBody"].ToString(); // Attribute
// makae ready HttpContext
HttpContext.Current.Response.Clear();
HttpContext.Current.Response.ContentType = "application/pdf";
// Create PDF document
Document pdfDocument = new Document(PageSize.A4, 80, 50, 30, 65);
//PdfWriter pw = PdfWriter.GetInstance(pdfDocument, HttpContext.Current.Response.OutputStream);
PdfWriter.GetInstance(pdfDocument, HttpContext.Current.Response.OutputStream);
pdfDocument.Open();
pdfDocument.Add(new Paragraph(pd.PageName));
pdfDocument.Add(new Paragraph(intro));
pdfDocument.Add(new Paragraph(mainBody));
pdfDocument.Close();
HttpContext.Current.Response.End();
}
This outputs the content of the article name, intro-text and main body.
But it does not pars HTML which is in the article text and there is no layout.
I've tried having a look at http://itextsharp.sourceforge.net/tutorial/index.html without becomming any wiser.
Any pointers to the right direction is greatly appreciated :)
For later versions of iTextSharp:
Using iTextSharp you can use the iTextSharp.text.html.simpleparser.HTMLWorker.ParseToList() method to create a PDF from HTML.
ParseToList() takes a TextReader (an abstract class) for its HTML source, which means you can use a StringReader or StreamReader (both of which use TextReader as a base type). I used a StringReader and was able to generate PDFs from simple mark up. I tried to use the HTML returned from a webpage and got errors on all but the simplist pages. Even the simplist webpage I retrieved (http://black.ea.com/) was rendering the content of the page's 'head' tag onto the PDF, so I think the HTMLWorker.ParseToList() method is picky about the formatting of the HTML it parses.
Anyway, if you want to try here's the test code I used:
// Download content from a very, very simple "Hello World" web page.
string download = new WebClient().DownloadString("http://black.ea.com/");
Document document = new Document(PageSize.A4, 80, 50, 30, 65);
try {
using (FileStream fs = new FileStream("TestOutput.pdf", FileMode.Create)) {
PdfWriter.GetInstance(document, fs);
using (StringReader stringReader = new StringReader(download)) {
ArrayList parsedList = HTMLWorker.ParseToList(stringReader, null);
document.Open();
foreach (object item in parsedList) {
document.Add((IElement)item);
}
document.Close();
}
}
} catch (Exception exc) {
Console.Error.WriteLine(exc.Message);
}
I couldn't find any documentation on which HTML constructs HTMLWorker.ParseToList() supports; if you do please post it here. I'm sure a lot of people would be interested.
For older versions of iTextSharp:
You can use the iTextSharp.text.html.HtmlParser.Parse method to create a PDF based on html.
Here's a snippet demonstrating this:
Document document = new Document(PageSize.A4, 80, 50, 30, 65);
try {
using (FileStream fs = new FileStream("TestOutput.pdf", FileMode.Create)) {
PdfWriter.GetInstance(document, fs);
HtmlParser.Parse(document, "YourHtmlDocument.html");
}
} catch(Exception exc) {
Console.Error.WriteLine(exc.Message);
}
The one (major for me) problem is the HTML must be strictly XHTML compliant.
Good luck!

Categories