converting HTML to a multi-column PDF

converting HTML to a multi-column PDF - c#

I am trying to generate a multi-column PDF from HTML using iText for .NET.
I am using CSS3 syntax to generate two columns.
And below code is not working for me.
CSS
column-count:2;
C# Code
StringReader html = new StringReader(#"
<div style='column-count:2;'>Sample Text. Sample Text. Sample Text. Sample Text.
Sample Text. Sample Text. Sample Text. Sample Text. Sample Text. Sample Text.
Sample Text. Sample Text. Sample Text. Sample Text. Sample Text. Sample Text.
Sample Text. Sample Text. </div>
");
Document document = new Document();
PdfWriter writer = PdfWriter.GetInstance(document, new FileStream(#"d:\temp\xyz.pdf", FileMode.Create));
document.Open();
XMLWorkerHelper.GetInstance().ParseXHtml(
writer, document, html
);
document.Close();
Please suggest what is issue in this code. Or is there any other HTML to PDF library available to fix this issue.

The CSS property column-count is not supported in XML Worker, and it probably never will.
However, this doesn't mean that you can't display HTML in columns.
If you go to the official XML Worker documentation, you'll find the ParseHtmlObjects where we parse a large HTML file and render it to a PDF with two columns: walden5.pdf
This is done by parsing the HTML into an ElementList first:
// CSS
CSSResolver cssResolver =
XMLWorkerHelper.getInstance().getDefaultCssResolver(true);
// HTML
HtmlPipelineContext htmlContext = new HtmlPipelineContext(null);
htmlContext.setTagFactory(Tags.getHtmlTagProcessorFactory());
htmlContext.autoBookmark(false);
// Pipelines
ElementList elements = new ElementList();
ElementHandlerPipeline end = new ElementHandlerPipeline(elements, null);
HtmlPipeline html = new HtmlPipeline(htmlContext, end);
CssResolverPipeline css = new CssResolverPipeline(cssResolver, html);
// XML Worker
XMLWorker worker = new XMLWorker(css, true);
XMLParser p = new XMLParser(worker);
Once we have the list of Element objects, we can add them to a ColumnText object:
// step 1
Document document = new Document(PageSize.LEGAL.rotate());
// step 2
PdfWriter writer = PdfWriter.getInstance(document, new FileOutputStream(file));
// step 3
document.open();
// step 4
Rectangle left = new Rectangle(36, 36, 486, 586);
Rectangle right = new Rectangle(522, 36, 972, 586);
ColumnText column = new ColumnText(writer.getDirectContent());
column.setSimpleColumn(left);
boolean leftside = true;
int status = ColumnText.START_COLUMN;
for (Element e : elements) {
if (ColumnText.isAllowedElement(e)) {
column.addElement(e);
status = column.go();
while (ColumnText.hasMoreText(status)) {
if (leftside) {
leftside = false;
column.setSimpleColumn(right);
}
else {
document.newPage();
leftside = true;
column.setSimpleColumn(left);
}
status = column.go();
}
}
}
// step 5
document.close();
As you can see, you need to make some decisions here: you need to define the rectangles on the pages. You need to introduce new pages, etc...
Note: there is currently no C# port of this documentation. Please think of the Java code as if it were pseudo code.

Related

Add multiple Checkboxes in ITextsharp html to pdf

I'm using the iTextSharp library to convert my html to pdf. The issue is I'm trying to add checkbox appearance using the below code:
string HTML,public static String FONT = "c:/windows/fonts/WINGDING.TTF";
public static String TEXT = "o";
public void HTMLToPdf( string FileName)
{
string HTML="<!DOCTYPE html>
<html>
<head><title></title><meta charset='UTF-8'></head>
<body><div class='mystyle'>Here i want to print many checkbox lik appearances</div></body>
<html>";
Document pdfDoc = new Document(PageSize.A4, 30f, 30f, 10f, 10f);
pdfDoc.Add(p);
BaseFont bf = BaseFont.CreateFont(FONT, BaseFont.IDENTITY_H, BaseFont.EMBEDDED);
Font f = new Font(bf, 12);
Paragraph p = new Paragraph(TEXT, f);
pdfDoc.Add(p);
}
The problem is this method adds the checkbox at the begining of pdf, please help me to attach the paragraph containing the checkbox value to my html.
Simply put, I'm getting the value at pdfDoc.Add(p), but I want it in a variable to print it many times in html.

In fact, it is not very clear what was meant in the question:
please help me to attach the paragraph containing the checkbox value to my html
I can assume that you wanted to convert HTML to PDF, and add paragraphs on the next line.
In general, it's a bad idea to use iTextSharp for this, since this library is outdated and no longer supported. I can suggest my own way of solving your problem in pdfHTML, this is an iText7 add-on. My code is in Java, but it's not much different from sharp.The main idea is not to close the document after html conversion. Because if you try to write a paragraph in a closed document, it will be at the very beginning, as in your example.
String FONT = "c:/windows/fonts/WINGDING.TTF";
String TEXT = "o";
File htmlSource = new File("checkBoxHtml.html");
File pdfDest = new File("output.pdf");
ConverterProperties converterProperties = new ConverterProperties();
Document document = HtmlConverter.convertToDocument(new FileInputStream(htmlSource),
new PdfDocument(new PdfWriter(pdfDest)), converterProperties);
PdfFont font = PdfFontFactory.createFont(FONT);
Text text = new Text(TEXT);
text.setFont(font);
Paragraph paragraph = new Paragraph();
// Adding text to the paragraph
paragraph.add(text);
// Adding paragraph to the document
document.add(paragraph);
document.close();

ITextSharp pdf resize and data alignment

I am using ITextSharp to convert HTML to PDF but i want the PDF to be generated of size 5cm width. I used the following code
var pgSize = new iTextSharp.text.Rectangle(2.05f, 2.05f);
Document doc = new Document(pgSize);
but it is just resizing the pdf and my data disappeared in the pdf or get hide.
How can i align the data in the center in PDF or resize the pdf? Here is my code
public void ConvertHTMLToPDF(string HTMLCode)
{
try
{
System.IO.StringWriter stringWrite = new StringWriter();
System.Web.UI.HtmlTextWriter htmlWrite = new HtmlTextWriter(stringWrite);
StringReader reader = new StringReader(HTMLCode);
var pgSize = new iTextSharp.text.Rectangle(2.05f, 2.05f);
Document doc = new Document(pgSize);
HTMLWorker parser = new HTMLWorker(doc);
PdfWriter.GetInstance(doc, new FileStream(Server.MapPath("~") + "/App_Data/HTMLToPDF.pdf",
FileMode.Create));
doc.Open();
foreach (IElement element in HTMLWorker.ParseToList(
new StringReader(HTMLCode), null))
{
doc.Add(element);
}
doc.Close();
Response.End();
}
catch (Exception ex)
{
}
}

You are creating a PDF that measures 0.0723 cm by 0.0723 cm. That is much too small to add any content. If you want to create a PDF of 5 cm by 5 cm, you need to create your document like this:
var pgSize = new iTextSharp.text.Rectangle(141.732f, 141.732f);
Document doc = new Document(pgSize);
As for the alignment, that should be defined in the HTML, but you are using an old version of iText and you are using the deprecated HTMLWorker.
You should upgrade to iText 7 and pdfHTML as described here: Converting HTML to PDF using iText
Also: the size of the page can be defined in the #page-rule of the CSS. See Huge white space after header in PDF using Flying Saucer
Why would you make it difficult for yourself by using an old iText version, when the new version allows you to do this:
#page {
size: 5cm 5cm;
}

Aspx to PDF iTextSharp usage

I have this code which I merged and modified for my needs. But I still can't make it work as I need. The first part that I made, it generates PDF with an option from aspx page chosen. Second, I need to have the background over the page, so I added next code, but now it generates just the second code and not the PDF. And im not able to merge those codes together.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Web;
using System.Web.UI;
using System.Web.UI.WebControls;
using System.IO;
using iTextSharp.text;
using iTextSharp.text.pdf;
public partial class CreatePDFFromScratch : System.Web.UI.Page
{
protected void btnCreatePDF_Click(object sender, EventArgs e)
{
// Create a Document object
var document = new Document(iTextSharp.text.PageSize.LETTER.Rotate(), 0f, 0f, 0f, 0f);
// Create a new PdfWrite object, writing the output to a MemoryStream
var output = new MemoryStream();
var writer = PdfWriter.GetInstance(document, output);
// Open the Document for writing
document.Open();
// First, create our fonts..
var titleFont = FontFactory.GetFont("Arial", 18, Font.BOLD);
var subTitleFont = FontFactory.GetFont("Arial", 14, Font.BOLD);
var boldTableFont = FontFactory.GetFont("Arial", 12, Font.BOLD);
var endingMessageFont = FontFactory.GetFont("Arial", 10, Font.ITALIC);
var bodyFont = FontFactory.GetFont("Arial", 12, Font.NORMAL);
// Add the "Northwind Traders Receipt" title
document.Add(new Paragraph("Northwind Traders Receipt", titleFont));
// Now add the "Thank you for shopping at Northwind Traders. Your order details are below." message
document.Add(new Paragraph("Thank you for shopping at Northwind Traders. Your order details are below.", bodyFont));
document.Add(Chunk.NEWLINE);
// Add the "Order Information" subtitle
document.Add(new Paragraph("Order Information", subTitleFont));
// Create the Order Information table
var orderInfoTable = new PdfPTable(2);
orderInfoTable.HorizontalAlignment = 0;
orderInfoTable.SpacingBefore = 10;
orderInfoTable.SpacingAfter = 10;
orderInfoTable.DefaultCell.Border = 0;
orderInfoTable.SetWidths(new int[] { 1, 4 });
orderInfoTable.AddCell(new Phrase("Order:", boldTableFont));
orderInfoTable.AddCell(txtOrderID.Text);
orderInfoTable.AddCell(new Phrase("Price:", boldTableFont));
orderInfoTable.AddCell(Convert.ToDecimal(txtTotalPrice.Text).ToString("c"));
document.Add(orderInfoTable);
// Add the "Items In Your Order" subtitle
document.Add(new Paragraph("Items In Your Order", subTitleFont));
// Create the Order Details table
var orderDetailsTable = new PdfPTable(3);
orderDetailsTable.HorizontalAlignment = 0;
orderDetailsTable.SpacingBefore = 10;
orderDetailsTable.SpacingAfter = 35;
orderDetailsTable.DefaultCell.Border = 0;
orderDetailsTable.AddCell(new Phrase("Item #:", boldTableFont));
orderDetailsTable.AddCell(new Phrase("Item Name:", boldTableFont));
orderDetailsTable.AddCell(new Phrase("Qty:", boldTableFont));
foreach (System.Web.UI.WebControls.ListItem item in cblItemsPurchased.Items)
if (item.Selected)
{
// Each CheckBoxList item has a value of ITEMNAME|ITEM#|QTY, so we split on | and pull these values out...
var pieces = item.Value.Split("|".ToCharArray());
orderDetailsTable.AddCell(pieces[1]);
orderDetailsTable.AddCell(pieces[0]);
orderDetailsTable.AddCell(pieces[2]);
}
document.Add(orderDetailsTable);
// Add ending message
var endingMessage = new Paragraph("Thank you for your business! If you have any questions about your order, please contact us at 800-555-NORTH.", endingMessageFont);
endingMessage.SetAlignment("Center");
document.Add(endingMessage);
document.Close();
Response.ContentType = "application/pdf";
Response.AddHeader("Content-Disposition", string.Format("inline;filename=Receipt-{0}.pdf", txtOrderID.Text));
///create background
Response.BinaryWrite(output.ToArray());
Response.Cache.SetCacheability(HttpCacheability.NoCache);
string imageFilePath = Server.MapPath(".") + "/images/1.jpg";
iTextSharp.text.Image jpg = iTextSharp.text.Image.GetInstance(imageFilePath);
Document pdfDoc = new Document(iTextSharp.text.PageSize.LETTER.Rotate(), 0, 0, 0, 0);
jpg.ScaleToFit(790, 777);
jpg.Alignment = iTextSharp.text.Image.UNDERLYING;
pdfDoc.Open();
pdfDoc.NewPage();
pdfDoc.Add(jpg);
pdfDoc.Close();
Response.Write(pdfDoc);
Response.End();
}
}
Thanks

I almost missed this question because it wasn't tagged as an itext question.
First let me copy/paste/adapt #mkl's comment:
The first part of your code in which you create a document document makes sense.
The second part in which you create a document pdfDoc does not.
First of all, at the end of the first part you write the pdf to
the response. That PDF is complete. It's finished. It's done.
It's ready to send to the browser.
Why do you think anything additional written to the
response thereafter might have a chance of combining with the original
written data to a properly generated PDF?
Also: the second part of your code is
written as if you want to create a new PDF from scratch; but didn't you
want to manipulate the PDF created in the first part?
All of this is true, but it doesn't solve your problem. It only reveals your deep lack of understanding in PDF.
There are different ways to achieve what you want. I see that you want to use an image as a background of all the pages of a newly created PDF. In that case, you should create a page event, and add that image underneath all the existing content in the OnEndPage() method. This is explained in the answer to How can I add an image to all pages of my PDF?
Create a PDF as is done in the first part of your code, but introduce a page event:
// step 1
Document document = new Document();
// step 2
PdfWriter writer = PdfWriter.GetInstance(document, stream);
MyEvent event = new MyEvent();
writer.PageEvent = event;
// step 3
document.Open();
// step 4
// Add whatever content you want to add
// step 5
document.Close();
What is the MyEvent class, you might ask? Well, that's a class you create yourself like this:
protected class MyEvent : PdfPageEventHelper {
Image image;
public override void OnOpenDocument(PdfWriter writer, Document document) {
image = Image.GetInstance(Server.MapPath("~/images/background.png"));
image.SetAbsolutePosition(0, 0);
}
public override void OnEndPage(PdfWriter writer, Document document) {
writer.DirectContent.AddImage(image);
}
}
Suppose that your requirement isn't as easy as adding an image in the background, then you could use the bytes created as output to create a PdfReader instance. You could then use the PdfReader to create a PdfStamper and you can use the PdfStamper to watermark the original document. If the simple solution doesn't meet your needs, create a new question that involves PdfReader/PdfStamper and don't forget to tag that question as an iText question. (And also: please read the documentation. A lot of time was spent on the iText web site. That time was wasted if you don't consult it.)

XMLWorkerHelper convert html page into pdf produces only first 2 pages

I'm getting only the first two page. I have a generated list of elements in the third page. When there are too many elements in my collection, all pages from there become blank in my pdf output
using (FileStream fs = new FileStream(filePath, FileMode.Create))
{
Document document = new Document(PageSize.A4, 25, 25, 30, 30);
WebClient wc = new WebClient();
string htmlText = wc.DownloadString(textUrl);
PdfWriter pdfWriter = PdfWriter.GetInstance(document, fs);
document.Open();
// register all fonts in current computer
FontFactory.RegisterDirectories();
XMLWorkerFontProvider fontProvider = new XMLWorkerFontProvider();
using (var msHtml = new MemoryStream(System.Text.Encoding.Default.GetBytes(htmlText)))
{
//Set factories
var cssAppliers = new CssAppliersImpl(fontProvider);
var htmlContext = new HtmlPipelineContext(cssAppliers);
//HtmlPipelineContext htmlContext = new HtmlPipelineContext(null);
htmlContext.SetTagFactory(Tags.GetHtmlTagProcessorFactory());
//FontFactory.Register(arialuniTff);
string gishaTff = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Fonts), "GISHA.TTF");
FontFactory.Register(gishaTff);
var worker = XMLWorkerHelper.GetInstance();
var cssStream = new FileStream(FolderMapPath("/css/style.css"), FileMode.Open);
worker.ParseXHtml(pdfWriter, document, msHtml, cssStream, new UnicodeFontFactory());
}
// Close the document
document.Close();
// Close the writer instance
pdfWriter.Close();
}
Here is my cshtml code

I only have experience working with iText in Java, but is it possible that the MemoryStream object you are using has a byte limit that gets filled up when the table on page 3 has too many elements for it to store? If that's the case, then the closing tags on that long table may not be written to the MemoryStream, thus that table and everything after doesn't get rendered; i.e. get's truncated by the PDF converter engine.
Can you try using a different Stream object?

Here is how i solved my problem. The problem wasn't with the C# backend code. It's seems like the XMLWorkerHelper at the moment doesn't deal well with loop in view. I had to display list of items in my PDF file. The result was well if the collection contains no so many items, But the page break at the level when the the collection contains more than 50 items because this could not be display in a single page. What i did is that i started to count the number of items and at some number like 40, i just include a break element <li style="list-style:none; list-style-type:none; page-break-before:always">#item</li>. And reset the counter and continue display my items. And it was great and my problem was resolved.
May be this can be helpful for someone.

.NET C# - MigraDoc - How to change document charset?

I've searched for solution to this problem, but still cannot find the answer. Any help would be appreciated.
Document document = new Document();
Section section = document.AddSection();
Paragraph paragraph = section.AddParagraph();
paragraph.Format.Font.Color = Color.FromCmyk(100, 30, 20, 50);
paragraph.AddText("ąčęėįųųūū");
paragraph.Format.Font.Size = 9;
paragraph.Format.Alignment = ParagraphAlignment.Center;
</b>
<...>
In example above characters "ąčęėįųųūū" are not displayed in exported pdf.
How can I set 'MigraDoc' character set ?

Just tell the Renderer to create an Unicode document:
PdfDocumentRenderer renderer = new PdfDocumentRenderer(true, PdfSharp.Pdf.PdfFontEmbedding.Always);
renderer.Document = document;
The first parameter of PdfDocumentRenderer must be true to get Unicode.
Please note that not all True Type fonts include all Unicode characters (but it should work with Arial, Verdana, etc.).
See here for a complete sample:
http://www.pdfsharp.net/wiki/HelloMigraDoc-sample.ashx

If you are mixing PDFSharp and MigraDoc, as I do ( it means that you have a PdfSharp object PdfDocument document and a MigraDoc object Document doc, which is rendered as a part of document), everything is not that simple. The example, that PDFSharp Team has given works only when you are using MigraDoc separately.
So you should use it this way:
Make sure you are rendering your MigraDoc doc earlier than rendering the MigraDoc object to the PDF sharp XGraphics gfx.
Use the hack to set encoding for the gfx object.
XGraphics gfx = XGraphics.FromPdfPage(page);
// HACK²
gfx.MUH = PdfFontEncoding.Unicode;
gfx.MFEH = PdfFontEmbedding.Always;
// HACK²
Document doc = new Document();
PdfDocumentRenderer pdfRenderer = new PdfDocumentRenderer(true, PdfFontEmbedding.Always);
pdfRenderer.Document = doc;
pdfRenderer.RenderDocument();
MigraDoc.Rendering.DocumentRenderer docRenderer = new DocumentRenderer(doc);
docRenderer.PrepareDocument();
docRenderer.RenderObject(gfx, XUnit.FromCentimeter(5), XUnit.FromCentimeter(10), "12cm", para);
For 1.5.x-betax
let gfx = XGraphics.FromPdfPage(page)
gfx.MUH <- PdfFontEncoding.Unicode
let doc = new Document()
let pdfRenderer = new PdfDocumentRenderer(true, PdfFontEmbedding.Always)
pdfRenderer.Document <- doc
pdfRenderer.RenderDocument()
let docRenderer = new DocumentRenderer(doc)
docRenderer.PrepareDocument()
docRenderer.RenderObject(gfx, XUnit.FromCentimeter 5, XUnit.FromCentimeter 10, "12cm", para)

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

converting HTML to a multi-column PDF - c#

Related

Add multiple Checkboxes in ITextsharp html to pdf

ITextSharp pdf resize and data alignment

Aspx to PDF iTextSharp usage

XMLWorkerHelper convert html page into pdf produces only first 2 pages

.NET C# - MigraDoc - How to change document charset?

Categories

Resources