ITextSharp pdf resize and data alignment - c#

I am using ITextSharp to convert HTML to PDF but i want the PDF to be generated of size 5cm width. I used the following code
var pgSize = new iTextSharp.text.Rectangle(2.05f, 2.05f);
Document doc = new Document(pgSize);
but it is just resizing the pdf and my data disappeared in the pdf or get hide.
How can i align the data in the center in PDF or resize the pdf? Here is my code
public void ConvertHTMLToPDF(string HTMLCode)
{
try
{
System.IO.StringWriter stringWrite = new StringWriter();
System.Web.UI.HtmlTextWriter htmlWrite = new HtmlTextWriter(stringWrite);
StringReader reader = new StringReader(HTMLCode);
var pgSize = new iTextSharp.text.Rectangle(2.05f, 2.05f);
Document doc = new Document(pgSize);
HTMLWorker parser = new HTMLWorker(doc);
PdfWriter.GetInstance(doc, new FileStream(Server.MapPath("~") + "/App_Data/HTMLToPDF.pdf",
FileMode.Create));
doc.Open();
foreach (IElement element in HTMLWorker.ParseToList(
new StringReader(HTMLCode), null))
{
doc.Add(element);
}
doc.Close();
Response.End();
}
catch (Exception ex)
{
}
}

You are creating a PDF that measures 0.0723 cm by 0.0723 cm. That is much too small to add any content. If you want to create a PDF of 5 cm by 5 cm, you need to create your document like this:
var pgSize = new iTextSharp.text.Rectangle(141.732f, 141.732f);
Document doc = new Document(pgSize);
As for the alignment, that should be defined in the HTML, but you are using an old version of iText and you are using the deprecated HTMLWorker.
You should upgrade to iText 7 and pdfHTML as described here: Converting HTML to PDF using iText
Also: the size of the page can be defined in the #page-rule of the CSS. See Huge white space after header in PDF using Flying Saucer
Why would you make it difficult for yourself by using an old iText version, when the new version allows you to do this:
#page {
size: 5cm 5cm;
}

Related

How to use itext 7 to generate a PDF from an HTML div and save it to a folder on the server in .net

I'm trying to create a CV builder that saves the CV edited by the user to a folder in my project for further processing of sending it through email, I have reached as far as using itext to create a PDF of an HTML div, but has no CSS or any of the text values I have returned from my database. Through some research i find that my problem could be solved by using itext 7 and an add-on pdfHTML but can not find any proper examples of how to use it with my ASP.NET code. Would really appreciate any help.
Bellow is the code for the on-click button event I use to generate the PDF
protected void ButtonDownload_Click(object sender, EventArgs e)
{
Response.ContentType = "application/pdf";
//Response.AddHeader("content-disposition", "attachment;filename=Panel.pdf");
Response.Cache.SetCacheability(HttpCacheability.NoCache);
StringWriter sw = new StringWriter();
HtmlTextWriter hw = new HtmlTextWriter(sw);
contentdiv.RenderControl(hw); //convert the div to PDF
StringReader sr = new StringReader(sw.ToString());
Document pdfDoc = new Document(PageSize.A4, 10f, 10f, 10f, 0f);
HTMLWorker htmlparser = new HTMLWorker(pdfDoc);
PdfWriter.GetInstance(pdfDoc, Response.OutputStream);
pdfDoc.Open();
htmlparser.Parse(sr);
pdfDoc.Close();
string filename = base.Server.MapPath("~/PDF/" + "UserCV.pdf");
HttpContext.Current.Request.SaveAs(filename, false);
Response.End();
}
This picture shows the pdf result i get when i click the download button
And this is html page it is trying to convert
The text bellow the headings on the HTML page are Labels whose values are being set by retrieving values form a database
This is an example on how to use pdfHTML
This example is quite extensive, as it also sets document properties, and registers a custom Font.
public void createPdf(String src, String dest, String resources) throws IOException {
try {
FileOutputStream outputStream = new FileOutputStream(dest);
WriterProperties writerProperties = new WriterProperties();
//Add metadata
writerProperties.addXmpMetadata();
PdfWriter pdfWriter = new PdfWriter(outputStream, writerProperties);
PdfDocument pdfDoc = new PdfDocument(pdfWriter);
pdfDoc.getCatalog().setLang(new PdfString("en-US"));
//Set the document to be tagged
pdfDoc.setTagged();
pdfDoc.getCatalog().setViewerPreferences(new PdfViewerPreferences().setDisplayDocTitle(true));
//Set meta tags
PdfDocumentInfo pdfMetaData = pdfDoc.getDocumentInfo();
pdfMetaData.setAuthor("Joris Schellekens");
pdfMetaData.addCreationDate();
pdfMetaData.getProducer();
pdfMetaData.setCreator("JS");
pdfMetaData.setKeywords("example, accessibility");
pdfMetaData.setSubject("PDF accessibility");
//Title is derived from html
// pdf conversion
ConverterProperties props = new ConverterProperties();
FontProvider fp = new FontProvider();
fp.addStandardPdfFonts();
fp.addDirectory(resources);//The noto-nashk font file (.ttf extension) is placed in the resources
props.setFontProvider(fp);
props.setBaseUri(resources);
//Setup custom tagworker factory for better tagging of headers
DefaultTagWorkerFactory tagWorkerFactory = new AccessibilityTagWorkerFactory();
props.setTagWorkerFactory(tagWorkerFactory);
HtmlConverter.convertToPdf(new FileInputStream(src), pdfDoc, props);
pdfDoc.close();
} catch (Exception e) {
e.printStackTrace();
}
}
The most relevant line here is
HtmlConverter.convertToPdf(new FileInputStream(src), pdfDoc, props);
Which essentially tells pdfHTML to perform the conversion of the inputstream (specified by src), put the content in pdfDoc and use the given ConverterProperties (specified by props).

Setting my paragraphs to begin at the top margin with iTextSharp

I'm trying to build pdfs to digitize our reporting system at my company. I've used iTextSharp and so far it looks great but my margins don't seem to be working properly. I've set the margins to a config file and the left and right margins are working great, but my paragraph seems to start about 30% down the page regardless of the top and bottom margin. Here's the code I'm using:
public int PrintPdf()
{
//Getting the path
//Path.GetFileNameWithoutExtension("Test_Doc_Print") + ".pdf");
object OutputFileName = this._path;
//Making the PDF Doc
iTextSharp.text.Document PDFReport = new iTextSharp.text.Document
(
PageSize.A4.Rotate(),
/*this._left,
this._right,
this._top,
this._bottom*/
10,
10,
10,
10
);
//Setting The Font
string fontpath = #"C:\Windows\Fonts\";
BaseFont monoFont = BaseFont.CreateFont(fontpath + "Consola.ttf", BaseFont.CP1252, BaseFont.EMBEDDED);
Font fontPDF = new Font(monoFont, this._fontSize);
// create file stream for writing the PDF
FileStream fs = new FileStream(this._path, FileMode.Create, FileAccess.ReadWrite);
//FileStream fs = new FileStream(#"c:\\Reportlocation", FileMode.Create);
// Create an FCFC scan object to convert TextToPrint page at a time
FCFCScanner page = new FCFCScanner(this._text);
// Create PDF writer and associate with file stream
//iTextSharp.text.pdf.PdfWriter writer = new iTextSharp.text.pdf.PdfWriter.GetInstance(PDFReport, fs);
PdfWriter writer = PdfWriter.GetInstance(PDFReport, fs);
//Opening the PDF Doc
PDFReport.Open();
//Load each page from the string into the PDF
page.NextPage();
do
{
if (page.PageLength > 0)
{
Paragraph prg = new Paragraph(page.Page, fontPDF);
PDFReport.Add(prg);
PDFReport.NewPage();
page.NextPage();
}
} while (page.MorePages);
PDFReport.Close();
return (0);
}
I've set the margins to 10 (hardcoded for now) to show what I'm working with. This program should read a string that I send it from my StringBuider class.
It's designed to receive one page of text at a time and to convert it into a PDF document.
Ok, so the problem: when the page is built, the first paragraph doesn't begin at the top margin. If I reduce the margin, it doesn't shift the paragraph up to the new margin. It's causing my PDFs to be much longer as the text that fits easily on a printer page takes two PDF pages. Any help with getting my paragraph to simply begin at the top margin would be really appreciated.
I'm relatively new to programming and this is my first post, so if you need more information, let me know and I'll add more info.

How to create a PDF containing text?

I want to create a PDF document containing some text that I have in the form of a string. This is what I have so far:
iTextSharp.text.Document d = new iTextSharp.text.Document();
string dosya = (#"C:\Deneme.pdf");
PdfWriter.GetInstance(d, new System.IO.FileStream(dosya, System.IO.FileMode.Create));
d.AddSubject(text);
Your question is unclear because you don't mention if you want to create a PDF from scratch (which may be what you want to do based on your code sample) or if you want to add text to an existing PDF (which is what the subject of your question suggests).
In both cases, you should take a look at the official documentation.
If you want to create a PDF from scratch, take a look at the Hello World example:
public void CreatePdf(Stream stream) {
// step 1
using (Document document = new Document()) {
// step 2
PdfWriter.GetInstance(document, stream);
// step 3
document.Open();
// step 4
document.Add(new Paragraph("Hello World!"));
}
}
The value of stream can be any output stream (one that writes to memory, one that writes to a file,...).
If you want to add a string to an existing PDF, take a look at a PdfStamper example.
public static byte[] Stamp(byte[] resource) {
PdfReader reader = new PdfReader(resource);
using (var ms = new MemoryStream()) {
using (PdfStamper stamper = new PdfStamper(reader, ms)) {
PdfContentByte canvas = stamper.GetOverContent(1);
ColumnText.ShowTextAligned(
canvas,
Element.ALIGN_LEFT,
new Phrase("Hello people!"),
36, 540, 0
);
}
return ms.ToArray();
}
}
These examples were taken from a book I once wrote. You will find the examples through this link: http://developers.itextpdf.com/examples/itext-action-second-edition
This answer assumes that you are using iText 5 (an assumption that is based on your code snippet). The most recent version is iText 7. That requires code that is totally different.

converting HTML to a multi-column PDF

I am trying to generate a multi-column PDF from HTML using iText for .NET.
I am using CSS3 syntax to generate two columns.
And below code is not working for me.
CSS
column-count:2;
C# Code
StringReader html = new StringReader(#"
<div style='column-count:2;'>Sample Text. Sample Text. Sample Text. Sample Text.
Sample Text. Sample Text. Sample Text. Sample Text. Sample Text. Sample Text.
Sample Text. Sample Text. Sample Text. Sample Text. Sample Text. Sample Text.
Sample Text. Sample Text. </div>
");
Document document = new Document();
PdfWriter writer = PdfWriter.GetInstance(document, new FileStream(#"d:\temp\xyz.pdf", FileMode.Create));
document.Open();
XMLWorkerHelper.GetInstance().ParseXHtml(
writer, document, html
);
document.Close();
Please suggest what is issue in this code. Or is there any other HTML to PDF library available to fix this issue.
The CSS property column-count is not supported in XML Worker, and it probably never will.
However, this doesn't mean that you can't display HTML in columns.
If you go to the official XML Worker documentation, you'll find the ParseHtmlObjects where we parse a large HTML file and render it to a PDF with two columns: walden5.pdf
This is done by parsing the HTML into an ElementList first:
// CSS
CSSResolver cssResolver =
XMLWorkerHelper.getInstance().getDefaultCssResolver(true);
// HTML
HtmlPipelineContext htmlContext = new HtmlPipelineContext(null);
htmlContext.setTagFactory(Tags.getHtmlTagProcessorFactory());
htmlContext.autoBookmark(false);
// Pipelines
ElementList elements = new ElementList();
ElementHandlerPipeline end = new ElementHandlerPipeline(elements, null);
HtmlPipeline html = new HtmlPipeline(htmlContext, end);
CssResolverPipeline css = new CssResolverPipeline(cssResolver, html);
// XML Worker
XMLWorker worker = new XMLWorker(css, true);
XMLParser p = new XMLParser(worker);
Once we have the list of Element objects, we can add them to a ColumnText object:
// step 1
Document document = new Document(PageSize.LEGAL.rotate());
// step 2
PdfWriter writer = PdfWriter.getInstance(document, new FileOutputStream(file));
// step 3
document.open();
// step 4
Rectangle left = new Rectangle(36, 36, 486, 586);
Rectangle right = new Rectangle(522, 36, 972, 586);
ColumnText column = new ColumnText(writer.getDirectContent());
column.setSimpleColumn(left);
boolean leftside = true;
int status = ColumnText.START_COLUMN;
for (Element e : elements) {
if (ColumnText.isAllowedElement(e)) {
column.addElement(e);
status = column.go();
while (ColumnText.hasMoreText(status)) {
if (leftside) {
leftside = false;
column.setSimpleColumn(right);
}
else {
document.newPage();
leftside = true;
column.setSimpleColumn(left);
}
status = column.go();
}
}
}
// step 5
document.close();
As you can see, you need to make some decisions here: you need to define the rectangles on the pages. You need to introduce new pages, etc...
Note: there is currently no C# port of this documentation. Please think of the Java code as if it were pseudo code.

Make a pdf conforming PDF/A with only images using iTextSharp

I'm using iTextSharp to generate pdf-a documents from images. So far I've not been successful.
Edit: I'm using iTextSharp to generate the PDF
All I try is to make a pdf-a document (1a or 1b, whatever suits), with some images. This is the code I've come up so far, but I keep getting errors when I try to validate them with pdf-tools or validatepdfa.
This are the errors I get from pdf-tools (using PDF/A-1b validation):
Edit: MarkInfo and Color Space arn't yet working. The rest is okay
Validating file "0.pdf" for conformance level pdfa-1a
The key MarkInfo is required but missing.
A device-specific color space (DeviceRGB) without an appropriate output intent is used.
The document does not conform to the requested standard.
The document contains device-specific color spaces.
The document doesn't provide appropriate logical structure information.
Done.
Main flow
var output = new MemoryStream();
using (var iccProfileStream = new FileStream("ToPdfConverter/ColorProfiles/sRGB_v4_ICC_preference_displayclass.icc", FileMode.Open))
{
var document = new Document(new Rectangle(PageSize.A4.Width, PageSize.A4.Height), 0f, 0f, 0f, 0f);
var pdfWriter = PdfWriter.GetInstance(document, output);
pdfWriter.PDFXConformance = PdfWriter.PDFA1A;
document.Open();
var pdfDictionary = new PdfDictionary(PdfName.OUTPUTINTENT);
pdfDictionary.Put(PdfName.OUTPUTCONDITION, new PdfString("sRGB IEC61966-2.1"));
pdfDictionary.Put(PdfName.INFO, new PdfString("sRGB IEC61966-2.1"));
pdfDictionary.Put(PdfName.S, PdfName.GTS_PDFA1);
var iccProfile = ICC_Profile.GetInstance(iccProfileStream);
var pdfIccBased = new PdfICCBased(iccProfile);
pdfIccBased.Remove(PdfName.ALTERNATE);
pdfDictionary.Put(PdfName.DESTOUTPUTPROFILE, pdfWriter.AddToBody(pdfIccBased).IndirectReference);
pdfWriter.ExtraCatalog.Put(PdfName.OUTPUTINTENT, new PdfArray(pdfDictionary));
var image = PrepareImage(imageBytes);
document.Open();
document.Add(image);
pdfWriter.CreateXmpMetadata();
pdfWriter.CloseStream = false;
document.Close();
}
return output.GetBuffer();
This is prepareImage()
It's used to flatten the image to bmp, so I don't need to bother about alpha channels.
private Image PrepareImage(Stream stream)
{
Bitmap bmp = new Bitmap(System.Drawing.Image.FromStream(stream));
var file = new MemoryStream();
bmp.Save(file, ImageFormat.Bmp);
var image = Image.GetInstance(file.GetBuffer());
if (image.Height > PageSize.A4.Height || image.Width > PageSize.A4.Width)
{
image.ScaleToFit(PageSize.A4.Width, PageSize.A4.Height);
}
return image;
}
Can anyone help me into a direction to fix the errors?
Specifically the device-specific color spaces
Edit: More explanation: What I'm trying to achieve is, converting scanned images to PDF/A for long-term data storage
Edit: added some files I'm using to test with
PDFs and Pictures.rar (3.9 MB)
https://mega.co.nz/#!n8pClYgL!NJOJqSO3EuVrqLVyh3c43yW-u_U35NqeB0svc6giaSQ
OK, I checked one of your files in callas pdfToolbox and it says: "Device color space used but no PDF/A output intent". Which I took as a sign that you do something wrong while writing an output intent to the document. I then converted that document to PDF/A-1b with the same tool and the difference is obvious.
Perhaps there are other errors you need to fix, but the first error here is that you put a key in the catalog dict for the PDF file that is named "OutputIntent". That's wrong: page 75 of the PDF Specification states that the key should be named "OutputIntents".
Like I said, perhaps there are other problems with your file beyond this, but the wrong name for the key causes PDF/A validators not to find the Output Intent you try to put in the file...
First of all, pdfx IS NOT pdfa.
Second, you're using wrong PdfWriter. It should be PdfAWriter.
I do not have solution for image problem unfortunatelly, but I have for 1 and 2.
Regards
using System;
using Microsoft.VisualStudio.TestTools.UnitTesting;
using System.Text;
using System.IO;
using iTextSharp.text;
using iTextSharp.text.pdf;
using iTextSharp.text.html.simpleparser;
using iTextSharp.tool.xml;
using System.Drawing;
using System.Drawing.Imaging;
namespace Tests
{
/*
* References:
* UTF-8 encoding http://stackoverflow.com/questions/4902033/itextsharp-5-polish-character
* PDFA http://www.codeproject.com/Questions/661704/Create-pdf-A-using-itextsharp
* Images http://stackoverflow.com/questions/15896581/make-a-pdf-conforming-pdf-a-with-only-images-using-itextsharp
*/
[TestClass]
public class UnitTest1
{
/*
* IMPORTANT: Restrictions with html usage of tags and attributes
* 1. Dont use * <head> <title>Sklep</title> </head>, because title is rendered to the page
*/
// Test cases
static string contents = "<html><body style=\"font-family:arial unicode ms;font-size: 8px;\"><p style=\"text-align: center;\"> Davčna številka dolžnika: 74605968<br /> </p><table> <tr> <td><b>\u0160t. sklepa: 88711501</b></td> <td style=\"text-align: right;\">Davčna številka dolžnika: 74605968</td> </tr> </table> <br/><img src=\"http://img.rtvslo.si/_static/images/rtvslo_mmc_logo.png\" /></body></html>";
//static string contents = "<html><body style=\"font-family:arial unicode ms;font-size: 8px;\"><p style=\"text-align: center;\"> Davčna številka dolžnika: 74605968<br /> </p><table> <tr> <td><b>\u0160t. sklepa: 88711501</b></td> <td style=\"text-align: right;\">Davčna številka dolžnika: 74605968</td> </tr> </table> <br/></body></html>";
//[TestMethod]
public void CreatePdfHtml()
{
createPDF(contents, true);
}
private void createPDF(string html, bool isPdfa)
{
TextReader reader = new StringReader(html);
Document document = new Document(PageSize.A4, 30, 30, 30, 30);
HTMLWorker worker = new HTMLWorker(document);
PdfWriter writer;
if (isPdfa)
{
//set conformity level
writer = PdfAWriter.GetInstance(document, new FileStream(#"c:\temp\testA.pdf", FileMode.Create), PdfAConformanceLevel.PDF_A_1B);
//set pdf version
writer.SetPdfVersion(PdfAWriter.PDF_VERSION_1_4);
// Create XMP metadata. It's a PDF/A requirement.
writer.CreateXmpMetadata();
}
else
{
writer = PdfWriter.GetInstance(document, new FileStream(#"c:\temp\test.pdf", FileMode.Create));
}
document.Open();
if (isPdfa) // document should be opend, or it will fail
{
// Set output intent for uncalibrated color space. PDF/A requirement.
ICC_Profile icc = ICC_Profile.GetInstance(Environment.GetEnvironmentVariable("SystemRoot") + #"\System32\spool\drivers\color\sRGB Color Space Profile.icm");
writer.SetOutputIntents("Custom", "", "http://www.color.org", "sRGB IEC61966-2.1", icc);
}
//register font used in html
FontFactory.Register(Environment.GetEnvironmentVariable("SystemRoot") + "\\Fonts\\ARIALUNI.TTF", "arial unicode ms");
//adding custom style attributes to html specific tasks. Can be used instead of css
//this one is a must fopr display of utf8 language specific characters (čćžđpš)
iTextSharp.text.html.simpleparser.StyleSheet ST = new iTextSharp.text.html.simpleparser.StyleSheet();
ST.LoadTagStyle("body", "encoding", "Identity-H");
worker.SetStyleSheet(ST);
worker.StartDocument();
worker.Parse(reader);
worker.EndDocument();
worker.Close();
document.Close();
}
}
}

Categories