I am creating an application that has to generate thousands of PDF files at a time. I am using ITextSharp to do this and it seems that the PdfReader is slowing down the process. Below is my code.
using (MemoryStream foutput = new MemoryStream())
{
using (PdfReader pdf = new PdfReader(templateByteArray)) // slow
{
using (PdfStamper stamper = new PdfStamper(pdf, foutput))
{
AcroFields form = stamper.AcroFields;
form.SetField(_dic[#"1,1"], "some string1");
form.SetField(_dic[#"1,2"], "some string2");
stamper.FormFlattening = true;
}
pdf.RemoveUsageRights();
}
EnqueueFile(foutput.ToArray());
}
I have a separate consumer thread that takes every byte array and writes the PDF documents to the HDD from a queue. After I messed around with the code, it seems that the bottleneck is in the PdfReader class. Is there an alternate way to doing what I am trying to do or do you have any suggestions?
You are reading the properties of the fields _dic[#"1,1"] and _dic[#"1,1"] over and over again. You should cache those properties. In Java, that's done like this:
HashMap<String,TextField> fieldCache = new HashMap<String,TextField>();
This cache stores information about each TextField that is encountered. You introduce it with the setFieldCache() method:
public void manipulatePdf(String src, String dest,
HashMap<String,TextField> cache, String name, String login)
throws IOException, DocumentException {
PdfReader reader = new PdfReader(src);
PdfStamper stamper = new PdfStamper(reader, new FileOutputStream(dest));
AcroFields form = stamper.getAcroFields();
form.setFieldCache(cache);
form.setField("test", "test");
stamper.close();
reader.close();
}
The first time the field with name "test" is encountered, the information will be read from the existing file. The next time it is encountered, the TextField info will be retrieved from the cache instead of from the existing file.
Related
I want to create a PDF document containing some text that I have in the form of a string. This is what I have so far:
iTextSharp.text.Document d = new iTextSharp.text.Document();
string dosya = (#"C:\Deneme.pdf");
PdfWriter.GetInstance(d, new System.IO.FileStream(dosya, System.IO.FileMode.Create));
d.AddSubject(text);
Your question is unclear because you don't mention if you want to create a PDF from scratch (which may be what you want to do based on your code sample) or if you want to add text to an existing PDF (which is what the subject of your question suggests).
In both cases, you should take a look at the official documentation.
If you want to create a PDF from scratch, take a look at the Hello World example:
public void CreatePdf(Stream stream) {
// step 1
using (Document document = new Document()) {
// step 2
PdfWriter.GetInstance(document, stream);
// step 3
document.Open();
// step 4
document.Add(new Paragraph("Hello World!"));
}
}
The value of stream can be any output stream (one that writes to memory, one that writes to a file,...).
If you want to add a string to an existing PDF, take a look at a PdfStamper example.
public static byte[] Stamp(byte[] resource) {
PdfReader reader = new PdfReader(resource);
using (var ms = new MemoryStream()) {
using (PdfStamper stamper = new PdfStamper(reader, ms)) {
PdfContentByte canvas = stamper.GetOverContent(1);
ColumnText.ShowTextAligned(
canvas,
Element.ALIGN_LEFT,
new Phrase("Hello people!"),
36, 540, 0
);
}
return ms.ToArray();
}
}
These examples were taken from a book I once wrote. You will find the examples through this link: http://developers.itextpdf.com/examples/itext-action-second-edition
This answer assumes that you are using iText 5 (an assumption that is based on your code snippet). The most recent version is iText 7. That requires code that is totally different.
I have an XFA PDF file (which I did not author). It's a third-party form which I'm trying to fill out. I filled out the form manually, then I used iTextSharp save the full XML DomDocument from it. Now I'm trying to apply that same XML file programmatically. However, the resulting PDF doesn't have any of the fields filled in. This is the code I'm using to apply the XML file:
PdfReader pdfReader = new PdfReader(inputPdf);
using (MemoryStream ms = new MemoryStream())
{
using (PdfStamper stamper = new PdfStamper(pdfReader, ms, '\0', true))
{
XfaForm xfaForm = new XfaForm(pdfReader);
XmlDocument doc = new XmlDocument();
doc.Load(inputXml);
xfaForm.DomDocument = doc;
xfaForm.Changed = true;
XfaForm.SetXfa(xfaForm, stamper.Reader, stamper.Writer);
}
var bytes = ms.ToArray();
System.IO.File.WriteAllBytes(outputPdf, bytes);
}
inputPdf is the path to the original empty PDF file.
inputXml is the path to the XML file extracted from the filled out PDF file. This is the entire XML file, and not just the datasets section.
What's interesting is that if I create the PdfStamper object like this instead:
new PdfStamper(pdfReader, ms);
then I see the data in the fields, but of course then I have the associated issues with not appending.
Any suggestions on what I might be doing wrong? I just can't seem to get any of the changes to the DomDocument to save.
I've recently used iTextSharp to create a PDF by importing the 20 pages from an existing PDF and then adding a dynamically generated link to the bottom of the last page. It works fine... kind of. Viewing the generated PDF in Acrobat Reader on a windows PC displays everything as expected although when closing the document it always asks "Do you want to save changes?". Viewing the generated PDF on a Surface Pro with PDF Reader displays the document without the first and last pages. Apparently on a mobile device using Polaris Office the first and last pages are also missing.
I'm wondering if when the new PDF is generated it's not getting closed off quite properly and that's why it asks "Do you want to save changes?" when closing it. And maybe that's also why it doesn't display correctly in some PDF reader apps.
Here's the code:
using (var reader = new PdfReader(HostingEnvironment.MapPath("~/app/pdf/OriginalDoc.pdf")))
{
using (
var fileStream =
new FileStream(
HostingEnvironment.MapPath("~/documents/attachments/DocWithLink_" + id + ".pdf"),
FileMode.Create, FileAccess.Write))
{
var document = new Document(reader.GetPageSizeWithRotation(1));
var writer = PdfWriter.GetInstance(document, fileStream);
using (PdfStamper stamper = new PdfStamper(reader, fileStream))
{
var baseFont = BaseFont.CreateFont(BaseFont.HELVETICA_BOLD, BaseFont.CP1252,
BaseFont.NOT_EMBEDDED);
Font linkFont = FontFactory.GetFont("Arial", 12, Font.UNDERLINE, BaseColor.BLUE);
document.Open();
for (var i = 1; i <= reader.NumberOfPages; i++)
{
document.NewPage();
var importedPage = writer.GetImportedPage(reader, i);
// Copy page of original document to new document.
var contentByte = writer.DirectContent;
contentByte.AddTemplate(importedPage, 0, 0);
if (i == reader.NumberOfPages) // It's the last page so add link.
{
PdfContentByte cb = stamper.GetOverContent(i);
//Create a ColumnText object
var ct = new ColumnText(cb);
//Set the rectangle to write to
ct.SetSimpleColumn(100, 30, 500, 90, 0, PdfContentByte.ALIGN_LEFT);
//Add some text and make it blue so that it looks like a hyperlink
var c = new Chunk("Click here!", linkFont);
var congrats = new Paragraph("Congratulations on reading the eBook! ");
congrats.Alignment = PdfContentByte.ALIGN_LEFT;
c.SetAnchor("http://www.domain.com/pdf/response/" + encryptedId);
//Add the chunk to the ColumnText
congrats.Add(c);
ct.AddElement(congrats);
//Tell the system to process the above commands
ct.Go();
}
}
}
}
}
I've looked at these posts with similar issues but none seem to quite provide the answer I need:
iTextSharp-generated PDFs cause save dialog when closing
Using iTextSharp to write data to PDF works great, but Acrobat Reader asks 'Do you want to save changes' when closing file
(Or they refer to memory streams instead of writing to disk etc)
My question is, how do I modify the above so that when closing the generated PDF in Acrobat Reader there's no "Do you want to save changes?" prompt. The answer to that may solve the problems with missing pages on Surface Pro etc but if you know anything else about what might be causing that I'd like to hear about it.
Any suggestions would be very welcome! Thanks!
At first glance (and without much coffee yet) it appears that you're using a PdfReader in three different contexts, as a source to a PdfStamper, as a source for Document and as for a source for importing. So you are essentially importing a document into itself that you're also writing to.
To give you a quick overview, the following code will essentially clone the contents of source.pdf into dest.pdf:
using (var reader = new PdfReader("source.pdf")){
using (var fileStream = new FileStream("dest.pdf", FileMode.Create, FileAccess.Write)){
using (PdfStamper stamper = new PdfStamper(reader, fileStream)){
}
}
}
Since that does all of the cloning for you you don't need to import pages or anything.
Then, if the only thing that you want to do is add some text to the last page, you can just use the above and ask the PdfStamper for a PdfContentByte using GetOverContent() and telling it what page number you're interested. Then you can just use the rest of your ColumnText logic.
using (var reader = new PdfReader("Source.Pdf")) {
using (var fileStream = new FileStream("Dest.Pdf"), FileMode.Create, FileAccess.Write) {
using (PdfStamper stamper = new PdfStamper(reader, fileStream)) {
//Get a PdfContentByte object
var cb = stamper.GetOverContent(reader.NumberOfPages);
//Create a ColumnText object
var ct = new ColumnText(cb);
//Set the rectangle to write to
ct.SetSimpleColumn(100, 30, 500, 90, 0, PdfContentByte.ALIGN_LEFT);
//Add some text and make it blue so that it looks like a hyperlink
var c = new Chunk("Click here!", linkFont);
var congrats = new Paragraph("Congratulations on reading the eBook! ");
congrats.Alignment = PdfContentByte.ALIGN_LEFT;
c.SetAnchor("http://www.domain.com/pdf/response/" + encryptedId);
//Add the chunk to the ColumnText
congrats.Add(c);
ct.AddElement(congrats);
//Tell the system to process the above commands
ct.Go();
}
}
}
I want to take 2 pdf files and merge them together.
each file is one page long. the reason to merge them is that one file is simply a footer. The footer needs to be attached to the existing file.
I'm using a stamper to try and merge the 2 files.
I successfully create the output file, but it doesn't have the footer. It's just a copy of the original input file. Any idea why they aren't merging?
using (Stream inputPdfStream = new FileStream(inputFile, FileMode.Open, FileAccess.Read, FileShare.Read))
using (Stream inputPdfFooterStream = new FileStream(footerPdf, FileMode.Open, FileAccess.Read, FileShare.Read))
using (Stream outputPdfStream = new FileStream(outputFile, FileMode.Create, FileAccess.Write, FileShare.None))
{
var reader = new PdfReader(inputPdfStream);
var stamper = new PdfStamper(reader, outputPdfStream);
var pdfContentByte = stamper.GetOverContent(1);
stamper.FormFlattening = true;
stamper.Close();
}
There are different problems with your question.
Problem #1: why did you add the line stamper.FormFlattening = true;? Are you working with a form? I don't see you do anything with forms, so why would you flatten the document?
Problem #2: You say you want to merge two documents with PdfStamper. That is misleading. Merging documents is done with PdfCopy. From your explanation, I gather that you want to superimpose two documents. You are right that you need PdfStamper to do so.
Problem #3: You want to use a specific document containing a footer as company stationery. In that case, you want to add the content of the stationery under the actual content. Then why are you using stamper.GetOverContent(1);? Use stamper.GetUnderContent(1); instead.
Problem #4: You are creating an inputPdfFooterStream to read the document with the footer, but I don't see you using that stream anywhere. What do you expect?
Problem #5: You didn't read the documentation. This is your main problem. Download chapter 6 of my book (it's available for free, and I've been referring to it in dozens of answers on StackOverflow). Go to page 176 where it says "Adding company stationery to an existing document". That example meets your requirement completely!
// Create readers
PdfReader reader = new PdfReader(src);
PdfReader s_reader = new PdfReader(stationery);
using (MemoryStream ms = new MemoryStream()) {
// Create the stamper
using (PdfStamper stamper = new PdfStamper(reader, ms)) {
// Add the stationery to each page
PdfImportedPage page = stamper.GetImportedPage(s_reader, 1);
int n = reader.NumberOfPages;
PdfContentByte background;
for (int i = 1; i <= n; i++) {
background = stamper.GetUnderContent(i);
background.AddTemplate(page, 0, 0);
}
}
return ms.ToArray();
}
In your code, you only have one reader. In my code, I also have an object called s_reader that takes the footerPdf document and allows you to created a PdfImportedPage:
PdfImportedPage page = stamper.GetImportedPage(s_reader, 1);
This page is then added under the existing content of the actual document:
background = stamper.GetUnderContent(i);
background.AddTemplate(page, 0, 0);
Note that this example assumes that both documents have the same page size and that the origin of the coordinate system of the document with the actual content coincided with the lower-left corner. If that isn't the case with your PDFs, you can have a situation where the footer isn't visible or is only partly visible. Also: if the document with the actual content is opaque, it will also make the footer invisible.
I'm trying to add new page to a PdfStamper but this code doesn't add the template pdf fields to the stamper.
private void InsertNewPage(PdfStamper stamper, int pageNumber)
{
var pdfReader = new PdfReader(UrlTemplateBlankPage);
pdfReader.SelectPages("1");
stamper.InsertPage(pageNumber, pdfReader.GetPageSize(1));
stamper.GetOverContent(pageNumber).AddTemplate(stamper.GetImportedPage(pdfReader, 1), 0, 0);
//This code doesn't work because the code before is not adding the form
var pdfFormFields = stamper.AcroFields;
var fieldKeys = pdfReader.AcroFields.Fields.Keys;
foreach (var k in fieldKeys.ToList())
{
pdfFormFields.RenameField(k, k + string.Format("_{0:000}", pageNumber));
}
}
I searched online but I can't find an answer about my problem.
The PDF template I'm adding has some fields added with Acrobat. I can't attach the template but I can give you all informations.
I cannot see how do you instantiate the stamper. This is an example about how to read a PDF template and assign it to the stamper:
var reader = new PdfReader(TEMPLATE_PATH);
var pdfOutput = new FileStream(PDF_OUTPUT_PATH, FileMode.Create)
var stamper = new PdfStamper(reader, pdfOutput);
After that, you can set the fields using the SetField function:
stamper.AcroFields.SetField("FIELD1", "VALUE")
There is an option to make your fillable PDFs non-editable using:
stamper.FormFlattening = true;
Otherwise, your PDF still fillable.
Once you have finished working with your files, close them:
stamper.Close();
reader.Close();
There is an example about how to use iTextSharp's PdfStamper in the next link: http://www.codeproject.com/Tips/679606/Filling-PDF-Form-using-iText-PDF-Library