How to save PDF using ITextSharp? - c#

I working with PDF annotations using ITextSharp. I was able to add annotations pretty smoothly.
But now I'm trying to edit them. It looks like my PdfReader object is actually updated. But for some reason I can't save it. As shown in the snippet below, I try to get the byte array from using a stamper. The byte array is only 1 byte longer than the previous version no matter how long is the annotation. And when I open the PDF saved on the file system, I still have the old annotation...
private void UpdatePDFAnnotation(string title, string body)
{
byte[] newBuffer;
using (PdfReader pdfReader = new PdfReader(dataBuffer))
{
int pageIndex = 1;
int annotIndex = 0;
PdfDictionary pageDict = pdfReader.GetPageN(pageIndex);
var annots = pageDict.GetAsArray(PdfName.ANNOTS);
if (annots != null)
{
PdfDictionary annot = annots.GetAsDict(annotIndex);
annot.Put(PdfName.T, new PdfString(title));
annot.Put(PdfName.CONTENTS, new PdfString(body));
}
// ********************************
// this line shows the new annotation is in here. Just have to save it somehow !!
var updatedBody = pdfReader.GetPageN(pageIndex).GetAsArray(PdfName.ANNOTS).GetAsDict(0).GetAsString(PdfName.CONTENTS);
Debug.Assert(newBody == updatedBody.ToString(), "Annotation body should be equal");
using (MemoryStream outStream = new MemoryStream())
{
using (PdfStamper stamp = new PdfStamper(pdfReader, outStream, '\0', true))
{
newBuffer = outStream.ToArray();
}
}
}
File.WriteAllBytes( #"Assets\Documents\AnnotedPdf.pdf", newBuffer);
}
Any idea what's wrong with my code?

PdfStamper does much of the writing at the time it is being closed. This implicitly happens at the end of its using block. But you retrieve the MemoryStream contents already in that block. Thus, the PDF is not yet written to the retrieved byte[].
Instead either explicitly close the PdfStamper instance before retrieving the byte[]:
using (PdfStamper stamp = new PdfStamper(pdfReader, outStream, '\0', true))
{
stamp.Close();
newBuffer = outStream.ToArray();
}
or retrieve the byte[] after that using block:
using (PdfStamper stamp = new PdfStamper(pdfReader, outStream, '\0', true))
{
}
newBuffer = outStream.ToArray();

Allright, I finally got it to work. The trick was the two last parameter in the PdfStamper instantiation. I tried it before with only 2 parameters and ended up with a corrupted file. Then I tried again and now it works... here's the snippet
private void UpdatePDFAnnotation(string title, string body)
{
using (PdfReader pdfReader = new PdfReader(dataBuffer))
{
PdfDictionary pageDict = pdfReader.GetPageN(pageIndex);
var annots = pageDict.GetAsArray(PdfName.ANNOTS);
PdfDictionary annot = annots.GetAsDict(annotIndex);
annot.Put(PdfName.T, new PdfString(title));
annot.Put(PdfName.CONTENTS, new PdfString(body));
using (MemoryStream ms = new MemoryStream())
{
PdfStamper stamp = new PdfStamper(pdfReader, ms);
stamp.Dispose();
dataBuffer = ms.ToArray();
}
}
}

Related

Cannot access a closed Stream. When using PDFReader

I have this file, which is a Stream:
var streamFile = await graphClient.Me.Drive.Items["id"].Content.Request().GetAsync();
Now I am trying to use PdfReader and PdfStamper to set Fields like so:
MemoryStream outFile = new MemoryStream();
PdfReader pdfReader = new PdfReader(streamFile);
PdfStamper pdfStamper = new PdfStamper(pdfReader, outFile);
AcroFields fields = pdfStamper.AcroFields;
fields.SetField("Full_Names", "JIMMMMMMAYYYYY");
pdfStamper.Close();
pdfReader.Close();
But when I try to do this, I get this error:
Cannot access a closed Stream.
On this line:
pdfReader.Close();
What am I doing wrong?
UPDATE
I tried this, still getting the same error:
using (MemoryStream outFile = new MemoryStream())
{
var streamFile = await graphClient.Me.Drive.Items["item-id"].Content.Request().GetAsync();
using (PdfReader pdfReader = new PdfReader(streamFile))
{
using (PdfStamper pdfStamper = new PdfStamper(pdfReader, outFile))
{
AcroFields fields = pdfStamper.AcroFields;
fields.SetField("Full_Names", "JIMMMMMMAYYYYY");
}
}
outFile.Position = 0;
await graphClient.Me.Drive.Items["item-id"].ItemWithPath("NewDocument-2.pdf").Content.Request().PutAsync<DriveItem>(outFile);
}
UPDATE
I have tried converting the Stream to bytes like so:
var streamFile = await graphClient.Me.Drive.Items["item-id"].Content.Request().GetAsync();
byte[] buffer = new byte[16 * 1024];
using (MemoryStream ms = new MemoryStream())
{
int read;
while ((read = streamFile.Read(buffer, 0, buffer.Length)) > 0)
{
ms.Write(buffer, 0, read);
}
using (PdfReader pdfReader = new PdfReader(ms.ToArray()))
{
using (PdfStamper pdfStamper = new PdfStamper(pdfReader, ms))
{
AcroFields fields = pdfStamper.AcroFields;
fields.SetField("Full_Names", "JIMMMMMMAYYYYY");
}
}
await graphClient.Me.Drive.Items["item-id"].ItemWithPath("NewDocument-2.pdf").Content.Request().PutAsync<DriveItem>(ms);
}
Same result...Cannot access a closed Stream on this line:
await graphClient.Me.Drive.Items["item-id"].ItemWithPath("NewDocument-2.pdf").Content.Request().PutAsync<DriveItem>(ms);
The PutAsync is expecting a Stream as well
So when I do this:
var streamFile = await graphClient.Me.Drive.Items["item-id"].Content.Request().GetAsync();
await graphClient.Me.Drive.Items["item-id"].ItemWithPath("NewDocument-2.pdf").Content.Request().PutAsync<DriveItem>(streamFile);
It uploads the file no problem. So I do believe the problem is trying to edit the PDF with iTextSharp.
In my case I wanted to create the document in memory and add WaterMark after creation of document, but without saving a physical file as an intermediate step (which works as well but not nearly as neat).
public byte[] AsArray(List<DocumentData> list)
{
MemoryStream streamIn = new MemoryStream(); // Set the initial stream for the document
MemoryStream streamOut = new MemoryStream(); // Set the result output stream
PdfWriter writer = new PdfWriter(streamIn); // create the writer for document
CreateDocument(writer, list); // Method where the document actually get's made
// Now the tricky bit
// Translate the `streamIn` (that now contains the document stream) into a PdfReader
// use the byte[] from streamIn to create a new MemoryStream()
PdfReader reader = new PdfReader(new MemoryStream(streamIn.ToArray()));
writer = new PdfWriter(streamOut); // Set the writer stream to be the streamOut
SetWaterMark(reader, writer); // Method to read through the document and add watermarks
return streamOut.ToArray();
}
and hey Presto, succes !
You can try the following:
var streamFile = await graphClient.Me.Drive.Items["item-id"].Content.Request().GetAsync();
byte[] buffer = new byte[16 * 1024];
try
{
PdfReader pdfReader = null;
PdfStamper pdfStamper = null;
using (MemoryStream ms = new MemoryStream())
{
int read;
while ((read = streamFile.Read(buffer, 0, buffer.Length)) > 0)
{
ms.Write(buffer, 0, read);
}
pdfReader = new PdfReader(ms.ToArray());
pdfStamper = new PdfStamper(pdfReader, ms);
AcroFields fields = pdfStamper.AcroFields;
fields.SetField("Full_Names", "JIMMMMMMAYYYYY");
await graphClient.Me.Drive.Items["item-id"].ItemWithPath("NewDocument-2.pdf").Content.Request().PutAsync<DriveItem>(ms);
}
}
finally
{
if (pdfReader != null) pdfReader.Dispose();
if (pdfStamper != null) pdfStamper.Dispose();
}
Dispose the pdfReader and pdfStamper after the await is done.

iTextSharp - Merging a Batch of PDF Byte Arrays

I am using iTextSharp to fill a few fields on a pdf. I need to be able to combine a series of these pdfs into one single batch pdf file. Below I am looping through a SQL result set, filling the fields of the pdf with values corresponding to the current record, storing that as a byte array, and consolidating all of those into a list of byte arrays. I am then attempting to merge each byte array in that list into a single byte array, and serve that as a pdf to the user.
It seems to work, generating a single file containing presumably as many individual pages as were in my result set, but with all the fields blank on each page. It works as expected when using FillForm() to serve a single pdf. What am I doing wrong?
byte[] pdfByteArray = new byte[0];
List<byte[]> pdfByteArrayList = new List<byte[]>();
byte[] pdfByteArrayItem = new byte[0];
foreach (DataRow row in results.Rows)
{
certNum = row[1].ToString();
certName = row[2].ToString();
certDate = row[3].ToString();
pdfByteArrayItem = FillForm(certType, certName, certNum, certDate);
pdfByteArrayList.Add(pdfByteArrayItem);
}
using (var ms = new MemoryStream()) {
using (var doc = new Document()) {
using (var copy = new PdfSmartCopy(doc, ms)) {
doc.Open();
//Loop through each byte array
foreach (var p in pdfByteArrayList) {
//Create a PdfReader bound to that byte array
using (var reader = new PdfReader(p)) {
//Add the entire document instead of page-by-page
copy.AddDocument(reader);
}
}
doc.Close();
}
}
pdfByteArray = ms.ToArray();
context.Response.ContentType = "application/pdf";
context.Response.BinaryWrite(pdfByteArray);
context.Response.Flush();
context.Response.End();
private byte[] FillForm(string certType, string certName, string certNum, string certDate)
{
string pdfTemplate = string.Format(#"\\filePath\{0}.pdf", certType);
PdfReader pdfReader = new PdfReader(pdfTemplate);
MemoryStream stream = new MemoryStream();
PdfStamper pdfStamper = new PdfStamper(pdfReader, stream);
AcroFields pdfFormFields = pdfStamper.AcroFields;
// set form pdfFormFields
pdfFormFields.SetField("CertName", certName);
pdfFormFields.SetField("CertNum", certNum);
pdfFormFields.SetField("CertDate", certDate);
// flatten the form to remove editting options, set it to false
// to leave the form open to subsequent manual edits
pdfStamper.FormFlattening = false;
// close the pdf
pdfStamper.Close();
stream.Flush();
stream.Close();
byte[] pdfByte = stream.ToArray();
return pdfByte;
}
Adding the line below after setting the value for the fields seems to have fixed it:
pdfFormFields.GenerateAppearances = true;

how to return binary stream from iText pdf converter

Is it possible to return binary stream (byte[ ]) from pdfstamper ?
Basically the objective is to edit PDF doc and replace particular text.
Input already in binary stream (byte[ ])
I worked on C# environment & iText for the PDF editing lib.
Here's my piece of code :
PdfReader reader = new PdfReader(Mydoc.FileStream);
PdfDictionary dict = reader.GetPageN(1);
PdfObject pdfObject = dict.GetDirectObject(PdfName.CONTENTS);
if (pdfObject.IsStream())
{
PRStream stream = (PRStream)pdfObject;
byte[] data = PdfReader.GetStreamBytes(stream);
stream.SetData(System.Text.Encoding.ASCII.GetBytes(System.Text.Encoding.ASCII. GetString(data).Replace("[TextReplacement]", "Hello world")));
}
FileStream outStream = new FileStream(dest, FileMode.Create);
PdfStamper stamper = new PdfStamper(reader, outStream);
reader.Close();
return newPDFinStream // this result should be in stream byte[]
Understand that FileStream need to have output filepath like C:\location\new.pdf
is it possible to not temporary save it ? and directly return the binary?
Sure, just save it to a MemoryStream instead:
using (MemoryStream ms = new MemoryStream())
{
// Odd to have a constructor but not use the newly-created object.
// Smacks of the constructor doing too much.
var ignored = new PdfStamper(reader, ms);
return ms.ToArray();
}

itext sharp merge pdfs with acrofields - fields go missing when merging

I have tried this now and its not working. form.GenerateAppearances = true; I merge my 2 documents and then save it. Then I open it again to populate all the fields. It says all the Acrofields keys are gone but when I open it in Nitro pro its there. Why can't I see them in code? Do I have to add something before I save?
private static void CombineAndSavePdf1(string savePath, List<string> lstPdfFiles)
{
using (Stream outputPdfStream = new FileStream(savePath, FileMode.Create, FileAccess.Write, FileShare.None))
{
Document document = new Document();
PdfSmartCopy copy = new PdfSmartCopy(document, outputPdfStream);
document.Open();
PdfReader reader;
int totalPageCnt;
PdfStamper stamper;
string[] fieldNames;
foreach (string file in lstPdfFiles)
{
reader = new PdfReader(file);
totalPageCnt = reader.NumberOfPages;
for (int pageCnt = 0; pageCnt < totalPageCnt; )
{
//have to create new reader for each page or PdfStamper will throw error
reader = new PdfReader(file);
stamper = new PdfStamper(reader, outputPdfStream);
fieldNames = new string[stamper.AcroFields.Fields.Keys.Count];
stamper.AcroFields.Fields.Keys.CopyTo(fieldNames, 0);
foreach (string name in fieldNames)
{
stamper.AcroFields.RenameField(name, name);
}
copy.AddPage(copy.GetImportedPage(reader, ++pageCnt));
}
copy.FreeReader(reader);
}
}
}
You are merging the documents the wrong way. See MergeForms to find out how to do it correctly. The key line that is missing in your code, is:
copy.setMergeFields();
Without it, the fields disappear (as you have noticed).
There's also a MergeForms2 example that explains how to merge two identical forms. In this case, you need to rename the fields, because each field needs to have a unique name. I'm adding a reference to this second example, because I see that you also try renaming the fields. There is, however, a serious flaw in your code: you create a stamper object, but you never do stamper.close(). Your use of the reader object is also problematic. All in all, it would be best to throw away your code, and to start anew using the two examples from the official iText web site.
Update: I've added the tags itext and itextsharp to your question. Only then I noticed that you're using iTextSharp instead of iText. Porting the Java code to C# should be easy for a C# developer, but I've never written a C# program, so please use the JAVA examples as if you were using pseudo-code. The code in C# won't be that different.
public void manipulatePdf(String src, String dest) throws IOException, DocumentException {
Document document = new Document();
PdfCopy copy = new PdfSmartCopy(document, new FileOutputStream(dest));
copy.setMergeFields();
document.open();
List<PdfReader> readers = new ArrayList<PdfReader>();
for (int i = 0; i < 3; ) {
PdfReader reader = new PdfReader(renameFields(src, ++i));
readers.add(reader);
copy.addDocument(reader);
}
document.close();
for (PdfReader reader : readers) {
reader.close();
}
}
public byte[] renameFields(String src, int i) throws IOException, DocumentException {
ByteArrayOutputStream baos = new ByteArrayOutputStream();
PdfReader reader = new PdfReader(src);
PdfStamper stamper = new PdfStamper(reader, baos);
AcroFields form = stamper.getAcroFields();
Set<String> keys = new HashSet<String>(form.getFields().keySet());
for (String key : keys) {
form.renameField(key, String.format("%s_%d", key, i));
}
stamper.close();
reader.close();
return baos.toByteArray();
}

iTextSharp problem concatenating PDF documents

I am trying to build up a single PDF from a bunch of other PDFs that I am filling out some form values in. Essentially I am doing a PDF mail merge. My code is below:
byte[] completedDocument = null;
using (MemoryStream streamCompleted = new MemoryStream())
{
using (Document document = new Document())
{
document.Open();
PdfCopy copy = new PdfCopy(document, streamCompleted);
copy.Open();
foreach (var item in eventItems)
{
byte[] mergedDocument = null;
PdfReader reader = new PdfReader(pdfTemplates[item.DataTokens[NotifyTokenType.OrganisationID]]);
using (MemoryStream streamTemplate = new MemoryStream())
{
using (PdfStamper stamper = new PdfStamper(reader, streamTemplate))
{
foreach (var token in item.DataTokens)
{
if (stamper.AcroFields.Fields.Any(fld => fld.Key == token.Key.ToString()))
{
stamper.AcroFields.SetField(token.Key.ToString(), token.Value);
}
}
stamper.FormFlattening = true;
stamper.Writer.CloseStream = false;
}
mergedDocument = new byte[streamTemplate.Length];
streamTemplate.Position = 0;
streamTemplate.Read(mergedDocument, 0, (int)streamTemplate.Length);
}
reader = new PdfReader(mergedDocument);
for (int i = 1; i <= reader.NumberOfPages; i++)
{
document.SetPageSize(PageSize.A4);
copy.AddPage(copy.GetImportedPage(reader, i));
}
}
}
completedDocument = new byte[streamCompleted.Length];
streamCompleted.Position = 0;
streamCompleted.Read(completedDocument, 0, (int)streamCompleted.Length);
}
The problem I am having is that is throws a null reference exception when it exits the using (Document document = new Document()) block.
From debugging the iTextSharp source the problem is the below method in PdfAnnotationsimp
public bool HasUnusedAnnotations() {
return annotations.Count > 0;
}
annotations is null so this throws the null ref exception. Is there something I should be doing to instantiate this?
I changed:
document.Open();
PdfCopy copy = new PdfCopy(document, streamCompleted);
to
PdfCopy copy = new PdfCopy(document, streamCompleted);
document.Open();
And it fixed the problem. This library needs better exception handling. When you do something slightly wrong it falls over horribly and gives you no clue about what you did wrong. I have no idea how i could possibly have worked this out if I didn't have the source code.
What version of iTextSharp are you using? The Document class doesn't implement IDisposable so you can't wrap it in a using block.

Categories