Why my PDF is not readable after edited by iText? - c#

My PDF is not readable after tried to edit the text.
How to make it works ?
my error message :
Adobe Reader could not open '495049.pdf' because it is either not a supported file type or because the file has been damaged (for example, it was sent as email attachment and wasn't correctly decoded)
Basically the objective is to edit PDF doc and replace particular text.
Input already in binary stream (byte[ ])
I worked on C# environment & iText for the PDF editing lib.
Here's my piece of code :
using (PdfReader reader = new PdfReader(doc.FileStream))
{
PdfDictionary dict = reader.GetPageN(1);
PdfObject pdfObject = dict.GetDirectObject(PdfName.CONTENTS);
if (pdfObject.IsStream())
{
PRStream stream = (PRStream)pdfObject;
byte[] data = PdfReader.GetStreamBytes(stream);
stream.SetData(System.Text.Encoding.ASCII.GetBytes(System.Text.Encoding.ASCII.GetString(data).Replace("[ReplacmentText]", "Hello World")));
}
using (MemoryStream ms = new MemoryStream())
{
var ignored = new PdfStamper(reader, ms);
reader.Close();
return ms.ToArray();
}
}

Your main mistake is that you retrieve the contents of the memory stream before closing the stamper; actually you don't close it at all!
Only when closing the stamper, the final part of the PDF is written. Thus:
using (MemoryStream ms = new MemoryStream())
{
var ignored = new PdfStamper(reader, ms);
ignored.Close();
reader.Close();
return ms.ToArray();
}
Your other problem (probably not relevant for your current test documents but in general):
stream.SetData(System.Text.Encoding.ASCII.GetBytes(System.Text.Encoding.ASCII.GetString(data).Replace("[ReplacmentText]", "Hello World")));
This assumes very much, especially that the stream content only contains ASCII bytes, that the place holder "[ReplacementText]" (I assume this is the correct spelling) occurs in one piece and in the immediate content streams, that the font used to draw the place holder and its replacement uses an ASCII'ish encoding, and that this font has glyphs for all characters in "Hello World". Neither of these assumptions are automatically true.

Related

Creating a new PDF from an existing template - odd behavior in Acrobat XI

I'm generating a new PDF from an existing template that has been created in LibreOffice. It contains one Text Box.
After the code compiles and successfully saves the PDF to a new file, if I open the newly created document in Acrobat Reader XI, it renders correctly, but, even if I don't modify the final document, upon closing the document, it asks "Do you want to save changes to "filename.pdf" before closing?"
I've read other posts on StackOverflow and their official site (iTextSharp), and found a solution, that maybe I'm implementing in a wrongful manner.
public string spdftemplate = #"C:\test\input.pdf";
public string newFile = #"C:\test\output.pdf";
private void FillFormsProperly()
{
PdfReader reader = new PdfReader(spdftemplate);
byte[] bytes;
using (MemoryStream ms = new MemoryStream())
{
PdfStamper stamper = new PdfStamper(reader, ms);
#region ForTesting
//PdfContentByte cb = stamper.GetOverContent(1);
//ColumnText ct = new ColumnText(cb);
//ct.SetSimpleColumn(100, 100, 500, 200);
//ct.AddElement(new Paragraph("This was added using ColumnText"));
//ct.Go();
#endregion ForTesting
AcroFields pdfFormFields = stamper.AcroFields;
foreach (DictionaryEntry de in reader.AcroFields.Fields)
{
pdfFormFields.SetField(de.Key.ToString(), "test"); //"Text Box 1"
}
//string sTmp = "W-4 Completed for " + pdfFormFields.GetField("Text Box 1");
//MessageBox.Show(sTmp, "Finished");
//Flush the PdfStamper's buffer
stamper.FormFlattening = true;
stamper.Close();
//Get the raw bytes of the PDF
bytes = ms.ToArray();
}
//Do whatever you want with the bytes
//Below I'm writing them to disk
using (FileStream fs = new FileStream(newFile, FileMode.Create, FileAccess.Write, FileShare.None))
{
fs.Write(bytes, 0, bytes.Length);
}
}
The best answer I found was this : creating a pdf from a template in itextsharp and outputting as content disposition.
The above code is my (copy-paste more or less) implementation.
It's obvious that the file is corrupted (but still readable), how can I fix this ?
Your input.pdf contains a form field and the flag /NeedAppearances true. Your output.pdf does not contain a field anymore (obviously... you flattened the form after all) but it still contains that flag /NeedAppearances true.
This flag tells the PDF viewer (Acrobat Reader) to generate appearance streams for some form fields. Thus, the Reader inspects all fields to create the appearances where necessary. Afterwards it removes the flag. Because of this the document then is changed; even if there are no fields, at least the flag removal is a change.
This reminds of an iText issue which has been fixed in February last year in iText:
In some cases, Adobe Reader X asks if you want to "save changes" after closing a flattened PDF form. This was due to the presence of some unnecessary entries in the /AcroForm dictionary (for instance added when the form was created with OOo).
(iText revision 5089, February 29th, 2012, blowagie)
This change has been ported to iTextSharp in iTextSharp revision 323, March 3rd, 2012, psoares33.
Thus, you might want to update the iTextSharp version you use.

Is it possible to write a packaging.package to a stream without having to save it to a file first?

I have a System.IO.Packaging.Package in memory (it is a WordprocessingDocument) and want to stream it down to browser to save it. The word document has been modified by the MVC-based application and the resulting file has been modified for the current request.
I understand the package represents a 'zip' file containing a number of parts. These parts include headers, footers and main body document. I've modified each individually and now want to stream the package back to the user.
I can get the individual part streams... package.GetPart(new Uri("/word/document.xml", UriKind.Relative)).GetStream()
However I'm missing how to get an output stream on the entire document (package)- without writing to the file system.
Thanks in advance
No- what I think I need is something like this... I've already read in the template document and made modifications in memory. Now I want to stream a modified document (leaving the template un-touched) back to the user.
MemoryStream stream = new MemoryStream();
WordprocessingDocument docOut =
WordprocessingDocument.Create( stream, WordprocessingDocumentType.Document);
foreach (var part in package.GetParts())
{
using (StreamReader streamReader = new StreamReader(part.GetStream()))
{
PackagePart newPart = docOut.Package.CreatePart(
part.Uri, part.ContentType );
using (StreamWriter streamWriter = new StreamWriter(newPart.GetStream(FileMode.Create)))
{
streamWriter.Write(streamReader.ReadToEnd());
}
}
}
Unfortunately- this produces a 'corrupt' word document...
OpenXmlPackage.Close Method saves all changes in all parts to the underlying store. If you opened the package from a stream, just use that stream:
public Stream packageStream() {
var ms = new MemoryStream();
var wrdPk = WordprocessingDocument.Create(ms, WordprocessingDocumentType.Document);
// Build the package ...
var docPart = wrdPk.AddMainDocumentPart();
docPart.Document = new Document(
new Body(new Paragraph(new Run(new Text("Hello world.")))));
// Flush all changes
wrdPk.Close();
return ms;
}

put generated pdf file without saving it on the server

I have code (in a .ashx-file) that generates a PDF file from a PDF template. The generated pdf gets personalized with a name and a code. I use iTextSharp to do so.
This is the code:
using (var existingFileStream = new FileStream(fileNameExisting, FileMode.Open))
using (var newFileStream = new FileStream(fileNameNew, FileMode.Create))
{
var pdfReader = new PdfReader(existingFileStream);
var stamper = new PdfStamper(pdfReader, newFileStream);
var form = stamper.AcroFields;
var fieldKeys = form.Fields.Keys;
form.SetField("Name", name);
form.SetField("Code", code);
stamper.FormFlattening = true;
stamper.Close();
pdfReader.Close();
}
context.Response.AppendHeader("content-disposition", "inline; filename=zenith_coupon.pdf");
context.Response.TransmitFile(fileNameNew);
context.Response.ContentType = "application/pdf";
This works, but it saves the file on the server. I don't want to do that because there're going to be a lot of people downloading the PDF file and the server will be full in no time.
So my question is, how can I generate a PDF with iTextSharp without saving it and put it to the user?
Instead of using a FileStream you could use a MemoryStream and then use Response.Write() to output the stream contents.
You can use any Stream (for example MemoryStream) for the intermediate PDF (in your code currently named newFileStream) if you don't want to save it as a file - for sample code see http://www.developerfusion.com/code/6623/dynamically-generating-pdfs-in-net/ and http://forums.asp.net/t/1093198.aspx/1.
Just remember to rewind (i.e. set Position = 0) the MemoryStream before transmitting it to the client (for example by Response.Write or CopyTo (Response.OutputStream) )...

ItextSharp Expected a Dict Object When trying to print

I have a web page that allows a user to view a pdf and print pdf. The print pdf is a copy of the display pdf and i am using ItextSharp to inject the javascript to allow auto printing. I have a method that allows a user to upload a pdf and it calls this method below to copy the display copy into a pdf. Both pdf's are then saved in the database. However , when a user goes to click on the print button on my web page they receive the following error "expected a dict object". below is my code that adds in the auto print, which works fine for me but not on my clients site.
I am doing anything wrong that could be corrupting the file. The original pdf content is passed in as a Binary Object.
Any help on this is much appreciated as i am highly confused on this one. Also i am using ASP.NET MVC2.
MemoryStream originalPdf = new MemoryStream(Content.BinaryData);
MemoryStream updatedPdf = new MemoryStream();
updatedPdf.Write(Content.BinaryData,0, Content.BinaryData.Length);
PdfReader pdfReader = new PdfReader(originalPdf);
PdfStamper pdfStamper = new PdfStamper(pdfReader, updatedPdf);
if (autoPrinting)
{
pdfStamper.JavaScript = "this.print(true);\r";
}
else
{
pdfStamper.JavaScript = null;
}
pdfStamper.Close();
pdfReader.Close();
Content.BinaryData = updatedPdf.ToArray();
Don't write the original PDF to your output. pdfStamper.close() will do all the writing for you, even in append mode (which you're not using).
Your code should read:
MemoryStream originalPdf = new MemoryStream(Content.BinaryData);
MemoryStream updatedPdf = new MemoryStream();
// Don't do that.
//updatedPdf.Write(Content.BinaryData,0, Content.BinaryData.Length);
PdfReader pdfReader = new PdfReader(originalPdf);
PdfStamper pdfStamper = new PdfStamper(pdfReader, updatedPdf);
if (autoPrinting) {
pdfStamper.JavaScript = "this.print(true);\r";
} else {
pdfStamper.JavaScript = null;
}
pdfStamper.Close(); // this does it for you.
pdfReader.Close();
Content.BinaryData = updatedPdf.ToArray();
I'm surprised that this "works for you". If nothing else, I'd expect the JS to fail because the byte offsets would be all wrong... in fact, all your offsets would be all wrong. I think my ignorance of C# is showing.
But Write() behaves the way I thought it would, so I'm back to being surprised.

iTextSharp + FileStream = Corrupt PDF file

I am trying to create a pdf file with iTextSharp. My attempt writes the content of the pdf to a MemoryStream so I can write the result both into file and a database BLOB. The file gets created, has a size of about 21kB and it looks like a pdf when opend with Notepad++. But my PDF viewer says it's currupted.
Here is a little code snippet (only tries to write to a file, not to a database):
Document myDocument = new Document();
MemoryStream myMemoryStream = new MemoryStream();
PdfWriter myPDFWriter = PdfWriter.GetInstance(myDocument, myMemoryStream);
myDocument.Open();
// Content of the pdf gets inserted here
using (FileStream fs = File.Create("D:\\...\\aTestFile.pdf"))
{
myMemoryStream.WriteTo(fs);
}
myMemoryStream.Close();
Where is the mistake I make?
Thank you,
Norbert
I think your problem was that you weren't properly adding content to your PDF. This is done through the Document.Add() method and you finish up by calling Document.Close().
When you call Document.Close() however, your MemoryStream also closes so you won't be able to write it to your FileStream as you have. You can get around this by storing the content of your MemoryStream to a byte array.
The following code snippet works for me:
using (MemoryStream myMemoryStream = new MemoryStream()) {
Document myDocument = new Document();
PdfWriter myPDFWriter = PdfWriter.GetInstance(myDocument, myMemoryStream);
myDocument.Open();
// Add to content to your PDF here...
myDocument.Add(new Paragraph("I hope this works for you."));
// We're done adding stuff to our PDF.
myDocument.Close();
byte[] content = myMemoryStream.ToArray();
// Write out PDF from memory stream.
using (FileStream fs = File.Create("aTestFile.pdf")) {
fs.Write(content, 0, (int)content.Length);
}
}
I had similar issue. My file gets downloaded but the file size will be 13Bytes. I resolved the issue when I used binary writer to write my file
byte[] bytes = new byte[0];
//pass in your API response into the bytes initialized
using (StreamWriter streamWriter = new StreamWriter(FilePath, true))
{
BinaryWriter binaryWriter = new BinaryWriter(streamWriter.BaseStream);
binaryWriter.Write(bytes);
}
Just some thoughts - what happens if you replace the memory stream with a file stream? Does this give you the result you need? This will at least tell you where the problem could be.
If this does work, how do the files differ (in size and binary representation)?
Just a guess, but have you tried seeking to the beginning of the memory stream before writing?
myMemoryStream.Seek(0, SeekOrigin.Begin);
Try double checking your code that manipulates the PDF with iText. Make sure you're calling the appropriate EndText method of any PdfContentByte objects, and make sure you call myDocument.Close() before writing the file to disk. Those are things I've had problems with in the past when generating PDFs with iTextSharp.
documentobject.Close();
using (FileStream fs = System.IO.File.Create(path)){
Memorystreamobject.WriteTo(fs);
}

Categories