iTextSharp + FileStream = Corrupt PDF file

iTextSharp + FileStream = Corrupt PDF file - c#

I am trying to create a pdf file with iTextSharp. My attempt writes the content of the pdf to a MemoryStream so I can write the result both into file and a database BLOB. The file gets created, has a size of about 21kB and it looks like a pdf when opend with Notepad++. But my PDF viewer says it's currupted.
Here is a little code snippet (only tries to write to a file, not to a database):
Document myDocument = new Document();
MemoryStream myMemoryStream = new MemoryStream();
PdfWriter myPDFWriter = PdfWriter.GetInstance(myDocument, myMemoryStream);
myDocument.Open();
// Content of the pdf gets inserted here
using (FileStream fs = File.Create("D:\\...\\aTestFile.pdf"))
{
myMemoryStream.WriteTo(fs);
}
myMemoryStream.Close();
Where is the mistake I make?
Thank you,
Norbert

I think your problem was that you weren't properly adding content to your PDF. This is done through the Document.Add() method and you finish up by calling Document.Close().
When you call Document.Close() however, your MemoryStream also closes so you won't be able to write it to your FileStream as you have. You can get around this by storing the content of your MemoryStream to a byte array.
The following code snippet works for me:
using (MemoryStream myMemoryStream = new MemoryStream()) {
Document myDocument = new Document();
PdfWriter myPDFWriter = PdfWriter.GetInstance(myDocument, myMemoryStream);
myDocument.Open();
// Add to content to your PDF here...
myDocument.Add(new Paragraph("I hope this works for you."));
// We're done adding stuff to our PDF.
myDocument.Close();
byte[] content = myMemoryStream.ToArray();
// Write out PDF from memory stream.
using (FileStream fs = File.Create("aTestFile.pdf")) {
fs.Write(content, 0, (int)content.Length);
}
}

I had similar issue. My file gets downloaded but the file size will be 13Bytes. I resolved the issue when I used binary writer to write my file
byte[] bytes = new byte[0];
//pass in your API response into the bytes initialized
using (StreamWriter streamWriter = new StreamWriter(FilePath, true))
{
BinaryWriter binaryWriter = new BinaryWriter(streamWriter.BaseStream);
binaryWriter.Write(bytes);
}

Just some thoughts - what happens if you replace the memory stream with a file stream? Does this give you the result you need? This will at least tell you where the problem could be.
If this does work, how do the files differ (in size and binary representation)?
Just a guess, but have you tried seeking to the beginning of the memory stream before writing?
myMemoryStream.Seek(0, SeekOrigin.Begin);

Try double checking your code that manipulates the PDF with iText. Make sure you're calling the appropriate EndText method of any PdfContentByte objects, and make sure you call myDocument.Close() before writing the file to disk. Those are things I've had problems with in the past when generating PDFs with iTextSharp.

documentobject.Close();
using (FileStream fs = System.IO.File.Create(path)){
Memorystreamobject.WriteTo(fs);
}

Related

Why my PDF is not readable after edited by iText?

My PDF is not readable after tried to edit the text.
How to make it works ?
my error message :
Adobe Reader could not open '495049.pdf' because it is either not a supported file type or because the file has been damaged (for example, it was sent as email attachment and wasn't correctly decoded)
Basically the objective is to edit PDF doc and replace particular text.
Input already in binary stream (byte[ ])
I worked on C# environment & iText for the PDF editing lib.
Here's my piece of code :
using (PdfReader reader = new PdfReader(doc.FileStream))
{
PdfDictionary dict = reader.GetPageN(1);
PdfObject pdfObject = dict.GetDirectObject(PdfName.CONTENTS);
if (pdfObject.IsStream())
{
PRStream stream = (PRStream)pdfObject;
byte[] data = PdfReader.GetStreamBytes(stream);
stream.SetData(System.Text.Encoding.ASCII.GetBytes(System.Text.Encoding.ASCII.GetString(data).Replace("[ReplacmentText]", "Hello World")));
}
using (MemoryStream ms = new MemoryStream())
{
var ignored = new PdfStamper(reader, ms);
reader.Close();
return ms.ToArray();
}
}

Your main mistake is that you retrieve the contents of the memory stream before closing the stamper; actually you don't close it at all!
Only when closing the stamper, the final part of the PDF is written. Thus:
using (MemoryStream ms = new MemoryStream())
{
var ignored = new PdfStamper(reader, ms);
ignored.Close();
reader.Close();
return ms.ToArray();
}
Your other problem (probably not relevant for your current test documents but in general):
stream.SetData(System.Text.Encoding.ASCII.GetBytes(System.Text.Encoding.ASCII.GetString(data).Replace("[ReplacmentText]", "Hello World")));
This assumes very much, especially that the stream content only contains ASCII bytes, that the place holder "[ReplacementText]" (I assume this is the correct spelling) occurs in one piece and in the immediate content streams, that the font used to draw the place holder and its replacement uses an ASCII'ish encoding, and that this font has glyphs for all characters in "Hello World". Neither of these assumptions are automatically true.

Is it possible to write a packaging.package to a stream without having to save it to a file first?

I have a System.IO.Packaging.Package in memory (it is a WordprocessingDocument) and want to stream it down to browser to save it. The word document has been modified by the MVC-based application and the resulting file has been modified for the current request.
I understand the package represents a 'zip' file containing a number of parts. These parts include headers, footers and main body document. I've modified each individually and now want to stream the package back to the user.
I can get the individual part streams... package.GetPart(new Uri("/word/document.xml", UriKind.Relative)).GetStream()
However I'm missing how to get an output stream on the entire document (package)- without writing to the file system.
Thanks in advance
No- what I think I need is something like this... I've already read in the template document and made modifications in memory. Now I want to stream a modified document (leaving the template un-touched) back to the user.
MemoryStream stream = new MemoryStream();
WordprocessingDocument docOut =
WordprocessingDocument.Create( stream, WordprocessingDocumentType.Document);
foreach (var part in package.GetParts())
{
using (StreamReader streamReader = new StreamReader(part.GetStream()))
{
PackagePart newPart = docOut.Package.CreatePart(
part.Uri, part.ContentType );
using (StreamWriter streamWriter = new StreamWriter(newPart.GetStream(FileMode.Create)))
{
streamWriter.Write(streamReader.ReadToEnd());
}
}
}
Unfortunately- this produces a 'corrupt' word document...

OpenXmlPackage.Close Method saves all changes in all parts to the underlying store. If you opened the package from a stream, just use that stream:
public Stream packageStream() {
var ms = new MemoryStream();
var wrdPk = WordprocessingDocument.Create(ms, WordprocessingDocumentType.Document);
// Build the package ...
var docPart = wrdPk.AddMainDocumentPart();
docPart.Document = new Document(
new Body(new Paragraph(new Run(new Text("Hello world.")))));
// Flush all changes
wrdPk.Close();
return ms;
}

put generated pdf file without saving it on the server

I have code (in a .ashx-file) that generates a PDF file from a PDF template. The generated pdf gets personalized with a name and a code. I use iTextSharp to do so.
This is the code:
using (var existingFileStream = new FileStream(fileNameExisting, FileMode.Open))
using (var newFileStream = new FileStream(fileNameNew, FileMode.Create))
{
var pdfReader = new PdfReader(existingFileStream);
var stamper = new PdfStamper(pdfReader, newFileStream);
var form = stamper.AcroFields;
var fieldKeys = form.Fields.Keys;
form.SetField("Name", name);
form.SetField("Code", code);
stamper.FormFlattening = true;
stamper.Close();
pdfReader.Close();
}
context.Response.AppendHeader("content-disposition", "inline; filename=zenith_coupon.pdf");
context.Response.TransmitFile(fileNameNew);
context.Response.ContentType = "application/pdf";
This works, but it saves the file on the server. I don't want to do that because there're going to be a lot of people downloading the PDF file and the server will be full in no time.
So my question is, how can I generate a PDF with iTextSharp without saving it and put it to the user?

Instead of using a FileStream you could use a MemoryStream and then use Response.Write() to output the stream contents.

You can use any Stream (for example MemoryStream) for the intermediate PDF (in your code currently named newFileStream) if you don't want to save it as a file - for sample code see http://www.developerfusion.com/code/6623/dynamically-generating-pdfs-in-net/ and http://forums.asp.net/t/1093198.aspx/1.
Just remember to rewind (i.e. set Position = 0) the MemoryStream before transmitting it to the client (for example by Response.Write or CopyTo (Response.OutputStream) )...

Loading binary file from db and process it as openxml

I have such method to load document file from db that is stored as binary and then replace customxml parts with parameters.
Somehow when i convert byte into MemoryStream then process it doesn't work, my custom xml parts are not replaced. But if i use FileStream and read same file from disk then it replaced perfectly!
What is wrong with MemoryStream? i can't also cast MemoryStream to FileStream or create instrance of Stream or etc..
Any suggestion?
private static Stream LoadContent(byte[] content, XmlDocument parameters)
{
//FileStream works perfectly
//Stream fileStream = new FileStream(#"C:\temp\test.docx", FileMode.Open);
Stream documentStream = new MemoryStream();
documentStream.Write(content, 0, content.Length);
//Processes word file, replace custom xml parts with parameters
using (WordprocessingDocument document = WordprocessingDocument.Open(documentStream, true))
{
MainDocumentPart mainPart = document.MainDocumentPart;
Stream partStream = mainPart.CustomXmlParts.First().GetStream();
using (XmlWriter xmlWriter = XmlWriter.Create(partStream, new XmlWriterSettings { CloseOutput = false }))
{
parameters.WriteTo(xmlWriter);
xmlWriter.Flush();
}
mainPart.Document.Save();
}
return documentStream;
}

You might want to try to set the 'Position' propery of the memorystream to 0 after writing data to it and before reading it again.
Alernatively you can also pass the byte array to the constructor of the memorystream instead of calling writer.
edit
I see according to MSDN that the 'Document.Save' method will flush the stream to allow propper saving. ( http://msdn.microsoft.com/en-us/library/cc840441.aspx ).
However MemoryStream wont do anything on flush ( http://msdn.microsoft.com/en-us/library/system.io.memorystream.flush.aspx ).
You could try to create a new MemoryStream and then pass that as a parameter to the 'Document.Save' method.

Delete dynamically generated PDF file immediately after it has been displayed to user

I'm creating a PDF file on the fly using ITextSharp and ASP.NET 1.1. My process is as follows -
Create file on server
Redirect browser to newly created PDF
file so it is displayed to user
What I'd like to do is delete the PDF from the server as soon it is displayed in the users browser. The PDF file is large so it is not an option to hold it in memory, an initial write to the server is required. I'm currently using a solution that periodically polls for files then deletes them, but I'd prefer a solution that deletes the file immediately after it has been downloaded to the client machine. Is there a way to do this?

Instead of redirecting the browser to the created file you could serve the file yourself using you own HttpHandler. Then you could delete the file immediately after you served it or you could even create the file in memory.
Write the PDF file directly to the Client:
public class MyHandler : IHttpHandler {
public void ProcessRequest(System.Web.HttpContext context) {
context.Response.ContentType = "application/pdf";
// ...
PdfWriter.getInstance(document, context.Response.OutputStream);
// ...
or read an already generated file 'filename', serve the file, delete it:
context.Response.Buffer = false;
context.Response.BufferOutput = false;
context.Response.ContentType = "application/pdf";
Stream outstream = context.Response.OutputStream;
FileStream instream =
new FileStream(filename, FileMode.Open, FileAccess.Read, FileShare.Read);
byte[] buffer = new byte[BUFFER_SIZE];
int len;
while ((len = instream.Read(buffer, 0, BUFFER_SIZE)) > 0) {
outstream.Write(buffer, 0, len);
}
outstream.Flush();
instream.Close();
// served the file -> now delete it
File.Delete(filename);
I didn't try this code. This is just how I think it would work ...

Inspired by f3lix's answer (thanks f3lix!) I've come up with the folowing VB.net code -
HttpContext.Current.Response.ClearContent()
HttpContext.Current.Response.ClearHeaders()
HttpContext.Current.Response.ContentType = "application/pdf"
HttpContext.Current.Response.TransmitFile(PDFFileName)
HttpContext.Current.Response.Flush()
HttpContext.Current.Response.Close()
File.Delete(PDFFileName)
This appears to work - is the 'WriteFile' method I've used any less efficent that the stream methods used by f3lix? Is there a method available that's more efficient than either of our solutions?
EDIT (19/03/2009) Based on comments below I've changed 'WriteFile' method to 'TransmitFile' as it appears it sends the file down to client in chunks rather than writing the entire file to the webserver's memory before sending. Further info can be found here.

Or you could just return it to the browser without writing to disk at all:
byte[] pdf;
using (MemoryStream ms = new MemoryStream()) {
Document doc = new Document();
PdfWriter.GetInstance(doc, ms);
doc.AddTitle("Document Title");
doc.Open();
doc.Add(new Paragraph("My paragraph."));
doc.Close();
pdf = ms.GetBuffer();
}
Response.ContentType = "application/pdf";
Response.AddHeader("Content-Disposition", "attachment;filename=MyDocument.pdf");
Response.OutputStream.Write(pdf, 0, pdf.Length);

The solution:
Response.TransmitFile(PDFFileName)
Response.Flush()
Response.Close()
File.Delete(PDFFileName)
Simply doesn't work for me (file never makes it to client). Reading in a byte array and calling Response.BinaryWrite isn't an option either since the file may be large. Is the only hack for this to start an asynchronous process that waits for the file to be released and then delete it?

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

iTextSharp + FileStream = Corrupt PDF file - c#

documentobject.Close(); using (FileStream fs = System.IO.File.Create(path)){ Memorystreamobject.WriteTo(fs); }

Related

Why my PDF is not readable after edited by iText?

Is it possible to write a packaging.package to a stream without having to save it to a file first?

put generated pdf file without saving it on the server

Loading binary file from db and process it as openxml

Delete dynamically generated PDF file immediately after it has been displayed to user

Categories

Resources