I am trying to add text to an existing PDF file using iTextSharp. I have been reading many posts, including the popular thread here.
I have some differences:
My PDF are X pages long
I want to keep everything in memory, and never have a file stored on my filesystem
So I tried to modify the code, so it takes in a byte array and returns a byte array. I have come this far:
The code compiles and runs
My out byte array has a different length than my in byte array
My problem:
I cannot see my added text when i later store the modified byte array and open it in my PDF reader
I don't get why. From every StackOverflow post I have seen, I do the same. using the DirectContent, I use BeginText and write a text. However, i cannot see it, no matter how I move the position around.
Any idea what is missing from my code?
public static byte[] WriteIdOnPdf(byte[] inPDF, string str)
{
byte[] finalBytes;
// open the reader
using (PdfReader reader = new PdfReader(inPDF))
{
Rectangle size = reader.GetPageSizeWithRotation(1);
using (Document document = new Document(size))
{
// open the writer
using (MemoryStream ms = new MemoryStream())
{
using (PdfWriter writer = PdfWriter.GetInstance(document, ms))
{
document.Open();
for (var i = 1; i <= reader.NumberOfPages; i++)
{
document.NewPage();
var baseFont = BaseFont.CreateFont(BaseFont.HELVETICA_BOLD, BaseFont.CP1252, BaseFont.NOT_EMBEDDED);
var importedPage = writer.GetImportedPage(reader, i);
var contentByte = writer.DirectContent;
contentByte.BeginText();
contentByte.SetFontAndSize(baseFont, 18);
var multiLineString = "Hello,\r\nWorld!";
contentByte.ShowTextAligned(PdfContentByte.ALIGN_LEFT, multiLineString,100, 200, 0);
contentByte.EndText();
contentByte.AddTemplate(importedPage, 0, 0);
}
document.Close();
ms.Close();
writer.Close();
reader.Close();
}
finalBytes = ms.ToArray();
}
}
}
return finalBytes;
}
The code below shows off a full-working example of creating a PDF in memory and then performing a second pass, also in memory. It does what #mkl says and closes all iText parts before trying to grab the raw bytes from the stream. It also uses GetOverContent() to draw "on top" of the previous pdf. See the code comments for more details.
//Bytes will hold our final PDFs
byte[] bytes;
//Create an in-memory PDF
using (var ms = new MemoryStream()) {
using (var doc = new Document()) {
using (var writer = PdfWriter.GetInstance(doc, ms)) {
doc.Open();
//Create a bunch of pages and add text, nothing special here
for (var i = 1; i <= 10; i++) {
doc.NewPage();
doc.Add(new Paragraph(String.Format("First Pass - Page {0}", i)));
}
doc.Close();
}
}
//Right before disposing of the MemoryStream grab all of the bytes
bytes = ms.ToArray();
}
//Another in-memory PDF
using (var ms = new MemoryStream()) {
//Bind a reader to the bytes that we created above
using (var reader = new PdfReader(bytes)) {
//Store our page count
var pageCount = reader.NumberOfPages;
//Bind a stamper to our reader
using (var stamper = new PdfStamper(reader, ms)) {
//Setup a font to use
var baseFont = BaseFont.CreateFont(BaseFont.HELVETICA_BOLD, BaseFont.CP1252, BaseFont.NOT_EMBEDDED);
//Loop through each page
for (var i = 1; i <= pageCount; i++) {
//Get the raw PDF stream "on top" of the existing content
var cb = stamper.GetOverContent(i);
//Draw some text
cb.BeginText();
cb.SetFontAndSize(baseFont, 18);
cb.ShowText(String.Format("Second Pass - Page {0}", i));
cb.EndText();
}
}
}
//Once again, grab the bytes before closing things out
bytes = ms.ToArray();
}
//Just to see the final results I'm writing these bytes to disk but you could do whatever
var testFile = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "test.pdf");
System.IO.File.WriteAllBytes(testFile, bytes);
Related
I'm trying to create a Memory stream that is a PDF, for testing I'm writing it to disk. When I write the stream I get a 1kb PDF file. Any ideas?
UPDATE:
It looks like I wasn't calling doc.close, however when I do it disposes of my finalreportstream. Is there a way around this?
public static Stream CombinePages(Stream firstPageStream, string nextPagesString)
{
var firstPage = new PdfReader(firstPageStream);
var nextPages = new PdfReader(nextPagesString);
Stream finalReportStream = new MemoryStream();
var doc = new Document();
var w = PdfWriter.GetInstance(doc, finalReportStream);
doc.Open();
doc.SetPageSize(firstPage.GetPageSize(1));
doc.NewPage();
//Add Page 1
w.DirectContent.AddTemplate(w.GetImportedPage(firstPage, 1), 0, 0);
//Add the rest of the pages
//copy readnextpages to doc starting page2 this cuts the first page
for (var page = 2; page <= nextPages.NumberOfPages; page++)
{
doc.SetPageSize(nextPages.GetPageSize(page));
doc.NewPage();
w.DirectContent.AddTemplate(w.GetImportedPage(nextPages, page), 0, 0);
}
return finalReportStream;
}
And then write it to disk:
var fileStream = File.Create(destfilename);
finalReportStream.CopyTo(fileStream);
fileStream.Close();
here is What i want i want to add page numbers to every pdf page that i generated on the fly.
i used on end page method but it did not worked out even when i added the doc bottom margin.
I decided to add the page numbers after the pdf is generated from the file path.
here is my code for generating pdf:
Document doc = new Document(iTextSharp.text.PageSize.LETTER, 10, 10, 42, 35);
PdfWriter wri = PdfWriter.GetInstance(doc, new FileStream("t5.pdf", FileMode.Create));
doc.Open();//Open Document to write
iTextSharp.text.Font font8 = FontFactory.GetFont("ARIAL", 7);
Paragraph paragraph = new Paragraph("Some content");
doc.Add(paragraph);
doc.Add(paragraph);// add paragraph to the document
doc.Close();
FileStream stream = File.OpenRead("t5.pdf");
byte[] fileBytes = new byte[stream.Length];
stream.Read(fileBytes, 0, fileBytes.Length);
stream.Close();
AddPageNumbers(fileBytes);
using (Stream file = File.OpenWrite("t5.pdf"))
{
file.Write(fileBytes, 0, fileBytes.Length);
}
}
and her is my add pagenumbers method:
MemoryStream ms = new MemoryStream();
PdfReader reader = new PdfReader(pdf);
int n = reader.NumberOfPages;
iTextSharp.text.Rectangle psize = reader.GetPageSize(1);
Document document = new Document(psize, 50, 50, 50, 50);
PdfWriter writer = PdfWriter.GetInstance(document, ms);
document.Open();
PdfContentByte cb = writer.DirectContent;
int p = 0;
for (int page = 1; page <= reader.NumberOfPages; page++)
{
document.NewPage();
p++;
PdfImportedPage importedPage = writer.GetImportedPage(reader, page);
cb.AddTemplate(importedPage, 0, 0);
BaseFont bf = BaseFont.CreateFont(BaseFont.HELVETICA, BaseFont.CP1252, BaseFont.NOT_EMBEDDED);
cb.BeginText();
cb.SetFontAndSize(bf, 10);
cb.ShowTextAligned(PdfContentByte.ALIGN_CENTER, +p + "/" + n, 100, 450, 0);
cb.EndText();
}
document.Close();
return ms.ToArray();
how ever it does not add the page numbers to the pdf document so what is the alternatives here? what can i do.
When posting a question here, please only post the smallest amount of code possible. Your "create a sample PDF with multiple pages" is 116 lines long. Inside of it you've got complicated PdfPTable and DataTable logic that is 100% unrelated to the problem. Instead, the following 13 lines is enough to make a multiple page PDF:
//Create a sample multiple page PDF and place it on the desktop
var outputFile = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "t5.pdf");
using (var fs = new FileStream(outputFile, FileMode.Create, FileAccess.Write, FileShare.None)) {
using (var doc = new Document()) {
using (var writer = PdfWriter.GetInstance(doc, fs)) {
doc.Open();
for (var i = 0; i < 1000; i++) {
doc.Add(new Paragraph(String.Format("This is paragraph #{0}", i)));
}
doc.Close();
}
}
}
Second, get rid of try/catch. Those are great for production (sometimes) but at the development level that's why we have IDEs and compilers, they'll tell us specifically what's wrong.
Now on to the bigger problem, you need to keep these two processes separate from each other. Every single brace and object from part part #1 must be closed, done and accounted for. Part #2 then needs to be fed a completely valid PDF but neither of the two parts should be "aware" of each other or depend on each other.
Since you just borrowed some code that wasn't intended for what you're trying to do I'm going to also ignore that and use some code that I know specifically will work. Also, since you're open to using a MemoryStream in the first place I'm just going to avoid writing to disk until I need to. Below is a full working sample that creates a multiple page and then adds page numbers in a second pass.
//Will hold our PDF as a byte array
Byte[] bytes;
//Create a sample multiple page PDF, nothing special here
using (var ms = new MemoryStream()) {
using (var doc = new Document()) {
using (var writer = PdfWriter.GetInstance(doc, ms)) {
doc.Open();
for (var i = 0; i < 1000; i++) {
doc.Add(new Paragraph(String.Format("This is paragraph #{0}", i)));
}
doc.Close();
}
}
//Store our bytes before
bytes = ms.ToArray();
}
//Read our sample PDF and apply page numbers
using (var reader = new PdfReader(bytes)) {
using (var ms = new MemoryStream()) {
using (var stamper = new PdfStamper(reader, ms)) {
int PageCount = reader.NumberOfPages;
for (int i = 1; i <= PageCount; i++) {
ColumnText.ShowTextAligned(stamper.GetOverContent(i), Element.ALIGN_CENTER, new Phrase(String.Format("Page {0} of {1}", i, PageCount)), 100, 10 , 0);
}
}
bytes = ms.ToArray();
}
}
var outputFile = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "t5.pdf");
System.IO.File.WriteAllBytes(outputFile, bytes);
I have a method, which takes in the following:
Byte array, which is a PDF file
A "from" size
A "to" size
The idea is it transforms a PDF file with a specific size, to another size. I want to return a byte array, and want to keep the whole thing in memory.
I create the PdfWriter using a memorystream in the constructor (outPDF), and then does my conversion. After, I want to say outBytes = outPDF.ToArray(); .
I tried putting this code in three places, see place A, B and C in the code. In place A, the length of the memorystream is only 255, which doesn't work. My guess is the doc.Close() has to run first. In place B and C, the stream is closed, and cannot be accessed.
My question is therefore:
How to get a byte array from PdfWriter, writing to a memorystream in iTextSharp
My code:
public static byte[] ConvertPdfSize(byte[] inPDF, LetterSize fromSize, LetterSize toSize)
{
if (fromSize != LetterSize.A4 || toSize != LetterSize.Letter)
{
throw new ArgumentException("Function only supports from size A4 to size letter");
}
MemoryStream outPDF = new MemoryStream();
byte[] outBytes;
using (PdfReader pdfr = new PdfReader(inPDF))
{
using (Document doc = new Document(PageSize.LETTER))
{
Document.Compress = true;
PdfWriter writer = PdfWriter.GetInstance(doc, outPDF);
doc.Open();
PdfContentByte cb = writer.DirectContent;
PdfImportedPage page;
for (int i = 1; i < pdfr.NumberOfPages + 1; i++)
{
page = writer.GetImportedPage(pdfr, i);
cb.AddTemplate(page, PageSize.LETTER.Width / pdfr.GetPageSize(i).Width, 0, 0, PageSize.LETTER.Height / pdfr.GetPageSize(i).Height, 0, 0);
doc.NewPage();
}
// place A
doc.Close();
// place B
}
pdfr.Close();
// place C
}
return new byte[0];
}
Just return your bytes after all of the iTextSharp stuff is done but before discarding the MemoryStream
using(MemoryStream outPDF = new MemoryStream())
{
using (PdfReader pdfr = new PdfReader(inPDF))
{
using (Document doc = new Document(PageSize.LETTER))
{
//...
}
}
return outPDF.ToArray();
}
I want to convert an image to PDF and add a watermark to it. I used iTextSharp to convert it. I successfully converted the image file to pdf but I'm not able to add watermark to it without creating another pdf file.
The code below creates a PDF file and also adds custom attributes,
function watermarkpdf is used to add watermark and pdfname is given as the arguement
foreach (string filenm in Images)
using (var imageStream = new FileStream(filenm, FileMode.Open, FileAccess.Read, FileShare.ReadWrite))
{
doc.NewPage();
iTextSharp.text.Image jpeg = iTextSharp.text.Image.GetInstance(filenm);
float width = doc.PageSize.Width;
float height = doc.PageSize.Height;
jpeg.ScaleToFit(width,height);
doc.Add(jpeg);
}
doc.AddHeader("name", "vijay");
watermarkpdf(pdfname);
The watermarkpdf function is given below.
PdfReader pdfReader = new PdfReader(txtpath.Text+"\\pdf\\" + pdfname);
FileStream stream = new FileStream(txtpath.Text + pdfname,FileMode.Open);
PdfStamper pdfStamper = new PdfStamper(pdfReader, stream);
for (int pageIndex = 1; pageIndex <= pdfReader.NumberOfPages; pageIndex++)
{
Rectangle pageRectangle = pdfReader.GetPageSizeWithRotation(pageIndex);
PdfContentByte pdfData = pdfStamper.GetUnderContent(pageIndex);
pdfData.SetFontAndSize(BaseFont.CreateFont(BaseFont.HELVETICA_BOLD, BaseFont.CP1252, BaseFont.NOT_EMBEDDED), 40);
PdfGState graphicsState = new PdfGState();
graphicsState.FillOpacity = 0.4F;
pdfData.SetGState(graphicsState);
pdfData.SetColorFill(BaseColor.BLUE);
pdfData.BeginText();
pdfData.ShowTextAligned(Element.ALIGN_CENTER, "SRO-Kottarakkara", pageRectangle.Width / 2, pageRectangle.Height / 2, 45);
pdfData.EndText();
}
pdfStamper.Close();
stream.Close();
iTextSharp doesn't support "in-place editing" of files, only reading existing files and creating new files. The problem is that it would have to write to something that is being written to which could be very problematic.
However, instead of using a file you can create your image in a MemoryStream, grab the bytes from that and pipe that to the PdfReader, all with minimal changes to your code. All of the PDF writing functions that take files actually work with the abstract Stream class and which MemoryStream inherits from so they can be used interchangeably. Below is some basic code that should show you what I'm talking about. I don't have an IDE currently so there might be a typo or two but for the most part it should work.
//Image part
//We will dump the bytes from the memory stream to the variable below later
byte[] bytes;
using (MemoryStream ms = new MemoryStream()){
Document doc = new Document(PageSize.LETTER);
PdfWriter writer = PdfWriter.GetInstance(doc, ms);
doc.Open();
//foreach (string filenm in Images)
//...
doc.Close();
//Dump the bytes, make sure to use ToArray() and not GetBuffer()
bytes = ms.ToArray();
}
//Watermark part
//Read from our bytes
PdfReader pdfReader = new PdfReader(bytes);
FileStream stream = new FileStream(txtpath.Text + pdfname,FileMode.Open);
//...
I have a stream (PDF file with annotations) and another stream (the same PDF file without annotations). I use streams because I need to execute this operations in memory.
I need to copy annotations from first document to another. Annotations can be different: comments, highlighting and other. So it is better to copy annotations without parsing it.
Can you advice me some helpful PDF library for .NET? And some sample for this problem.
You can use this example for iTextSharp to approach your problem (this example copies a list of pdf files with annotations into a new pdf file):
var output = new MemoryStream();
using (var document = new Document(PageSize.A4, 70f, 70f, 20f, 20f))
{
var readers = new List<PdfReader>();
var writer = PdfWriter.GetInstance(document, output);
writer.CloseStream = false;
document.Open();
const Int32 requiredWidth = 500;
const Int32 zeroBottom = 647;
const Int32 left = 50;
Action<String, Action> inlcudePdfInDocument = (filename, e) =>
{
var reader = new PdfReader(filename);
readers.Add(reader);
var pageCount = reader.NumberOfPages;
for (var i = 0; i < pageCount; i++)
{
e?.Invoke();
var imp = writer.GetImportedPage(reader, (i + 1));
var scale = requiredWidth / imp.Width;
var height = imp.Height * scale;
writer.DirectContent.AddTemplate(imp, scale, 0, 0, scale, left, zeroBottom - height);
var annots = reader.GetPageN(i + 1).GetAsArray(PdfName.ANNOTS);
if (annots != null && annots.Size != 0)
{
foreach (var a in annots)
{
var newannot = new PdfAnnotation(writer, new Rectangle(0, 0));
var annotObj = (PdfDictionary) PdfReader.GetPdfObject(a);
newannot.PutAll(annotObj);
var rect = newannot.GetAsArray(PdfName.RECT);
rect[0] = new PdfNumber(((PdfNumber)rect[0]).DoubleValue * scale + left); // Left
rect[1] = new PdfNumber(((PdfNumber)rect[1]).DoubleValue * scale); // top
rect[2] = new PdfNumber(((PdfNumber)rect[2]).DoubleValue * scale + left); // right
rect[3] = new PdfNumber(((PdfNumber)rect[3]).DoubleValue * scale); // bottom
writer.AddAnnotation(newannot);
}
}
document.NewPage();
}
}
foreach (var apprPdf in pdfs)
{
document.NewPage();
inlcudePdfInDocument(apprPdf.Pdf, null);
}
document.Close();
readers.ForEach(x => x.Close());
}
output.Position = 0;
return output;
PdfReader has a constructor that takes an array of bytes so you can adapt it for MemoryStream.
I'm using ITextSharp which is forked from IText (a java implemenation fpr pdf editing).
http://sourceforge.net/projects/itextsharp/
http://itextpdf.com/
Edit - this is what you need to do (untested but shoul be close):
using System;
using System.IO;
using iTextSharp.text;
using iTextSharp.text.pdf;
// return processed stream (a new MemoryStream)
public Stream copyAnnotations(Stream sourcePdfStream, Stream destinationPdfStream)
{
// Create new document (IText)
Document outdoc = new Document(PageSize.A4);
// Seek to Stream start and create Reader for input PDF
m.Seek(0, SeekOrigin.Begin);
PdfReader inputPdfReader = new PdfReader(sourcePdfStream);
// Seek to Stream start and create Reader for destination PDF
m.Seek(0, SeekOrigin.Begin);
PdfReader destinationPdfReader = new PdfReader(destinationPdfStream);
// Create a PdfWriter from for new a pdf destination stream
// You should write into a new stream here!
Stream processedPdf = new MemoryStream();
PdfWriter pdfw = PdfWriter.GetInstance(outdoc, processedPdf);
// do not close stream if we've read everything
pdfw.CloseStream = false;
// Open document
outdoc.Open();
// get number of pages
int numPagesIn = inputPdfReader.NumberOfPages;
int numPagesOut = destinationPdfReader.NumberOfPages;
int max = numPagesIn;
// Process max number of pages
if (max<numPagesOut)
{
throw new Exception("Impossible - different number of pages");
}
int i = 0;
// Process Pdf pages
while (i < max)
{
// Import pages from corresponding reader
PdfImportedPage pageIn = writer.inputPdfReader(reader, i);
PdfImportedPage pageOut = writer.destinationPdfReader(reader, i);
// Get named destinations (annotations
List<Annotations> toBeAdded = ParseInAndOutAndGetAnnotations(pageIn, pageOut);
// add your annotations
foreach (Annotation anno in toBeAdded) pageOut.Add(anno);
// Add processed page to output PDFWriter
outdoc.Add(pageOut);
}
// PDF creation finished
outdoc.Close();
// your new destination stream is processedPdf
return processedPdf;
}
The implementation of ParseInAndOutAndGetAnnotations(pageIn, pageOut) needs to reflect your annotations.
Here is a good example with annotations: http://www.java2s.com/Open-Source/Java-Document/PDF/pdf-itext/com/lowagie/text/pdf/internal/PdfAnnotationsImp.java.htm