convert a pdfreader to a pdfdocument - c#

Can anyone tell me how to convert a PdfReader object into a PdfDocument ?
I have read a disk file and converted to a memorystream but I need it as a PdfDocument for other methods in my C# program.
I'm converting an application to use iTextSharp instead of PdfSharp.
MemoryStream pdfstream = new MemoryStream();
/* Convert the attachment to an byte array */
byte[] pdfarray = (byte[])dr["Data"];
/* Write the attachment into the memory */
pdfstream.Write(pdfarray, 0, pdfarray.Length);
/* Set the memorystream to the beginning */
pdfstream.Seek(0, System.IO.SeekOrigin.Begin);
/* Open the pdf document */
PdfSharp.Pdf.PdfDocument document = PdfSharp.Pdf.IO.PdfReader.Open(pdfstream, PdfDocumentOpenMode.Modify);
//iTextSharp.text.Document doc1 = iTextSharp.text.pdf.PdfReader.GetStreamBytes(
//ITS.pdf.PdfReader rdr = ITS.pdf.PdfReader(
string filename = DateTime.Now.Ticks.ToString() + "_" + dr["AttachmentName"].ToString();
string path = Path.Combine(FolderName, filename);
document.Save(path);

I think you can do something like this (note code not run or tested, might need a tweak):
using (MemoryStream ms = new MemoryStream())
{
Document doc = new Document(PageSize.A4, 50, 50, 15, 15);
PdfWriter writer = PdfWriter.GetInstance(doc, ms);
using (var rdr = new PdfReader(filePath))
{
PdfImportedPage page;
for(int i = 1; i <= rdr.PageCount; i++)
{
page = writer.GetImportedPage(templateReader, i)
writer.DirectContent.AddTemplate(page, 0, 0);
doc.NewPage();
}
}
}
This will read in the PDF page by page and output it to your document.

Related

Creating a blank pdf and uploading an image to each page

I am new to file handling in asp.net core 6.0. I want to create a blank pdf and load images from the ImagePath list into it. Using the resources on the internet, I tried to create a blank pdf and throw it into it, but in vain.
I couldn't use pdfReader inside pdfStamper. It was the only resource on the Internet that I found suitable for myself.
Link to the question; Converting Multiple Images into Multiple Pages PDF using itextsharp
How can I do that my code is below.
public static string MainStamping(string docname, List < string > imagePath, string mediaField) {
var config = new ConfigurationBuilder().AddJsonFile("appsettings.json").Build();
var webRootPath = config["AppSettings:urunResimPath"].ToString();
string filename = webRootPath + "\\menupdf\\" + docname + ".pdf";
// yeniisim = yeniisim + filelist.FileName;
// var fileName = "menupdf\\" + yeniisim;
FileStream pdfOutputFile = new FileStream(filename, FileMode.Create);
PdfConcatenate pdfConcatenate = new PdfConcatenate(pdfOutputFile);
PdfReader result = null;
for (int i = 0; i < imagePath.Count; i++) {
result = CreatePDFDocument1(imagePath[i], mediaField);
pdfConcatenate.AddPages(result);
}
pdfConcatenate.Close();
return filename;
}
public static PdfReader CreatePDFDocument1(string imagePath, string mediaField) {
PdfReader pdfReader = null;
//C:\Users\hilal\OneDrive\Belgeler\GitHub\Restaurant\Cafe.Web\wwwroot\assets\barkod-menu
var config = new ConfigurationBuilder().AddJsonFile("appsettings.json").Build();
var webRootPath = config["AppSettings:urunResimPath"].ToString();
string image = webRootPath + "\\barkod-menu\\" + imagePath;
iTextSharp.text.Image instanceImg = iTextSharp.text.Image.GetInstance(image);
MemoryStream inputStream = new MemoryStream();
inputStream.Seek(0, SeekOrigin.Begin); //I don't know what to do here do I need to use it?
pdfReader = new PdfReader(inputStream);
MemoryStream memoryStream = new MemoryStream();
PdfStamper pdfStamper = new PdfStamper(pdfReader, memoryStream);
AcroFields testForm = pdfStamper.AcroFields;
testForm.SetField("MediaField", mediaField);
PdfContentByte overContent = pdfStamper.GetOverContent(1);
IList < AcroFields.FieldPosition > fieldPositions = testForm.GetFieldPositions("ImageField");
if (fieldPositions == null || fieldPositions.Count <= 0) throw new ApplicationException("Error locating field");
AcroFields.FieldPosition fieldPosition = fieldPositions[0];
overContent.AddImage(instanceImg);
pdfStamper.FormFlattening = true;
pdfStamper.Close();
PdfReader resultReader = new PdfReader(memoryStream.ToArray());
pdfReader.Close();
return resultReader;
}
If I want to explain visually, the blank pdf I created will be uploaded in this way. Thank you
The following shows how to create a PDF using iTextSharp (v. 5.5.13.3), and add images to the PDF (one image per page). It's been tested with .NET 6.
Pre-requisite:
Download / install NuGet package iTextSharp (v. 5.5.13.3)
Add the following using statements:
using iTextSharp.text;
using iTextSharp.text.pdf;
using System.IO;
In the code below the PDF is saved to a byte array, instead of a file to allow for more options.
CreatePdfDocument:
public static byte[] CreatePdfDocument(List<string> imagePaths)
{
byte[] pdfBytes;
using (MemoryStream ms = new MemoryStream())
{
using (Document doc = new Document(PageSize.LETTER, 1.0f, 1.0f, 1.0f, 1.0f))
{
using (PdfWriter writer = PdfWriter.GetInstance(doc, ms))
{
//open
doc.Open();
//create a new page for each image
for (int i = 0; i < imagePaths.Count; i++)
{
//add new page
doc.NewPage();
//get image
iTextSharp.text.Image img = iTextSharp.text.Image.GetInstance(imagePaths[i]);
//ToDo: set desired image size
//img.ScaleAbsolute(100.0f, 100.0f);
//center image on page
img.SetAbsolutePosition((PageSize.LETTER.Width - img.ScaledWidth) / 2, (PageSize.LETTER.Height - img.ScaledHeight) / 2);
//add image to page
doc.Add(img);
}
//close
doc.Close();
//convert MemoryStream to byte[]
pdfBytes = ms.ToArray();
}
}
}
return pdfBytes;
}
I've decided that all I want to do is write the PDF to a file. The method below writes the PDF to a file.
CreatePdf:
public static void CreatePdf(string filename, List<string> imagePaths)
{
//create PDF
byte[] pdfBytes = CreatePdfDocument(imagePaths);
//save PDF to file
System.IO.File.WriteAllBytes (filename, pdfBytes);
}
Resources:
How to generate PDF one page one row from datatable using iTextSharp
iTextSharp - Working with images
Center image in pdf using itextsharp
Additional Resources:
iTextSharp: How to resize an image to fit a fix size?

MemoryStream PDF file in ASP.NET Core 2

Keep getting an error 'Cannot access a closed Stream.'
Does the file need to be physically saved on a server first, prior to being Streamed? I am doing the same with Excel file and it works just fine. Trying to use the same principle here with PDF file.
[HttpGet("ExportPdf/{StockId}")]
public IActionResult ExportPdf(int StockId)
{
string fileName = "test.pdf";
MemoryStream memoryStream = new MemoryStream();
// Create PDF document
Document document = new Document(PageSize.A4, 25, 25, 25, 25);
PdfWriter pdfWriter = PdfWriter.GetInstance(document, memoryStream);
document.Open();
document.Add(new Paragraph("Hello World"));
document.Close();
memoryStream.Position = 0;
return File(memoryStream, "application/pdf", fileName);
}
Are you using iText7? If yes, please add that tag onto your question.
using (var output = new MemoryStream())
{
using (var pdfWriter = new PdfWriter(output))
{
// You need to set this to false to prevent the stream
// from being closed.
pdfWriter.SetCloseStream(false);
using (var pdfDocument = new PdfDocument(pdfWriter))
{
...
}
var renderedBuffer = new byte[output.Position];
output.Position = 0;
output.Read(renderedBuffer, 0, renderedBuffer.Length);
...
}
}
Switched to iText& (7.1.1) version as per David Liang comment
Here is the code that worked for me, a complete method:
[HttpGet]
public IActionResult Pdf()
{
MemoryStream memoryStream = new MemoryStream();
PdfWriter pdfWriter = new PdfWriter(memoryStream);
PdfDocument pdfDocument = new PdfDocument(pdfWriter);
Document document = new Document(pdfDocument);
document.Add(new Paragraph("Welcome"));
document.Close();
byte[] file = memoryStream.ToArray();
MemoryStream ms = new MemoryStream();
ms.Write(file, 0, file.Length);
ms.Position = 0;
return File(fileStream: ms, contentType: "application/pdf", fileDownloadName: "test_file_name" + ".pdf");
}

Convert html to pdf and merge it with existing pdfs

I have a System.Net.Mail.MailMessage which shall have it's html body and pdf attachments converted into one single pdf.
Converting the html body to pdf works for me with this answer
Converting the pdf attachments into one pdf works for me with this answer
However after ~10 hours of trying I can not come up with a combined solution which does both. All I'm getting are NullReferenceExceptions somewhere in IText source, "the document is not open", etc...
For example, this will throw no error but the resulting pdf will only contain the attachments but not the html email body:
Document document = new Document();
StringReader sr = new StringReader(mail.Body);
HTMLWorker htmlparser = new HTMLWorker(document);
using (FileStream fs = new FileStream(targetPath, FileMode.Create))
{
PdfCopy writer = new PdfCopy(document, fs);
document.Open();
htmlparser.Parse(sr);
foreach (string fileName in pdfList)
{
PdfReader reader = new PdfReader(fileName);
reader.ConsolidateNamedDestinations();
for (int i = 1; i <= reader.NumberOfPages; i++)
{
PdfImportedPage page = writer.GetImportedPage(reader, i);
writer.AddPage(page);
}
PRAcroForm form = reader.AcroForm;
if (form != null)
{
writer.CopyAcroForm(reader);
}
reader.Close();
}
writer.Close();
document.Close();
}
I'm using the LGPL licensed ITextSharp 4.1.6
From v4.1.6 fanboy to v4.1.6 fanboy :D
Looks like the HTMLWorker is closing the documents stream right after parsing. So as a workaround, you could create a pdf from your mailbody in memory. And then add this one together with the attachment to your final pdf.
Here is some code, that should do the trick:
StringReader htmlStringReader = new StringReader("<html><body>Hello World!!!!!!</body></html>");
byte[] htmlResult;
using (MemoryStream htmlStream = new MemoryStream())
{
Document htmlDoc = new Document();
PdfWriter htmlWriter = PdfWriter.GetInstance(htmlDoc, htmlStream);
htmlDoc.Open();
HTMLWorker htmlWorker = new HTMLWorker(htmlDoc);
htmlWorker.Parse(htmlStringReader);
htmlDoc.Close();
htmlResult = htmlStream.ToArray();
}
byte[] pdfResult;
using (MemoryStream pdfStream = new MemoryStream())
{
Document doc = new Document();
PdfCopy copyWriter = new PdfCopy(doc, pdfStream);
doc.Open();
PdfReader htmlPdfReader = new PdfReader(htmlResult);
AppendPdf(copyWriter, htmlPdfReader); // your foreach pdf code here
htmlPdfReader.Close();
PdfReader attachmentReader = new PdfReader("C:\\temp\\test.pdf");
AppendPdf(copyWriter, attachmentReader);
attachmentReader.Close();
doc.Close();
pdfResult = pdfStream.ToArray();
}
using (FileStream fs = new FileStream("C:\\temp\\test2.pdf", FileMode.Create, FileAccess.Write))
{
fs.Write(pdfResult, 0, pdfResult.Length);
}
private void AppendPdf(PdfCopy writer, PdfReader reader)
{
for (int i = 1; i <= reader.NumberOfPages; i++)
{
PdfImportedPage page = writer.GetImportedPage(reader, i);
writer.AddPage(page);
}
}
Ofc you could directly use a FileStream for the final document instead of a MemoryStream as well.

how to add pagenumbers to every pdf page using itextsharp

here is What i want i want to add page numbers to every pdf page that i generated on the fly.
i used on end page method but it did not worked out even when i added the doc bottom margin.
I decided to add the page numbers after the pdf is generated from the file path.
here is my code for generating pdf:
Document doc = new Document(iTextSharp.text.PageSize.LETTER, 10, 10, 42, 35);
PdfWriter wri = PdfWriter.GetInstance(doc, new FileStream("t5.pdf", FileMode.Create));
doc.Open();//Open Document to write
iTextSharp.text.Font font8 = FontFactory.GetFont("ARIAL", 7);
Paragraph paragraph = new Paragraph("Some content");
doc.Add(paragraph);
doc.Add(paragraph);// add paragraph to the document
doc.Close();
FileStream stream = File.OpenRead("t5.pdf");
byte[] fileBytes = new byte[stream.Length];
stream.Read(fileBytes, 0, fileBytes.Length);
stream.Close();
AddPageNumbers(fileBytes);
using (Stream file = File.OpenWrite("t5.pdf"))
{
file.Write(fileBytes, 0, fileBytes.Length);
}
}
and her is my add pagenumbers method:
MemoryStream ms = new MemoryStream();
PdfReader reader = new PdfReader(pdf);
int n = reader.NumberOfPages;
iTextSharp.text.Rectangle psize = reader.GetPageSize(1);
Document document = new Document(psize, 50, 50, 50, 50);
PdfWriter writer = PdfWriter.GetInstance(document, ms);
document.Open();
PdfContentByte cb = writer.DirectContent;
int p = 0;
for (int page = 1; page <= reader.NumberOfPages; page++)
{
document.NewPage();
p++;
PdfImportedPage importedPage = writer.GetImportedPage(reader, page);
cb.AddTemplate(importedPage, 0, 0);
BaseFont bf = BaseFont.CreateFont(BaseFont.HELVETICA, BaseFont.CP1252, BaseFont.NOT_EMBEDDED);
cb.BeginText();
cb.SetFontAndSize(bf, 10);
cb.ShowTextAligned(PdfContentByte.ALIGN_CENTER, +p + "/" + n, 100, 450, 0);
cb.EndText();
}
document.Close();
return ms.ToArray();
how ever it does not add the page numbers to the pdf document so what is the alternatives here? what can i do.
When posting a question here, please only post the smallest amount of code possible. Your "create a sample PDF with multiple pages" is 116 lines long. Inside of it you've got complicated PdfPTable and DataTable logic that is 100% unrelated to the problem. Instead, the following 13 lines is enough to make a multiple page PDF:
//Create a sample multiple page PDF and place it on the desktop
var outputFile = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "t5.pdf");
using (var fs = new FileStream(outputFile, FileMode.Create, FileAccess.Write, FileShare.None)) {
using (var doc = new Document()) {
using (var writer = PdfWriter.GetInstance(doc, fs)) {
doc.Open();
for (var i = 0; i < 1000; i++) {
doc.Add(new Paragraph(String.Format("This is paragraph #{0}", i)));
}
doc.Close();
}
}
}
Second, get rid of try/catch. Those are great for production (sometimes) but at the development level that's why we have IDEs and compilers, they'll tell us specifically what's wrong.
Now on to the bigger problem, you need to keep these two processes separate from each other. Every single brace and object from part part #1 must be closed, done and accounted for. Part #2 then needs to be fed a completely valid PDF but neither of the two parts should be "aware" of each other or depend on each other.
Since you just borrowed some code that wasn't intended for what you're trying to do I'm going to also ignore that and use some code that I know specifically will work. Also, since you're open to using a MemoryStream in the first place I'm just going to avoid writing to disk until I need to. Below is a full working sample that creates a multiple page and then adds page numbers in a second pass.
//Will hold our PDF as a byte array
Byte[] bytes;
//Create a sample multiple page PDF, nothing special here
using (var ms = new MemoryStream()) {
using (var doc = new Document()) {
using (var writer = PdfWriter.GetInstance(doc, ms)) {
doc.Open();
for (var i = 0; i < 1000; i++) {
doc.Add(new Paragraph(String.Format("This is paragraph #{0}", i)));
}
doc.Close();
}
}
//Store our bytes before
bytes = ms.ToArray();
}
//Read our sample PDF and apply page numbers
using (var reader = new PdfReader(bytes)) {
using (var ms = new MemoryStream()) {
using (var stamper = new PdfStamper(reader, ms)) {
int PageCount = reader.NumberOfPages;
for (int i = 1; i <= PageCount; i++) {
ColumnText.ShowTextAligned(stamper.GetOverContent(i), Element.ALIGN_CENTER, new Phrase(String.Format("Page {0} of {1}", i, PageCount)), 100, 10 , 0);
}
}
bytes = ms.ToArray();
}
}
var outputFile = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "t5.pdf");
System.IO.File.WriteAllBytes(outputFile, bytes);

Adding text to existing multipage PDF document in memorystream using iTextSharp

I am trying to add text to an existing PDF file using iTextSharp. I have been reading many posts, including the popular thread here.
I have some differences:
My PDF are X pages long
I want to keep everything in memory, and never have a file stored on my filesystem
So I tried to modify the code, so it takes in a byte array and returns a byte array. I have come this far:
The code compiles and runs
My out byte array has a different length than my in byte array
My problem:
I cannot see my added text when i later store the modified byte array and open it in my PDF reader
I don't get why. From every StackOverflow post I have seen, I do the same. using the DirectContent, I use BeginText and write a text. However, i cannot see it, no matter how I move the position around.
Any idea what is missing from my code?
public static byte[] WriteIdOnPdf(byte[] inPDF, string str)
{
byte[] finalBytes;
// open the reader
using (PdfReader reader = new PdfReader(inPDF))
{
Rectangle size = reader.GetPageSizeWithRotation(1);
using (Document document = new Document(size))
{
// open the writer
using (MemoryStream ms = new MemoryStream())
{
using (PdfWriter writer = PdfWriter.GetInstance(document, ms))
{
document.Open();
for (var i = 1; i <= reader.NumberOfPages; i++)
{
document.NewPage();
var baseFont = BaseFont.CreateFont(BaseFont.HELVETICA_BOLD, BaseFont.CP1252, BaseFont.NOT_EMBEDDED);
var importedPage = writer.GetImportedPage(reader, i);
var contentByte = writer.DirectContent;
contentByte.BeginText();
contentByte.SetFontAndSize(baseFont, 18);
var multiLineString = "Hello,\r\nWorld!";
contentByte.ShowTextAligned(PdfContentByte.ALIGN_LEFT, multiLineString,100, 200, 0);
contentByte.EndText();
contentByte.AddTemplate(importedPage, 0, 0);
}
document.Close();
ms.Close();
writer.Close();
reader.Close();
}
finalBytes = ms.ToArray();
}
}
}
return finalBytes;
}
The code below shows off a full-working example of creating a PDF in memory and then performing a second pass, also in memory. It does what #mkl says and closes all iText parts before trying to grab the raw bytes from the stream. It also uses GetOverContent() to draw "on top" of the previous pdf. See the code comments for more details.
//Bytes will hold our final PDFs
byte[] bytes;
//Create an in-memory PDF
using (var ms = new MemoryStream()) {
using (var doc = new Document()) {
using (var writer = PdfWriter.GetInstance(doc, ms)) {
doc.Open();
//Create a bunch of pages and add text, nothing special here
for (var i = 1; i <= 10; i++) {
doc.NewPage();
doc.Add(new Paragraph(String.Format("First Pass - Page {0}", i)));
}
doc.Close();
}
}
//Right before disposing of the MemoryStream grab all of the bytes
bytes = ms.ToArray();
}
//Another in-memory PDF
using (var ms = new MemoryStream()) {
//Bind a reader to the bytes that we created above
using (var reader = new PdfReader(bytes)) {
//Store our page count
var pageCount = reader.NumberOfPages;
//Bind a stamper to our reader
using (var stamper = new PdfStamper(reader, ms)) {
//Setup a font to use
var baseFont = BaseFont.CreateFont(BaseFont.HELVETICA_BOLD, BaseFont.CP1252, BaseFont.NOT_EMBEDDED);
//Loop through each page
for (var i = 1; i <= pageCount; i++) {
//Get the raw PDF stream "on top" of the existing content
var cb = stamper.GetOverContent(i);
//Draw some text
cb.BeginText();
cb.SetFontAndSize(baseFont, 18);
cb.ShowText(String.Format("Second Pass - Page {0}", i));
cb.EndText();
}
}
}
//Once again, grab the bytes before closing things out
bytes = ms.ToArray();
}
//Just to see the final results I'm writing these bytes to disk but you could do whatever
var testFile = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "test.pdf");
System.IO.File.WriteAllBytes(testFile, bytes);

Categories