Grab all of the pages of a PDF using textsharp

Grab all of the pages of a PDF using textsharp - c#

I am getting a pfd using the older version of itextsharp with this code
string Oldfile = #"C:/test.pdf"; // Gets the Template
(new FileInfo("C:/C:/test.pdf")).Directory.Create(); // Go create this folder if it's not there
string NewFile = "C:/test.pdf";
PdfReader reader = new PdfReader(Oldfile);
iTextSharp.text.Rectangle Size = reader.GetPageSizeWithRotation(1);
Document document = new Document(Size);
// MemoryStream memory_stream = new MemoryStream();
FileStream fs = new FileStream(NewFile, FileMode.Create, FileAccess.Write);
PdfWriter weiter = PdfWriter.GetInstance(document, fs);
document.Open();
PdfContentByte cb = weiter.DirectContent;
PdfImportedPage page = weiter.GetImportedPage(reader, 1);
//PdfImportedPage page2 = weiter.GetImportedPage(reader, 2);
cb.AddTemplate(page, 0, 0);
The problem I am having is when it gets that file it has 2 pages in that pdf but it only gets the 1st page and adds lines and saves the only 1st page of the pdf I want to be able to grab both of them or is there a way to merge them after wards

I bet you need to iterate all pages.
using System;
using System.IO;
using System.Collections.Generic;
using iTextSharp.text;
using iTextSharp.text.pdf;
namespace TestAnything
{
class Program
{
static void Main(string[] args)
{
List<string> filesToMerge = new List<string> { #"c:\temp\1.pdf", #"c:\temp\2.pdf" };
FileInfo destinationFile = new FileInfo(#"c:\temp\merge.pdf");
if (File.Exists(destinationFile.FullName))
File.Delete(destinationFile.FullName);
MergeFiles(filesToMerge, destinationFile);
}
public static void MergeFiles(List<string> sourceFiles, FileInfo destinationFile)
{
if (sourceFiles == null || sourceFiles.Count == 0)
throw new ArgumentNullException("blahhh.");
PdfReader reader = new PdfReader(sourceFiles[0]);
Document document = new Document(reader.GetPageSizeWithRotation(1));
PdfCopy writer = new PdfCopy(document, new FileStream(destinationFile.FullName, FileMode.Create));
document.Open();
try
{
foreach (string sourceFile in sourceFiles)
{
reader = new PdfReader(sourceFile);
reader.ConsolidateNamedDestinations();
for (int x = 1; x <= reader.NumberOfPages; x++)
writer.AddPage(writer.GetImportedPage(reader, x));
PRAcroForm form = reader.AcroForm;
if (form != null)
writer.CopyAcroForm(reader);
}
}
finally
{
if (document.IsOpen())
document.Close();
}
}
}
}

Related

merging pdf and preserve SetTagged

I'm using iTextSharp 5.x. I'm trying to merge two pdfs and preserve the isTagged flag. When I remove copy.SetTagged(); the result pdf contains both pdfs which is great. When adding the copy.SetTagged() is get an exception
Exception -->System.ObjectDisposedException: Cannot access a closed file.
at System.IO.__Error.FileNotOpen()
at System.IO.FileStream.get_Position()
Here is the code
List<string> filesToMerge = new List<string> { "C:/dev/dcs/wp-cla-dcs/Hex/Docs/metadata/coverPage.pdf", "C:/dev/dcs/wp-cla-dcs/Hex/Docs/metadata/49W7a.pdf" };
string outputFileName = "C:/dev/dcs/wp-cla-dcs/Hex/Docs/metadata/results.pdf";
using (FileStream outFS = new FileStream(outputFileName, FileMode.Create))
using (Document document = new Document())
// using (PdfCopy copy = new PdfCopy(document, outFS))
using (PdfCopy copy = new PdfSmartCopy(document, outFS))
{
{
copy.SetTagged();
// Set up the iTextSharp document
document.Open();
foreach (string pdfFile in filesToMerge)
{
using (var reader = new PdfReader(pdfFile))
{
copy.AddDocument(reader);
copy.FreeReader(reader);
}
}
}
}

despite #bruno-lowagie's comment, I have had better results doing this with with iText5.
Uisng iText7, PdfMerger left several contents untagged (all were tagged in the source document). PdfCopy in iText5 however worked just fine, only needed to manually add Xmp metadata, title, lang, etc:
public static void CombineMultiplePDFs(string[] fileNames, string outFile)
{
var lang = "en";
var title = "My new title";
// step 1: creation of a document-object
Document document = new Document();
// step 2: we create a writer that listens to the document
FileStream newFileStream = new FileStream(outFile, FileMode.Create);
PdfCopy writer = new PdfCopy(document, newFileStream);
writer.SetTagged();
writer.PdfVersion = PdfWriter.VERSION_1_7;
writer.AddViewerPreference(PdfName.DISPLAYDOCTITLE, new PdfBoolean(true));
writer.Info.Put(PdfName.TITLE, new PdfString(title));
writer.CreateXmpMetadata();
// step 3: we open the document
document.Open();
// set meta data
document.AddLanguage(lang);
document.AddTitle(title);
// keep an array of all open readers so they can be closed again.
var readers = new PdfReader[fileNames.Length];
for (var fi = 0; fi < fileNames.Length; fi++)
{
// we create a reader for a certain document
var fileName = fileNames[0];
PdfReader reader = new PdfReader(fileName);
readers[fi] = reader;
reader.ConsolidateNamedDestinations();
// step 4: we add content
for (int i = 1; i <= reader.NumberOfPages; i++)
{
// IMPORTANT: the third param is is "KeepTaggedPdfStructure"
PdfImportedPage page = writer.GetImportedPage(reader, i, true);
writer.AddPage(page);
}
}
// step 5: we close the document and writer
writer.Close();
document.Close();
// close readers only after document is lcosed
foreach (var r in readers)
{
r.Close();
}
}

How to create a copy of a PDF file in ASP.NET MVC

I'm reading a PDF file for writing a string on it like this :
public ActionResult Index(HttpPostedFileBase file)
{
byte[] pdfbytes = null;
BinaryReader rdr = new BinaryReader(file.InputStream);
pdfbytes = rdr.ReadBytes((int)file.ContentLength);
PdfReader myReader = new PdfReader(pdfbytes);
and I'm trying to pass a new file to FileStream like this :
FileStream fs = new FileStream(newFile, FileMode.Create, FileAccess.Write);
But I don't know how to pass the copied new file to fs object. Can you help me with that? Thanks.

If you have access to updated byte array pass it to File.WriteAllBytes. Or you might have an instance of PdfDocument or PdfWriter which usually allow saving the document to file on disk too. Hope it helps!

Here is example which is reading existing pdf file, copying it to new one and adding new string line:
using iTextSharp.text;
using iTextSharp.text.pdf;
using System.IO;
namespace ConsoleApp1
{
class Program
{
static void Main(string[] args)
{
string originalFile = "c:\\Users\\Admin\\Desktop\\receipt mod 3.pdf";
string copyOfOriginal = "c:\\Users\\Admin\\Desktop\\newFile.pdf";
using (var reader = new PdfReader(originalFile))
{
using (var fileStream = new FileStream(copyOfOriginal, FileMode.Create, FileAccess.Write))
{
var document = new Document(reader.GetPageSizeWithRotation(1));
var writer = PdfWriter.GetInstance(document, fileStream);
document.Open();
for (var i = 1; i <= reader.NumberOfPages; i++)
{
document.NewPage();
var baseFont = BaseFont.CreateFont(BaseFont.HELVETICA_BOLD, BaseFont.CP1252, BaseFont.NOT_EMBEDDED);
var importedPage = writer.GetImportedPage(reader, i);
var contentByte = writer.DirectContent;
contentByte.BeginText();
contentByte.SetFontAndSize(baseFont, 12);
var LineString = "Hello World!";
contentByte.ShowTextAligned(10,LineString,50,50,0);
contentByte.EndText();
contentByte.AddTemplate(importedPage, 0, 0);
}
document.Close();
writer.Close();
}
}
}
}
}

Try this.
This program copies all pdf files from one location to another.
protected void Button1_Click(object sender, EventArgs e)
{
string sourceDirectory = #"D:\project training\source";
string targetDirectory = #"D:\project training\destiny";
Copy(sourceDirectory, targetDirectory);
}
public static void Copy(string sourceDirectory, string targetDirectory)
{
DirectoryInfo diSource = new DirectoryInfo(sourceDirectory);
DirectoryInfo diTarget = new DirectoryInfo(targetDirectory);
CopyAll(diSource, diTarget);
}
public static void CopyAll(DirectoryInfo source, DirectoryInfo target)
{
Directory.CreateDirectory(target.FullName);
foreach (FileInfo fi in source.GetFiles())
{
if (fi.Extension.Equals(".pdf"))
{
fi.CopyTo(Path.Combine(target.FullName, fi.Name), true);
}
}
foreach (DirectoryInfo diSourceSubDir in source.GetDirectories())
{
DirectoryInfo nextTargetSubDir =
target.CreateSubdirectory(diSourceSubDir.Name);
CopyAll(diSourceSubDir, nextTargetSubDir);
}
}

How to create a PDF Portfolio using existing PDF using iTextSharp [duplicate]

How would I merge several pdf pages into one with iTextSharp which also supports merging pages having form elements like textboxes, checkboxes, etc.
I have tried so many by googling, but nothing has worked well.

See my answer here Merging Memory Streams. I give an example of how to merge PDFs with itextsharp.
For updating form field names add this code that uses the stamper to change the form field names.
/// <summary>
/// Merges pdf files from a byte list
/// </summary>
/// <param name="files">list of files to merge</param>
/// <returns>memory stream containing combined pdf</returns>
public MemoryStream MergePdfForms(List<byte[]> files)
{
if (files.Count > 1)
{
string[] names;
PdfStamper stamper;
MemoryStream msTemp = null;
PdfReader pdfTemplate = null;
PdfReader pdfFile;
Document doc;
PdfWriter pCopy;
MemoryStream msOutput = new MemoryStream();
pdfFile = new PdfReader(files[0]);
doc = new Document();
pCopy = new PdfSmartCopy(doc, msOutput);
pCopy.PdfVersion = PdfWriter.VERSION_1_7;
doc.Open();
for (int k = 0; k < files.Count; k++)
{
for (int i = 1; i < pdfFile.NumberOfPages + 1; i++)
{
msTemp = new MemoryStream();
pdfTemplate = new PdfReader(files[k]);
stamper = new PdfStamper(pdfTemplate, msTemp);
names = new string[stamper.AcroFields.Fields.Keys.Count];
stamper.AcroFields.Fields.Keys.CopyTo(names, 0);
foreach (string name in names)
{
stamper.AcroFields.RenameField(name, name + "_file" + k.ToString());
}
stamper.Close();
pdfFile = new PdfReader(msTemp.ToArray());
((PdfSmartCopy)pCopy).AddPage(pCopy.GetImportedPage(pdfFile, i));
pCopy.FreeReader(pdfFile);
}
}
pdfFile.Close();
pCopy.Close();
doc.Close();
return msOutput;
}
else if (files.Count == 1)
{
return new MemoryStream(files[0]);
}
return null;
}

Here is my simplified version of Jonathan's Merge code with namespaces added, and stamping removed.
public IO.MemoryStream MergePdfForms(System.Collections.Generic.List<byte[]> files)
{
if (files.Count > 1) {
using (System.IO.MemoryStream msOutput = new System.IO.MemoryStream()) {
using (iTextSharp.text.Document doc = new iTextSharp.text.Document()) {
using (iTextSharp.text.pdf.PdfSmartCopy pCopy = new iTextSharp.text.pdf.PdfSmartCopy(doc, msOutput) { PdfVersion = iTextSharp.text.pdf.PdfWriter.VERSION_1_7 }) {
doc.Open();
foreach (byte[] oFile in files) {
using (iTextSharp.text.pdf.PdfReader pdfFile = new iTextSharp.text.pdf.PdfReader(oFile)) {
for (i = 1; i <= pdfFile.NumberOfPages; i++) {
pCopy.AddPage(pCopy.GetImportedPage(pdfFile, i));
pCopy.FreeReader(pdfFile);
}
}
}
}
}
return msOutput;
}
} else if (files.Count == 1) {
return new System.IO.MemoryStream(files[0]);
}
return null;
}

to merge PDF see "Merging two pdf pages into one using itextsharp"

Below is my code for pdf merging.Thanks Jonathan for giving suggestion abt renaming fields,which resolved the issues while merging pdf pages with form fields.
private static void CombineAndSavePdf(string savePath, List<string> lstPdfFiles)
{
using (Stream outputPdfStream = new FileStream(savePath, FileMode.Create, FileAccess.Write, FileShare.None))
{
Document document = new Document();
PdfSmartCopy copy = new PdfSmartCopy(document, outputPdfStream);
document.Open();
PdfReader reader;
int totalPageCnt;
PdfStamper stamper;
string[] fieldNames;
foreach (string file in lstPdfFiles)
{
reader = new PdfReader(file);
totalPageCnt = reader.NumberOfPages;
for (int pageCnt = 0; pageCnt < totalPageCnt; )
{
//have to create a new reader for each page or PdfStamper will throw error
reader = new PdfReader(file);
stamper = new PdfStamper(reader, outputPdfStream);
fieldNames = new string[stamper.AcroFields.Fields.Keys.Count];
stamper.AcroFields.Fields.Keys.CopyTo(fieldNames, 0);
foreach (string name in fieldNames)
{
stamper.AcroFields.RenameField(name, name + "_file" + pageCnt.ToString());
}
copy.AddPage(copy.GetImportedPage(reader, ++pageCnt));
}
copy.FreeReader(reader);
}
document.Close();
}
}

Crop Pdf from each edge using itextshap

I am trying to crop pdf 5 mm from every edge i.e top,bottom,right and left. I tried with below code
public void TrimPdf(string sourceFilePath, string outputFilePath)
{
PdfReader pdfReader = new PdfReader(sourceFilePath);
float widthTo_Trim = iTextSharp.text.Utilities.MillimetersToPoints(5);
using (FileStream output = new FileStream(outputFilePath, FileMode.Create, FileAccess.Write))
using (PdfStamper pdfStamper = new PdfStamper(pdfReader, output))
{
for (int page = 1; page <= pdfReader.NumberOfPages; page++)
{
Rectangle cropBox = pdfReader.GetCropBox(page);
cropBox.Left += widthTo_Trim;
cropBox.Right += widthTo_Trim;
cropBox.Top += widthTo_Trim;
cropBox.Bottom += widthTo_Trim;
pdfReader.GetPageN(page).Put(PdfName.CROPBOX, new PdfRectangle(cropBox));
}
}
}
By using this code i am Able to Crop only Left and Bottom part. unable to crop top and right side
How can i get desire result ?

This solved my problem by using Below code
public void TrimLeftandRightFoall(string sourceFilePath, string outputFilePath, float cropwidth)
{
PdfReader pdfReader = new PdfReader(sourceFilePath);
float width = (float)GetPDFwidth(sourceFilePath);
float height = (float)GetPDFHeight(sourceFilePath);
float widthTo_Trim = iTextSharp.text.Utilities.MillimetersToPoints(cropwidth);
PdfRectangle rectLeftside = new PdfRectangle(widthTo_Trim, widthTo_Trim, width-widthTo_Trim , height-widthTo_Trim);
using (var output = new FileStream(outputFilePath, FileMode.CreateNew, FileAccess.Write))
{
// Create a new document
Document doc = new Document();
// Make a copy of the document
PdfSmartCopy smartCopy = new PdfSmartCopy(doc, output);
// Open the newly created document
doc.Open();
// Loop through all pages of the source document
for (int i = 1; i <= pdfReader.NumberOfPages; i++)
{
// Get a page
var page = pdfReader.GetPageN(i);
page.Put(PdfName.MEDIABOX, rectLeftside);
var copiedPage = smartCopy.GetImportedPage(pdfReader, i);
smartCopy.AddPage(copiedPage);
}
doc.Close();
}
}

iTextSharp problem concatenating PDF documents

I am trying to build up a single PDF from a bunch of other PDFs that I am filling out some form values in. Essentially I am doing a PDF mail merge. My code is below:
byte[] completedDocument = null;
using (MemoryStream streamCompleted = new MemoryStream())
{
using (Document document = new Document())
{
document.Open();
PdfCopy copy = new PdfCopy(document, streamCompleted);
copy.Open();
foreach (var item in eventItems)
{
byte[] mergedDocument = null;
PdfReader reader = new PdfReader(pdfTemplates[item.DataTokens[NotifyTokenType.OrganisationID]]);
using (MemoryStream streamTemplate = new MemoryStream())
{
using (PdfStamper stamper = new PdfStamper(reader, streamTemplate))
{
foreach (var token in item.DataTokens)
{
if (stamper.AcroFields.Fields.Any(fld => fld.Key == token.Key.ToString()))
{
stamper.AcroFields.SetField(token.Key.ToString(), token.Value);
}
}
stamper.FormFlattening = true;
stamper.Writer.CloseStream = false;
}
mergedDocument = new byte[streamTemplate.Length];
streamTemplate.Position = 0;
streamTemplate.Read(mergedDocument, 0, (int)streamTemplate.Length);
}
reader = new PdfReader(mergedDocument);
for (int i = 1; i <= reader.NumberOfPages; i++)
{
document.SetPageSize(PageSize.A4);
copy.AddPage(copy.GetImportedPage(reader, i));
}
}
}
completedDocument = new byte[streamCompleted.Length];
streamCompleted.Position = 0;
streamCompleted.Read(completedDocument, 0, (int)streamCompleted.Length);
}
The problem I am having is that is throws a null reference exception when it exits the using (Document document = new Document()) block.
From debugging the iTextSharp source the problem is the below method in PdfAnnotationsimp
public bool HasUnusedAnnotations() {
return annotations.Count > 0;
}
annotations is null so this throws the null ref exception. Is there something I should be doing to instantiate this?

I changed:
document.Open();
PdfCopy copy = new PdfCopy(document, streamCompleted);
to
PdfCopy copy = new PdfCopy(document, streamCompleted);
document.Open();
And it fixed the problem. This library needs better exception handling. When you do something slightly wrong it falls over horribly and gives you no clue about what you did wrong. I have no idea how i could possibly have worked this out if I didn't have the source code.

What version of iTextSharp are you using? The Document class doesn't implement IDisposable so you can't wrap it in a using block.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Grab all of the pages of a PDF using textsharp - c#

Related

merging pdf and preserve SetTagged

How to create a copy of a PDF file in ASP.NET MVC

How to create a PDF Portfolio using existing PDF using iTextSharp [duplicate]

Crop Pdf from each edge using itextshap

iTextSharp problem concatenating PDF documents

Categories

Resources