iTextSharp structureTreeRoot.numTree is null

iTextSharp structureTreeRoot.numTree is null - c#

I'm getting an error while closing my document. It's thrown when calling the function "FixTaggedStructure" from PdfCopy
Dictionary<int, PdfIndirectReference> numTree = structureTreeRoot.NumTree;
My debugger shows that "structureTreeRoot" is null, but I don't know why.
My code is very simple. I am trying to convert a PDF to an PDF/A-1 referring to
Convert PDF to PDF/A3 or PDF/A-1 to PDF/A-3
Document doc = new Document();
FileStream fs = new FileStream(destPdfA, FileMode.Create);
PdfReader reader = new PdfReader(pdfParth);
PdfCopy copy = new PdfCopy(doc, fs);
copy.SetPdfVersion(PdfCopy.PDF_VERSION_1_4);
copy.SetTagged();
copy.CreateXmpMetadata();
doc.Open();
ICC_Profile icc = ICC_Profile.GetInstance(new FileStream(ICM, FileMode.Open));
PdfDictionary outi = new PdfDictionary(PdfName.OUTPUTINTENT);
outi.Put(PdfName.OUTPUTCONDITIONIDENTIFIER, new PdfString("sRGB IEC61966-2.1"));
outi.Put(PdfName.INFO, new PdfString("sRGB IEC61966-2.1"));
outi.Put(PdfName.S, PdfName.GTS_PDFA1);
// get this file here: http://old.nabble.com/attachment/10971467/0/srgb.profile
PdfICCBased ib = new PdfICCBased(icc);
ib.Remove(PdfName.ALTERNATE);
outi.Put(PdfName.DESTOUTPUTPROFILE, copy.AddToBody(ib).IndirectReference);
copy.ExtraCatalog.Put(PdfName.OUTPUTINTENTS, outi);
copy.AddDocument(reader);
doc.Close();

Related

iText7 PdfDocument save to two locations on disk

In the code below I want to be able to read in a PDF file and add some encryption and then resave the file. what is the best way to do that? I dont see a Save method in pdfDocument is there another object I should use?
PdfReader pdfReader = null;
byte[] bytesPassword = System.Text.ASCIIEncoding.UTF8.GetBytes("PassWord");
WriterProperties writerProperties = new WriterProperties();
PdfDocument pdfDocument = null;
using (MemoryStream ms = new MemoryStream())
{
pdfReader = new PdfReader(destFile);
writerProperties.SetStandardEncryption(null, bytesPassword, EncryptionConstants.ALLOW_PRINTING, EncryptionConstants.ENCRYPTION_AES_256);
pdfDocument = new PdfDocument(pdfReader, new PdfWriter(ms, writerProperties));
pdfDocument.Close();
}
//pdfDocument.Save(FilePath1)
//pdfDocument.Save(FilePath2)

The document has no catalog object (meaning: it's an invalid PDF)

I am reading and writing to the same PDF at the same time i am getting error "The document has no catalog object (meaning: it's an invalid PDF)" on this line "PdfReader pdfReader = new PdfReader(inputPdf2);" in the below code snippet.
iTextSharp.text.pdf.PdfCopy pdfCopy = null;
Document finalPDF = new Document();
//pdfReader = null;
FileStream fileStream = null;
int pageCount = 1;
int TotalPages = 20;
try
{
fileStream = new FileStream(finalPDFFile, FileMode.OpenOrCreate, FileAccess.Write);
pdfCopy = new PdfCopy(finalPDF, fileStream);
finalPDF.Open();
foreach (string inputPdf1 in inputPDFFiles)
{
if (File.Exists(inputPdf1))
{
var bytes = File.ReadAllBytes(inputPdf1);
PdfReader pdfReader = new PdfReader(bytes);
fileStream = new FileStream(inputPdf1, FileMode.Open, FileAccess.Write);
var stamper = new PdfStamper(pdfReader, fileStream);
var acroFields = stamper.AcroFields;
stamper.AcroFields.SetField(acrofiled.Key, "Page " + 1+ " of " + 16);
stamper.FormFlattening = true;
stamper.Close();
stamper.Dispose();
fileStream.Close();
fileStream.Dispose();
pdfReader.Close();
pdfReader.Dispose();
}
}
foreach (string inputPdf2 in inputPDFFiles)
{
if (File.Exists(inputPdf2))
{
PdfReader pdfReader = new PdfReader(inputPdf2);
int pageNumbers = pdfReader.NumberOfPages;
for (int pages = 1; pages <= pageNumbers; pages++)
{
PdfImportedPage page = pdfCopy.GetImportedPage(pdfReader, pages);
PdfCopy.PageStamp pageStamp = pdfCopy.CreatePageStamp(page);
pdfCopy.AddPage(page);
}
pdfReader.Close();
pdfReader.Dispose();
}
}
pdfCopy.Close();
pdfCopy.Dispose();
finalPDF.Close();
finalPDF.Dispose();
fileStream.Close();
fileStream.Dispose();
please help me in order to fix issue or give me any alternate approach

In your first loop you overwrite each of your files with a manipulated version like this:
var bytes = File.ReadAllBytes(inputPdf1);
PdfReader pdfReader = new PdfReader(bytes);
fileStream = new FileStream(inputPdf1, FileMode.Open, FileAccess.Write);
var stamper = new PdfStamper(pdfReader, fileStream);
[...]
Using FileMode.Open here is an error. You want to replace the existing file with a new one, and for such a use case you have to use FileMode.Create or FileMode.Truncate.
Using FileMode.Open results in the original file content remaining there and you writing into it. Thus, if your new file content is shorter than the original one (which can happen when flattening a form), your new file keeps a tail segment of the original file. In PDFs there are relevant lookup information at the end, so upon reading this new file the PdfReader finds the lookup information of the old file which don't match the new content anymore at all.
By the way, you create the PdfCopy like this:
fileStream = new FileStream(finalPDFFile, FileMode.OpenOrCreate, FileAccess.Write);
pdfCopy = new PdfCopy(finalPDF, fileStream);
This is wrong for the same reason: If there already is PDF there, FileMode.OpenOrCreate works just like FileMode.Open with the unwanted effects described above.
Thus, you should replace the FileMode values for streams you write to with FileMode.Create.

C#, iTextSharp 5.5.10 Shown "Cannot access a closed file." on document close command

I don't know how to solve this because the document seem close before the actual command even I put command to open it again. Please help.
This is my code. When I click the button it will do this and the error will occur at doc.close() line. It shown "Cannot access a closed file." Even I put doc.open() above.
private void run_Click(object sender, EventArgs e)
{
Document doc = new Document(PageSize.A4);
using(FileStream op = new FileStream("text.pdf", FileMode.Create))
{
PdfWriter wri = PdfWriter.GetInstance(doc, op);
Paragraph p = new Paragraph("test");
doc.Open();
doc.Add(p);
}
using (FileStream op = new FileStream("text.pdf", FileMode.Append, FileAccess.Write))
{
PdfWriter wri = PdfWriter.GetInstance(doc, op);
Paragraph p = new Paragraph("test2");
doc.Open();
doc.Add(p);
doc.Close();
}
}

First of all, it is not a correct way to add some contents to an existing PDF by appending a PDF file. If you want to add contents to an existing PDF, please check ITextSharp insert text to an existing pdf.
However, if you just want it to work, you just need to create a new Document instance every time.
private void run_Click(object sender, EventArgs e)
{
Document doc = new Document(PageSize.A4);
using(FileStream op = new FileStream("text.pdf", FileMode.Create))
{
PdfWriter wri = PdfWriter.GetInstance(doc, op);
Paragraph p = new Paragraph("test");
doc.Open();
doc.Add(p);
}
using (FileStream op = new FileStream("text.pdf", FileMode.Append, FileAccess.Write))
{
doc = new Document(PageSize.A4); // this is the fix
PdfWriter wri = PdfWriter.GetInstance(doc, op);
Paragraph p = new Paragraph("test2");
doc.Open();
doc.Add(p);
doc.Close();
}
}

Convert html to pdf and merge it with existing pdfs

I have a System.Net.Mail.MailMessage which shall have it's html body and pdf attachments converted into one single pdf.
Converting the html body to pdf works for me with this answer
Converting the pdf attachments into one pdf works for me with this answer
However after ~10 hours of trying I can not come up with a combined solution which does both. All I'm getting are NullReferenceExceptions somewhere in IText source, "the document is not open", etc...
For example, this will throw no error but the resulting pdf will only contain the attachments but not the html email body:
Document document = new Document();
StringReader sr = new StringReader(mail.Body);
HTMLWorker htmlparser = new HTMLWorker(document);
using (FileStream fs = new FileStream(targetPath, FileMode.Create))
{
PdfCopy writer = new PdfCopy(document, fs);
document.Open();
htmlparser.Parse(sr);
foreach (string fileName in pdfList)
{
PdfReader reader = new PdfReader(fileName);
reader.ConsolidateNamedDestinations();
for (int i = 1; i <= reader.NumberOfPages; i++)
{
PdfImportedPage page = writer.GetImportedPage(reader, i);
writer.AddPage(page);
}
PRAcroForm form = reader.AcroForm;
if (form != null)
{
writer.CopyAcroForm(reader);
}
reader.Close();
}
writer.Close();
document.Close();
}
I'm using the LGPL licensed ITextSharp 4.1.6

From v4.1.6 fanboy to v4.1.6 fanboy :D
Looks like the HTMLWorker is closing the documents stream right after parsing. So as a workaround, you could create a pdf from your mailbody in memory. And then add this one together with the attachment to your final pdf.
Here is some code, that should do the trick:
StringReader htmlStringReader = new StringReader("<html><body>Hello World!!!!!!</body></html>");
byte[] htmlResult;
using (MemoryStream htmlStream = new MemoryStream())
{
Document htmlDoc = new Document();
PdfWriter htmlWriter = PdfWriter.GetInstance(htmlDoc, htmlStream);
htmlDoc.Open();
HTMLWorker htmlWorker = new HTMLWorker(htmlDoc);
htmlWorker.Parse(htmlStringReader);
htmlDoc.Close();
htmlResult = htmlStream.ToArray();
}
byte[] pdfResult;
using (MemoryStream pdfStream = new MemoryStream())
{
Document doc = new Document();
PdfCopy copyWriter = new PdfCopy(doc, pdfStream);
doc.Open();
PdfReader htmlPdfReader = new PdfReader(htmlResult);
AppendPdf(copyWriter, htmlPdfReader); // your foreach pdf code here
htmlPdfReader.Close();
PdfReader attachmentReader = new PdfReader("C:\\temp\\test.pdf");
AppendPdf(copyWriter, attachmentReader);
attachmentReader.Close();
doc.Close();
pdfResult = pdfStream.ToArray();
}
using (FileStream fs = new FileStream("C:\\temp\\test2.pdf", FileMode.Create, FileAccess.Write))
{
fs.Write(pdfResult, 0, pdfResult.Length);
}
private void AppendPdf(PdfCopy writer, PdfReader reader)
{
for (int i = 1; i <= reader.NumberOfPages; i++)
{
PdfImportedPage page = writer.GetImportedPage(reader, i);
writer.AddPage(page);
}
}
Ofc you could directly use a FileStream for the final document instead of a MemoryStream as well.

Merging PDF with ITextSharp takes time

I am using ITextSharp to merge PDFs.
My problem is when I merge huge PDFs, it takes a very long time to do it (many minutes). It appears that it takes all this time on the "document.close()".
Here is my code :
iTextSharp.text.Document doc = new iTextSharp.text.Document();
PdfCopy copy = new PdfCopy(doc, msOutput);
copy.SetMergeFields();
doc.Open();
byte[] byteArray = Convert.FromBase64String("someString");
PdfReader reader = new PdfReader(byteArray);
copy.AddDocument(reader);
doc.Close(); // <== It takes time here !
byte[] form = msOutput.ToArray();
Is there anything I did wrong ?
How can I improve this merging time ?

You are missing some Close() calls - this may help to bring your time down:
byte[] form
using (var msOutput = new MemoryStream())
{
iTextSharp.text.Document doc = new iTextSharp.text.Document();
byte[] byteArray = Convert.FromBase64String("someString");
PdfCopy copy = new PdfCopy(doc, msOutput);
copy.SetMergeFields();
doc.Open();
PdfReader reader = new PdfReader(byteArray);
copy.AddDocument(reader);
reader.Close();
copy.Close();
doc.Close();
form = msOutput.ToArray();
}
You should also be sure you are properly disposing of your stream after use.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

iTextSharp structureTreeRoot.numTree is null - c#

Related

iText7 PdfDocument save to two locations on disk

The document has no catalog object (meaning: it's an invalid PDF)

C#, iTextSharp 5.5.10 Shown "Cannot access a closed file." on document close command

Convert html to pdf and merge it with existing pdfs

Merging PDF with ITextSharp takes time

Categories

Resources