I have an XFA PDF file (which I did not author). It's a third-party form which I'm trying to fill out. I filled out the form manually, then I used iTextSharp save the full XML DomDocument from it. Now I'm trying to apply that same XML file programmatically. However, the resulting PDF doesn't have any of the fields filled in. This is the code I'm using to apply the XML file:
PdfReader pdfReader = new PdfReader(inputPdf);
using (MemoryStream ms = new MemoryStream())
{
using (PdfStamper stamper = new PdfStamper(pdfReader, ms, '\0', true))
{
XfaForm xfaForm = new XfaForm(pdfReader);
XmlDocument doc = new XmlDocument();
doc.Load(inputXml);
xfaForm.DomDocument = doc;
xfaForm.Changed = true;
XfaForm.SetXfa(xfaForm, stamper.Reader, stamper.Writer);
}
var bytes = ms.ToArray();
System.IO.File.WriteAllBytes(outputPdf, bytes);
}
inputPdf is the path to the original empty PDF file.
inputXml is the path to the XML file extracted from the filled out PDF file. This is the entire XML file, and not just the datasets section.
What's interesting is that if I create the PdfStamper object like this instead:
new PdfStamper(pdfReader, ms);
then I see the data in the fields, but of course then I have the associated issues with not appending.
Any suggestions on what I might be doing wrong? I just can't seem to get any of the changes to the DomDocument to save.
Related
I have a PDF document (using iText 7/C# 4.01) that I am creating in a MemoryStream and at the end, I want to write it out to a file. Part of the reason I am creating it in a memory stream is that I want to stamp a header table and footers on it at the end and was hoping to avoid writing it to a file then reading the file back in, stamping, then writing out a new file (as the examples I keep finding on iText website seem to do). However, I seem to be having some sort of chicken/egg scenario in the below code. It seems that you have to Close() the document in order for iText to fully form it. However, if I Close() it, then I get an ObjectDisposedException when trying to write it (simplified example below). I have to be missing something simple here, right? Thanks
MemoryStream baos = new MemoryStream();
PdfWriter writer = new PdfWriter(baos);
PdfDocument pdfDocument = new PdfDocument(writer.SetSmartMode(true));
//writer.SetCloseStream(true);
//pdfDocument.SetCloseWriter(true);
//pdfDocument.SetCloseReader(true);
//pdfDocument.SetFlushUnusedObjects(true);
Document d = new Document(pdfDocument, iText.Kernel.Geom.PageSize.LETTER);
d.Add(new Paragraph("Hello world!"));
//d.Close();
FileStream file = new FileStream("C:\test.pdf",
FileMode.Create, FileAccess.Write);
baos. WriteTo(file);
file.Close();
//baos.Close();
//d.Close();
Try this
I dont have IDE for test, but i think this work
MemoryStream baos = new MemoryStream();
PdfWriter writer = new PdfWriter(baos);
PdfDocument pdfDocument = new PdfDocument(writer.SetSmartMode(true));
Document d = new Document(pdfDocument, iText.Kernel.Geom.PageSize.LETTER);
d.Add(new Paragraph("Hello world!"));
d.Close();
byte[] byte1 = baos.ToArray();
File(byte1, "application/pdf", "C:\\iTextTester\\test.pdf");
So I need to generate a docx file for reporting purposes. This report contains text, tables and a lot of images.
So far, I managed to add text and a table (and populate it based on the content of my xml using an xslt transform).
However, I am stuck on adding images. I found some examples of how to add images using C# but I don't think this is what I need. I need to format the document using my xslt and add the images in the right places (for instance in a table cell). Is it somehow possible to add a container using xslt which uses the filepath to display/embed the image similar to the <img> tag in html?
I know that the docx format is basically a zip containing a file structure and to embed the image I should add it to this file structure also.
Any examples or references are appreciated.
to give you an idea of my code:
XslCompiledTransform transform = new XslCompiledTransform();
transform.Load(xsltFile);
StringWriter stringWriter = new StringWriter();
XmlWriter xmlWriter = XmlWriter.Create(stringWriter);
transform.Transform(xmlFile, xmlWriter);
XmlDocument newWordContent = new XmlDocument();
newWordContent.LoadXml(stringWriter.ToString());
File.Copy(docXtemplate, outputFilename, true);
using (WordprocessingDocument myDoc = WordprocessingDocument.Open(outputFilename, true))
{
MainDocumentPart mainPart = myDoc.MainDocumentPart;
Body body = new Body(newWordContent.DocumentElement.InnerXml);
DocumentFormat.OpenXml.Wordprocessing.Document document = new DocumentFormat.OpenXml.Wordprocessing.Document(body);
document.Save(mainPart);
}
It basically replaces the body of an existing docx file. This enables me to use all the formatting, etc.
The xslt file is generated by adjusting the document.xml file from the docx.
Update
Ok, so I figured out how to add an image to the docx file directory, see below
using (WordprocessingDocument myDoc = WordprocessingDocument.Open(outputFilename, true))
{
MainDocumentPart mainPart = myDoc.MainDocumentPart;
ImagePart imagePart = mainPart.AddImagePart(ImagePartType.Png);
using (FileStream stream = new FileStream(imageFile, FileMode.Open))
{
imagePart.FeedData(stream);
}
Body body = new Body(newWordContent.DocumentElement.InnerXml);
DocumentFormat.OpenXml.Wordprocessing.Document document = new
DocumentFormat.OpenXml.Wordprocessing.Document(body);
document.Save(mainPart);
}
This will add the image to the docx structure. I also checkt the relatioship and this is present in the 'document.xml.rels' file. When I take this id and use it in my xslt to add the image to the document (for testing), I do see an area where the image should be when opening with Word, however it says: cannot display image with the red cross.
A difference I do notice is that image which where in the orignal docx are saved in "word\media" while the added image with the code above is added in "media". Not sure if this is a problem
Ok, So I think I figured it out.
XslCompiledTransform transform = new XslCompiledTransform();
transform.Load(xsltFile);
StringWriter stringWriter = new StringWriter();
XmlWriter xmlWriter = XmlWriter.Create(stringWriter);
transform.Transform(xmlFile, xmlWriter);
XmlDocument newWordContent = new XmlDocument();
newWordContent.LoadXml(stringWriter.ToString());
using (WordprocessingDocument myDoc = WordprocessingDocument.Open(outputFilename, true))
{
MainDocumentPart mainPart = myDoc.MainDocumentPart;
ImagePart imagePart = mainPart.AddImagePart(ImagePartType.Png, "imgId");
using (FileStream stream = new FileStream(imageFile, FileMode.Open))
{
imagePart.FeedData(stream);
}
Body body = new Body(newWordContent.DocumentElement.InnerXml);
DocumentFormat.OpenXml.Wordprocessing.Document document = new
DocumentFormat.OpenXml.Wordprocessing.Document(body);
document.Save(mainPart);
}
The above code will add an image to your docx file structure with a specific id. You can use this id to refer to in your xsl transform. In the code example from my question I didn't set the id but used the one that was generated. However, each time you run this code the image will be added to the file with a new id resulting in a "not able to display" error. Not one of my sharpest moments;-).
For my use case I have to add multiple images to a large document so that code will be different but I think that based on the above code this can be achieved.
I use the itextsharp library to remove unwanted bookmarks from PDF files.
I developed the following code:
private static void RemovePDFbookmarks(string filein, string fileout)
{
PdfReader pdfReader = new PdfReader(filein);
Document document = new Document();
PdfCopy copy = new PdfCopy(document, new FileStream(fileout, FileMode.Create));
document.Open();
copy.AddDocument(pdfReader);
document.Close();
pdfReader.Close();
copy.Close();
}
This method creates a copy of the original file. During the following process, I need to delete the original file and rename the new file back to the original file's name.
How can I remove the bookmarks in the original PDF without the copy-delete-rename detour?
I am creating an application that has to generate thousands of PDF files at a time. I am using ITextSharp to do this and it seems that the PdfReader is slowing down the process. Below is my code.
using (MemoryStream foutput = new MemoryStream())
{
using (PdfReader pdf = new PdfReader(templateByteArray)) // slow
{
using (PdfStamper stamper = new PdfStamper(pdf, foutput))
{
AcroFields form = stamper.AcroFields;
form.SetField(_dic[#"1,1"], "some string1");
form.SetField(_dic[#"1,2"], "some string2");
stamper.FormFlattening = true;
}
pdf.RemoveUsageRights();
}
EnqueueFile(foutput.ToArray());
}
I have a separate consumer thread that takes every byte array and writes the PDF documents to the HDD from a queue. After I messed around with the code, it seems that the bottleneck is in the PdfReader class. Is there an alternate way to doing what I am trying to do or do you have any suggestions?
You are reading the properties of the fields _dic[#"1,1"] and _dic[#"1,1"] over and over again. You should cache those properties. In Java, that's done like this:
HashMap<String,TextField> fieldCache = new HashMap<String,TextField>();
This cache stores information about each TextField that is encountered. You introduce it with the setFieldCache() method:
public void manipulatePdf(String src, String dest,
HashMap<String,TextField> cache, String name, String login)
throws IOException, DocumentException {
PdfReader reader = new PdfReader(src);
PdfStamper stamper = new PdfStamper(reader, new FileOutputStream(dest));
AcroFields form = stamper.getAcroFields();
form.setFieldCache(cache);
form.setField("test", "test");
stamper.close();
reader.close();
}
The first time the field with name "test" is encountered, the information will be read from the existing file. The next time it is encountered, the TextField info will be retrieved from the cache instead of from the existing file.
I want to take 2 pdf files and merge them together.
each file is one page long. the reason to merge them is that one file is simply a footer. The footer needs to be attached to the existing file.
I'm using a stamper to try and merge the 2 files.
I successfully create the output file, but it doesn't have the footer. It's just a copy of the original input file. Any idea why they aren't merging?
using (Stream inputPdfStream = new FileStream(inputFile, FileMode.Open, FileAccess.Read, FileShare.Read))
using (Stream inputPdfFooterStream = new FileStream(footerPdf, FileMode.Open, FileAccess.Read, FileShare.Read))
using (Stream outputPdfStream = new FileStream(outputFile, FileMode.Create, FileAccess.Write, FileShare.None))
{
var reader = new PdfReader(inputPdfStream);
var stamper = new PdfStamper(reader, outputPdfStream);
var pdfContentByte = stamper.GetOverContent(1);
stamper.FormFlattening = true;
stamper.Close();
}
There are different problems with your question.
Problem #1: why did you add the line stamper.FormFlattening = true;? Are you working with a form? I don't see you do anything with forms, so why would you flatten the document?
Problem #2: You say you want to merge two documents with PdfStamper. That is misleading. Merging documents is done with PdfCopy. From your explanation, I gather that you want to superimpose two documents. You are right that you need PdfStamper to do so.
Problem #3: You want to use a specific document containing a footer as company stationery. In that case, you want to add the content of the stationery under the actual content. Then why are you using stamper.GetOverContent(1);? Use stamper.GetUnderContent(1); instead.
Problem #4: You are creating an inputPdfFooterStream to read the document with the footer, but I don't see you using that stream anywhere. What do you expect?
Problem #5: You didn't read the documentation. This is your main problem. Download chapter 6 of my book (it's available for free, and I've been referring to it in dozens of answers on StackOverflow). Go to page 176 where it says "Adding company stationery to an existing document". That example meets your requirement completely!
// Create readers
PdfReader reader = new PdfReader(src);
PdfReader s_reader = new PdfReader(stationery);
using (MemoryStream ms = new MemoryStream()) {
// Create the stamper
using (PdfStamper stamper = new PdfStamper(reader, ms)) {
// Add the stationery to each page
PdfImportedPage page = stamper.GetImportedPage(s_reader, 1);
int n = reader.NumberOfPages;
PdfContentByte background;
for (int i = 1; i <= n; i++) {
background = stamper.GetUnderContent(i);
background.AddTemplate(page, 0, 0);
}
}
return ms.ToArray();
}
In your code, you only have one reader. In my code, I also have an object called s_reader that takes the footerPdf document and allows you to created a PdfImportedPage:
PdfImportedPage page = stamper.GetImportedPage(s_reader, 1);
This page is then added under the existing content of the actual document:
background = stamper.GetUnderContent(i);
background.AddTemplate(page, 0, 0);
Note that this example assumes that both documents have the same page size and that the origin of the coordinate system of the document with the actual content coincided with the lower-left corner. If that isn't the case with your PDFs, you can have a situation where the footer isn't visible or is only partly visible. Also: if the document with the actual content is opaque, it will also make the footer invisible.