AcroForm PDF to normal PDF in c#

AcroForm PDF to normal PDF in c# - c#

I have an Acroform PDF (a PDF which can be edited) but I'm using an API to sign the PDF which requires that the PDF is a normal one and never an Acroform one.
Is there any way to transform an AcroForm PDF to a normal one?
I tried making it Read-Only but even though it cannot be edited it still is an Acroform PDF.

In answer to my comment, I assume you are using iTextSharp, even though you do not specify. Using iTextSharp, I believe you need to Flatten the form when you are done. Here is a simple example:
public void GeneratePDF(string filePath, List<PDFField> modifiedFields)
{
var pdfReader = new PdfReader(filePath);
var folderStructure = filePath.Split('\\');
if (folderStructure.Length == 0) return;
var currentFileName = folderStructure.Last();
var newFilePath = string.Format("{0}{1}", Constants.SaveFormsPath,
currentFileName.Replace(".pdf", DateTime.Now.ToString("MMddyyhhmmss") + ".pdf"));
var pdfStamper = new PdfStamper(pdfReader, new FileStream(newFilePath, FileMode.Create));
foreach (var field in modifiedFields.Where(f=>f.Value != null))
{
pdfStamper.AcroFields.SetField(field.Name, field.Value);
}
pdfStamper.FormFlattening = true;
pdfStamper.Close();
}
Ignoring the parts about the filename, it boils down to passing in some key value list regarding the field values to set. This could be where you do your signature piece, and then setting the FormFlattening property on the stamper to true.
Here is another SO post where they used a similiar technique for a slightly different issue, it may be of help: How to flatten already filled out PDF form using iTextSharp

Related

Change text of label in PDF

I have problems with editing fields inside of an PDF document.
I created a simple invoice with OpenOffice and added some fields via the form creation tool. I exported it as PDF with forms after that.
One of the fields I want to change is named "{Firma}" and I want to fill this field with a string.
Below is a short example-code which doesnt seem to work, the field "{Firma}" in the output-file is still empty.
public static void ReplacePdfForm()
{
string fileNameExisting = #".\template\templaterechnung.pdf";
string fileNameNew = #".\rechnung.pdf";
using (var existingFileStream = new FileStream(fileNameExisting, FileMode.Open))
using (var newFileStream = new FileStream(fileNameNew, FileMode.Create))
{
// Open existing PDF
var pdfTemplate = new PdfReader(existingFileStream);
// PdfStamper, (PDF to be changed)
var pdfInvoice = new PdfStamper(pdfTemplate, newFileStream);
AcroFields fields = pdfInvoice.AcroFields;
// set form fields
fields.SetField("{Firma}", "Test1");
pdfInvoice.FormFlattening = true;
pdfInvoice.FreeTextFlattening = true;
pdfInvoice.Close();
pdfTemplate.Close();
}
}
(I have some more fields which also don't change but I deleted them from code because they behave the same way.)
Thanks in advance.
EDIT:
Here's my PDF: http://www.file-upload.net/download-11071404/templaterechnung.pdf.html
EDIT2:
This is how I set the property in OpenOffice:

When creating a form with Open Office, Open Office adds a parameter to the PDF instructing software processing the PDF not to create any appearance streams, but to leave it up to the PDF viewer to create those appearances.
This works as long as the form remains interactive, but as soon as you flatten the form, no appearances are created at all.
You can work around this problem by adding the following line:
fields.GenerateAppearances = true;
This way, you force iTextSharp to generate the appearances.

How can I remove image properties such as local path that Adobe Illustrator has been embedded to PDF file?

I'm trying to replace image in PDF file using iTextSharp(not a java version). It works fine but there only the problem is when I open that PDF with Adobe Illustrator it's always opened with the old hard link. It means Abode Illustrator always view the old image before replace. And a little weird here that it view fine with Adobe Reader(the replaced image can be viewed).
This is the snip code that I've tried:
public static void ReplaceImage(string pdfIn, string imagePath, string pdfOut)
{
PdfReader reader = new PdfReader(pdfIn);
PdfStamper stamper = new PdfStamper(reader, new FileStream(pdfOut, FileMode.Create));
PdfWriter writer = stamper.Writer;
Image img = Image.GetInstance(SysDrawing.Image.FromFile(imagePath), ImageFormat.Tiff);
PdfDictionary page = reader.GetPageN(1);
PdfDictionary resources = page.GetAsDict(PdfName.RESOURCES);
PdfDictionary xobject = resources.GetAsDict(PdfName.XOBJECT);
PdfDictionary properties = resources.GetAsDict(PdfName.PROPERTIES);
PdfDictionary procset = resources.GetAsDict(PdfName.PROCSET);
if (xobject != null)
{
List<PdfName> imgs = new List<PdfName>();
foreach (var ele in xobject.Keys)
{
PdfIndirectReference iref = xobject.GetAsIndirectObject(ele);
imgs.Add(ele);
if (iref.IsIndirect())
{
try
{
PdfDictionary pg = (PdfDictionary)PdfReader.GetPdfObject(iref);
if (pg != null)
{
PdfReader.KillIndirect(iref);
if (PdfName.IMAGE.Equals(SubType))
{
if (img.ImageMask != null)
writer.AddDirectImageSimple(img.ImageMask);
writer.AddDirectImageSimple(img, iref);
}
}
else
{
PdfReader.KillIndirect(iref);
writer.AddDirectImageSimple(img, iref);
}
}
catch {
continue;
}
}
}
}
//stamper.SetFullCompression();
stamper.Close();
stamper.Dispose();
reader.RemoveUnusedObjects();
reader.RemoveAnnotations();
reader.RemoveFields();
reader.Close();
reader.Dispose();
}
Any answer would be appreciated!

Your PDF contains two different documents: one described using PDF syntax and one described using Adobe Illustrator syntax. These two different documents should look identical, but as you have changed the PDF version of the document, they no longer do.
You perceive the document as only one document, because the AI document is stored inside the PDF document. In another question on SO, mkl explains the mechanism: Insert hidden digest in pdf using iText library
In his answer, mkl explains how to add hidden data (in this case a hidden digest, in your case the document in IA format) into a PDF.
You can remove this second document like this:
PdfDictionary catalog = reader.getCatalog();
catalog.remove(PdfName.PIECEINFO);
Of course, this throws away the Adobe Illustrator entirely, so you won't be able to edit the PDF in Adobe Illustrator anymore. If you want the image to change in the AI syntax, you need a library that is able to change AI syntax (and I don't know of any such library).

Create PDF by copying it from template with PdfCopy (lost of data)

I'm trying to create a new pdf file based on another one using PdfCopy.
Everything work fine during generation and the generated file can be opened without any problem on my desktop, but the file seems to be corrupted and isn't accepted by the service that I must use :
SignService error when calling 'sign', probably caused by a bad file format.
I noticed that the generated pdf is always ligther than the original template, so i compared the template version with the generated one. There are some big parts of missing data, especially a whole bunch of xml. I guess PdfCopy does not copying every of my original pdf but i cannot figured out what am i missing.
here is my method :
byte[] completedDocument = null;
string originalUri = Path.Combine(this.PdfPath, pdfName);
string generatedUri = Path.Combine(this.PdfGeneratedPath, generatedPdfName);
using(MemoryStream streamCompleted = new MemoryStream())
{
using(Document doc = new Document())
{
PdfCopy copy = new PdfCopy(doc, streamCompleted);
copy.PdfVersion = PdfWriter.VERSION_1_6;
doc.Open();
copy.Open();
byte[] mergedDocument = null;
PdfReader pdfReader = new PdfReader(originalUri);
int pdfPageNumber = pdfReader.NumberOfPages;
using(MemoryStream streamTemplate = new MemoryStream())
{
using (PdfStamper pdfStamper = new PdfStamper(pdfReader, streamTemplate))
{
AcroFields acrofields = pdfStamper.AcroFields;
foreach (KeyValuePair<string, AcroFields.Item> field in acrofields.Fields)
{
string data;
if (pdfFieldsValues.TryGetValue(field.Key, out data))
{
if (data == null)
{
data = string.Empty;
}
acrofields.SetField(field.Key, data);
}
}
pdfStamper.FormFlattening = true;
pdfStamper.Writer.CloseStream = false;
}
mergedDocument = streamTemplate.ToArray();
}
pdfReader = new PdfReader(mergedDocument);
for (int page = 1; page <= pdfPageNumber; page++)
{
if (!excludedPages.Any(s => s == page))
{
copy.AddPage(copy.GetImportedPage(pdfReader, page));
}
}
doc.Close();
copy.Close();
}
completedDocument = streamCompleted.ToArray();
}
File.WriteAllBytes(generatedUri, completedDocument);
I tried to upload the "mergedDocument" rather than the "completedDocument" and my service accepting it, so i'm pretty sure it has something to do with this part :
for (int page = 1; page <= pdfPageNumber; page++)
{
if (!excludedPages.Any(s => s == page))
{
copy.AddPage(copy.GetImportedPage(pdfReader, page));
}
}
Or pdfCopy init

You start with a form. You fill out the form and you flatten it. By flattening it, you deliberately throw away all interactivity. I'm surprised that you're surprised that the file is getting smaller: you're throwing away the form infrastructure!
You then upload the flattened file to some service unknown to us. This service complains:
SignService error when calling 'sign', probably caused by a bad file format.
As we don't know which service you are talking about, we can only guess. An educated guess would be that the original form contains a signature field that needs to be signed by a signing service.
Obviously that field is gone: you flattened the form! I may be wrong, but I assume that the service also tries to read the fields you filled out, but that won't be possible either as you throw away all interactivity. Please remove the following line:
pdfStamper.FormFlattening = true;
Then there's Chris' comment: it seems that you're using PdfCopy. If you're using an old version of iTextSharp (before iText 5.5.1), you shouldn't expect the form to be preserved. If you are using a recent version, you should instruct PdfCopy to preserve the form (but that line is missing). You don't need to ask 'how do I preserve the form?' because you shouldn't be using PdfCopy anyway.
You only need PdfStamper. You already use PdfStamper to fill out the fields, now you can also use the selectPages() method to select the pages you want to keep (or to exclude the ones you want to remove).
Finally, it is unclear what you mean when you write:
There are some big parts of missing data, especially a whole bunch of xml.
Are you saying that the form isn't a pure AcroForm, but that it also contains an XFA stream? If so, then you most definitely can't use PdfCopy.

ITextSharp PDFTemplate FormFlattening removes filled data

I am porting an existing app from Java to C#. The original app used the IText library to fill PDF form templates and save them as new PDF's. My C# code (example) below:
string templateFilename = #"C:\Templates\test.pdf";
string outputFilename = #"C:\Output\demo.pdf";
using (var existingFileStream = new FileStream(templateFilename, FileMode.Open))
{
using (var newFileStream = new FileStream(outputFilename, FileMode.Create))
{
var pdfReader = new PdfReader(existingFileStream);
var stamper = new PdfStamper(pdfReader, newFileStream);
var form = stamper.AcroFields;
var fieldKeys = form.Fields.Keys;
foreach (string fieldKey in fieldKeys)
{
form.SetField(fieldKey, "REPLACED!");
}
stamper.FormFlattening = true;
stamper.Close();
pdfReader.Close();
}
}
All works well only if I ommit the
stamper.FormFlattening = true;
line, but then the forms are visible as...forms.
When I add the this line, any values set to the form fields are lost, resulting in a blank form. I would really appreciate any advice.

Most likely you can resolve this when using iTextSharp 5.4.4 (or later) by forcing iTextSharp to generate appearances for the form fields. In your example code:
var form = stamper.AcroFields;
form.GenerateAppearances = true;

Resolved the issue by using a previous version of ITextSharp (5.4.3). Not sure what the cause is though...

I found a working solution for this for any och the newer iTextSharp.
The way we do it was:
1- Create a copy of the pdf temmplate.
2- populate the copy with data.
3- FormFlatten = true and setFullCompression
4- Combine some of the PDFs to a new document.
5- Move the new combined document and then remove the temp.
This way we got the issue with removed input and if we skipped the "formflatten" it looked ok.
However when we moved the "FormFlatten = true" from step 3 and added it as a seperate step after the moving etc was complete, it worked perfectly.
Hope I explained somewhat ok :)

In your PDF File, change the property to Visible, the Default value is Visible but not printable.

How to get a list of the fields in an XFA form?

I am trying to get a simple list of all the fields in my XFA form. I am using this code:
private void ListFieldNames()
{
string pdfTemplate = #"C:\Projects\iTextSharp\SReport.pdf";
MemoryStream m = new MemoryStream();
// title the form
this.Text += " - " + pdfTemplate;
// create a new PDF reader based on the PDF template document
PdfReader pdfReader = new PdfReader(pdfTemplate);
PdfStamper pdfStamper = new PdfStamper(pdfReader, m);
AcroFields formFields = pdfStamper.AcroFields;
AcroFields form = pdfReader.AcroFields;
XfaForm xfa = form.Xfa;
StringBuilder sb = new StringBuilder();
sb.Append(xfa.XfaPresent ? "XFA form" : "AcroForm");
sb.Append(Environment.NewLine);
foreach (string key in form.Fields.Keys)
{
sb.Append(key);
sb.Append(Environment.NewLine);
txtFields.Text = sb.ToString();
}
txtFields.Text = sb.ToString();
}
But all I am getting is the XFA Form and not any fields. Any idea what I am doing wrong?
Thanks in advance

You've taken a code sample from chapter 8 of my book "iText in Action." The result of that code sample is consistent with what I wrote on page 273:
Running Listing 8.18 with this form as resource will give you the following result:
AcroForm
If your question is Any idea what I am doing wrong? then the answer is simple: you stopped reading on page 270, or you used a code sample without reading the accompanying documentation. How to fix this? Read the documentation!
If your question is Why don't I get any info about the fields? (which isn't your question, but let's assume it is), the answer is: you're using code to retrieve AcroForm fields, but your form doesn't contain any such fields. Your form is a pure XFA form, which means that all field information is stored as XML and XML only!
Suppose that you now want to know: How can I extract that XML? then you should go to the place where you found the example you copy/pasted.
That could be here:
http://itextpdf.com/examples/iia.php?id=164
Or maybe here: http://sourceforge.net/p/itextsharp/code/HEAD/tree/trunk/book/iTextExamplesWeb/iTextExamplesWeb/iTextInAction2Ed/Chapter08/XfaMovie.cs
Or even here: http://kuujinbo.info/iTextInAction2Ed/index.aspx?ch=Chapter08&ex=XfaMovie
This code snippet will return the complete XFA stream:
public string ReadXfa(PdfReader reader) {
XfaForm xfa = new XfaForm(reader);
XmlDocument doc = xfa.DomDocument;
reader.Close();
if (!string.IsNullOrEmpty(doc.DocumentElement.NamespaceURI)) {
doc.DocumentElement.SetAttribute("xmlns", "");
XmlDocument new_doc = new XmlDocument();
new_doc.LoadXml(doc.OuterXml);
doc = new_doc;
}
var sb = new StringBuilder(4000);
var Xsettings = new XmlWriterSettings() {Indent = true};
using (var writer = XmlWriter.Create(sb, Xsettings)) {
doc.WriteTo(writer);
}
return sb.ToString();
}
Now look for the <xfa:datasets> tag; it will have a subtag <xfa:data> (probably empty if the form is empty) and a subtag <dd:dataDescription>. Inside the dataDescription tag, you'll find something that resembles XSD. That's what you need to know what the fields in the form are about.
I could go on guessing questions, such as: How do I fill out such a form? By using the method fillXfaForm(); How can I flatten such a form? By using XFA Worker (which is a closed source library written on top of iTextSharp), but let's keep those questions for another thread ;-)

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.