Different result in different pdf viewer pdfsharp

Different result in different pdf viewer pdfsharp - c#

Me and my team recently changed from ITextSharp to PdfSharp, because of ITextSharp became really slow, and we couldn't seem to fix the problem.
But right now we have a problem, where our pdf, thats filled by PdfSharp, is 200kb bigger then the one from ITextSharp. The size itself isn't the problem, its that when we open our pdf in firefox, the data is still displayed fine in the viewer, but when we want to print it, all the multiline fields, is suddenly one liners, with a different font too.
We have /NeedAppearance on our acroform elements, and tried to remove it to see what it would look like in adobe, and etc. and it looked the same as it did on the print screen in firefox.
The NeedAppearance isn't on our document from the ITextSharp, and it displays fine in every viewer.
This is the code we use to set the text:
public static bool SetField(this PdfAcroForm form, string fieldName, string value)
{
PdfTextField field = (PdfTextField)form.Fields[fieldName];
if (field != null)
{
field.Text = value;
}
return field != null;
}
At the end of the fields being set, we have a document.flatten() to make the fields readOnly.
A little side note
Once we have opened the pdf in adobe, and we want to close it, it wants us to save it, without we have changed the document. Once we have saved it, it is 200kb less, and suddenly works in all viewers. This is with /NeedAppearance on.
Update 1
I've spend the whole night looking for a solution, but couldn't.
But this is what I have found so far:
On every PdfTextField after the Text property has been set, there comes an /AP element in Elements which contains a reference to an object, which contains what should be drawn.
I think that Adobe can understand the /NeedAppearance element on the acroforms, and therefore makes the /APelement on every field correct. The reason for the file is less kb after, is seams to be that Adobe do something with the streams on the elements, some sort of encoding, that takes off less space.
So as it is right now, I think I have too create a new Flatten method that creates the /AP elements right. I don't know why The current Flatten method doesn't do that, as it's only changing the fields to readonly.

What I ended up doing, is to create my own flatten method.
Summary of what the Flatten method do:
I've mad it an extension to PdfAcroForm.
I loop trough all the fields, except PdfChecboxfield, because that is displayed just fine.
Then I went and found the page the field was on, and created and XGraphics from that page.
Then I get the position and size of the field, from the element /Rect
Then putted my XGraphics in a XTextFormatter, and sets the appearance on my XTextFormmater by the elements of my field.
Then I use XTextFormatter.Drawstring() and after that, dispose my XGraphics.
Then to remove the field, I delete all the elements on that field.
If this was unclear to you, feel free to comment, and I'll try my best to help you.
DISCLAIMER:
The flatten method I created, will delete your fields, and you CAN NOT undo it. It writes the text on the pdf itself, but just do it on the fields position.

Related

Unwanted characters in Acrobat PDF conversion of auto-detected Word fillable fields. Deleting fillable field characters using iTextSharp5

Acrobat DC, Office 365, iTextSharp5, Win10 Pro 64-bit
I have a Word document containing several pages of text and one empty TextBox in between two of the lines. I am attempting to use the Acrobat "Prepare Form" feature to convert that document to PDF with the TextBox as a fillable field, and Acrobat has no problem auto detecting the TextBox and making it fillable. The problem, however, is that the converted TextBox contains text from either the line of text above it or below it.
I've read that this is caused by placing the TextBox too close to those lines in the Word document and sure enough, by leaving three or four empty lines of space above and below the text box the issue goes away. However, that's an unacceptable amount of wasted space. I tried putting continuous section breaks above and below the TextBox in Word as well as typing spaces in the TextBox but that doesn't help. I also tried it with a 1x1 table instead of a TextBox but the same problem occurred.
I then tried deleting the unwanted text from the PDF TextBox field and saving it that way, which appeared to be a reasonable solution. However, when I used an iTextSharp5 program to detect the PDF's fillable fields it could no longer detect the empty field. I wouldn't mind leaving the original unwanted text in the PDF TextBox field if there were some way to remove it with iTextSharp, but it doesn't seem to have that ability.
Because I have many Word documents to convert to fillable PDF's and might need to update them occasionally, it simply isn't practical for me to manually add the fillable fields to the converted PDFs each time an update is needed. Any suggestions are welcome :-)

Filling pdf field that takes value from other field

I know there are many similar questions on here, i've tried several of them but have not managed to solve the problem(and many of them are still unanswered after several years).
My problem is that i can not set the value of the two fields at the top of the second page named "form1[0].sida2[0].flt_datSidhuvud[0]" and "form1[0].sida2[0].flt_txtPersonNrBrukare[0]", they both have a field with same name but different prefix on the first page and from the research i've done this might be causing the problem but the suggested solutions have not worked for me.
If i fill in the form manually with for example Acrobat Reader the values that are input in the fields on page one automatically appear in the fields on page two and the other way around.
Here is an example of the code i use to try and fill in the two fields
MemoryStream output = new MemoryStream();
FileStream fs = File.Open(pdfTemplatePath, FileMode.Open, FileAccess.Read);
PdfReader reader = new PdfReader(fs);
fs.Close();
PdfStamper stamper = new PdfStamper(reader, output);
var formFields = stamper.AcroFields;
formFields.SetField("form1[0].sida2[0].flt_datSidhuvud[0]", "2016-12"));
formFields.SetField("form1[0].sida2[0].flt_txtPersonNrBrukare[0]", data.SocialSecurityNumber);//data.SocialSecurityNumber is a string
stamper.FormFlattening = true;
stamper.Close();
reader.Close();
The result is that it fills in the value of the fields on the first page only.
Link to PDF
From my research this question here on SO was the most promising but the suggested solution(to remove XFA) doesn't seem to work in my case.

In general
If (AcroForm) form fields have different full names, they are separate fields.
A behavior as you describe (filling one field in Adobe Reader upon losing focus automatically fills another one, too) can be achieved using JavaScript actions. But filling in these fields using iText does not trigger any JavaScript events. Thus, in general you have to fill in both (AcroForm) fields.
Your case is special
Your case is slightly different, though:
Your PDF contains a hybrid form, both present as AcroForm form and as XFA form, and iText sometimes in the 5.x versions got fitted with a certain amount of XFA support. In particular in case of hybrid forms,
whenever the value of a field is retrieved, it first is looked up in the XFA form data elements; and
whenever a field value is set, it is set both in a single matching AcroForm field and in the XFA form data elements.
In the AcroForm representation of your form, the flt_datSidhuvud and flt_txtPersonNrBrukare fields on page 2 (which you do not explicitly fill) are empty and don't even have an appearance stream.
In the XFA representation of your form both flt_datSidhuvud form fields are backed by a single data element, and so are both flt_txtPersonNrBrukare form fields.
Furthermore, your case is special because you flatten the form. If you did not flatten id, only the values in the fields of the AcroForm fields on the first page and the XFA data fields would be set, not the AcroForm fields on the second page.
Form flattening has also been substantially improved during the 5.x versions.
Why does it work
While flattening your hybrid form, the fields on page two get their values:
While flattening the still empty flt_datSidhuvud and flt_txtPersonNrBrukare AcroForm fields on page two, iText determines that they do not have any appearance streams yet to flatten into the page content and, therefore, tries to create appearance streams for them.
To create these appearances, iText first retrieves the value of each field. As mentioned above, this means that the value in the XFA form is looked up first which is the same for both the fields on page one and two. Thus, here the value you set for the field on the first page is retrieved and used for building the appearance on the second page!
Why didn't it work for you initially
In comments you said that you mostly used the older 4.1.6.0 iTextSharp version, not the current 5.5.11.0 one. As explained above, the automatic fill-in on your second page depends on the iText XFA support and form flattening improvements both of which were introduced during the 5.x versions.
Thus, your initial attempts to run the code from your question did not result in filled-in fields on page 2 because the older iTextSharp versions simply did not implement support for that.

Retrieve PDF field "description text" with Itextsharp

Is it possible to get the text shown in the red squares in my attached image? The image is showing part of a PDF document with several fields and their "title"
I don't know what they are called so im having a hard time searching for a solution :(
I can get all the field names and types. But when i debug i cant seem to find any option where the "caption", or whatever it is called, for a field is stored/accessed. If there is a link between the field and that text it would be lovely :)
If you don't know how to access that information with code, do you know what the text might be called so that i can try searching/debugging some more myself?
Edit - added a bigger image, sorry for the NSA blackout of text, not sure i can share the customer PDF document...
Edit2 - added some PDFReader data about the document from VS quickWatch

How do I insert an image in a word document as footer

I need to create and insert a QR code into existing word documents using .NET.
I've done the QR generation part. The 2 things I need to accomplish are:
Inserting the QR code in the footer of an existing word document (preferably using Open XML).
Each page of the word document has a unique QR code. This means that each footer would have to be different. (I could eliminate the footer and place the QR code as part of the body, but that word make flow of text complicated.)
Is it possible to accomplish this?

I haven't done this, but I believe that what you will need to do is
put each page in a separate Word section (and that means, in effect,
that you will need to decide what your page size and layout is)
create a footer containing one QR code to find out what XML Word
expects, and what type of image data you need to store in the .docx
(assuming that you are not attempting to store your image data
externally in spearate files).
create a footer for each section (and ensure that the footers are
not "linked to previous"), replicating the format you discovered in
point (2)
create a part for each QR code image, and a relationship to that
part
What I am even less sure about is whether Word will insist that you also store each image in another format (e.g. Windows Metafile or Extended metafile format). My guess is that Word will generate what it needs from your .jpg (or whatever). Or maybe you can use "AltChunks" in some useful way here.
The background to this is that if it were a .doc format document, you could have created a single footer containing a set of nested field codes that used the { PAGE } page number field to link to the correct image for each page - e.g.
{ INCLUDETEXT "c:\\myqrcodes\\qr{ PAGE }.jpg" }
or more likely, the slightly more complicated
{ PAGE \#"'{ INCLUDETEXT "c:\\myqrcodes\\qr{ PAGE }.jpg" }'" }
But if you try to save that as .docx format, even in compatibility mode, when you close and re-open, I think you wil just see one image on all pages. Further, even though that approach works with .doc format, it only works if the external image files are actually there and located at absolute addresses in the file system. If they are located at releative addresses (there is a way to do that) you or the end user will probably have to update the footer field codes to get the correct results.

Paragraph breakage, one page to the other page, in detail section of Crystal Reports

I have made a report in Crystal Reports, it has a detail section, i have dragged a variable carrying text and if the paragraph of the text longer, i want its half of the part should be gone to the next page, as we see in text books and as it is a standward way on A4 paper.
When i write a variable carrying data in text for detail, this problem occurs. I had used some algorithm or like that to divide the data into two parts and made two variables, but as data can be html also so that algorithm does not work in very good manner. I just want to use crystal reports functionality. Thanks in advance.I am attaching an image for further understanding.

The history field should have a property called 'keep together'. I'm guessing its set to true and should be changed to false to allow the field to be split across a page break.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.