Retrieve PDF field "description text" with Itextsharp

Retrieve PDF field "description text" with Itextsharp - c#

Is it possible to get the text shown in the red squares in my attached image? The image is showing part of a PDF document with several fields and their "title"
I don't know what they are called so im having a hard time searching for a solution :(
I can get all the field names and types. But when i debug i cant seem to find any option where the "caption", or whatever it is called, for a field is stored/accessed. If there is a link between the field and that text it would be lovely :)
If you don't know how to access that information with code, do you know what the text might be called so that i can try searching/debugging some more myself?
Edit - added a bigger image, sorry for the NSA blackout of text, not sure i can share the customer PDF document...
Edit2 - added some PDFReader data about the document from VS quickWatch

Related

Different result in different pdf viewer pdfsharp

Me and my team recently changed from ITextSharp to PdfSharp, because of ITextSharp became really slow, and we couldn't seem to fix the problem.
But right now we have a problem, where our pdf, thats filled by PdfSharp, is 200kb bigger then the one from ITextSharp. The size itself isn't the problem, its that when we open our pdf in firefox, the data is still displayed fine in the viewer, but when we want to print it, all the multiline fields, is suddenly one liners, with a different font too.
We have /NeedAppearance on our acroform elements, and tried to remove it to see what it would look like in adobe, and etc. and it looked the same as it did on the print screen in firefox.
The NeedAppearance isn't on our document from the ITextSharp, and it displays fine in every viewer.
This is the code we use to set the text:
public static bool SetField(this PdfAcroForm form, string fieldName, string value)
{
PdfTextField field = (PdfTextField)form.Fields[fieldName];
if (field != null)
{
field.Text = value;
}
return field != null;
}
At the end of the fields being set, we have a document.flatten() to make the fields readOnly.
A little side note
Once we have opened the pdf in adobe, and we want to close it, it wants us to save it, without we have changed the document. Once we have saved it, it is 200kb less, and suddenly works in all viewers. This is with /NeedAppearance on.
Update 1
I've spend the whole night looking for a solution, but couldn't.
But this is what I have found so far:
On every PdfTextField after the Text property has been set, there comes an /AP element in Elements which contains a reference to an object, which contains what should be drawn.
I think that Adobe can understand the /NeedAppearance element on the acroforms, and therefore makes the /APelement on every field correct. The reason for the file is less kb after, is seams to be that Adobe do something with the streams on the elements, some sort of encoding, that takes off less space.
So as it is right now, I think I have too create a new Flatten method that creates the /AP elements right. I don't know why The current Flatten method doesn't do that, as it's only changing the fields to readonly.

What I ended up doing, is to create my own flatten method.
Summary of what the Flatten method do:
I've mad it an extension to PdfAcroForm.
I loop trough all the fields, except PdfChecboxfield, because that is displayed just fine.
Then I went and found the page the field was on, and created and XGraphics from that page.
Then I get the position and size of the field, from the element /Rect
Then putted my XGraphics in a XTextFormatter, and sets the appearance on my XTextFormmater by the elements of my field.
Then I use XTextFormatter.Drawstring() and after that, dispose my XGraphics.
Then to remove the field, I delete all the elements on that field.
If this was unclear to you, feel free to comment, and I'll try my best to help you.
DISCLAIMER:
The flatten method I created, will delete your fields, and you CAN NOT undo it. It writes the text on the pdf itself, but just do it on the fields position.

Unwanted characters in Acrobat PDF conversion of auto-detected Word fillable fields. Deleting fillable field characters using iTextSharp5

Acrobat DC, Office 365, iTextSharp5, Win10 Pro 64-bit
I have a Word document containing several pages of text and one empty TextBox in between two of the lines. I am attempting to use the Acrobat "Prepare Form" feature to convert that document to PDF with the TextBox as a fillable field, and Acrobat has no problem auto detecting the TextBox and making it fillable. The problem, however, is that the converted TextBox contains text from either the line of text above it or below it.
I've read that this is caused by placing the TextBox too close to those lines in the Word document and sure enough, by leaving three or four empty lines of space above and below the text box the issue goes away. However, that's an unacceptable amount of wasted space. I tried putting continuous section breaks above and below the TextBox in Word as well as typing spaces in the TextBox but that doesn't help. I also tried it with a 1x1 table instead of a TextBox but the same problem occurred.
I then tried deleting the unwanted text from the PDF TextBox field and saving it that way, which appeared to be a reasonable solution. However, when I used an iTextSharp5 program to detect the PDF's fillable fields it could no longer detect the empty field. I wouldn't mind leaving the original unwanted text in the PDF TextBox field if there were some way to remove it with iTextSharp, but it doesn't seem to have that ability.
Because I have many Word documents to convert to fillable PDF's and might need to update them occasionally, it simply isn't practical for me to manually add the fillable fields to the converted PDFs each time an update is needed. Any suggestions are welcome :-)

PDFClown Detect empty text location

I am able to use the PDFClown library in C# to parse and extract the text from a daily report in PDF. The issue I am having is detecting when a text value is missing. Using the TextExtractor, there is no place holder in the text value as I expected. The PDF document has a box where the missing text should be so it would seem like there should be some way to detect value is not there. There is no form in this document.

How do I insert an image in a word document as footer

I need to create and insert a QR code into existing word documents using .NET.
I've done the QR generation part. The 2 things I need to accomplish are:
Inserting the QR code in the footer of an existing word document (preferably using Open XML).
Each page of the word document has a unique QR code. This means that each footer would have to be different. (I could eliminate the footer and place the QR code as part of the body, but that word make flow of text complicated.)
Is it possible to accomplish this?

I haven't done this, but I believe that what you will need to do is
put each page in a separate Word section (and that means, in effect,
that you will need to decide what your page size and layout is)
create a footer containing one QR code to find out what XML Word
expects, and what type of image data you need to store in the .docx
(assuming that you are not attempting to store your image data
externally in spearate files).
create a footer for each section (and ensure that the footers are
not "linked to previous"), replicating the format you discovered in
point (2)
create a part for each QR code image, and a relationship to that
part
What I am even less sure about is whether Word will insist that you also store each image in another format (e.g. Windows Metafile or Extended metafile format). My guess is that Word will generate what it needs from your .jpg (or whatever). Or maybe you can use "AltChunks" in some useful way here.
The background to this is that if it were a .doc format document, you could have created a single footer containing a set of nested field codes that used the { PAGE } page number field to link to the correct image for each page - e.g.
{ INCLUDETEXT "c:\\myqrcodes\\qr{ PAGE }.jpg" }
or more likely, the slightly more complicated
{ PAGE \#"'{ INCLUDETEXT "c:\\myqrcodes\\qr{ PAGE }.jpg" }'" }
But if you try to save that as .docx format, even in compatibility mode, when you close and re-open, I think you wil just see one image on all pages. Further, even though that approach works with .doc format, it only works if the external image files are actually there and located at absolute addresses in the file system. If they are located at releative addresses (there is a way to do that) you or the end user will probably have to update the footer field codes to get the correct results.

How to send values from text box through form in "Group Sort Expert"dailogue in crystal report

alt text http://img136.imageshack.us/img136/4083/15748429.jpg
Hi all,
Please look into the above screenshot. Here i want to change the value of N in Group Sort Expert explicitly from text box using some c# application.
Can anyone help me on this.

I don't think that would be very easy. My guess is you'd have to hack into ReportDocument.ReportDefinition and maybe somewhere in the Areas or Sections collections. If you did find it and changed the value, you might even have to re-save the .rpt file for it to take effect. I'm not sure if it could be changed on the fly.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.