How to add an updateable header/footer with iTextSharp/C# - c#

I want to be able to add a 'Page x of y' footer with iTextSharp that can then be updated and manipulated using the Header & Footer options in Adobe Acrobat. After the pdf is generated, users may still manually add or remove some pages, so I want them to be able to update the footer easily.
I've found quite a few resources showing how to add some text as a header or footer using PageEvent and GetOverContent(). However, once the pdf is generated, these are just plain text and aren't actually a header or footer object that can be updated in Acrobat without changing the text on each one.
Anyone know how to either:
a) access an existing pdf's header/footer objects via iTextSharp, or
b) create an actual header/footer object that Acrobat can manipulate
I'm using v5.4.3.0 - thank you

I won't downvote the question, but IMHO it's not an eligible question for StackOverflow. You may expect counter-questions such as "what have you tried."
As I'm the original developer of iText, let me explain why your question isn't a support question, but rather a request for consultancy.
PDF is defined in an ISO standard, ISO-32000-1. This standard is implemented by many companies, Adobe being the original creator of the spec, is one of them, iText Software is another.
You're asking for functionality that isn't part of the spec. You're asking about a specific implementation that is proprietary to Adobe. When using the Header & Footer functionality, Acrobat creates an Artifact (/Artifact <</Type /Pagination /Contents (Test) /Subtype /Header >>) and it stores the content stream of this artifact in a way that isn't part of the standard. (I've just read the draft for ISO-32000-2, the specification for PDF 2.0 we'll discuss at the meeting of the ISO committee in a couple of weeks, and I didn't encounter it.)
If you'd want to mimic this behavior using iTextSharp, you'll have to guess the procedure used by Adobe and implement it in iTextSharp (assuming that you're allowed to reverse engineer that procedure; I think it would be legal as it would improve interoperability). I'm pretty sure this is the closest answer to your question you'll get on this forum. It's now up to you to decide: will I implement this myself, or will I hire somebody to do this?
If you want to hire somebody at iText, please note that you'll need a license for your use of iText 5.4.3 before we'll consider your request.

Related

Is this TOC scenario possible using c# and iTextSharp?

I've just downloaded iTextSharp and before I put a lot of effort into this I'd like to know if this scenario is possible with it. We have a client that is insisting that their SSRS report PDFs contain a table of contents, preferably with page numbers. The various components of these reports have highly variable lengths so we can't hard code actual page numbers. As you all probably know, there is no direct way to create a Table of Contents in SSRS. (We've even had a special session with the Microsoft rep about this.)
What I would like to do is as follows:
Mark the target locations in the SSRS report by setting their
DocumentMapLabel property.
Generate the pdf in the usual fashion, either from the report server
or a ReportViewer control. (This will be in c#.)
Open the pdf in my hypothetical code.
Insert a blank page at or near the front.
Scan the pdf for DocumentMapLabels (and, ideally, detect which page
they're on.)
Populate the blank page with links to the various sections.
Is this possible?
I wouldn't use your design. As soon as the TOC needs more than one page, you're in trouble. Maybe you're confident that this won't happen today, but what if that's needed tomorrow?
You have different options:
Create your document in one go. Add the TOC at the end. Reorder the pages before closing the document.
Create a document (e.g. in memory) using named destinations for the targets. Create a document (e.g. in memory) with the TOC referring to the named destinations. Merge the two documents into one document, consolidating the named destinations.
Create a document with bookmarks (this will result in a bookmarks panel to the left in Adobe Reader). Then read the bookmarks to create a TOC in PDF and merge the PDF with the TOC with the document that has the bookmarks.
All of this is documented.
In The Best iText Questions on StackOverflow, you'll find the answers to these (and many other) questions:
How can I add titles of chapters in ColumnText? (this sounds exactly like what you need)
Create Index File(TOC) for merged pdf using itext library in java
PDF Page re-ordering using itext
How to reorder the pages of a PDF file?
What you want to do is possible, but not the way you describe it. Read the book, pick an option and post a new question if you have a problem with the option you picked. Just download that book; it's free of charge.
Note: iText(Sharp) is free software, NOT freeware. This means that it is only free of charge if you agree with the open source license (the AGPL). It is not free of charge in all situations as explained in this video. That's also important to know before you start an iText(Sharp) project.

Converting pdf to text

I need to create a C# or C++ (MFC) application that converts pdf files to txt. I need not only to convert, but remove headers, footers, some garbage characters on the left margin etc. Thus the application shold allow the user to set page margins to cut off what is not needed. I actually have already created such an application using xpdf, but it gives me some problems when I am trying to insert custom tags into the extracted text to preserve italics and bold. Maybe somebody could suggest something useful?
Thanks.
There are shareware and freeware utilities out there. Try fetching their source code, or perhaps use them the way they are.
A public version of the PDF specification can be found here: Adobe PDF Specification
PDF Shareware readers can be found: PDF Reader source code # SourceForge
Please look at Podofo. It's a LGPL-licensed library that has many powerful editing features. One of it's examples, txt2pdf IIRC, is a good start: it shows basic text-extraction; From there you can check if pre (in pdf engine) or post (in text) filtering suffices to your goals. I didn't get to use Pdf Hummus, but it's supposed to have these capabilities too, although it's less straightforward.

Generating Pdf from webpage in asp.net

Could you give me some recommendations on free/Openspurce library etc that could be integrated on asp.net application to Generate Pdf out of Html fragments. I will be generating Invoices that is displayed in DataGrid and tables. Is there some readily available library that would print the whole Table with Datagrid into Pdf. ITextSharp seems nice but i will have to do the tough work of adding tables and blah blah when everything is already in the webpage.
this is a possible duplicate but it generates pdf from Full page which is not desired
Possible duplicate question
ITextSharp does almost what you ask and is Open Source, however the API for conversion process has not been touched for years and is outdated. I therefore would recommend a commercial product.
Something like Winnovative HTML to PDF Converter
To be honest I look at it like this, you can save money by buying a licence for a commercial product rather than spending days developing a solution yourself.
Edit If it is for generating invoices alone then I would use iTextSharp as it does not take long to learn the basics. However if you want to be able to convert a full rich webpage into a PDF then go down the commercial route.
These links may help:
Convert HTML to PDF in .NET
Generate PDF from ASP.NET from raw HTML/CSS content?
Creating PDF Invoices - Are there any templating solutions?
When doing invoices, I usually go a slightly different way, by starting with Aspose.Words and using the nested mail merge feature.
Another option could be the HTML to PDF feature of Aspose.Pdf.
Both libraries are commercial only, I don't know whether this is appropriate for your project.

How can I programmatically create PDF bookmarks from PDF file?

So, I have used Pdf995's PDF print driver from a web browser to print web pages and eventually use PdfEdit995 to join these various PDF files into one large PDF.
Now I have a lot of large PDF documents that I wish to add bookmarks to, but am hoping there is a relatively easy way of doing this programmatically (using C#, preferably) - basically, I want to find, within each PDF, text that is large enough to qualify as a header, and use that text as the bookmark.
Any tips/advice/direction? Thanks!
It's definitely possible to do this, but I would recommend finding a PDF library that does most of the leg work. Technically you could do it all yourself with the aid of the PDF specification, but that'd probably take more time than it's worth.
The library will need to be able to let you find text in a document and then return the page and size, font, etc, of the text and create bookmarks (also known as outlines) based on that information programmatically.
My companies product, Quick PDF Library, can help you do this and so can PDFKit.NET. I'm sure there are other libraries out there that support this functionality too. As far as free libraries go, from what I've seen I don't believe that PDFSharp or iText will meet all of your requirements in this case, but I'm sure someone will correct me if I'm wrong.
If you'd prefer to develop a solution for this entirely yourself, then the PDF reference is available online for free.

Is there a way to replace a text in a PDF file with itextsharp?

I'm using itextsharp to generate the PDFs, but I need to change some text dynamically.
I know that it's possible to change if there's any AcroField, but my PDF doen's have any of it. It just has some pure texts and I need to change some of them.
Does anyone know how to do it?
Actually, I have a blog post on how to do it! But like IanGilham said, it depends on whether you have control over the original PDF. The basic idea is you setup a form on the page and replace the form fields with the text you want. (You can style the form so it doesn't look like a form)
If you don't have control over the PDF, let me know how to do it!
Here is a link to the full post:
Using a template to programmatically create PDFs with C# and iTextSharp
I haven't used itextsharp, but I have been using PDFNet SDK to explore the content of a large pile of PDFs for localisation over the last few weeks.
I would say that what you require is absolutely achievable, but how difficult it is will depend entirely on how much control you have over the quality of the files. In my case, the files can be constructed from any combination of images, text in any random order, tables, forms, paths, single pixel graphics and scanned pages, some of which are composed from hundreds of smaller images. Let's just say we're having fun with it.
In the PDFTron way of doing things, you would have to implement a viewer (sample available), and add some code over a text selection. Given the complexities of the format, it may be necessary to implement a simple editor in a secondary dialog with the ability to expand the selection to the next line (or whatever other fundamental object is used to make up text). The string could then be edited and applied by copying the entire page of the document into a new page, replacing the selected elements with your new string. You would probably have to do some mathematics to get this to work well though, as just about everything in PDF is located on the page by means of an affine transform.
Good luck. I'm sure there are people on here with some experience of itextsharp and PDF in general.
This question comes up from time to time on the mailing list. The same answer is given time and time again - NO. See this thread for the official answer from the person who created iText.
This question should be a FAQ on the itextsharp tag wiki.

Categories