I was wondering whether it would at all be possible to have our creative department design a nice-looking PDF template for our client, e.g. a fancy letterhead, then supply it to me so I could inject various types of content into the body using PDFSharp or MigraDoc.
Currently we generate the header and footer content as part of the rendering process, and it works very well, but as you can imagine, any non-trivial layout and styling is pretty complicated to pull off in what is essentially a 2D graphics environment.
So the thought arose as to whether one of these tools would be able to take a pre-existing PDF, give me access to various objects, and allow me to e.g. replace certain text placeholders or manipulate the PDF "DOM" in a more intelligent fashion.
Something similar to working with Spreadsheets (binary and XML versions) or OpenXML, etc.
What we do: take an existing PDF page, draw it at the bottom (Z axis) of a new PDF page, and then use MigraDoc to add other contents to the page.
PDFsharp can also be used to draw on top.
The template PDF pages are used like letter heads with the corporate design of a customer and the final document will have as many pages as needed.
Related
I have built several PDF documents dynamically from ASP.NET pages (HTML/CSS) using plugins like Winnovative HtmlToPDFconverter. It has always been a successful outcome using the built-in functionality for those plugins, like merging existing PDF documents with dynamic content and adding pre-defined headers and footers, adding page margins, page numbers and so forth. The HTML content has overall been rendered as expected in the final PDF document(s).
Is there any way/any advice for a similar .NET plugin that can render HTML/CSHTML to a Microsoft Word document (.docx) in the same way – or is it too difficult to render native HTML5 and CSS into a desirable layout for a Microsoft Word Document?
I have googled around and found some suggestions, but I'm looking for recommendations for maybe a specific plugin – or a warning if it is a no-go
and too difficult to get the desired layout 1:1 from HTML to a Word document because of incompatibility between markups?
Devexpress HTML editor can export its HTML content to different formats including docx and rtf. Not sure about its limitations (e.g. script and canvas export, etc.), but in the common case it works well.
I'm using PDFsharp to use one PDF as a watermark in another PDF. This is mostly working. The watermark PDF is placed "behind" the content of each page in the target PDF. However, the watermark content needs to be partially transparent (or screened) in order for the resulting PDF to be legible.
How do I go about using PDFsharp to globally adjust the transparency of a PDF?
You can check the documentation here for details on adding a watermark onto a pdf using PdfSharp. From the link:
Note: Technically the watermarks in this sample are simple graphical output. They have nothing to do with the Watermark Annotations introduced in PDF 1.5.
Here is another link which claims to have 3 different methods of applying watermarks - have you tried any of these? It looks like you may need to use MigraDocs as well as PdfSharp to achieve this.
You didn't specify what your watermark looks like - does it need to support any custom pdf you can create, or is it just some text going across the page? The latter definitely looks possible using the links I posted.
If you want to create custom objects, maybe you can check this link (Xforms), where it talks about drawing transparent custom shapes:
This sample shows how to create an XForm object from scratch. You can think of such an object as a template, that, once created, can be drawn frequently anywhere in your PDF document.
I think that perhaps instead of having 2 PDFs (1 main and 1 watermark) it is probably going to be easier to have 1 pdf and then create the watermark either with the built-in methods or by creating an XForm object and sticking it on the pdf.
I have a pdf template without AcroFields and i need to replace text in it. The text is formated like this ((aFieldToReplace)), but there are also tables that need filled up with a n-numbered rows.
Is there any good tutorial, resource or sample to find?
Is there a way to replace a text in a PDF file with itextsharp? has more or less the same question but the answer ignores the "no Acrofield" part of the question.
EDIT:
To make it even harder, i have multiple templates that i can use. The templates have all there own formatting-style (font, color,...)
EDIT 2:
The purpose is to create a report with some data in a database. The data in a database is coming from several forms in a ASP.NET MVC application.
The report could have several layouts depending on the chosen template.
Templates should be addable dynamically, so i can't create the layout from scratch. I really need to get the layout from a template.
Quoting the excellent iText in Action:
In a PDF document, every character or glyph on a PDF page has its fixed position, regardless of the application that’s used to view the document.
[…]
Suppose you want to replace the word “edit” with the word “manipulate” in a sentence, you’d have to reflow the text. You’d have to reposition all the characters that follow that word. Maybe you’d even have to move a portion of the text to the next page. That’s not trivial, if not impossible.
[…]
Don’t expect any tool to be able to edit a PDF file the same way you’d edit a Word document.
PDF is a document display format. If you want templating you'll probably have to use something else.
#Frederiek:
If you can spend a bit of money, this will do exactly what you want. Check out the demo, it's quite cool. It can reflow the text, replace images, etc. Quite nice.
http://www.iceni.com/infixServer.htm
Let me know if that works for you.
I'm creating docx file by using DocumentFormat OpenXML API, appending texts, tables, images, etc... at some point I'd like to know how much "space" (rows, pixels, whatever) I have before I reach the end of the page.
Is it possible to get this information? The page size, formatting, margins, etc, are all there - everything that's needed to calculate this is in the document.
I do understand that the document itself doesn't deal with formatting, how many pages are there etc, but if not from the OpenXML API, is there some other way to find this out? Maybe some dummy formatter that reads this docx and can be used to find out the 'formatting' data - where is currently the last character positioned on a page, etc... is there such a thing?
I'm not aware of such a "dummy formatter". Such a thing would need to replicate Word's page layout model (including line spacing, how it hyphenates words, calculates space for the header/footer etc). That might be relatively straightforward for simple text only documents, but for a full solution, you have tables, columns, images, text boxes etc etc to worry about.
Can you automate Word and ask it?
I'm using itextsharp to generate the PDFs, but I need to change some text dynamically.
I know that it's possible to change if there's any AcroField, but my PDF doen's have any of it. It just has some pure texts and I need to change some of them.
Does anyone know how to do it?
Actually, I have a blog post on how to do it! But like IanGilham said, it depends on whether you have control over the original PDF. The basic idea is you setup a form on the page and replace the form fields with the text you want. (You can style the form so it doesn't look like a form)
If you don't have control over the PDF, let me know how to do it!
Here is a link to the full post:
Using a template to programmatically create PDFs with C# and iTextSharp
I haven't used itextsharp, but I have been using PDFNet SDK to explore the content of a large pile of PDFs for localisation over the last few weeks.
I would say that what you require is absolutely achievable, but how difficult it is will depend entirely on how much control you have over the quality of the files. In my case, the files can be constructed from any combination of images, text in any random order, tables, forms, paths, single pixel graphics and scanned pages, some of which are composed from hundreds of smaller images. Let's just say we're having fun with it.
In the PDFTron way of doing things, you would have to implement a viewer (sample available), and add some code over a text selection. Given the complexities of the format, it may be necessary to implement a simple editor in a secondary dialog with the ability to expand the selection to the next line (or whatever other fundamental object is used to make up text). The string could then be edited and applied by copying the entire page of the document into a new page, replacing the selected elements with your new string. You would probably have to do some mathematics to get this to work well though, as just about everything in PDF is located on the page by means of an affine transform.
Good luck. I'm sure there are people on here with some experience of itextsharp and PDF in general.
This question comes up from time to time on the mailing list. The same answer is given time and time again - NO. See this thread for the official answer from the person who created iText.
This question should be a FAQ on the itextsharp tag wiki.