If you search for "add image to pdf" on Internet, you will find many useful articles. However none of them meet my requirements.
I want to add an image to a certain place inside an existing PDF file, for instance incide a textbox.
I am not certain of how exactly you require an image added to your PDF, but there a number of approaches you can consider:
1- Load the PDF as a rasterized image and draw the image at your desired location.
2- Add the image as an annotation to the PDF.
3- Convert the PDF to a format that allows easy modification of text and insertion of images.
Loading the PDF as a rasterized image is the most direct approach. However, your text will no longer be searchable and any other PDF objects (Annotations, Hyperlinks) will all become part of one image (no longer objects). But using this approach you can simply draw the image at the exact place you need. If you want to restore text searchability after doing this, you can use an OCR engine to process the text in the resulting image.
The ImageMagick library uses the Ghostscript common engine for dealing with PDF, and it can convert PDF pages to images. There's a .NET wrapper for ImageMagick to use with C#. For OCR, there are free engines like MODI or Tesseract.
Adding the image as an annotation allows you to maintain the original format and text in the PDF, though the image will be treated as a separate object than the text and will not be “in-line”. Annotations also allow you to draw them at the exact location you need without too much difficulty.
LibreOffice Draw and Okular are options you can consider for drawing annotations.
Finally, you could simply convert the PDF to a format that easier for processing and editing, like DOC, add your image then convert it back to PDF.
Related
I'm looking into this concept for creating AR in Unity using PDF content:
Have a picture of an object as AR target, then a command to search through a PDF database for the PDF file that contains the same/similar picture, and finally to overlay the text content in that PDF on my initial object.
From your experience, do you know if this can be achievable somehow in Unity currently?
My findings for now are that there are PDF reader plugins for Unity, that you can display PDF files in a browser (Application.OpenURL) or resorting to the PDF-image format conversion. But haven't found much on search features inside PDFs (text or image search).
Instead of searching for a PDF file that contains the same picture used as AR target, maybe search for a unique "text" and get the PDF where that string is located?
https://askubuntu.com/questions/37408/can-the-unity-dash-search-for-content-within-files
https://askubuntu.com/questions/41829/how-can-i-search-for-files-in-unity
And would it be possible to pull out the text content in the PDF and display it as overlay while target detection?
Can you also advise on trainings I could take that'll be helpful for the above? Or propose alternatives?
Thank you.
I was wondering whether it would at all be possible to have our creative department design a nice-looking PDF template for our client, e.g. a fancy letterhead, then supply it to me so I could inject various types of content into the body using PDFSharp or MigraDoc.
Currently we generate the header and footer content as part of the rendering process, and it works very well, but as you can imagine, any non-trivial layout and styling is pretty complicated to pull off in what is essentially a 2D graphics environment.
So the thought arose as to whether one of these tools would be able to take a pre-existing PDF, give me access to various objects, and allow me to e.g. replace certain text placeholders or manipulate the PDF "DOM" in a more intelligent fashion.
Something similar to working with Spreadsheets (binary and XML versions) or OpenXML, etc.
What we do: take an existing PDF page, draw it at the bottom (Z axis) of a new PDF page, and then use MigraDoc to add other contents to the page.
PDFsharp can also be used to draw on top.
The template PDF pages are used like letter heads with the corporate design of a customer and the final document will have as many pages as needed.
I'm using PDFsharp to use one PDF as a watermark in another PDF. This is mostly working. The watermark PDF is placed "behind" the content of each page in the target PDF. However, the watermark content needs to be partially transparent (or screened) in order for the resulting PDF to be legible.
How do I go about using PDFsharp to globally adjust the transparency of a PDF?
You can check the documentation here for details on adding a watermark onto a pdf using PdfSharp. From the link:
Note: Technically the watermarks in this sample are simple graphical output. They have nothing to do with the Watermark Annotations introduced in PDF 1.5.
Here is another link which claims to have 3 different methods of applying watermarks - have you tried any of these? It looks like you may need to use MigraDocs as well as PdfSharp to achieve this.
You didn't specify what your watermark looks like - does it need to support any custom pdf you can create, or is it just some text going across the page? The latter definitely looks possible using the links I posted.
If you want to create custom objects, maybe you can check this link (Xforms), where it talks about drawing transparent custom shapes:
This sample shows how to create an XForm object from scratch. You can think of such an object as a template, that, once created, can be drawn frequently anywhere in your PDF document.
I think that perhaps instead of having 2 PDFs (1 main and 1 watermark) it is probably going to be easier to have 1 pdf and then create the watermark either with the built-in methods or by creating an XForm object and sticking it on the pdf.
How to embed a word document into another word document via OpenXML SDK, but showing content, not an icon of word? Such, as we do it manually in word: Insert object from file -> WITHOUT checking "Dispaly as icon"?
I've found this article, but it uses an icon. I've also tried to use OpenXML SDK Productivity Tool, but shows only generated binary data.
EDITED:
I use the following code:
DrawAspect = OleDrawAspectValues.Content
and then i add image part:
var imagePart = mainDocumentPart.AddNewPart<ImagePart>("image/x-emf", imagePartId);
GenerateImagePart(imagePart);
But my image part - is just an array of bytes of word's icon.
So, in this case happens the following: when i open generated document, it shows embedded document as an icon, but when i double click this embedded document, edit it and save changes, the embedded document is shown as a content, so maybe it's possible in some way to show this content without editing embedded document? Should i use instead of array of bytes of word's icon an array of bytes of doc's screenshot?
Not sure i described it clear, so please ask
I'm afraid what you are asking for is almost impossible.
The only difference as far as the word file is concerned between the icon and the embedded file, is the image.
When you don't use a icon Word pretty much just take a screenshot of the document you are embedding and inserts that in place of the Icon graphic.
I've uploaded an example I grabbed from a Word file I made. Found this little gem in the /media folder inside the .docx file.
So basicly, your only choice in resolving this if you can't live with the Icon is to somehow grab a picture of the word-file you want to embed and insert that instead of the Icon image.
How you'd go about that can't be pretty. First of all the open xml sdk contains no such functionality. I tried playing a bit around with office interop as well, but no luck.
I only see two possible ways to achieve this.
First one is via Interop. You'll need to install a "pretend printer" like the ones that print to PDF instead of sending it to a printer. This one however needs to print to an image format. The format of the file in the Media folder was .emf but I'm not positive thats a requirement.
Anyways, should the above somehow be possible you could embed that picture, pretty much using the example you link from Microsoft, and just change this size of the "icon" which now would be an image of the document.
Second possibility would be to open the word document as a process, set the document size to 72% (or whatever makes the document be the only one on screen on your desktop) and the grab a print screen and cut it down to just the document and the use that as your image for the embedding.
For the record, I don't recommend you do any of the above, but thoose are the only options I see.
Should someone have a better solution to this I'm all ears.
Finally, should you decide that you want to push on with this, I'll be happy to code up an example of option number 2 if you reply and tell me you'd like that.
Kaspar
There is a nice wrapper API (Document Builder 2.2) around open xml specially designed to merge documents, with flexibility of choosing the paragraphs to merge etc. You can download it from here.
Using this tool you can embed a paragraph of another word document or entire word document as per your requirement.
The documentation and screen casts on how to use it are here.
Hope this helps.
I'm using itextsharp to generate the PDFs, but I need to change some text dynamically.
I know that it's possible to change if there's any AcroField, but my PDF doen's have any of it. It just has some pure texts and I need to change some of them.
Does anyone know how to do it?
Actually, I have a blog post on how to do it! But like IanGilham said, it depends on whether you have control over the original PDF. The basic idea is you setup a form on the page and replace the form fields with the text you want. (You can style the form so it doesn't look like a form)
If you don't have control over the PDF, let me know how to do it!
Here is a link to the full post:
Using a template to programmatically create PDFs with C# and iTextSharp
I haven't used itextsharp, but I have been using PDFNet SDK to explore the content of a large pile of PDFs for localisation over the last few weeks.
I would say that what you require is absolutely achievable, but how difficult it is will depend entirely on how much control you have over the quality of the files. In my case, the files can be constructed from any combination of images, text in any random order, tables, forms, paths, single pixel graphics and scanned pages, some of which are composed from hundreds of smaller images. Let's just say we're having fun with it.
In the PDFTron way of doing things, you would have to implement a viewer (sample available), and add some code over a text selection. Given the complexities of the format, it may be necessary to implement a simple editor in a secondary dialog with the ability to expand the selection to the next line (or whatever other fundamental object is used to make up text). The string could then be edited and applied by copying the entire page of the document into a new page, replacing the selected elements with your new string. You would probably have to do some mathematics to get this to work well though, as just about everything in PDF is located on the page by means of an affine transform.
Good luck. I'm sure there are people on here with some experience of itextsharp and PDF in general.
This question comes up from time to time on the mailing list. The same answer is given time and time again - NO. See this thread for the official answer from the person who created iText.
This question should be a FAQ on the itextsharp tag wiki.