Background:
I have PDF's I am programmatically generating. I need to be able to send the PDF directly to a printer from the server (not through an intermediate application). At the moment I can do all of the above (generate PDF, send to printer), but because the fonts aren't embedded in the PDF the printer is doing font substitution.
Why the fonts aren't embedded when generated:
I am creating PDF's using SQL Reporting Services 2008. There is a known issue with SQL Reporting Services in that it will not embed fonts (unless a series of requirements are met - http://technet.microsoft.com/en-us/library/ms159713%28SQL.100%29.aspx). Don't ask me why, the PDF meets all of MS's listed requirements and the fonts still show up as not embedded - there is no real control over whether the fonts are embedded, so I have accepted that this isn't working and moved on. The suggested workaround from Microsoft (Link under 'When will Reporting Services do font embedding') is to post process the PDF to manually embed the fonts.
Goal
Take an already generated PDF document, programmatically 'open' it and embed the fonts, resave the PDF.
Approach
I was pointed towards iTextSharp, but most of the examples are for the Java version and I'm having trouble translating to the iTextSharp version (I can't find any documentation for iTextSharp).
I am working on this post for what I need to do: Itext embed font in a PDF.
However for the life of me, I cannot seem to use the ByteArrayOutputStream object. It can't seem to find it. I've researched and researched but nobody seems to say what class it's in or where I find it so I can include it in the using statements. I've even cracked open Reflector and can't seem to find it anywhere.
This is what I have so far and it compiles etc. etc.
(result is my byte[] of the generated PDF).
PdfReader pdf = new PdfReader(result);
BaseFont unicode = BaseFont.CreateFont("Georgia", BaseFont.IDENTITY_H, BaseFont.EMBEDDED);
// the next line doesn't work as I need a ByteArrayOutputStream variable to pass in
PdfStamper stamper = new PdfStamper(pdf, MISSINGBYTEARRAYOUTPUTSTREAMVARIABLE);
stamper.AcroFields.SetFieldProperty("test", "textfont", unicode, null);
stamper.Close();
pdf.Close();
So can anybody either help me with using iTextSharp to embed fonts into a PDF or point me in the right direction?
I'm more than happy to use any other solutions other than iTextSharp to complete this goal, but it needs to be free and able to be used by a business for an internal application (i.e. Affero GPL).
This may not be the answer you are looking for (since you want to get your problems solved programmatically, not by an external tool).
But you can use Ghostscript commandline to embed missing fonts in retrospect to PDFs which have not embedded them:
gs \
-sFONTPATH=/path/to/fonts:/another/dir/with/more/fonts \
-o output-pdf-with-embedded-fonts.pdf \
-sDEVICE=pdfwrite \
-dPDFSETTINGS=/prepress \
input-pdf-where-some-fonts-are-not-embedded.pdf
One important thing is that the missing fonts are all available in one of the directories pointed to by the -sFontPath=... switch.
Besides Ghostscript, it is also possible to use Poppler and Cairo. There is a command pdftocairo from Poppler that converts PDF to PDF via pdftocairo -pdf input.pdf output.pdf. It also considers font substitutions set in a Fontconfig configuration file. This is very helpful if you do not have all fonts on your system that are referenced in a PDF file, but know which other font you have installed is a good-looking replacement. After processing, the substitution font is embedded.
I had this problem on a Mac with a PDF I was submitting to IEEE. Using Adobe Reader and Preview, I was able to get around this. I think any pdf printer might work in place of Preview if you are on a PC.
Here are the steps I took. You can individually fix each figure, or fix the whole document.
Open at pdf file using Adobe Reader.
Right click on image, and click “Document Properties.”
Click “Fonts.” Check to see if the font isn’t embedded. Should say “Courier” or other font name.
If your pdf isn’t a standard page size, click on “Description” and look at the page size. Write this down. Ex. 19.4 x 5.22 in.
Open the pdf up in Preview. Go to File->Print. If using a pdf that isn’t a standard page size, click on Paper Size and choose custom. You will need to create a custom page size that is equal to the one you wrote down in step 4. Don’t forget to zero the margins to 0 for all sides. After doing that, you’ll need to set the scale of the print in the print dialog to 100%.
In the lower left of the print dialog (in Preview on a Mac), click “PDF” to print the PDF to a new PDF. Select the destination and print.
Open the new pdf up in Adobe Reader and verify that the fonts are now embedded.
I hope this helps.
I had this problem today with an existing PDF I uploaded to lulu.com to make a printed copy. It was rejected for not having all fonts embedded.
I found that if I opened it in Acrobat X and Saved out as postscript .ps file, then when I double clicked this .ps file in File Explorer, it opened in Acrobat X Distiller, and this automatically created a new PDF file with all fonts embedded!
Naturally this would mean you must have all the fonts needed on your computer. Otherwise a program like InFix can make font substitutions.
Related
im needing to create a form in my C# project that just allows the user to view the pdf.
i have a way to open the pdf and read it but i need to disable features like printing, saving, highlighting, copy/pasting while maintaining the ability to search in the document
they should really just be able to open the document, read it,search for words in the document, close it
any help would be great
thanks in advanced
You could use Ghostscript to convert PDF to images and then show the images on your form or you could rasterize your PDF directly to the screen.
To use Ghostscript from .NET you can take a look at the Ghostscript.NET library (managed wrapper around the Ghostscript library).
Ghostscript Viewer C# sample that rasterizes PDF directly to the screen can be found here: https://github.com/jhabjan/Ghostscript.NET/tree/master/Ghostscript.NET.Viewer
To search for the text inside the pdf you can use iTextSharp
(Disclaimer I worked on this component at Software Siglo XXI)
If you don't want to mess with Ghostscript API and need a quick working solution to visualise the documents, you could use ImageZoom Viewer .NET. It's available for both 32 and 64 bit and is very cheap and effective. I'd recommend you to try it since it's a very fancy and fast. You can browse, scroll and print the pages from the viewer.
You can take a look here: http://softwaresigloxxi.com/ImageZoom.html
This is for quick browsing and reading. Then, when you want to use text operations, you could let the user to use Adobe Reader, launching the PDF from there.
I'm building a Word document in OpenXML with C#.
One of the fonts I must use is a custom-made branded font. This font will not be available on customer machines.
Is it possible to embed font-file within .docx file and reference that font in font styles. If yes, how can this be done within C# SDK?
So far that does not seem to be possible, but I might have missed the documentation page somewhere.
p.s. I already have PDF with embedded fonts. Now I need the same looking Word document.
Sounds like what you need is a .pdf. So unless it absolutely must be a .docx I think that's your best option.
Help on generating a .pdf in C# can be found here.
How to embed a word document into another word document via OpenXML SDK, but showing content, not an icon of word? Such, as we do it manually in word: Insert object from file -> WITHOUT checking "Dispaly as icon"?
I've found this article, but it uses an icon. I've also tried to use OpenXML SDK Productivity Tool, but shows only generated binary data.
EDITED:
I use the following code:
DrawAspect = OleDrawAspectValues.Content
and then i add image part:
var imagePart = mainDocumentPart.AddNewPart<ImagePart>("image/x-emf", imagePartId);
GenerateImagePart(imagePart);
But my image part - is just an array of bytes of word's icon.
So, in this case happens the following: when i open generated document, it shows embedded document as an icon, but when i double click this embedded document, edit it and save changes, the embedded document is shown as a content, so maybe it's possible in some way to show this content without editing embedded document? Should i use instead of array of bytes of word's icon an array of bytes of doc's screenshot?
Not sure i described it clear, so please ask
I'm afraid what you are asking for is almost impossible.
The only difference as far as the word file is concerned between the icon and the embedded file, is the image.
When you don't use a icon Word pretty much just take a screenshot of the document you are embedding and inserts that in place of the Icon graphic.
I've uploaded an example I grabbed from a Word file I made. Found this little gem in the /media folder inside the .docx file.
So basicly, your only choice in resolving this if you can't live with the Icon is to somehow grab a picture of the word-file you want to embed and insert that instead of the Icon image.
How you'd go about that can't be pretty. First of all the open xml sdk contains no such functionality. I tried playing a bit around with office interop as well, but no luck.
I only see two possible ways to achieve this.
First one is via Interop. You'll need to install a "pretend printer" like the ones that print to PDF instead of sending it to a printer. This one however needs to print to an image format. The format of the file in the Media folder was .emf but I'm not positive thats a requirement.
Anyways, should the above somehow be possible you could embed that picture, pretty much using the example you link from Microsoft, and just change this size of the "icon" which now would be an image of the document.
Second possibility would be to open the word document as a process, set the document size to 72% (or whatever makes the document be the only one on screen on your desktop) and the grab a print screen and cut it down to just the document and the use that as your image for the embedding.
For the record, I don't recommend you do any of the above, but thoose are the only options I see.
Should someone have a better solution to this I'm all ears.
Finally, should you decide that you want to push on with this, I'll be happy to code up an example of option number 2 if you reply and tell me you'd like that.
Kaspar
There is a nice wrapper API (Document Builder 2.2) around open xml specially designed to merge documents, with flexibility of choosing the paragraphs to merge etc. You can download it from here.
Using this tool you can embed a paragraph of another word document or entire word document as per your requirement.
The documentation and screen casts on how to use it are here.
Hope this helps.
I am looking for a free open source .Net based (prefer C#) pdf driver. Any idea where I can download one?
Pdf Creator
PDFCreator easily creates PDFs from any Windows program. Use it like a printer in Word, StarCalc or any other Windows application.
If you need to use created PDF file inside of your C# application, then the easiest way is to generate PDF inside this application. Then you don't need to monitor a temp folder for a new file created.
To generate PDF inside of your application you will need a PDF-generating library for C#.
For example, PDFFlow library. You can generate all the elements of PDF document (text element, paragraph, image, inline image, line, table, page number, header/footer, etc...), so you can construct "any kind of document", as you said.
Hope, this idea will help.
I have a pdf file which I want to open in a Windows Forms Application and perform following tasks-
View the pdf document
Zoom +/- document
Search Text
Highlight a specific text
Show it in a listbox/dropdown
select those words and highlight in pdf
Remove selection/Highlight.
I have tried using certain libraries like pdfSharp/iTextSharp even Acrobat Reader OCX control.
Its really bugging me..is there any help??
I'd suggest looking at some means of converting the PDF if you don't have a direct need to edit it. Even then, it may be easier to convert to a different form, make changes, and then convert back. PDF is a form of PostScript, which makes it powerful, but also makes it a mess to deal with and my personal preference is to skip that headache. Not always avoidable (had a lot of fun creating Thai support in PDF print#home ticket creation once without bloating the document beyond unusable), but highly recommended where possible.
Anyways, there are a variety of PDF conversion libraries out there, some of which may be available for .NET. Worst case, you may need to create a managed C++ layer to allow your C# code to access them.
Doesn't acrobat reader OCX already have all those features ? What exactly doesnt the OCX do that you need to do in your code ?
You might try contacting Adobe and getting their full SDK for PDF. It might have controls which you can use to solve your problem.
Come to think of it , is there even an SDK for PDF from Adobe ?
You have not mentioned your preference of using Free or Commercial PDF Viewer option. If you are open to use Commercial PSF viewer, you may evaluate SyncFusion PDF Viewer control, Telerik PDF Viewer, Dynamic PDF Viewer or TallComponents. I have checked feature set and all seem to have features you are looking for. I do not represent or promote any of these SDKs, I have used TallComponents and Dynamic PDF for PDF manipulation and both have excellent support, I would say PDF Veterans in .NET space.