Embedding Pdf with OpenXml in PowerPoint fails for newer versions

Embedding Pdf with OpenXml in PowerPoint fails for newer versions - c#

I need to progammatically embed Pdf documents in PowerPoint via OpenXml. According to this: Embedding files into Open XML documents using C# it is possible via OLE32.StgCreateStorageEx methods to create the necessary picture as well as the oleobject.
Unfortunately this doesn't work with current versions of PDF. On a 64 bit OS, this seems to work only with Abobe version 9.Higher version fails with error code 0x8000FFFF which translates to Catastrophic failure. This is actual true after testing it. Even the version 9 does not work reliable.
As a fallback, I used pdfium by google to create a png from the first page. This unluckily is only half the way, as the incorporated oleobject is very different from the original one. That does not hurt until the user tries to open the embedded document via doubleclick within Powerpoint. Then an error message comes up, saying the application of the document cannot be found.
Here my questions:
Has anyone information about how to improve the procedure to make it working even with never versions?
Does anybody know what the changes to the pdf document are that are necessary to incorporate a similar object like pdf does it?
Any hint is highly appreciated

Finally I made it running. Have a look here for explanation.
Actually there is only one difference compared to the code in Embedding files into Open XML documents using C# . When calling, StgCreateStorageEx OLE32.STGFMT.STGFMT_DOCFILE has to be used instead of STGFMT_STORAGE.
That makes it running even with newer Adobe versions.

Related

C# WPF Anyway to Embed a PDF within an Exe?

Having read up on this issue for a few hours, I am coming round to the idea that it is not possible to embed a PDF file into the Executable itself - that the only way it happens is by having an "associated" folder that you have to pass along with the executable.
However, before I entirely scrap the idea, I thought I would check here to see if anyone knows of a workaround?
I am not looking for the user to open the PDF in a PDF reader, I am using WebBrowser to display the PDF - which works fine - but when compiling into an Executable using Fody - it produces a Folder called "Resources" and if I move the standalone Executable elsewhere (despite being almost exactly the size of the PDF larger than a compiled executable without the PDF) it says it cannot find the PDF.
I was wondering whether the workaround is have a URI that is in a different location to the Resource folder? - but I don't know how to do that and have not been able to find a way of doing it.
The PDF is a static document - instructions - that is unchanging.
I have read a LOT of responses on here - it is what I have been doing for the past few hours - but none of them really answer this question.
Can you embed a PDF for display in a WPF application and have a standalone executable?
Steps I have taken:
Bring the PDF into the WPF C# .NET Framework 4.7.2 application as a Resource using the Resource.resx system.
Marking it as Embedded Resource/Resource/Content (I tried all 3)
Marking it as Copy Always/Never/If Newer (I tried all 3)
Code Block
XAML
<WebBrowser x:name="pdfWebviewer"/>
Code Behind
pdfWebviewer.Navigate("blank::about")
pdfWebviewer.Navigate("pack://application:,,,/Test.pdf")
This code is from memory so forgive me if there is an error in capitalisation or something.
This code will display the PDF - but only if you have the Resources folder in the same location as the Executable - and if I am going to do that, I might as well just give people the PDF...
If anyone knows a workaround would appreciate it.

PostScript - Error when Using Ghostscript "pdfwrite"

I want to preface this with the understanding that I am working with legacy code and thus I am having to live with less than ideal situations and am doing some quirky stuff because of that. Until I can get approval to rewrite, I will have to make due.
Context
Here is my situation. The application is a "simple" one in that it reports off of a SQL database. For better or for worse it builds its reports with postscript. It make use of Ghostscript dlls in which it has embedded into the application directory. Here is the kicker, it has been requested that I include SSIS reports whose output is already in PDF format. For compatibility sake, i need to convert these PDFs into postscript even though in most situations they will be converted right back to PDF later on. I know this is most likely bad design but there is certain functionality that requires this and it just is what it is for the time being. I am using GhostScript to handle the conversions.
Observed Behavior
The following behavior is what is observed once the PDF is converted to PS, passed through the application, and then converted back to PDF.
When using the "sDevice=pswrite" everything works except that the reports are compiled with poor resolution despite how I tweek the resolution option.
When leveraging "sDevice=ps2write" which I understand to be the current accepted protocol, the PDF will not render back and produces the following error.
ERROR:
undefined
OFFENDING COMMAND:
U1!‘WVt92\a
STACK:
--nostringval--
20
The above error is only produced when using a report from a report server that is accessed via web client. I can confirm that the PDF returns successfully and is not corrupt.
When running local SSIS packages on the application the produced PDF is able to be handled successfully.
When the origional PDF is converted to PS using PS2Write the comments are populated as follows
%!PS-Adobe-3.0
%%BoundingBox: 0 0 612 792
%%Creator: GPL Ghostscript 905 (ps2write)
%%LanguageLevel: 2
%%CreationDate: D:20171003154139-05'00'
%%Pages: 3
%%EndComments
pswrite produces
%!PS-Adobe-3.0
%%Pages: (atend)
%%BoundingBox: 21 30 761 576
%%HiResBoundingBox: 21.600000 30.400000 760.566016 575.100000
%.....................................
%%Creator: GPL Ghostscript 905 (pswrite)
%%CreationDate: 2017/10/03 15:53:40
%%DocumentData: Clean7Bit
%%LanguageLevel: 2
%%EndComments
%%BeginProlog
Suspicion
I am suspecting that either the PDF is in an incompatible standard that cannot be converted to PostScript. For example, a newer PDF version that cant be handled. Or perhaps it contains something that is incompatible such as a font or img.
Is there anyway to hunt this down for sure? Has anyone come across similar situations and what was the solution? Any pointers as to what to look into or things to try?

To be honest, nobody is likely going to be able to help without seeing the original PDF file. Even a dummy file will be fine provided it exhibits the error.
However, the first thing that springs to mind is that you appear to be using Ghostscript 9.05. That is now 5 years old, the current release is (about to be) 9.22. There have been numerous fixes to ps2write in that time, at least 50 or more, and the first thing I would suggest you do is upgrade and see if the problem goes away.
Secondly, you haven't been clear on why you need to convert the PDF files to PostScript. If all you are doing is feeding those back through Ghostscript along with some additional PostScript in order to convert the assemblage into PDF, you do not need to turn the PDF files into into PostScript first. Ghostscript is entirely capable of taking a mixture of PDF and PostScript files, so you can simply inject the PDF in between the PostScript from your SQL output to produce a single combined PDF.
This has a number of advantages; first and most obviously, you shouldn't get your conversion problem. Secondly, any construct in the PDF file which cannot be represented in PostScript (eg transparency) means that the content will be rendered to an image and the PostScript will simply contain a big bitmap. Just like the pswrite output, avoiding conversion means that won't happen. Thirdly it will be quicker than first converting all the PDF files to PostScript.
If you absolutely can't do that, then I would try current code and see if its better. If not then you have found a bug and I would suggest you report it at https://bugs.ghostscript.com you will need to be able to supply an example file and command line though.

.NET graphic libraries to display images (pdf, .docx and any other format of image) in the browser

I am developing a ASP .NET MVC application where users are able to upload files to a repository. Those files could be pdf, doc, any type of image and so on.
When the user select a file to be imported I would like to display this file in the browser so they can review its contents before the upload.
I know I could use some sort of IFrame to display pdf but I am looking for some specific class or .net libraries to implement this feature.
I just need a north.

This is an extremely difficult problem. There are some libraries that can help. For instance PDF files might be rendered to images with ghostscript. Word and Excel files might be converted to PDF or image with a number of libraries. None of them, AFAIK, are very good at it so I can not recommend one.
You could automate MSO to perform the conversion to PDF, but that is decidedly not safe for server code. Another possibility is convert source documents to SWF files (like flexpaper) and display in flash. There are some great libraries out there, but it will limit your supported clients. Sharepoint has support for providing some of this capability as well. Others have used OpenOffice to convert MSO documents but also at a loss of quality.
I can't really advise any specific direction as it is highly dependent on what you/your company is willing to spend and the desired results. Good luck.

You could try to rely on Windows and the explorer thumbnails for it, like here, but then you'd have to make sure that:
You can abuse the server in the most elaborate way (install stuff, talk to the shell from ASP.NET)
You have a thumbnail provider installed on the server for every type that you want to preview. I guess from the moment you can see the thumbnail in explorer, you're set. So for pdf, you might need to install PDF Reader from Adobe.
Docx files should be saved with thumbnail checked (see link). There seems to be no other easy, free way to convert a docx to a thumbnail. The "best" solution I came across, was saving it automatically again somehow, and making sure the thumbnail option is checked.

I don't want to say that's impossible, but it can't be done with finite effort.
What you are asking for is a browser-based solution, because you want the user to be able to "review" the document before uploading.
Therefore you cannot use a server side solution, which is essentially what is being asked by referring to a ".Net library".
.Net libraries are dependent on an installed version of .Net, which does not exist in all versions for all operating systems for which graphical browsers exist.
Next, recent changes in browser security do not allow to read the full client-side file name of the selected file in the input field.
You'd have to rely on HTML5 and its FileReader to access the file's byte stream, but even then you can only retrieve image from image files. (see sample)
Excluding browser-based solutions in Flash, ActiveX, Java, due to browser and platform support, this leaves JavaScript as the only "reasonable" solution: you'd need a library for each supported format to either convert a file into an image in an image format supported by browsers, or extract the text(+image) representation of a file.

Great awnsers... Just want to share the result of my research and I found a nice client-based solution supported by Mozilla Labs. This is a framework based on HTML5 and Javascript with no native code needed.
Here the project website:
https://github.com/mozilla/pdf.js
This is what you are capable of:
http://mozilla.github.com/pdf.js/web/viewer.html
And for the last a great video explaning how everthing works
http://www.youtube.com/watch?v=Iv15UY-4Fg8&noredirect=1
Reguarding my question we are going to converter every possible file to PDF on the server and then render this PDF using this framework.

C# free Doc 2 PDF solution

would anyone suggest a free solution to programmatically convert Office documents (mostly .doc) to PDF in the form of a .NET library or a command-line application i can call from my program? Thanks
PS: I know I can use SaveAs PDF in newer versions of Office, but some of the clients where the program will run still have older versions of Office.

Won't GhostScript (GhostScript Website) do that for you? Otherwise, I think, under reserves, that PDFSharp might do it. If these won't do, I hope that this one will: PDFCreate. In fact, after a closer look, if Ghostcript won't do, I would perhaps consider trying PDFCreate as it provides some sample code on the wbesite I linked for it.
You might also want to consult Wikipedia on the topic: List of PDF software

You can maybe use something like PrimoPDF which basically installs a printer that when you print to it, creates a PDF document. I've never actually called it command line but since it's just another printer, any standard print code would work.
Cody

How to highlight text in Pdf Winforms C#

I have a pdf file which I want to open in a Windows Forms Application and perform following tasks-
View the pdf document
Zoom +/- document
Search Text
Highlight a specific text
Show it in a listbox/dropdown
select those words and highlight in pdf
Remove selection/Highlight.
I have tried using certain libraries like pdfSharp/iTextSharp even Acrobat Reader OCX control.
Its really bugging me..is there any help??

I'd suggest looking at some means of converting the PDF if you don't have a direct need to edit it. Even then, it may be easier to convert to a different form, make changes, and then convert back. PDF is a form of PostScript, which makes it powerful, but also makes it a mess to deal with and my personal preference is to skip that headache. Not always avoidable (had a lot of fun creating Thai support in PDF print#home ticket creation once without bloating the document beyond unusable), but highly recommended where possible.
Anyways, there are a variety of PDF conversion libraries out there, some of which may be available for .NET. Worst case, you may need to create a managed C++ layer to allow your C# code to access them.

Doesn't acrobat reader OCX already have all those features ? What exactly doesnt the OCX do that you need to do in your code ?
You might try contacting Adobe and getting their full SDK for PDF. It might have controls which you can use to solve your problem.
Come to think of it , is there even an SDK for PDF from Adobe ?

You have not mentioned your preference of using Free or Commercial PDF Viewer option. If you are open to use Commercial PSF viewer, you may evaluate SyncFusion PDF Viewer control, Telerik PDF Viewer, Dynamic PDF Viewer or TallComponents. I have checked feature set and all seem to have features you are looking for. I do not represent or promote any of these SDKs, I have used TallComponents and Dynamic PDF for PDF manipulation and both have excellent support, I would say PDF Veterans in .NET space.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.