I have been reading a lot of questions about convert doc files to pdf but I haven't read any response which solve my problem.
I tried ASPOSE, which is really good for what we want but it is really expensive and my boss doesn't want to spend a lot of money.
I need to open a docx file, manipulate it and save as pdf. My boss doesn't want the system save the file as docx and then convert to pdf.
Anyone has a simple solution to do that?
Thank you in advance.
PS: We have abcpdf and asppdf components but I didn't find any documentation about open a pdf file and save it as doc
If your boss wants to open a .DOC and save as .PDF then maybe Word or Word automation will help.
Newer versions of Microsoft Word are able to produce PDFs.
EDIT
Here are some links to sample code:
How do I convert Word files to PDF programmatically? (see accepted answer)
Word Doc to PDF Conversion. Command line using VBScript and automation
You can use iTextSharp to read the content and manipulate it then use openxml sdk to create word document from the read information.
Openxml SDK:
http://openxmldeveloper.org/
Related
I want with c# convert pdf(with images) to doc.
I looking for free solution, i tried so much, but yet i not found.
I will be happy if you will give me a simple ,work and free way.
thanks
Same post here:
Convert a pdf file to word document
You can try the solution I provided but note that this free version is limited to 10 pages of PDF file and you can only get the first 3 pages to word when converting PDF files.
My objective is to make an automated server-side process to turn a .ppt into a .pdf. Microsoft themselves suggested that I use OpenXML, and now I'm looking at that.
My question is:
Can I actually achieve my objective using OpenXML?
I'm having a hard time finding the methods that I'd expect, such as "save as" here
Or perhaps I'm just misunderstanding how it all works?
... to turn a .ppt into a .pdf. Microsoft themselves suggested that I use OpenXML ... Can I actually achieve my objective using OpenXML?
For the conversion of a .ppt into .pdf? I'm curious to see where you have read this ;-)
No It's just impossible using OpenXml SDK:
OpenXml SDK permits to create, modify OpenXml documents (.pptx in case of PowerPoint) and here you are talking about .ppt (Biff format)
There is NO method for converting as PDF. OpenXml SDK permits to retrieve, create, modify the content of the document Without an Office Application but DOES NOT contain any methods to render it, or such Office Application methods such as SaveAs() ...
No, a common way to convert Office documents as pdf is to use Office.Interop.
This thread How do I convert Word files to PDF programmatically? is related to Word but it can help you, it's the same with PowerPoint.
In my program I generate some reports in FlowDocument and display it with DocumentViewer control.
Now I need to add more export opportunities. I use iTextSharp to export in PDF, and I can save to XPS natively. Can I save a document directly to any office formats, DOC or XLS. Or maybe someone knows of a good library for converting from PDF / XPS in DOC or XLS?
I found a solution. As I can't export to Doc from WPF automatically, I reproduced my page layout with DocX Library. This is really awesome and simple library that don't required MSOffice installed to create Word 2007/2010 files.
I'm not sure if you're looking for an answer, so I'll be brief. You can use the Microsoft Interop assemblies to create Word documents. It's no menial task, but in my opinion, it's easier than using iTextSharp. They come with Visual Studio.
To create XPS documents, you'll need to generate FixedDocument objects from your FlowDocument, but from there, it's only a few lines of code. Eric Sink has a nice article that you can find here. This is also mentioned in this question posted here.
I need to convert PDF files into .doc files using C#. The computer has no file system though it doesn't have Office installed. Any good ideas how I can approach this? I did some research and most of people use the interop services.
You need to understand that PDF is not really implemented as a single document format.
If your PDF docs are created by rendering text to a PDF file, then direct PDF conversion is not only possible, but can be very good (reliable).
If the source of your PDF is either a scanner or fax (essentially a scanner...) then what you have is a document with an "picture" of text. This scenario is more difficult to deal with. If you open up the markup for this there is no 'text' to be converted. In this situation you have to deal with some manner of OCR (optical character recognition) which is less reliable due to a variety of issues.
If you have the option of intercepting the data before it is rendered to PDF (say like in SSRS or Crystal) then it would be better for you to bypass the PDF stage and move your data to a Word document.
If you are constrained to receiving faxes and then needing to interpret their content, prepare for OCR hell. It has been a while since I was there, so I hope that it has gotten better.
Even with out office installed on your machine, you have access (with Visual Studios) to the Office developer toolkit which will allow you build documents to be distributed in the Word formats.(.doc/.docx).
An option/idea may be to convert the PDF to Html, which can be opened in Word?
use aspose pdf kit to conver pdf to text and then text to doc using filestream or aspose doc
I want to read tables which are in a PDF document and I want to store these values in a Database.
What I have found so far through searching the web:
Read text from PDF using abcpdf .net, which is freeware available. But it's not right solution because I want to read the tables.
Convert PDF document into Excel/Word. Tables will come in the target document as it is. Word conversion is possible by using EasyPDF Converter which is third party tool which is much cheaper than the other solution available in other tool which converts PDF into Excel.
But I am looking for any other solution/API classes which can convert PDF into Excel.
There are 2 possible solutions
a) Cometdocs makes a free online conversion from PDF to XLS surprisingly good and send for your email the result file.
b) Cognview is a comertial shareware that converts PDF to XLS. There is OCR and text version. I didn't use personally, but they have good recomendations.
If you are looking to upload your data into a database, converting your PDFs to CSV is probably the safest option. The PDFTables API will allow you to do this with C#, converting as many PDFs at once as necessary. https://pdftables.com/pdf-to-excel-api#csharp
You can try to use Quablo, a PDF table extractor available at this web page (link updated/corrected).