My program takes a word doc and manipulates it with the Codeplex DocX open source app. That works great.
Now I need to print it. I've looked for a few hours and I haven't found a good way to print the PDF version of the file. I even tried to use AcroRd32.exe and it's just plain clunky and not really usable for a serious application.
I do have it printing with the Word.Interop but that is tying me down to a specific version of Word, more specifically, the version I have on my machine. That makes the lower versions that our customers use not work and the devs cannot compile if they aren't on 2010.
I need a way to print either a pdf or a word doc (2003 or greater) seamlessly without being prompted with each document like Acrobat Reader is doing.
Anyone have any suggestions?
Thanks!
I've used the following library for printing PDFs in past projects:
http://www.debenu.com/products/development/debenu-pdf-library/
They have a free and professional (commercial) version. It's a great library and well worth the minor expense.
Related
I'm writing a program that modifies word documents. Currently I have used Microsoft.Office,Interop.Word to work with Word document and it requiers Microsoft Office to be installed on users computer, but some my clients don't have MS Office, but they have Open Office.
So, which library should I use instead of Interop?
and also how can I make my code to be able to work with different word files, not only .doc and .docx, but also with other office program files?
currently I'm writing different code for every type of the document..
My program translates the documents from its original language to another, so it is very important for me to keep the formatting of the document in original format, that's why I used Interop.. but also I want my program to be useful for as many people as possible
I think you are not mentioning but, are you assuming all your clients use the same version of Office. To solve the issue of the office versions, you may want to look at this open source project: NetOffice http://netoffice.codeplex.com/ and do all your .doc and .docx file formats development in using that library.
For the OpenOffice or LibreOffice, I believe the best you can do is going into the projects website and download the SDK. For example, go here: http://api.libreoffice.org/examples/examples.html and you will find some examples in Java, Python, C++ to edit Text Document including odt files.
LibreOffice SDK download here: http://www.libreoffice.org/download/
And finally, there is also the OpenXML format (mentioned on another answer) which is:
ECMA Office Open XML ("Open XML") is an international, open standard for word-processing documents, presentations, and spreadsheets that can be freely implemented by multiple applications on multiple platforms.
And you can download also its SDK here: http://msdn.microsoft.com/en-us/office/bb265236.aspx
Hope that helps.
You will likely end up writing separate code to work with each file type. There may be some similarities within, say, Office products, but for the most part you're going to need an adapter for each type.
However, you could (and should) minimize the amount of duplicate code by placing the translation logic and other non-type-specific functions in a shared library that each adapter would then reference.
We are using aspose words. This supports DOC, DOCX, RTF and OOXML.
But it's not free.
I am using Foxit SDK to extract the text from Pdf document .
Everything is okay but when I extract a pdf in other languages rather than English I don't get the correct output .
I have also used PDFBox in java but that gives me the worst output, output from Foxit SDK is better than PDFBox.
Are there ant other libraries which can solve the issue..?
Or there is some other solution.
Personally if you want it done right you have to pay for it. ComponentOne has a PDFViewer for WPF. Not sure what framework your working with since your tag is missing one.
ComponentOne PDF Viewer for WPF
You might want to try the trial version of Quick PDF Library to see how it performs on your documents. http://www.quickpdflibrary.com
QP.GetPageText(7) or GetPageText(8) returns pretty good results for most PDF files.
Andrew.
Disclaimer: I do some consulting work for Quick PDF Library.
If you are on windows, you can use the IFilter that adobe provides. Me, I used the IFilter adobe provides with the adobe reader 8.
Here is a link to the exact example I used
http://www.codeproject.com/Articles/13391/Using-IFilter-in-C
The performance was okay (I think. I haven't used many other methods). Takes about 15 sec for a 400 page PDF.
would anyone suggest a free solution to programmatically convert Office documents (mostly .doc) to PDF in the form of a .NET library or a command-line application i can call from my program? Thanks
PS: I know I can use SaveAs PDF in newer versions of Office, but some of the clients where the program will run still have older versions of Office.
Won't GhostScript (GhostScript Website) do that for you? Otherwise, I think, under reserves, that PDFSharp might do it. If these won't do, I hope that this one will: PDFCreate. In fact, after a closer look, if Ghostcript won't do, I would perhaps consider trying PDFCreate as it provides some sample code on the wbesite I linked for it.
You might also want to consult Wikipedia on the topic: List of PDF software
You can maybe use something like PrimoPDF which basically installs a printer that when you print to it, creates a PDF document. I've never actually called it command line but since it's just another printer, any standard print code would work.
Cody
I am planning on generating a Word document on the webserver dynamically. Is there good way of doing this in c#? I know I could script Word to do this but I would prefer another option.
I've worked at a company in the past that really wanted generated word documents, in the end they were perfectly satisfied with RTF docs that had a ".doc" extension. Word has no problem recognizing and opening them.
The RTF docs were generated with iText.net (free .net library), the API is pretty easy to use, performs extremely well, you don't need word on the machine, also, you could extend to generating PDF, HTML, and Text docs in the future with very little effort. After four years the solution I created is still in place, so that's a little testimony in iText.net's favor.
It looks like the official iText page suggests that iText Sharp is the best .Net choice right now, so that's another option
You'd be better off generating an rtf file, which word will know how to open.
If want to generate Office 2007 documents check the Open XML File Formats, they're simple zipped XML files, check this links:
Open XML File Formats: What is it, and how can I get started?
Introducing the Office (2007) Open XML File Formats
Edit: Check this project, can serve you as a good starting point:
DocumentMaker
Seems very simple and customizable, look this code snippet:
Paragraph p = new Paragraph();
p.Runs.Add(new Run("Text can have multiple format styles, they can be "));
p.Runs.Add(new Run("bold and italic",
TextFormats.Format.Bold | TextFormats.Format.Italic));
doc.Paragraphs.Add(p);
Word will quite happily open a HTML with a .doc extension. If you include an internal style sheet, you can have it fully formatted. There was previous post on this subject:
Export to Word Document in C#
Creating the old .DOC files (pre-Word 2007) is nigh-impossible without Word itself. The format is just too complex. Microsoft has released the format description, but it's enough to reduce a grown programmer to tears. There is a reason for that too (historical), but that doesn't make things better.
The new .DOCX would be easier, although quite a bit of hassle still. However depending on which Word versions you are targeting, there are some other options too.
For one, there is the classic .RTF. The format is pretty complex still, yet well documented and has strong support across many applications and platforms. And you might use some string-replacing into template files to make things easier (it's non-binary).
Then there are the "old" Word XML files. I think they worked starting with Word XP. Kinda the predecessors of .DOCX. I've used them, not bad. And the documentation is pretty OK.
Finally, the easy way that I would choose, is to make a simple HTML. Word can load HTML files just fine starting with version 2000. In the simplest way just change the extension of a HTML file to .DOC and you have it. You can also add a few word-specific tags and comments to make it look even better in Word. Use the Word's Save As...HTML option to see what they are.
There are third party libraries about that will do the job.
Doing a quick google came up with this one, for example.
I haven't tried any, so I can't give you specific advice, I'm afraid!
Let us know how you get on...
In Office 2007 Microsoft introduced a new file format called the Microsoft Open Office XML Format (.docx). This format is not compatible with older versions of Microsoft Word. Since this is XML you can create or read with out having a Word installed.
Here is the component that generates document based on the custom template. The documents are generated from the sharepoint list ... so the data is pulled from the list item into the document on the fly:
http://store.sharemuch.com/products/generate-word-documents-from-sharepoint-list
Hope that helps,
Yaroslav Pentsarskyy
Blog: www.sharemuch.com
See title...
No.
You can use WordML (Word XML)
Word 2007 version
You can create Word 2007 documents using its XML format without the need of installing Word in your server.
This can be a starting point.
I've already +1'd Mitch's reply, but as an aside: Word isn't even supported for use in service applications; it is designed to be user-interactive. So installing Word, even if it worked, wouldn't leave you in a great place.
If you're just generating the documents from scratch the solutions so far proposed work well. My situation was that I had an existing template that I needed to use and substitute in my own text in a few places (mail merge, if you will). This was several years ago - prior to Office 2007 - but we ended up going with the Aspose library of components for this. I've used the Words and Cells (Excel) components to generate documents from templates and spreadsheets on the fly to download from web sites. The interfaces are a little clunky and can be inconsistent between the various products. The installer, frankly, is awful, but the products work pretty well and made it much easier to do what needed to be done.
Word recognizes rtf as intrinsic, and if your intended document can be constructed as whatever.rtf - which for all of its fancy formatting is plain ASCII markup - then you shd be able to write the document without Word installed.
To get the picture, create an example document and save it as an rtf file. Then view that file with an ascii text editor (like Notepad). You'll have to learn rtf syntax, but there's at least one handbook around on that.
AS
Just to add another potential solution for you, OfficeWriter is a Word/Excel API that lets you create documents and spreadsheets in ASP.NET without using Office:
http://www.officewriter.com