Read word documents from C# and Display it Inline in browser - c#

Q 1.
How can I read MS-Word documents(doc and docx) from C# without MS Office installed. I was able to read unformatted text using stream reader. I think I can use OpenXML for docx. But what about doc? Is there some open source solution to handle it? Is using OLE32dll an option in unlicensed scenario?
Is use of IFilter a solution? havent seen anywhere any samples using it though and also not sure about its support in windows 7 and 8.
EDIT : I stumbled upon this solution and found it acceptable for my situation
Q 2.
I need to display the doc and docx files in my Webpage as Inline or in a partial page or even iframe. How is that possible? Is COM interoperablity the only solution to it too?

Maybe you can use the redistributable Interop Assemblies from Microsoft, to read your ".doc" :
http://www.microsoft.com/en-us/download/details.aspx?id=3508
It doesn't require Office according to the description.

Related

How to convert Word documents (XML based) to PDF, with C#?

I have to do some automation of converting Word documents to PDF. By doing some research, I found that starting from Microsoft Office 2007, Word documents are XML based. Furthermore, I found that there is a free solution ApacheFOP doing conversion from XML to PDF, however, I still didn't manage to find the way to automate it with C#. There is nFOP (version that runs on the .NET framework), but some detailed explanation of implementing it, not really.
You could use docx4j.NET
That's a .NET version of docx4j, which is a Java library which converts docx to PDF using FOP.
See ConvertOutPDF.java
Before you go to the effort of downloading etc, you might want to use the online demo to see whether the PDF output is close to your needs.
**Disclosure: I lead the docx4j project. **
An ugly solution would be to make a "save as" using microsoft office interop...
Read more here
And find the related stackoverflow post here
I have found one library that can convert XML to PDF in C#/.NET and vice versa known as Aspose.PDF for .NET . I hope it will solve your problem.

Read/Write/Save MS Word Document in c#

I have to open a word document using c# and do some changes in that document and save it again. Document will have lot of tables and styling. For example I have to process that document page by page. I have to change all italics to normal and all caps to small letters and save to that document only the changes without affecting the styling and alignment or format of the document.
Is that possible in c# .net? Please let me know it there any tutorial available based on my requirement. Basically I am a Java developer recently moved to c#. I have googled for past 2 days, I didn't get any proper data.
Personally, I use Aspose.NET. But that component is not free. If you need something free, I can also recommend the Microsoft Open XML Library:
http://blogs.msdn.com/b/ericwhite/archive/2008/04/22/using-the-open-xml-sdk.aspx
I would not use Office Interop as Jim suggested. It's not very stable for server.

Open PDF and print to PDF programmatically C#

I am developing an application that is able to open and display PDFs after I open them and print them to another PDF using CutePDF, but the originals are not viewable.
I am looking for a way to programmatically open a PDF file, and print to another PDF file (not necessarily using CutePDF, just printing to another PDF is the desired functionality).
This will be integrated into a C# .NET project. Are there any suggestions how to go about doing this?
Thanks.
You could use Office Interop and generate the PDF, when you say "print to another pdf", I imagine you mean just generate? Or are you saying spool them to a pdf print driver that essentially will just create a PDF to be saved.
Use iText, which is available in Java and C# versions. I have used the Java version successfully. I recommend the iText in Action book to help you get up to speed with iText faster. The book discusses only the Java API, but I imagine you will be able to learn the principles of iText from the book and then figure out the minor differences for the C# version.
To implement this you can use PDFFlow library for generating PDF files from C#. It has easy fluent syntax and many features.
Here are many examples of real complex PDF documents: examples
Good luck :)

How to create a Online Word Processor also [ HTML/XML to .doc conversion in server using .net]

I'm interested in knowing to create a Online Word processor similar to Google Docs and MS Office web Apps. i want to do it using MicroSoft technologies and Tools only. I'm a beginner in ASP.net and C#.net. I've planned to do its front end using TinyMCE [ http://tinymce.moxiecode.com/tryit/full.php ]. but how to convert the data in the browser to .doc in the server? how can i do the formatting of a .doc file in the server using .net? what are the tools available in .net to work on such kind of projects? Thanks in Advance.
Use OpenXML to generate word docs. This is of course for Word 2007/2010, not 2003. PLenty of documentation on how to do it. You can reverse a word doc by changing extension from .docx to .zip then extracting the files and viewing them in notepad.
thinking about it more, you might want to create an XSLT to translate the html markup to OpenXML. But this is a lot of work (might already be available somewhere on the net) so you might try a 3rd party tool as suggested below.
There are a variety of third-party libraries, such as Aspose, that can do this.
I don't think you'll find any good free ones.
You can generate .docx (OpenXML) files using the OpenXML SDK.

How to highlight text in Pdf Winforms C#

I have a pdf file which I want to open in a Windows Forms Application and perform following tasks-
View the pdf document
Zoom +/- document
Search Text
Highlight a specific text
Show it in a listbox/dropdown
select those words and highlight in pdf
Remove selection/Highlight.
I have tried using certain libraries like pdfSharp/iTextSharp even Acrobat Reader OCX control.
Its really bugging me..is there any help??
I'd suggest looking at some means of converting the PDF if you don't have a direct need to edit it. Even then, it may be easier to convert to a different form, make changes, and then convert back. PDF is a form of PostScript, which makes it powerful, but also makes it a mess to deal with and my personal preference is to skip that headache. Not always avoidable (had a lot of fun creating Thai support in PDF print#home ticket creation once without bloating the document beyond unusable), but highly recommended where possible.
Anyways, there are a variety of PDF conversion libraries out there, some of which may be available for .NET. Worst case, you may need to create a managed C++ layer to allow your C# code to access them.
Doesn't acrobat reader OCX already have all those features ? What exactly doesnt the OCX do that you need to do in your code ?
You might try contacting Adobe and getting their full SDK for PDF. It might have controls which you can use to solve your problem.
Come to think of it , is there even an SDK for PDF from Adobe ?
You have not mentioned your preference of using Free or Commercial PDF Viewer option. If you are open to use Commercial PSF viewer, you may evaluate SyncFusion PDF Viewer control, Telerik PDF Viewer, Dynamic PDF Viewer or TallComponents. I have checked feature set and all seem to have features you are looking for. I do not represent or promote any of these SDKs, I have used TallComponents and Dynamic PDF for PDF manipulation and both have excellent support, I would say PDF Veterans in .NET space.

Categories