Display MSWord file content in any browser - c#

I want to display content of word file in browser same like we display pdf file in browser. I don't want any plugin because if I use plugin I have to install for all browser. I want just one solution which works in all browser.
I have searched on google, but I found all link which directly download word file and open it.
Currently I am using object tag for displaying pdf file but it is not working for word file. It is showing message: The plug-in is not supported.

Using a browser plug-in (such as the free Word Viewer) is by far the easiest method, and arguably the most correct - however, there are some alternatives if you really don't want to do this:
Convert the Word document to another format (e.g. HTML/PDF) on-the-fly before the response is sent. For Word 97-2003 documents, you can do this with VSTO/Automation. For Word 2007+ documents, you can use the OpenXML SDK (although you will have to write the conversion algorithm yourself).
Use an XSL stylesheet to transform the Word markup (docx) into html/css. You can do this server-side or, potentially, with client-side scripting (JavaScript). Some useful resources here and here.

Great question. In principle, browsers only really tend to support viewing websites (e.g. html). Most, however, also support viewing PDFs, and, as you've correctly identified, you could use plugins to extend the behaviour. Crucially, though, some browsers provide document viewing with a javascript-based viewer.
I wasn't aware of it before you asked, but there are apparently javascript implementations of non-PDF document readers--for example, ViewerJS--that seem to directly support .odt. With a little digging, you might be able to find an implementation/plugin for a javascript viewer that supports .docx. However, I can't recommend one from personal experience at the moment. I would recommend searching for javascript document viewers though.

Related

How to convert office file to image

I am searching from last two days but did not find any thing.
My requirement is to create a document viewer in my web application (C#.Net) and I don't want to use any third party tool for this. Can I convert the files in image or PDF or in any common formate which can be easly render on web page. I also can not use Introp object.
Any help will be highly appreciated
You mention in one of your comments that you'd like to write all the code yourself but don't know where to start. Here's how I would go about it...
First, you'll need to familiarize yourself with the Microsoft Office Format specification. You can find that here (there's a link to the technical specification). Office documents are actually a .zip file with an XML file inside along with any binary data representing attachments. Just renamed a .docx file as .zip and you'll be able to open it up and see the XML and any other supporting documents inside (same is true for xlsx, etc...).
Then you'll need to become intimately familiar with either PDF or HTML, as your job now will be to convert the various Office document structure into PDF or HTML structure, being sure to respect page layout, margins, order, etc...
As others have said, this is a large task which is why third party tools exist today. Also, each third party toolset has it's limitation as this is really hard to "get right" in all situations and there will be edge cases that work for one document and not another (because maybe they didn't use Microsoft Word to save the .docx, maybe they used OpenOffice and OpenOffice interpreted the standard slightly differently...)
If you cannot use COM/Interop technologies in your solution, you can take a look at the specialized 3rd party options. I see that you prefer not to use them, however, there are no existing built-in solutions in the .NET Framework. Check out my answer in a similar thread that describes how to accomplish exactly the same task using 3rd party libraries (for example, DevExpress, since I have experience with it). In addition, take a look at the Documents demo, where you can see how to create images/thumbnails from different types of MS Office documents.
I believe what you need is an intermediate representation of the documents which can be converted into an image for the viewer to display.
Lets me try to explain with the below diagram:
You can use tools like smallpdf or OfficeToPDF to do that. Just integrate them into your application.
Small PDF(https://smallpdf.com/library-detail)
officetopdf (https://officetopdf.codeplex.com/)

.NET graphic libraries to display images (pdf, .docx and any other format of image) in the browser

I am developing a ASP .NET MVC application where users are able to upload files to a repository. Those files could be pdf, doc, any type of image and so on.
When the user select a file to be imported I would like to display this file in the browser so they can review its contents before the upload.
I know I could use some sort of IFrame to display pdf but I am looking for some specific class or .net libraries to implement this feature.
I just need a north.
This is an extremely difficult problem. There are some libraries that can help. For instance PDF files might be rendered to images with ghostscript. Word and Excel files might be converted to PDF or image with a number of libraries. None of them, AFAIK, are very good at it so I can not recommend one.
You could automate MSO to perform the conversion to PDF, but that is decidedly not safe for server code. Another possibility is convert source documents to SWF files (like flexpaper) and display in flash. There are some great libraries out there, but it will limit your supported clients. Sharepoint has support for providing some of this capability as well. Others have used OpenOffice to convert MSO documents but also at a loss of quality.
I can't really advise any specific direction as it is highly dependent on what you/your company is willing to spend and the desired results. Good luck.
You could try to rely on Windows and the explorer thumbnails for it, like here, but then you'd have to make sure that:
You can abuse the server in the most elaborate way (install stuff, talk to the shell from ASP.NET)
You have a thumbnail provider installed on the server for every type that you want to preview. I guess from the moment you can see the thumbnail in explorer, you're set. So for pdf, you might need to install PDF Reader from Adobe.
Docx files should be saved with thumbnail checked (see link). There seems to be no other easy, free way to convert a docx to a thumbnail. The "best" solution I came across, was saving it automatically again somehow, and making sure the thumbnail option is checked.
I don't want to say that's impossible, but it can't be done with finite effort.
What you are asking for is a browser-based solution, because you want the user to be able to "review" the document before uploading.
Therefore you cannot use a server side solution, which is essentially what is being asked by referring to a ".Net library".
.Net libraries are dependent on an installed version of .Net, which does not exist in all versions for all operating systems for which graphical browsers exist.
Next, recent changes in browser security do not allow to read the full client-side file name of the selected file in the input field.
You'd have to rely on HTML5 and its FileReader to access the file's byte stream, but even then you can only retrieve image from image files. (see sample)
Excluding browser-based solutions in Flash, ActiveX, Java, due to browser and platform support, this leaves JavaScript as the only "reasonable" solution: you'd need a library for each supported format to either convert a file into an image in an image format supported by browsers, or extract the text(+image) representation of a file.
Great awnsers... Just want to share the result of my research and I found a nice client-based solution supported by Mozilla Labs. This is a framework based on HTML5 and Javascript with no native code needed.
Here the project website:
https://github.com/mozilla/pdf.js
This is what you are capable of:
http://mozilla.github.com/pdf.js/web/viewer.html
And for the last a great video explaning how everthing works
http://www.youtube.com/watch?v=Iv15UY-4Fg8&noredirect=1
Reguarding my question we are going to converter every possible file to PDF on the server and then render this PDF using this framework.

How can I convert PDF to doc without microsoft.office.interop?

I need to convert PDF files into .doc files using C#. The computer has no file system though it doesn't have Office installed. Any good ideas how I can approach this? I did some research and most of people use the interop services.
You need to understand that PDF is not really implemented as a single document format.
If your PDF docs are created by rendering text to a PDF file, then direct PDF conversion is not only possible, but can be very good (reliable).
If the source of your PDF is either a scanner or fax (essentially a scanner...) then what you have is a document with an "picture" of text. This scenario is more difficult to deal with. If you open up the markup for this there is no 'text' to be converted. In this situation you have to deal with some manner of OCR (optical character recognition) which is less reliable due to a variety of issues.
If you have the option of intercepting the data before it is rendered to PDF (say like in SSRS or Crystal) then it would be better for you to bypass the PDF stage and move your data to a Word document.
If you are constrained to receiving faxes and then needing to interpret their content, prepare for OCR hell. It has been a while since I was there, so I hope that it has gotten better.
Even with out office installed on your machine, you have access (with Visual Studios) to the Office developer toolkit which will allow you build documents to be distributed in the Word formats.(.doc/.docx).
An option/idea may be to convert the PDF to Html, which can be opened in Word?
use aspose pdf kit to conver pdf to text and then text to doc using filestream or aspose doc

How to highlight text in Pdf Winforms C#

I have a pdf file which I want to open in a Windows Forms Application and perform following tasks-
View the pdf document
Zoom +/- document
Search Text
Highlight a specific text
Show it in a listbox/dropdown
select those words and highlight in pdf
Remove selection/Highlight.
I have tried using certain libraries like pdfSharp/iTextSharp even Acrobat Reader OCX control.
Its really bugging me..is there any help??
I'd suggest looking at some means of converting the PDF if you don't have a direct need to edit it. Even then, it may be easier to convert to a different form, make changes, and then convert back. PDF is a form of PostScript, which makes it powerful, but also makes it a mess to deal with and my personal preference is to skip that headache. Not always avoidable (had a lot of fun creating Thai support in PDF print#home ticket creation once without bloating the document beyond unusable), but highly recommended where possible.
Anyways, there are a variety of PDF conversion libraries out there, some of which may be available for .NET. Worst case, you may need to create a managed C++ layer to allow your C# code to access them.
Doesn't acrobat reader OCX already have all those features ? What exactly doesnt the OCX do that you need to do in your code ?
You might try contacting Adobe and getting their full SDK for PDF. It might have controls which you can use to solve your problem.
Come to think of it , is there even an SDK for PDF from Adobe ?
You have not mentioned your preference of using Free or Commercial PDF Viewer option. If you are open to use Commercial PSF viewer, you may evaluate SyncFusion PDF Viewer control, Telerik PDF Viewer, Dynamic PDF Viewer or TallComponents. I have checked feature set and all seem to have features you are looking for. I do not represent or promote any of these SDKs, I have used TallComponents and Dynamic PDF for PDF manipulation and both have excellent support, I would say PDF Veterans in .NET space.

How can I programmatically create PDF bookmarks from PDF file?

So, I have used Pdf995's PDF print driver from a web browser to print web pages and eventually use PdfEdit995 to join these various PDF files into one large PDF.
Now I have a lot of large PDF documents that I wish to add bookmarks to, but am hoping there is a relatively easy way of doing this programmatically (using C#, preferably) - basically, I want to find, within each PDF, text that is large enough to qualify as a header, and use that text as the bookmark.
Any tips/advice/direction? Thanks!
It's definitely possible to do this, but I would recommend finding a PDF library that does most of the leg work. Technically you could do it all yourself with the aid of the PDF specification, but that'd probably take more time than it's worth.
The library will need to be able to let you find text in a document and then return the page and size, font, etc, of the text and create bookmarks (also known as outlines) based on that information programmatically.
My companies product, Quick PDF Library, can help you do this and so can PDFKit.NET. I'm sure there are other libraries out there that support this functionality too. As far as free libraries go, from what I've seen I don't believe that PDFSharp or iText will meet all of your requirements in this case, but I'm sure someone will correct me if I'm wrong.
If you'd prefer to develop a solution for this entirely yourself, then the PDF reference is available online for free.

Categories