I am searching from last two days but did not find any thing.
My requirement is to create a document viewer in my web application (C#.Net) and I don't want to use any third party tool for this. Can I convert the files in image or PDF or in any common formate which can be easly render on web page. I also can not use Introp object.
Any help will be highly appreciated
You mention in one of your comments that you'd like to write all the code yourself but don't know where to start. Here's how I would go about it...
First, you'll need to familiarize yourself with the Microsoft Office Format specification. You can find that here (there's a link to the technical specification). Office documents are actually a .zip file with an XML file inside along with any binary data representing attachments. Just renamed a .docx file as .zip and you'll be able to open it up and see the XML and any other supporting documents inside (same is true for xlsx, etc...).
Then you'll need to become intimately familiar with either PDF or HTML, as your job now will be to convert the various Office document structure into PDF or HTML structure, being sure to respect page layout, margins, order, etc...
As others have said, this is a large task which is why third party tools exist today. Also, each third party toolset has it's limitation as this is really hard to "get right" in all situations and there will be edge cases that work for one document and not another (because maybe they didn't use Microsoft Word to save the .docx, maybe they used OpenOffice and OpenOffice interpreted the standard slightly differently...)
If you cannot use COM/Interop technologies in your solution, you can take a look at the specialized 3rd party options. I see that you prefer not to use them, however, there are no existing built-in solutions in the .NET Framework. Check out my answer in a similar thread that describes how to accomplish exactly the same task using 3rd party libraries (for example, DevExpress, since I have experience with it). In addition, take a look at the Documents demo, where you can see how to create images/thumbnails from different types of MS Office documents.
I believe what you need is an intermediate representation of the documents which can be converted into an image for the viewer to display.
Lets me try to explain with the below diagram:
You can use tools like smallpdf or OfficeToPDF to do that. Just integrate them into your application.
Small PDF(https://smallpdf.com/library-detail)
officetopdf (https://officetopdf.codeplex.com/)
Related
First of all I want to say hi to the programming community, what I am looking for is a way to generate a report from my Windows Forms Application in word preferably, this report is basically a list of pre-configured days in a tour creation software I am creating.
I have searched everywhere and I cant seem to find information on how to start creating the report, I have all the information saved into a database, I just need to be able to get this information into word and ordered as it should be ordered.
I just want to be pointed in the right direction so I can research on it even further.
The exact thing I want to create is a word file that I wish I could share here so you can actually see what I mean.
Thank you for all your attention and help if possible.
I can point you in the right direction. Word documents are stored in a format called OpenXml which can be created and manipulated without actually using Word directly. That's good because you don't want to deal with code that actually starts an instance of Word and automates it (Interop.) It sort of works but it's not something I recommend dealing with ever.
OpenXml isn't fun either, but it's better. You can create your document "normally" using Word, save it, and then have your application use it as a template, opening a copy, populating some data, and then saving it.
Here's the reference for OpenXml with Word. I'm not saying it's pretty. It's not. The documentation is lacking. This page on adding text isn't linked from the previous page, even though many other topics are.
There are some nuget packages like this one that can help.
I once did a POC that did exactly what you're describing by opening and altering a document used a template using OpenXml. I'll see if I can dig up the code. But this is definitely a good direction to look in if Word is an absolute requirement.
This is a long shot, but can you output in HTML? If you can that's an even easier alternative.
Can you use Excel? That's also OpenXml but there's easier-to-use tools like EPPlus that simplify dealing with it, because it's not just the friendliest thing to work with.
An option that I would suggest is Crystal Reports. You can download the Crystal Reports add-in for Visual Studio for free from here. Crystal Reports is an easy way to perform reporting from various data sources including SQL. There are also a lot of free tutorials online for learning how to use CR. The syntax is a little strange, but it is easy enough to use.
The add-in allows you to create reports for your application and also build applications that can display, print, and export Crystal Reports.
You can export reports to .RTF (Rich Text Format) files. MS Word can open, edit, and convert RTF documents. It does a fairly decent job, but special formatting might take some work. This route is a ton easier than trying to write XML or anything else. I've written several reports designed for export to RTF. My boss runs the report, exports it, then edits it in Word. He loves the reports.
If you are planning on developing a lot of reports, purchasing the full version of Crystal Reports is well worth it. I believe they are on version 2016 currently.
If you do want to deal with automating Word, Microsoft's guide "Automating Applications Using the Office Object Model" Word-specific task content is here: https://msdn.microsoft.com/en-us/library/78whx7s6(v=vs.80).aspx
A larger example: https://support.microsoft.com/en-us/kb/316384
To begin, simply add an assembly reference to your project file for the correct Office Object Library (example: "Microsoft Word ##.0 Object Library"). Note that you must have Office installed to take this approach.
Good luck!
Is there a way to programmatically create PowerPoint presentations? If possible, I'd like to use C# and create PowerPoint 2003 presentations.
Yes, you can.
You will want to look into MSDN which has a pretty good introduction to it.
I might give you a word of warning, Microsoft Office interop is compatible with an API which is now more than 10 years old. Because of this, it is downright nasty to use sometimes. If you have the money to invest in a good book or two, I think it would be money well spent.
Here's a starting point for you. Use the search feature on MSDN MSDN Webpage. It's good for any Microsoft C# .NET style stuff.
Specifically in regards to your question, this link should help: Automate PowerPoint from C#. EDIT LINK NOW DEAD :(. These two links are fairly close to the original KB article:
Automate Powerpoint from C# 1/2
Automate Powerpoint from C# 2/2
Finally, to whoever downvoted this: We were all learning one day, how to do something as a beginner is most definitely programming related, regardless of how new someone might be.
OpenXML looks like the way to go from a web app.
Using the interop libraries is not recommended, as others have stated.
You can also look at Aspose Slides, a component for .NET and Java that makes it easy to generate powerpoint documents.
If you don't really need PowerPoint compatible output, consider using a markup language such as LaTeX with the Beamer package to produce a PDF of the presentation, or use HTML and javascript in a manner similar to Slidy. If you need fancy effects, it might still be easier to use SVG, and you'd have the benefit of getting output that can be reliably viewed with free software.
http://msdn.microsoft.com/hi-in/magazine/cc163471(en-us).aspx
Use this link. Although this is in VB.NET, C# supports the same.
You may also try out SlideMight, a tool for merging hierarchical data with PowerPoint templates.
SlideMight supports:
text substitution in text fields, tables and notes
image substitution, from raw data, files and URLs
images in tables nested
iterations over data to create slides
iterations to populate tables, possibly spanning multiple slides
special formatting for specific cell values
hyperlinks to generated slides
Input data format is at this time just JSON.
There are versions for Windows and Mac OS X.
More information is at http://www.SlideMight.com
Disclaimer:
I am the owner of Delftware Technology, the company that developed SlideMight.
And I am one of the developers.
You can use Essential Presentation product from Syncfusion Software Private Limited. This product can be used to
Create and manipulate PowerPoint presentations
Open, modify, and save existing PowerPoint presentations
Convert PowerPoint presentations to PDF or Image
More information is at https://help.syncfusion.com/file-formats/presentation/overview
Disclaimer:
I work for Syncfusion Software Private Limited
I am looking for ways to convert the first page of a docx (and later excel and powerpoint) document to an image. I would rather not manually parse the document's entire xml since that seems like a lot of work ;)
So I guess I'm just trying to collect some resources on how to best tackle this.
Thanks!
Rendering is the most difficult task :-)
I would go something like this:
DOCX -> PDF (via Aspose.Words) -> Render via Aspose.Pdf.Kit
Not a perfect solutions (in terms of "looks like printed from Word"), but the most suitable I can think of.
In addition, you could use some Microsoft Office interop, which would require Office to be installed.
I'm trying to create a web script that will allow me to alter PDF templates that I have uploaded and re-output them. I have tried Zend already which allows me to write to a PDF but that means leaving the PDF blank in certain space which is to primitive for what I need. PDFFlip was not any better.
We need to implement functionality so we can remove content from the PDF as well as remove and replace. I have looked at CAM::PDF and changepagestring.pl but I'm not sure it's up to the job. I was hard pressed to find any real usage examples and Perl is not a language I have used before.
This is for a web project but I am flexible with the language we use, ideally PHP or ASP.NET C# would be great. Preferably not Java unless there is no other way.
I should also point out that I looked through the FoxitReader SDK without any luck. I never tried to implement it but I found no mention of find and replace like functionality.
You can tinker with PDF text but it is not straight-forward just to search and replace. The text is designed as an end-file format not for easy editing. I wrote a blog post explaining some of the issues at http://pdf.jpedal.org/java-pdf-blog/bid/12670/PDF-text
May be as workaround it's better to hold and fill in templates in some format that is more convenient for editing? E.g., you can keep your templates as Microsoft Word templates and then export them to PDF after filling. This thread may be useful on this way.
PDF file format isn't quite appropriate for editing.
Alternatively, you may prepare your templates as PDFs containing form fields. In this case filling of form fields is common and well-known task and there a lot of pdf components for this.
I am planning on generating a Word document on the webserver dynamically. Is there good way of doing this in c#? I know I could script Word to do this but I would prefer another option.
I've worked at a company in the past that really wanted generated word documents, in the end they were perfectly satisfied with RTF docs that had a ".doc" extension. Word has no problem recognizing and opening them.
The RTF docs were generated with iText.net (free .net library), the API is pretty easy to use, performs extremely well, you don't need word on the machine, also, you could extend to generating PDF, HTML, and Text docs in the future with very little effort. After four years the solution I created is still in place, so that's a little testimony in iText.net's favor.
It looks like the official iText page suggests that iText Sharp is the best .Net choice right now, so that's another option
You'd be better off generating an rtf file, which word will know how to open.
If want to generate Office 2007 documents check the Open XML File Formats, they're simple zipped XML files, check this links:
Open XML File Formats: What is it, and how can I get started?
Introducing the Office (2007) Open XML File Formats
Edit: Check this project, can serve you as a good starting point:
DocumentMaker
Seems very simple and customizable, look this code snippet:
Paragraph p = new Paragraph();
p.Runs.Add(new Run("Text can have multiple format styles, they can be "));
p.Runs.Add(new Run("bold and italic",
TextFormats.Format.Bold | TextFormats.Format.Italic));
doc.Paragraphs.Add(p);
Word will quite happily open a HTML with a .doc extension. If you include an internal style sheet, you can have it fully formatted. There was previous post on this subject:
Export to Word Document in C#
Creating the old .DOC files (pre-Word 2007) is nigh-impossible without Word itself. The format is just too complex. Microsoft has released the format description, but it's enough to reduce a grown programmer to tears. There is a reason for that too (historical), but that doesn't make things better.
The new .DOCX would be easier, although quite a bit of hassle still. However depending on which Word versions you are targeting, there are some other options too.
For one, there is the classic .RTF. The format is pretty complex still, yet well documented and has strong support across many applications and platforms. And you might use some string-replacing into template files to make things easier (it's non-binary).
Then there are the "old" Word XML files. I think they worked starting with Word XP. Kinda the predecessors of .DOCX. I've used them, not bad. And the documentation is pretty OK.
Finally, the easy way that I would choose, is to make a simple HTML. Word can load HTML files just fine starting with version 2000. In the simplest way just change the extension of a HTML file to .DOC and you have it. You can also add a few word-specific tags and comments to make it look even better in Word. Use the Word's Save As...HTML option to see what they are.
There are third party libraries about that will do the job.
Doing a quick google came up with this one, for example.
I haven't tried any, so I can't give you specific advice, I'm afraid!
Let us know how you get on...
In Office 2007 Microsoft introduced a new file format called the Microsoft Open Office XML Format (.docx). This format is not compatible with older versions of Microsoft Word. Since this is XML you can create or read with out having a Word installed.
Here is the component that generates document based on the custom template. The documents are generated from the sharepoint list ... so the data is pulled from the list item into the document on the fly:
http://store.sharemuch.com/products/generate-word-documents-from-sharepoint-list
Hope that helps,
Yaroslav Pentsarskyy
Blog: www.sharemuch.com