Convert asp.net page to Word document - c#

I have built several PDF documents dynamically from ASP.NET pages (HTML/CSS) using plugins like Winnovative HtmlToPDFconverter. It has always been a successful outcome using the built-in functionality for those plugins, like merging existing PDF documents with dynamic content and adding pre-defined headers and footers, adding page margins, page numbers and so forth. The HTML content has overall been rendered as expected in the final PDF document(s).
Is there any way/any advice for a similar .NET plugin that can render HTML/CSHTML to a Microsoft Word document (.docx) in the same way – or is it too difficult to render native HTML5 and CSS into a desirable layout for a Microsoft Word Document?
I have googled around and found some suggestions, but I'm looking for recommendations for maybe a specific plugin – or a warning if it is a no-go
and too difficult to get the desired layout 1:1 from HTML to a Word document because of incompatibility between markups?

Devexpress HTML editor can export its HTML content to different formats including docx and rtf. Not sure about its limitations (e.g. script and canvas export, etc.), but in the common case it works well.

Related

Using a PDF template with PDFSharp and/or MigraDoc

I was wondering whether it would at all be possible to have our creative department design a nice-looking PDF template for our client, e.g. a fancy letterhead, then supply it to me so I could inject various types of content into the body using PDFSharp or MigraDoc.
Currently we generate the header and footer content as part of the rendering process, and it works very well, but as you can imagine, any non-trivial layout and styling is pretty complicated to pull off in what is essentially a 2D graphics environment.
So the thought arose as to whether one of these tools would be able to take a pre-existing PDF, give me access to various objects, and allow me to e.g. replace certain text placeholders or manipulate the PDF "DOM" in a more intelligent fashion.
Something similar to working with Spreadsheets (binary and XML versions) or OpenXML, etc.
What we do: take an existing PDF page, draw it at the bottom (Z axis) of a new PDF page, and then use MigraDoc to add other contents to the page.
PDFsharp can also be used to draw on top.
The template PDF pages are used like letter heads with the corporate design of a customer and the final document will have as many pages as needed.

Display MSWord file content in any browser

I want to display content of word file in browser same like we display pdf file in browser. I don't want any plugin because if I use plugin I have to install for all browser. I want just one solution which works in all browser.
I have searched on google, but I found all link which directly download word file and open it.
Currently I am using object tag for displaying pdf file but it is not working for word file. It is showing message: The plug-in is not supported.
Using a browser plug-in (such as the free Word Viewer) is by far the easiest method, and arguably the most correct - however, there are some alternatives if you really don't want to do this:
Convert the Word document to another format (e.g. HTML/PDF) on-the-fly before the response is sent. For Word 97-2003 documents, you can do this with VSTO/Automation. For Word 2007+ documents, you can use the OpenXML SDK (although you will have to write the conversion algorithm yourself).
Use an XSL stylesheet to transform the Word markup (docx) into html/css. You can do this server-side or, potentially, with client-side scripting (JavaScript). Some useful resources here and here.
Great question. In principle, browsers only really tend to support viewing websites (e.g. html). Most, however, also support viewing PDFs, and, as you've correctly identified, you could use plugins to extend the behaviour. Crucially, though, some browsers provide document viewing with a javascript-based viewer.
I wasn't aware of it before you asked, but there are apparently javascript implementations of non-PDF document readers--for example, ViewerJS--that seem to directly support .odt. With a little digging, you might be able to find an implementation/plugin for a javascript viewer that supports .docx. However, I can't recommend one from personal experience at the moment. I would recommend searching for javascript document viewers though.

Creating PDFs Online

We are using Report Definition laguage (RDL) templates to define various reports in one of our Sharepoint applications. These reports are (then) saved as PDFs into various SharePoint Document Library's. One report in-particular renders, but is considered to be "failing" due to the styling needs of the report. So it appears RDL only understand "very simple" HTML.
For Example:
Trademark characters are not rendering as superscript (they render as normal text instead)
The ability to assign Line Height fails
The ability to assign Word Spacing fails (so printers "leading" requirements fail)
Both of these point to various marked Microsoft limitation for RDL's to interprint various HTML...of which we are now aware.
So...
I need a better tool...and we are scratching our heads on this one!
QUESTION:
What tools take-in HTML, understand CSS (well!) and can generate PDFs from C-Sharp objects?
Please keep in-mind I need the to PDF generator tools you recommend (below) to understand CSS and HTML.
NOTE:
I looked at the various other StackEchange sites to see if there is a better forum for this particular question, but this one was the only one that seemed to fit-the-bill. If you are a mediator, and feel this question is mis-placed, please feel free to move this question.
This HTML to PDF converter has the most accurate conversion of a complex html/css page. There is also a demo to try the conversion with your html
Maybe you can give Amyuni WebkitPDF a try. It is a Free component for converting HTML+CSS into PDF files. From the home page:
Directly convert HTML files into PDF without the use of a web browser or a printer driver
Convert HTML files into XAML/XPS for rendering within Silverlight
Integrate and deploy the HTML conversion feature within your applications
Generate either a single continuous PDF page or split the HTML into multiple PDF pages
Amyuni WebkitPDF is distributed as a library with a sample application, and sample code for C++ and C#.
Disclaimer: I currently work as software developer at Amyuni Technologies.
I only know a workaround for the "leading space" issue. This example "leads" the value with 10 spaces:
=space(10) & Fields!FieldName.Value
This should work for any renderer, I'll update this if I come around other tricks.
Have a look at Aspose.Pdf for .NET: http://www.aspose.com/categories/.net-components/aspose.pdf-for-.net/default.aspx

Generating PDF Report from database in C#, specifically ASP

I need to generate a high quality report based on information in a SQL Server database, and I want very explicit control of the layout and appearance from inside C#.
I have several choices that I know of that are already being used for various other reports at our company:
1) SQL Server's built in Reporting Services
2) Adobe Forms
3) Crystal Reports
This information I need as PDF directly parallels what is already being displayed in the user's web browser as HTML, so creating a print stylesheet and converting the browser body to PDF is an option as well.
So this creates option 4:
4) JavaScript convert HTML to PDF (my preference at this time)
Does anybody have a recommendation as to which approach I should take, or even better an alternative? All the choices seem pretty horrible.
I've used iTextSharp with very good results. It is an open-source .NET port of a java library. It works really well for creating PDFs from scratch. Remember that editing PDFs will always be hacky with any library, because PDF is an output format, not a read-write format.
Provided your HTML is fairly clean (remove javascript postbacks, anchors, ...),the iText HtmlWorker can convert HTML to PDF, if you prefer that route.
HTML to PDF in using iTextSharp:
Document doc = new Document(PageSize.A4);
HTMLWorker parser = new HTMLWorker(doc);
PdfWriter.GetInstance(doc, Response.OutputStream);
Also here.
Use SSRS, it has a built in PDF rendering mode.
I have used two other PDF report libraries with great success; Active Reports and Telerik Reporting. Personally I prefer the latter when it comes to programmatic control of layout and such.
Take a look also at the DevExpress Reporting (non-free 3rd party tool):
Overview
Online Demos
Documentation
Yes, you should use the best tools to get the best solution. The best tool in this case probably is SSRS.
But that's just looking at the capabilities of the tool.
Don't forget to look at your own capabilities!
My story: I know SQL, I know C#. (Both intermediate, I'm not a guru.)
Then I lay my hands on SSRS. And burnt them, once, twice, etc.
At the end, there was a nice result. So burning your fingers is not a wrong thing to do.
But first try to pull your html through an html to pdf converter (demo version) and see if the result it serves your needs.
Currently I'm using both:
SSRS for creating invoices, because amounts have to be transported from one page to the next
Winnovative to generate documents that only need page numbers
I would suggest using .Net ReportViewer control in local mode (no report server required). It works in both webforms and winforms. You create a client-side report (.rdlc) file (which contains all the visuals as well as placement of data fields), link it up to the ReportViewer, and supply the data (DataTable or collection of objects, as long as the fields match, it doesn't matter). In client mode it supports exporting to pdf and excel (and Word too? don't remember). By default these done by a dropdown in the control itself however you can programmatically export to any of the supported formats as well. You'll end up with a byte array you can shove into a file stream.
Basically you get most of the good parts of SSRS without all of that backend complexity. There should be a ReportViewer folder in %programFiles%\Microsoft Visual Studio 10.0\ReportViewer - but versions exist for 2005 and 2008 as well. Check out http://gotreportviewer.com/
I think the 4th option is the best. In this case you don't need to change either layout of the HTML page or a layout of PDF, if one of them has been changed.
It is also more convenient making a nice design via HTML than programmatically via C# :)
Take a look at WebToPDF.NET which is a .NET component written in C# that converts HTML to PDF. The converter supports HTML 4.01, XHTML 1.0, XHTML 1.1 and CSS 2.1 including page breaks, forms and links. It passes all W3C tests (except BIDI).
You can use Fast Report it's good tool and i has a free version

concatenating word documents and converting them to pdf

what is the best possible way to merge multiple documents and convert them to pdf. also we need to insert blank pages for every odd pages.
A fully supported, server side automated version of this (mostly baked into the the MS camp though) involves using the OpenXMLSDK to do any field inserts, then using Sharepoint's Word Automation Services (SP 2010) to convert the documents to PDF, and then pick your favorite PDF toolkit (iTextSharp for me) for any post processing (merging documents, inserting blank pages, or images that must be positioned relative to specific pages).
The reason for doing the document merge in PDF rather than OpenXML is simplicity - you don't have to deal with merging styles, headers etc.
The reason for doing the blank pages and image insertion is that OpenXML has no idea how to render the content, and so it has no idea where page breaks would occur naturally (you can still insert breaks like you would in Word though).
If you are using C# and you are OK with a server based solution then have a look at this post. It uses a .net friendly web services interface.
There is an optional SharePoint version available as well, but as you did not include a SharePoint tag I assume that won't be of interest to you.
Full disclosure, I wrote that post.

Categories