I am trying to convert HTM to docx without using interop dll. I have tried with Dynamically generate a MS Word document using HTML & CSS and also with Html to OpenXml.
I don't find a way to convert with HTML to Docx with all the styles and images intact. Openxml does support styles but only when styles are inline. If I have the styles defined in CSS file then the styles does not get reflected.
What alternative can I go for to achieve this?
Queston 1 - Preserve Styling
Pre-process the HTML to inline the styles from the css file before using html2openxml to convert the document.
Question 2 - Preserve Images
Images are supposed to work in that converter according to here http://html2openxml.codeplex.com/wikipage?title=ImageProcessing&referringTitle=Documentation
maybe you need to debug this a bit / post more info
Edit
Maybe you forgot to set the base path
converter.BaseImageUrl = new Uri("http://myserver:8080/");
Related
I'm using OpenHtmlToPdf to convert a html file to a pdf file. This works, but it doesn't support css grid and also doesn't render the svg which is embedded in the html file.
Do you have an idea of how fix this or know any alternative libraries which support svgs and css grid?
Best regards
I have built several PDF documents dynamically from ASP.NET pages (HTML/CSS) using plugins like Winnovative HtmlToPDFconverter. It has always been a successful outcome using the built-in functionality for those plugins, like merging existing PDF documents with dynamic content and adding pre-defined headers and footers, adding page margins, page numbers and so forth. The HTML content has overall been rendered as expected in the final PDF document(s).
Is there any way/any advice for a similar .NET plugin that can render HTML/CSHTML to a Microsoft Word document (.docx) in the same way – or is it too difficult to render native HTML5 and CSS into a desirable layout for a Microsoft Word Document?
I have googled around and found some suggestions, but I'm looking for recommendations for maybe a specific plugin – or a warning if it is a no-go
and too difficult to get the desired layout 1:1 from HTML to a Word document because of incompatibility between markups?
Devexpress HTML editor can export its HTML content to different formats including docx and rtf. Not sure about its limitations (e.g. script and canvas export, etc.), but in the common case it works well.
In Google Chrome, when you open an xml file, you get a formatted (pretty) view of the xml if there is no stylesheet referenced in the xml file itself.
I simply want to do this in my application, which uses Awesomium.
I am using the Awesomium.Windows.Forms.WebControl
I don't want to roll my own if I can avoid it.
Thanks!
I'm doing this in an internal tool for my development team. I format the XML with an xsl that colors and indents everything, then update the web control with the resulting HTML.
Check out this link for formatting XML, the CSS styles are built in, so you can update styles colors as you wish
See the "XML to HTML Verbatim Formatter with Syntax Highlighting" project on this page.
http://www2.informatik.hu-berlin.de/~obecker/XSLT/
I am using iTextSharp to generate a pdf invoice.
I have a template for the invoice which is very simple, but uses CSS3 for the formatting and styling.
when I display the page in a browser it works fine, but when I try to generate the page into a pdf using itextsharp it seems to ignore all the CSS3 formatting for some reason.
My question is: Is there a way to get it to work with CSS? Or is that a limitation of iTextSharp ?
Are your stylesheets linked with an absolute path or a relative one? PDF converters don't tend to work too well with relative paths.
Context
What we need is to capture some user input (formatted text) from a WPF application and output a PDF with some stored images AND the user input on the last page.
What we've tried
We create the WPF app, add the iTextSharp library, recover the images from the DB and add it to the PDF. That's working. Now, for the user input we added a RichTextBox control from the Extended WPF Toolkit. We added this control mainly because of its binding properties and formatters. Basically we can bind the rich content of the control to a property. That binding is working. We already have the RTF format, as (in example):
"{\rtf1\ansi\ansicpg1252\uc1\htmautsp\deff2{\fonttbl{\f0\fcharset0 Times New Roman;}{\f2\fcharset0 Segoe UI;}}{\colortbl\red0\green0\blue0;\red255\green255\blue255;}\loch\hich\dbch\pard\plain\ltrpar\itap0{\lang1033\fs18\f2\cf0 \cf0\ql{\f2 {\ltrch This is the }{\b\ltrch RichTextBox}\li0\ri0\sa0\sb0\fi0\ql\par}}}"
Problem
The thing is, the actual output of the PDF is precisely that previously shown RTF, but the expected output (for the example) must be:
"This is the **RichTextBox**\r\n"
This is happening obviously because we are inserting the binded RTF from the control as it comes to the PDF, the thing is: How can we add that content and specify its RTF?
PS. If you have other working idea or solution (without using a richtextbox, or something like that) it's welcome. Thanks in advance.
Unfortunately, iTextSharp does not directly support RTF format anymore. I would suggest to convert the RTF fragment to XHTML first and then import the resulting XHTML into the final document (it seems that the official HTML support is gone away, so XHTML is the only alternative in this case).
In short, I would suggest to:
convert the RTF fragment into XHTML;
place the XHTML stream into a new iTextSharp document (or directly into the final document, if you wish);
add the content of the aforementioned document into the target document you are going to export as PDF.
UPDATE
There is no built-in mechanism to convert from RTF to XHTML but many open source project exist; I would start coupling this RTF to HTML converter with the HTML Agility Pack (which will in turn convert your HTML to XHTML).
Frankly, however, the whole flow is a bit complex to follow and I would perhaps opt for a simpler solution, maybe by using an HTML editor (alternative) directly in your project or by reverting to the FlowDocument as others have suggested.
WPF already had a good FlowDocument and it does good rendering. So we created Xaml to PDF converter, its in beta, but most String, Table and Images are converted to PDF successfully, its an open source project available at, http://xamltopdf.codeplex.com/ , RTF can give you FlowDocument and you can convert it to XAML and pass it on to XamlToPDF converter.