How do I output a webpage that contains MathML to PDF? - c#

My web application displays MathML embedded in HTML using the MathPlayer plugin. I need to output to PDF. I have PDF components (Dynamic PDF, ABCpdf), but they don't know how to parse the MathML, of course.
Is there a library that can help me translate the MathML to an image or something that I can feed to the PDF components on the fly in the web application?

Design Science has a command line Windows executable (also available as a DLL) that will convert all of the MathML in a document to EPS for use in PDF. It's the Document Composer, which is part of the MathFlow SDK. Contact us if you're interested in more info or an evaluation.

FYI, I have also found another PDF component that supports MathML called AHFormatter. I have not tried it, but it apparently works very well.

Related

How HTML to PDF works (specially abcPDF)

My new project is converting the HTML into PDF on the fly using the URL.
I have searched a lot in my initial period and come up with the solution so that HTML convert to IMAGE and IMAGE goes to PDF.
But its not ideal solution as user can not copy paste from the PDF file.
Recently i came across abcPDF component, you can check their demo here http://www.abcpdfeditor.com/
Now i am wondering how they are able to produce such a nice PDF with all such feature. What will be their logic? I dont think they are going to parse each and every HTML tag to create document. Do you guys have any idea?
Any help will be much appreciated
In short, this is how most HTML to PDF conversion works.
HTML ----Converted To ----> EMF (Metafile/Vector Image) ----> PDF
Basically, IE's rendering engine (i.e, MSHTML) has some APIs through which you can export loaded HTML page as Emf (Enhanced metafile format) which is nothing but a vector image.
You can make use of this open-source web browser control for this purpose.
http://groups.google.com/group/csexwb
Then you have to render the generated EMF file on to PDF page. This is typically called as, EMF to PDF conversion. Based on my understanding there is no free Emf to PDF conversion software available. But ITextsharp provides minimal support for WMF format.

Pdf printer driver C#

I am looking for a free open source .Net based (prefer C#) pdf driver. Any idea where I can download one?
Pdf Creator
PDFCreator easily creates PDFs from any Windows program. Use it like a printer in Word, StarCalc or any other Windows application.
If you need to use created PDF file inside of your C# application, then the easiest way is to generate PDF inside this application. Then you don't need to monitor a temp folder for a new file created.
To generate PDF inside of your application you will need a PDF-generating library for C#.
For example, PDFFlow library. You can generate all the elements of PDF document (text element, paragraph, image, inline image, line, table, page number, header/footer, etc...), so you can construct "any kind of document", as you said.
Hope, this idea will help.

How can I convert PDF to doc without microsoft.office.interop?

I need to convert PDF files into .doc files using C#. The computer has no file system though it doesn't have Office installed. Any good ideas how I can approach this? I did some research and most of people use the interop services.
You need to understand that PDF is not really implemented as a single document format.
If your PDF docs are created by rendering text to a PDF file, then direct PDF conversion is not only possible, but can be very good (reliable).
If the source of your PDF is either a scanner or fax (essentially a scanner...) then what you have is a document with an "picture" of text. This scenario is more difficult to deal with. If you open up the markup for this there is no 'text' to be converted. In this situation you have to deal with some manner of OCR (optical character recognition) which is less reliable due to a variety of issues.
If you have the option of intercepting the data before it is rendered to PDF (say like in SSRS or Crystal) then it would be better for you to bypass the PDF stage and move your data to a Word document.
If you are constrained to receiving faxes and then needing to interpret their content, prepare for OCR hell. It has been a while since I was there, so I hope that it has gotten better.
Even with out office installed on your machine, you have access (with Visual Studios) to the Office developer toolkit which will allow you build documents to be distributed in the Word formats.(.doc/.docx).
An option/idea may be to convert the PDF to Html, which can be opened in Word?
use aspose pdf kit to conver pdf to text and then text to doc using filestream or aspose doc

What's the best way to generate a file and print it in .net?

I am working in a desktop project in C# with .net. This project has a function that generates some information and i would like to print this generated info as a document (may be .doc, .pdf, etc). Summarizing, i need:
Get the data generated by a function;
Generate a document containing these information structured with title, texts and tables (things that every document have);
Print it;
I thought generating an .html file (because it's simple to generate this kind of file), but i couldn't find a way to print it directly from my program.
Which extension of file would you recommend to insert this kind of information and print it directly from my program??
Thanks in advance.
Here's an easy way that uses a RichTextBox
http://www.codeproject.com/KB/printing/simpleprintingcs.aspx
It's not trivial to print a PDF, HTML, or a doc unless you are going to use external programs or third-party libraries. ImageMagick/GhostScript could help you print PDF.
Disclaimer: I work at Atalasoft -- If you are willing to use commercial software, my company makes PDF rendering components for .NET. There are companies that do the same for HTML.
Directly? Open printer port...
Or you can do it with framework classes:
How to: Print with a WebBrowser Control
http://msdn.microsoft.com/en-us/library/b0wes9a3.aspx

HTML to Image .tiff File

Is there a way to convert a HTML string into a Image .tiff file?
I am using C# .NET 3.5. The requirement is to give the user an option to fact a confirmation. The confirmation is created with XML and a XSLT. Typically it is e-mailed.
Is there a way I can take the HTML string generated by the transformation HTML string and convert that to a .tiff or any image that can be faxed?
3rd party software is allowed, however the cheaper the better.
We are using a 3rd party fax library, that will only accept .tiff images, but if I can get the HTML to be any image I can covert it into a .tiff.
Here are some free-as-in-beer possibilities:
You can use the PDFCreator printer driver that comes with ghostscript and print
directly to a TIFF file or many other formats.
If you have MSOffice installed, the Microsoft Office Document Image Writer will produce
a file you can convert to other formats.
But in general, your best bet is to print to a driver that will produce and
image file of some kind or a windows meta-file format (.wmf) file.
Is there some reason why you can't just print-to-fax? Does the third-party software not support a printer driver? That's unusual these days.
A starting point might be the software of WebSuperGoo, which provide rich image editing products, cheap or for free.
I know for sure their PDF Writer can do basic HTML (http://www.websupergoo.com/helppdf6net/source/3-concepts/b-htmlstyles.htm). This should not be too hard to convert to TIFF.
This does not include the full HTML subset or CSS. That might require using Microsofts IE ActiveX component.

Categories