I´m using ExpertPDF (library for .NET C#) for converting HTML to PDF and my problem is that it takes a lot of time to do this.
Are there any customizations that will improve the conversion?
The HTML-page contains table-data with just a few images, so it is not that complex.
Have anyone else ever experienced this problem, or do you recommend another library for doing this?
I´m thankful for all hints I can get, there must be a way to increase the performance of this action...
If you really need build from HTML, i suggest to have a look websupergoo, it is no free or open source library can export PDF from HTML.
There is a lot of questions in stackoverflow that speak for this subject
Generate PDF from ASP.NET from raw HTML/CSS content?
Printing a PDF in .NET
How do I programmatically create a PDF in my .NET application?
see search result
Related
I have a project where I need to create an HTML form (no problem) and then create a PDF file from the results using C#.
I have done this before in PHP using FPDF but this one needs to be C#. Ideally I want to put the code into a user control and then stick it in an Umbraco website.
Can anyone recommend a good way to do this? PDF doesn't need to be fancy, it'll just display text, we aim to create a generic purchase order based on what the customer wants from the form, which can then be emailed to them to print off on headed paper.
Thanks
There are a couple of recent problems with iTextSharp. The most annoying is that in the latest version they've deprecated the HTML parser. So now everything has to work through the XMLWorkerHelper singleton and parses through ParseXHtml. I find this a real pain, since HTML pages which aren't well formed appear fine on browser, parse OK in the old method and now crash out with an exception. So it necessitates an extra step to make sure your HTML is well formed (as XHTML) first. If you are generating your HTML from an ASPX page, then using Server.Execute() to get the stream, then this might be useful to you for iTextSharp:
http://jwcooney.com/2012/12/30/generate-a-pdf-from-an-asp-net-web-page-using-the-itextsharp-xmlworker-namespace/
Be mindful that iTextSharp has a distinct lack of any decent documentation of the modern changes (being mindful that the Java iText documents don't translate perfectly to C#), it makes the learning curve far too long and steep for any practical use in short spaces of time. I've basically given up on that platform, though may just create a baseline system to get something working lean whilst I then learn another framework.
As a result, I'm looking at PDFizer and PDFSharp libraries. If I have some success, I'll report back.
here is a library for converting HTML to PDF
http://pdfcrowd.com/web-html-to-pdf-net/
I like the PDFsharp library. Not sure how it would work for your needs, though.
Hi i'm new programming and i have written few application to access pdf content by using some dll files, but now my question is how can we write our own dll to access the pdf files. I know it's a big process but i'm very much interested to learn about this. any one please help me.
You can start by reading the PDF specification (warning 32MB behind this link) in order to understand how the PDF file format is implemented. This is necessary if you want to be able to parse it and extract the information you are interested in.
In the meantime (as this reading might occupy you during a certain amount of time) if you have pressing project deadlines you probably want to use an existing library such as iTextSharp.
I know it's a big process but i'm very much interested to learn about this.
That's true. I'd like to suggest to study some open source APIs (iTextSharp) and PDF SDK.
We are using Report Definition laguage (RDL) templates to define various reports in one of our Sharepoint applications. These reports are (then) saved as PDFs into various SharePoint Document Library's. One report in-particular renders, but is considered to be "failing" due to the styling needs of the report. So it appears RDL only understand "very simple" HTML.
For Example:
Trademark characters are not rendering as superscript (they render as normal text instead)
The ability to assign Line Height fails
The ability to assign Word Spacing fails (so printers "leading" requirements fail)
Both of these point to various marked Microsoft limitation for RDL's to interprint various HTML...of which we are now aware.
So...
I need a better tool...and we are scratching our heads on this one!
QUESTION:
What tools take-in HTML, understand CSS (well!) and can generate PDFs from C-Sharp objects?
Please keep in-mind I need the to PDF generator tools you recommend (below) to understand CSS and HTML.
NOTE:
I looked at the various other StackEchange sites to see if there is a better forum for this particular question, but this one was the only one that seemed to fit-the-bill. If you are a mediator, and feel this question is mis-placed, please feel free to move this question.
This HTML to PDF converter has the most accurate conversion of a complex html/css page. There is also a demo to try the conversion with your html
Maybe you can give Amyuni WebkitPDF a try. It is a Free component for converting HTML+CSS into PDF files. From the home page:
Directly convert HTML files into PDF without the use of a web browser or a printer driver
Convert HTML files into XAML/XPS for rendering within Silverlight
Integrate and deploy the HTML conversion feature within your applications
Generate either a single continuous PDF page or split the HTML into multiple PDF pages
Amyuni WebkitPDF is distributed as a library with a sample application, and sample code for C++ and C#.
Disclaimer: I currently work as software developer at Amyuni Technologies.
I only know a workaround for the "leading space" issue. This example "leads" the value with 10 spaces:
=space(10) & Fields!FieldName.Value
This should work for any renderer, I'll update this if I come around other tricks.
Have a look at Aspose.Pdf for .NET: http://www.aspose.com/categories/.net-components/aspose.pdf-for-.net/default.aspx
Could you give me some recommendations on free/Openspurce library etc that could be integrated on asp.net application to Generate Pdf out of Html fragments. I will be generating Invoices that is displayed in DataGrid and tables. Is there some readily available library that would print the whole Table with Datagrid into Pdf. ITextSharp seems nice but i will have to do the tough work of adding tables and blah blah when everything is already in the webpage.
this is a possible duplicate but it generates pdf from Full page which is not desired
Possible duplicate question
ITextSharp does almost what you ask and is Open Source, however the API for conversion process has not been touched for years and is outdated. I therefore would recommend a commercial product.
Something like Winnovative HTML to PDF Converter
To be honest I look at it like this, you can save money by buying a licence for a commercial product rather than spending days developing a solution yourself.
Edit If it is for generating invoices alone then I would use iTextSharp as it does not take long to learn the basics. However if you want to be able to convert a full rich webpage into a PDF then go down the commercial route.
These links may help:
Convert HTML to PDF in .NET
Generate PDF from ASP.NET from raw HTML/CSS content?
Creating PDF Invoices - Are there any templating solutions?
When doing invoices, I usually go a slightly different way, by starting with Aspose.Words and using the nested mail merge feature.
Another option could be the HTML to PDF feature of Aspose.Pdf.
Both libraries are commercial only, I don't know whether this is appropriate for your project.
So, I have used Pdf995's PDF print driver from a web browser to print web pages and eventually use PdfEdit995 to join these various PDF files into one large PDF.
Now I have a lot of large PDF documents that I wish to add bookmarks to, but am hoping there is a relatively easy way of doing this programmatically (using C#, preferably) - basically, I want to find, within each PDF, text that is large enough to qualify as a header, and use that text as the bookmark.
Any tips/advice/direction? Thanks!
It's definitely possible to do this, but I would recommend finding a PDF library that does most of the leg work. Technically you could do it all yourself with the aid of the PDF specification, but that'd probably take more time than it's worth.
The library will need to be able to let you find text in a document and then return the page and size, font, etc, of the text and create bookmarks (also known as outlines) based on that information programmatically.
My companies product, Quick PDF Library, can help you do this and so can PDFKit.NET. I'm sure there are other libraries out there that support this functionality too. As far as free libraries go, from what I've seen I don't believe that PDFSharp or iText will meet all of your requirements in this case, but I'm sure someone will correct me if I'm wrong.
If you'd prefer to develop a solution for this entirely yourself, then the PDF reference is available online for free.