what is the best possible way to merge multiple documents and convert them to pdf. also we need to insert blank pages for every odd pages.
A fully supported, server side automated version of this (mostly baked into the the MS camp though) involves using the OpenXMLSDK to do any field inserts, then using Sharepoint's Word Automation Services (SP 2010) to convert the documents to PDF, and then pick your favorite PDF toolkit (iTextSharp for me) for any post processing (merging documents, inserting blank pages, or images that must be positioned relative to specific pages).
The reason for doing the document merge in PDF rather than OpenXML is simplicity - you don't have to deal with merging styles, headers etc.
The reason for doing the blank pages and image insertion is that OpenXML has no idea how to render the content, and so it has no idea where page breaks would occur naturally (you can still insert breaks like you would in Word though).
If you are using C# and you are OK with a server based solution then have a look at this post. It uses a .net friendly web services interface.
There is an optional SharePoint version available as well, but as you did not include a SharePoint tag I assume that won't be of interest to you.
Full disclosure, I wrote that post.
Related
My customer gave me some Word and Powerpoint documents which specify how certain 'reports' generated by our product are supposed to look like.
That means, I need to modify those documents (replace placeholders etc.) and then I need to export them as PDF.
How would you solve this problem in C# ?
TL;DR: Editing the office document is no problem at all, but exporting that document to PDF (using Interop) allegedly causes issues when running it as a web server application. That's the whole problem here.
I agree that Interop is not suitable for document manipulation in server environment. I would approach this problem by preparing MS Word template documents with placeholders for data. Then I would use c# to load the data for the reports and merge the data with templates to get final documents (docx, pdf, xps or various image formats). There are 3rd party toolkits which make it quite easy. Here is the code used by one such toolkit needed for merging xml data with the template to get a pdf document:
XElement customers = XElement.Load("Customers.xml");
DocumentGenerator dg = new DocumentGenerator(customers);
DocumentGenerationResult result = dg.GenerateDocument("MyTemplate.docx", "MyReport.pdf");
You can of course also use free libraries and SDKs based on OpenXML but you should expect a steep learning curve, lots of debugging and lots of time invested.
Wkthmltopdf might be an option.
A completely different "report approach" could be, to save those office documents with the placeholders as mht (That's MHTML a web archive format). This could be done directly in MS Office or even programatically.
The placeholders could be easily exchanged by string search and replace. The mht files could directly be used to show the report instead of the PDF. A clear disadvantage of the mht format, is the HTML formatting. With PDF you have a clear and fix positioning.
We are using this kind of report creation. There are some flaws, but it works and the customer could edit the mht templates directly by right-click Open-With the prefered MS Office flavor.
You can use report generators, like FastReport.Net for solving your problems. It can assign different data for placeholders and also allow export to PDF.
I'm comfortable generating Word documents using Aspose.Word (which can also save as a PDF) but I've recently been asked to do the same thing using a PDF as the starter template. We recently bought Aspose.Total and whilst Aspose.Pdf looks like it can do some manipulations it doesn't look to be all that flexible/easy (like adding a big line of text and getting it to wrap, and shifting other content down the page if it takes up more space).
What would be the best way of using a PDF as a template for what is basically a bit like a mail merge from a database? Should I turn it into a PDF form and merge it from an XML data source? Is this even viable or would such a form still have a limitation on spacing (so that longer lines/paragraphs of text won't reflow the document where necessary)?
From what I can tell it doesn't look like InDesign can be manipulated in c# even via a COM object (which would be nasty on a web server anyway).
If I recreated the InDesign/PDF as a Word document I'm sure I could work wonders, but you know what these publishing types are like, who think Word documents are the tool of the devil. These PDFs are never going to a professional printer anyway; they're just brochures for a client to download from a web page (based on information in a database) for printing/use at home.
You have indeed many solutions for such a web to print project. Choosing one is a matter of budget, requirements and users count. Placing dynamic contents can be done at the simpliest with PDF forms fillable with xml data.
On the other hand you can work with InDesign Server and output PDF based on InDesign templates. That's generally a good choice when a large amount of users needs to get rich pdf files in parallel. But the costs are heavy.
You can also envision A pitstop server or Callas PDFToolBox Server to place dynamic texts based on variables as supplied by you. The good point here is that you don't need much coding here. Those apps are ready to use.
You can at last consider command line tools. A few of them may have some useful commands such as pdfTk or cPdf to merge texts.
We are using Report Definition laguage (RDL) templates to define various reports in one of our Sharepoint applications. These reports are (then) saved as PDFs into various SharePoint Document Library's. One report in-particular renders, but is considered to be "failing" due to the styling needs of the report. So it appears RDL only understand "very simple" HTML.
For Example:
Trademark characters are not rendering as superscript (they render as normal text instead)
The ability to assign Line Height fails
The ability to assign Word Spacing fails (so printers "leading" requirements fail)
Both of these point to various marked Microsoft limitation for RDL's to interprint various HTML...of which we are now aware.
So...
I need a better tool...and we are scratching our heads on this one!
QUESTION:
What tools take-in HTML, understand CSS (well!) and can generate PDFs from C-Sharp objects?
Please keep in-mind I need the to PDF generator tools you recommend (below) to understand CSS and HTML.
NOTE:
I looked at the various other StackEchange sites to see if there is a better forum for this particular question, but this one was the only one that seemed to fit-the-bill. If you are a mediator, and feel this question is mis-placed, please feel free to move this question.
This HTML to PDF converter has the most accurate conversion of a complex html/css page. There is also a demo to try the conversion with your html
Maybe you can give Amyuni WebkitPDF a try. It is a Free component for converting HTML+CSS into PDF files. From the home page:
Directly convert HTML files into PDF without the use of a web browser or a printer driver
Convert HTML files into XAML/XPS for rendering within Silverlight
Integrate and deploy the HTML conversion feature within your applications
Generate either a single continuous PDF page or split the HTML into multiple PDF pages
Amyuni WebkitPDF is distributed as a library with a sample application, and sample code for C++ and C#.
Disclaimer: I currently work as software developer at Amyuni Technologies.
I only know a workaround for the "leading space" issue. This example "leads" the value with 10 spaces:
=space(10) & Fields!FieldName.Value
This should work for any renderer, I'll update this if I come around other tricks.
Have a look at Aspose.Pdf for .NET: http://www.aspose.com/categories/.net-components/aspose.pdf-for-.net/default.aspx
I need to generate a high quality report based on information in a SQL Server database, and I want very explicit control of the layout and appearance from inside C#.
I have several choices that I know of that are already being used for various other reports at our company:
1) SQL Server's built in Reporting Services
2) Adobe Forms
3) Crystal Reports
This information I need as PDF directly parallels what is already being displayed in the user's web browser as HTML, so creating a print stylesheet and converting the browser body to PDF is an option as well.
So this creates option 4:
4) JavaScript convert HTML to PDF (my preference at this time)
Does anybody have a recommendation as to which approach I should take, or even better an alternative? All the choices seem pretty horrible.
I've used iTextSharp with very good results. It is an open-source .NET port of a java library. It works really well for creating PDFs from scratch. Remember that editing PDFs will always be hacky with any library, because PDF is an output format, not a read-write format.
Provided your HTML is fairly clean (remove javascript postbacks, anchors, ...),the iText HtmlWorker can convert HTML to PDF, if you prefer that route.
HTML to PDF in using iTextSharp:
Document doc = new Document(PageSize.A4);
HTMLWorker parser = new HTMLWorker(doc);
PdfWriter.GetInstance(doc, Response.OutputStream);
Also here.
Use SSRS, it has a built in PDF rendering mode.
I have used two other PDF report libraries with great success; Active Reports and Telerik Reporting. Personally I prefer the latter when it comes to programmatic control of layout and such.
Take a look also at the DevExpress Reporting (non-free 3rd party tool):
Overview
Online Demos
Documentation
Yes, you should use the best tools to get the best solution. The best tool in this case probably is SSRS.
But that's just looking at the capabilities of the tool.
Don't forget to look at your own capabilities!
My story: I know SQL, I know C#. (Both intermediate, I'm not a guru.)
Then I lay my hands on SSRS. And burnt them, once, twice, etc.
At the end, there was a nice result. So burning your fingers is not a wrong thing to do.
But first try to pull your html through an html to pdf converter (demo version) and see if the result it serves your needs.
Currently I'm using both:
SSRS for creating invoices, because amounts have to be transported from one page to the next
Winnovative to generate documents that only need page numbers
I would suggest using .Net ReportViewer control in local mode (no report server required). It works in both webforms and winforms. You create a client-side report (.rdlc) file (which contains all the visuals as well as placement of data fields), link it up to the ReportViewer, and supply the data (DataTable or collection of objects, as long as the fields match, it doesn't matter). In client mode it supports exporting to pdf and excel (and Word too? don't remember). By default these done by a dropdown in the control itself however you can programmatically export to any of the supported formats as well. You'll end up with a byte array you can shove into a file stream.
Basically you get most of the good parts of SSRS without all of that backend complexity. There should be a ReportViewer folder in %programFiles%\Microsoft Visual Studio 10.0\ReportViewer - but versions exist for 2005 and 2008 as well. Check out http://gotreportviewer.com/
I think the 4th option is the best. In this case you don't need to change either layout of the HTML page or a layout of PDF, if one of them has been changed.
It is also more convenient making a nice design via HTML than programmatically via C# :)
Take a look at WebToPDF.NET which is a .NET component written in C# that converts HTML to PDF. The converter supports HTML 4.01, XHTML 1.0, XHTML 1.1 and CSS 2.1 including page breaks, forms and links. It passes all W3C tests (except BIDI).
You can use Fast Report it's good tool and i has a free version
Well, heres my scenario.
Client/Server winforms application with SQL Express as the DB. I need to be able to print invoice, packing slips etc..
i would like the customer to be able to modify the invoices. ie. be able to put their logo or change font sizes etc...basically format the display.
Things i have considered so far are.
1) Use a template engine (similar to codesmith or mygeneration) and use templates that output HTML. Then print the html page.
2) Use ReportViewer in local mode. I've heard that users can download a plugin for web dev express and edit the local report files. can anyone confirm this?
3) Use Reportviewer in remote mode.
I don't have much experience with ReportViewer so I'm not sure if i should use local or remote mode as well.
Those of you that have done this kind of thing before whats your recommendation?
After just completing a project with it, I would heartily recommend iTextSharp to create your invoices and other forms as PDFs. In addition to creating PDFs from scratch, you can also use it to fill in PDF forms and/or templates created with Acrobat (or even MS Office/OpenOffice). And it's free.
It's pretty easy to use in Windows apps or in ASP.Net applications. Most of the documentation and the books on it (iText in Action, for example) are about the original Java version, iText. However, there are tutorials and example code on the conversion process and, for the most part, all of the functions and libraries work the same in the .Net version, so adapting the book and reference code has been no problem.
I definitely learned the hard way that HTML and CSS are great for browsers (well, great except for the "every browser interprets it different" problem), but horrible for trying to generate consistent, attractive, and precise printed output and forms.
I'm personally using Aspose Words: they use word documents as templates, and I'm using Words bookmarks function to mark and retrieve the fields I need to fill.
Aspose works nicely with Tables (ie: you can add lines to a table, etc...) and sees Word documents as XML documents. You can then save the document as MSWord or PDF.
I wouldn't say it's the greatest library in the world, but it's definitely worth having a look :)
you can use Crystal Report for this. But first you need to scan the INVOICE and save it as an image,
Next is, on your crystal report, export the image on to it, and DRAG the fields to where they must print on the invoice (IMAGE SERVES AS YOUR GUIDE). Then after everything has been set-up, DELETE THE IMAGE and try it.
hope this helps.