A File converted by Word Automation Services loses its table header - c#

I am generating a Word document using the OpenXml SDK 2.0 and everything in that respect is fine. The document has a lot of tables with multi-row table headers and everything looks exactly the way it should.
I am passing this document through the word automation service in Sharepoint 2010 Enterprise and the service returns a converted file. Sometimes the file format is the same as the input format (Docx->Docx) as I use the service to refresh the table of contents, but most conversions are to PDF.
My problem is that the document returned does not contain the same headers as the source document. If I look at the OpenXml of the document, the rows do not have the TableHeader property but they do in the source.
Has anyone experienced this before? What can I do to fix this as I can find very little about WAS and how it works. We have invested a fair bit of time into developing this and do not want to have to resort to a third party component.

This is actually a defect in Word Automation Services and how it handles multi-row headers. I got around the issue by explicitly setting the colours in the header rows so that they look like headers, but are not.
This means that if the table spans across multiple pages then the header won't be repeated, but it's the lesser of two evils.

Related

Simple Template Based PDF Reports in .Net

My customer gave me some Word and Powerpoint documents which specify how certain 'reports' generated by our product are supposed to look like.
That means, I need to modify those documents (replace placeholders etc.) and then I need to export them as PDF.
How would you solve this problem in C# ?
TL;DR: Editing the office document is no problem at all, but exporting that document to PDF (using Interop) allegedly causes issues when running it as a web server application. That's the whole problem here.
I agree that Interop is not suitable for document manipulation in server environment. I would approach this problem by preparing MS Word template documents with placeholders for data. Then I would use c# to load the data for the reports and merge the data with templates to get final documents (docx, pdf, xps or various image formats). There are 3rd party toolkits which make it quite easy. Here is the code used by one such toolkit needed for merging xml data with the template to get a pdf document:
XElement customers = XElement.Load("Customers.xml");
DocumentGenerator dg = new DocumentGenerator(customers);
DocumentGenerationResult result = dg.GenerateDocument("MyTemplate.docx", "MyReport.pdf");
You can of course also use free libraries and SDKs based on OpenXML but you should expect a steep learning curve, lots of debugging and lots of time invested.
Wkthmltopdf might be an option.
A completely different "report approach" could be, to save those office documents with the placeholders as mht (That's MHTML a web archive format). This could be done directly in MS Office or even programatically.
The placeholders could be easily exchanged by string search and replace. The mht files could directly be used to show the report instead of the PDF. A clear disadvantage of the mht format, is the HTML formatting. With PDF you have a clear and fix positioning.
We are using this kind of report creation. There are some flaws, but it works and the customer could edit the mht templates directly by right-click Open-With the prefered MS Office flavor.
You can use report generators, like FastReport.Net for solving your problems. It can assign different data for placeholders and also allow export to PDF.

Best way to generate PDF in c# using Word or InDesign?

I'm comfortable generating Word documents using Aspose.Word (which can also save as a PDF) but I've recently been asked to do the same thing using a PDF as the starter template. We recently bought Aspose.Total and whilst Aspose.Pdf looks like it can do some manipulations it doesn't look to be all that flexible/easy (like adding a big line of text and getting it to wrap, and shifting other content down the page if it takes up more space).
What would be the best way of using a PDF as a template for what is basically a bit like a mail merge from a database? Should I turn it into a PDF form and merge it from an XML data source? Is this even viable or would such a form still have a limitation on spacing (so that longer lines/paragraphs of text won't reflow the document where necessary)?
From what I can tell it doesn't look like InDesign can be manipulated in c# even via a COM object (which would be nasty on a web server anyway).
If I recreated the InDesign/PDF as a Word document I'm sure I could work wonders, but you know what these publishing types are like, who think Word documents are the tool of the devil. These PDFs are never going to a professional printer anyway; they're just brochures for a client to download from a web page (based on information in a database) for printing/use at home.
You have indeed many solutions for such a web to print project. Choosing one is a matter of budget, requirements and users count. Placing dynamic contents can be done at the simpliest with PDF forms fillable with xml data.
On the other hand you can work with InDesign Server and output PDF based on InDesign templates. That's generally a good choice when a large amount of users needs to get rich pdf files in parallel. But the costs are heavy.
You can also envision A pitstop server or Callas PDFToolBox Server to place dynamic texts based on variables as supplied by you. The good point here is that you don't need much coding here. Those apps are ready to use.
You can at last consider command line tools. A few of them may have some useful commands such as pdfTk or cPdf to merge texts.

Convert PDF document to Word document by programmatically without any third party tool (SSRS 2005)

I am using SQL Server Reporting Service 2005(SSRS 2005) to export report to Excel and PDF and VS2008. But now i want an option to Export to Word also, but it is not possible in SSRS 2005 report that i came to know after googling. Here problem is that I CAN'T USE SSRS 2008 REPORT. So i thought that i will follow the steps as....
-- Export to Word
1. Export to PDF
2. Convert that PDF to Word document
Even after so much of googling i didn't got the proper answer. I told once and even telling that i can't use any third party tools so don't give me wrong path.
There are many fundamental differences between PDF and Word making the approach you want highly undesirable as a general workflow. I'll give just one example: PDF typically does not store information about document structure - sentences, paragraphs, columns, tables... All it stores is the actual text at certain locations at a page. Word of course does have those concepts.
Is it possible to do what you want? Yes, to some extent. In the general case with guesswork and approximation. If you know which information you want to convert it might be possible to search for it in the PDF file generated by SSRS and then generate a Word file out of it. However, if SSRS allows export to text, XML, RTF or any other structure based file format (however slightly structure based), you'd have a much easier time.
If you insist on doing what you suggest here, you would have to:
1) Write code to take the PDF exported from SSRS and interpret it (find the textual content you want)
2) Recreate the necessary structural information from that information (what are paragraphs, where and what are the tables, what's the formatting etc...)
3) Write that into a file Word can read (or create a new Word document directly using automation).
This would be a considerable amount of work, but you have all of the necessary information as the PDF specification is freely downloadable from the Adobe web site and it contains all of the information you need.

concatenating word documents and converting them to pdf

what is the best possible way to merge multiple documents and convert them to pdf. also we need to insert blank pages for every odd pages.
A fully supported, server side automated version of this (mostly baked into the the MS camp though) involves using the OpenXMLSDK to do any field inserts, then using Sharepoint's Word Automation Services (SP 2010) to convert the documents to PDF, and then pick your favorite PDF toolkit (iTextSharp for me) for any post processing (merging documents, inserting blank pages, or images that must be positioned relative to specific pages).
The reason for doing the document merge in PDF rather than OpenXML is simplicity - you don't have to deal with merging styles, headers etc.
The reason for doing the blank pages and image insertion is that OpenXML has no idea how to render the content, and so it has no idea where page breaks would occur naturally (you can still insert breaks like you would in Word though).
If you are using C# and you are OK with a server based solution then have a look at this post. It uses a .net friendly web services interface.
There is an optional SharePoint version available as well, but as you did not include a SharePoint tag I assume that won't be of interest to you.
Full disclosure, I wrote that post.

C# - Templated Printing from Object(s)

I'm in need of a solution to print or export (pdf/doc) from C#. I want to be able to design a template with place holders, bind an object (or xml) to this template, and get out a finished document.
I'm not really sure if this is a reporting solution or not.
I also don't want to have to roll my own printing / graphics code -- I'd like all display concerns handled in a template.
I initially think of this as something Crystal Reports can do (although I've never used CR), but I'm not sure if I'm abusing the system here -- I'm not really interested in binding ADO.NET datasets at the moment (screw datasets). Can Crystal deal with binding to objects?
Does SSRS or WPF play in this field too?
A subset of WPF-P is XPS which can be used to present your objects via databinding.
One of the best choices if you are already using WPF.
Google Keywords: XPS, FixedDocument, FlowDocument, WPF Printing
Might read through this thread:
http://groups.google.com/group/nhusers/browse_thread/thread/e2c2b8f834ae7ea8
Seems a lot of people like iTextSharp
http://itextsharp.sourceforge.net/
For Word docs, look into Word's Mail Merge feature and Word automation. I did this recently in a form letter printing project. Basically what I did was create a Word template file (file extension .dot) and in this template file I defined MergeFields in a standard form letter. My application queries a database for the records it needs to print and then for each record it returns it matches fields in the database with these merge fields and sends the result (the merged doc) to the printer.
It's working really well and if I had a link that gave a definitive explanation, I'd provide it (check back here, I'll see if I can't find the most useful ones). Hopefully I've provided enough keywords to let you find your own resources. I can go into more detail if you need.
I've never had to export PDF files but for a project I'm working on now I'll have to. For a free solution my research has lead to iTextSharp (like Will Shaver points out) but I've only done the initial investigations and I have found a few pay solutions I might end up resorting to.

Categories