Mail merge or merge-like functionality from C#

Mail merge or merge-like functionality from C# - c#

I need to print a few thousand stickers with a few text fields (name, position, etc) as well as a barcode image.
Each staff member gets two unique stickers, and the sticker paper has 4 per sheet so that's 2 staff per sheet.
I already have all the code to generate the barcode as an Image, and the staff details are stored in a List of object.
If possible, I'd like to avoid using MSWord directly since my development environment is quite different from the target environment and I've had issues in the past from the disparity. (Win7-64, MSOffice2010 vs. WinXP-32, MSOffice2003).
What's the best way to accomplish this?
If I save the document as an XML format and replace the mail merge fields with unique tokens which I can replace with my actual values (and I can even replace the binary image data with base-64 encoded image bytes) then that works but it's clunky. For starters, I'd have to save the XML file and then somehow print it transparent to the user (don't want Word showing up). Also, the XML template is 1 page, but I might have several dozen to print. I can send each page to the printer individually but that's not exactly ideal.
Any other suggestions?

I would use DevXpress XtraReports as I have used it in the past in similar scenarios with great results. If you prefer other engines like Crystal or Telerik is the same, as easy as dragging some fields in the page details section and assign your object list as datasource. DevXpress has also a RichTextBox with builtin mailmerge feature. at last if you decide for word do not forget that you can automate and use it while keeping it invisible so users wont see it.

Related

.NET program scan renderable text in Chart in .PDF - not for words but for values - Text Location features?

Hello I have a chart that I need to have the system review and give results...
Chart image located here....
example chart .pdf http://imageshack.us/photo/my-images/651/scorecardchartexample.gif/
http://imageshack.us/photo/my-images/651/scorecardchartexample.gif/
--Assume the chart is in .PDF and the text is renderable I.E. "highlight-able".
--Assume the chart is placed on the page exactly the same way and same position every time
--Assume the chart can change - that is to say, I need to be able to upload a 1000 of these charts all following the exact same format but with some alternate info from chart to chart.
--Assume VAST expertise in .NET - and little expertise in actual text interpretation.
--Assume expertise in interpreting .PDF that have editable fields...I am already doing this, this is limited to .PDF's I created and was able to place values on each field etc.
--Assume this chart is only deliverable in a single text renderable .PDF - that is to say - we interact with a website that creates this chart - this website has no API to interact with, we must print to PDF this chart from the webpage and that is all we can do...(government website)
Using a .NET system, I need to create a program...or incorporate an existing application into my .NET system, that will review this chart and will be able to tell what each "X" represents...that is to say an "X" one inch to the left or in the next row is an indicator of a different result (refer to chart)
I need the program to perform its search and return results based on the trigger of the .PDF document hitting a folder or whatever. This part we can handle assuming we creating the program from scratch...otherwise we will be limited to interacting with an existing app as needed.
We are open to a variety of strategies. Assuming such a class or object exists, we were thinking of reading text based on location in the document, like an X,Y sort of thing. Another desireable route would be some sort of stringBuffer (assume C#) but will need to be able to navigate the chart gridlines and will need to count white spaces to accurately interpret the position of the "X"'s and what the "X" means based on its placement. 3rd option, something we are unware of.
If something exists and is tried and true, well that of course woould be best. Then any tips on interfacing with it using .NET and C#.
Thank you all very much in advance Code Gawds!
Reel

OK We found some software called ClearImage - it wasn't cheap but it is pretty neat. It will analyze any image in the same fashion Adobe PDF analyzes a document to find form fields. After clear image does that it gives you a list of "blobs" you then get to dictate what each blob means and give it a unique identifer. This allows for auto value declaration based on "blob" placement in the image.
It also allows to sort of "finger print" an image so if the same image were to show up it could recognize it...in my case we have 3 different templates for the chart, and indeed each one will be different due to different charting, but ultimately each template has the same layout from multiples of the chart...this has helped in allowing our system to identfy what chart has been entered then after that first check, move on to anyalizing each blob.
Anyway, worth a look if anyone else should come across this question and is in need this type of function. I didn't want to leave it unanswered. I may update this as we learn more about it. I know this isn't exactly a coding question but this type of task is coding intensive and if anyone was looking to perform the same task they may find their way here. I will endeavor to update in the spirit of stackoverflow with comments relating to integration and objects etc. etc.
should anyone have more questions about this software in relation to coding you can ask here or post a new question, we will be happy to post our code (methods, classes objects etc.) we used (in C#) in terms of integrating it into our/your programs.

Use iTextsharp to edit pdf template without Acrofields

I have a pdf template without AcroFields and i need to replace text in it. The text is formated like this ((aFieldToReplace)), but there are also tables that need filled up with a n-numbered rows.
Is there any good tutorial, resource or sample to find?
Is there a way to replace a text in a PDF file with itextsharp? has more or less the same question but the answer ignores the "no Acrofield" part of the question.
EDIT:
To make it even harder, i have multiple templates that i can use. The templates have all there own formatting-style (font, color,...)
EDIT 2:
The purpose is to create a report with some data in a database. The data in a database is coming from several forms in a ASP.NET MVC application.
The report could have several layouts depending on the chosen template.
Templates should be addable dynamically, so i can't create the layout from scratch. I really need to get the layout from a template.

Quoting the excellent iText in Action:
In a PDF document, every character or glyph on a PDF page has its fixed position, regardless of the application that’s used to view the document.
[…]
Suppose you want to replace the word “edit” with the word “manipulate” in a sentence, you’d have to reflow the text. You’d have to reposition all the characters that follow that word. Maybe you’d even have to move a portion of the text to the next page. That’s not trivial, if not impossible.
[…]
Don’t expect any tool to be able to edit a PDF file the same way you’d edit a Word document.
PDF is a document display format. If you want templating you'll probably have to use something else.

#Frederiek:
If you can spend a bit of money, this will do exactly what you want. Check out the demo, it's quite cool. It can reflow the text, replace images, etc. Quite nice.
http://www.iceni.com/infixServer.htm
Let me know if that works for you.

Extract data from nested tables in PDF

I have a few pdf files that were created from word or excel files.
I need to get the information thats in the tables.
The text in the document is not an image so I'm able to extract the text using tools such as pdfbox.
When I have the text I have no way of knowing what cells in the table it belongs to because I don't know where the table borders are.
Iv'e tried a few desktop tools such as abby or solid pdf converter and they are able to convert the files into nice word documents but this doesn't suit my needs as I want to be able to do this programatticly in C#.
Some of the tables have nested tables wich I think makes this a little bit more diffucult.
I appreciate your help

The difficulty here is caused by the fact that the text in the PDF is not contained within any table. It might look like it is, but underneath the surface, it is not.
So there are a couple of options that I can think of. But none of them are going to be quite as satisfying as you'd probably like.
There are some companies that offer SDKs for PDF to Excel/Word conversion. Investintech and Iceni are a couple of examples. But these solutions are not free.
If you know the exact layout of the PDF files that you need to extract the table data from, then you can use any SDK that lets you extract text from a PDF and also tells you the exact co-ordinates of the extracted text. Using this method you need to know in advance where the text is going to be, so that you can extract text from a specific area on the page. It obviously won't work if you need to process any random document.
It's a difficult task, but hopefully this will give you a starting point.

Is there a way to replace a text in a PDF file with itextsharp?

I'm using itextsharp to generate the PDFs, but I need to change some text dynamically.
I know that it's possible to change if there's any AcroField, but my PDF doen's have any of it. It just has some pure texts and I need to change some of them.
Does anyone know how to do it?

Actually, I have a blog post on how to do it! But like IanGilham said, it depends on whether you have control over the original PDF. The basic idea is you setup a form on the page and replace the form fields with the text you want. (You can style the form so it doesn't look like a form)
If you don't have control over the PDF, let me know how to do it!
Here is a link to the full post:
Using a template to programmatically create PDFs with C# and iTextSharp

I haven't used itextsharp, but I have been using PDFNet SDK to explore the content of a large pile of PDFs for localisation over the last few weeks.
I would say that what you require is absolutely achievable, but how difficult it is will depend entirely on how much control you have over the quality of the files. In my case, the files can be constructed from any combination of images, text in any random order, tables, forms, paths, single pixel graphics and scanned pages, some of which are composed from hundreds of smaller images. Let's just say we're having fun with it.
In the PDFTron way of doing things, you would have to implement a viewer (sample available), and add some code over a text selection. Given the complexities of the format, it may be necessary to implement a simple editor in a secondary dialog with the ability to expand the selection to the next line (or whatever other fundamental object is used to make up text). The string could then be edited and applied by copying the entire page of the document into a new page, replacing the selected elements with your new string. You would probably have to do some mathematics to get this to work well though, as just about everything in PDF is located on the page by means of an affine transform.
Good luck. I'm sure there are people on here with some experience of itextsharp and PDF in general.

This question comes up from time to time on the mailing list. The same answer is given time and time again - NO. See this thread for the official answer from the person who created iText.
This question should be a FAQ on the itextsharp tag wiki.

C# - Templated Printing from Object(s)

I'm in need of a solution to print or export (pdf/doc) from C#. I want to be able to design a template with place holders, bind an object (or xml) to this template, and get out a finished document.
I'm not really sure if this is a reporting solution or not.
I also don't want to have to roll my own printing / graphics code -- I'd like all display concerns handled in a template.
I initially think of this as something Crystal Reports can do (although I've never used CR), but I'm not sure if I'm abusing the system here -- I'm not really interested in binding ADO.NET datasets at the moment (screw datasets). Can Crystal deal with binding to objects?
Does SSRS or WPF play in this field too?

A subset of WPF-P is XPS which can be used to present your objects via databinding.
One of the best choices if you are already using WPF.
Google Keywords: XPS, FixedDocument, FlowDocument, WPF Printing

Might read through this thread:
http://groups.google.com/group/nhusers/browse_thread/thread/e2c2b8f834ae7ea8
Seems a lot of people like iTextSharp
http://itextsharp.sourceforge.net/

For Word docs, look into Word's Mail Merge feature and Word automation. I did this recently in a form letter printing project. Basically what I did was create a Word template file (file extension .dot) and in this template file I defined MergeFields in a standard form letter. My application queries a database for the records it needs to print and then for each record it returns it matches fields in the database with these merge fields and sends the result (the merged doc) to the printer.
It's working really well and if I had a link that gave a definitive explanation, I'd provide it (check back here, I'll see if I can't find the most useful ones). Hopefully I've provided enough keywords to let you find your own resources. I can go into more detail if you need.
I've never had to export PDF files but for a project I'm working on now I'll have to. For a free solution my research has lead to iTextSharp (like Will Shaver points out) but I've only done the initial investigations and I have found a few pay solutions I might end up resorting to.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.