I am generating CSV data from a C# application. This can be imported into Excel easily but I need formatting applied to the file.
One option is interop but the machine running this application will not have Office products installed so that is out.
I've been told that XML can work with Excel templates and am looking for a starter example on how to achieve this.
I have generated excel spread sheets using the excel 2003 xml format several times but you will have to consider the following features that cannot be supported using this format:
This XML Spreadsheet 2003 file format (.xml) does not retain the following features:
Auditing tracer arrows
Chart and other graphic objects
Chart sheets, macro sheets, dialog sheets
Custom views
Data consolidation references
Drawing object layers
Outlining and grouping features
Password-protected worksheet data
Scenarios
User-defined function categories
VBA projects
If that is acceptable you can use as someone suggest an open source library that allows you to generate the spreadsheet in code or as I have done you can generate the xml using either an xml transform or a using the spark template engine. Both have worked for me in the past but using the spark view engine was probably the nicest.
The best way to achieve either of these is to create a template the way you want it to look and save it as a Excel 2003 Xml format and look at the raw xml. This should make it easy for you to generate your output. You can also download the xml definition for reference.
You can use excellent OpenXML wrapper ClosedXML to generate xlsx files with formatting. Or if you want, you can use pure OpenXML. OpenXML installation is required for ClosedXML to work.
Related
I am developing an application which must produce a complex well-formatted Excel report. I have previously successfully used OpenXML for filling in PowerPoint and Word templates by first reflecting code with OpenXML Productivity Tool and then passing a model into the reflected code and making the required changes.
I have noticed that some people recommend using ClosedXML and NPOI (e.g. here http://odetocode.com/blogs/scott/archive/2014/07/30/easily-generate-microsoft-office-files-from-c.aspx) for Excel. I've started researching ClosedXML and indeed it seems like a nice solution for creating new Excel files, but I would prefer to fill out the template I already have as it would take a month to write it from scratch.
Is it possible to reflect Excel files into ClosedXML code similarly to like it's done with OpenXML Productivity Tool?
Do you have any examples of ClosedXML code which creates complex Excel files with multiple sheets, thousands of rows, some charts and advanced formatting?
It's quite time-consuming to edit OpenXML code when any significant changes needed. What do you think would be the best technology to use for my task considering also time spent on editing?
Any other tips and tricks on manipulating Excel with C# are appreciated! Thank you for your answers.
ClosedXML answers:
You can't reflect code is with OpenXML, but you can load an Excel template and populate it in ClosedXML.
See the ClosedXML Wiki at https://github.com/ClosedXML/ClosedXML/wiki for examples.
I have to merge two excel files containing one sheet in each of them and I have to generate a third file containing two sheets corresponding to the two original sheets.
This task can be done using "interop" and the code works but when the same code is run in a system that does not contain MS Office, the process fails and an error comes up.
Can you please guide me as to what dll files to be included or how this merging could be done without using interop?
Thanks in advance.
From what I've experienced, there is unfortunately no framework way of doing this (without writing your own excel file reader). I happened across this interesting library which does just that.
http://exceldatareader.codeplex.com/
So far it has worked for our needs and requires no interop.
You should use an external component to work with excel files. I use the syncfusion xslIo.
If you only have raw data (no formulas, etc) you could also just save the files using the XML Spreadsheet 2003 (*.xml) format (its very easy to read) and process the data using standard XML tools.
I am interested in writing an application that will take in an excel document of a specific format, massage the data and create a new Excel document that has different formatting.
I am curious if anyone can recommend a good place to start on this.
My first thought was to write something my self in C#. I came across this tool on codeplex:
http://excelwrapperdotnet.codeplex.com/wikipage?title=Usage%20-%20Example&referringTitle=Documentation
But it appears to only be for Excel 2007.
Is there a best practice for doing this type of thing for Excel 2010 documents? Do I even need to program something custom to do this or does Excel offer something that might handle this?
Another nice library to modify Excel 2007/2010 documents (.xlsx) is EPPlus. It gives you a nice object model on your spreadsheets.
Excel files (.xslx) are archived XML files. They use 'Open XML', take a look here MICROSOFT Open XML
That should get you going on the right path.
I need to create a script that extracts some data from a complex Excel 2003 file (with multiple sheets and different tables inside a single sheet) and produces different XML files that need to be validated against a given XSD file.
My preferred language is Python;
to create and validate XML files i would go with lxml.
What do you suggest for parsing XLS files?
Is xlrd the right tool to use for complex Excel files?
Or do i need to convert all the sheets in CSV manually, and read files line by line, splitting and getting data?
I accept C#, VB6, VBA suggestions too.
[disclaimer: I'm the author of xlrd]
xlrd is quite suited for this kind of job. Get the latest version from PyPI. Get the flavour from the tutorial found here. XLSX support is in alpha test; e-mail me if you need it. The awkwardness and lossiness of the save-as-CSV approach was one of the things that prompted me to write xlrd.
Xlrd is OK. We use it extensively to import XLS files full of references and formulas with multiple sheets and data presented in custom (not Latin-1) encoding.
I am convinced the most simple solution for this task is using Excel VBA together with MSXML parser. Look here for some links how to use the MSXML parser in VBA for reading XML files; you can adopt this easily for writing XML files, I think.
I cant answer whether xlrd/python is the right tool for the job - as I don't know python well enough.
But there are many ways to access the excel data...in the main you have VBA built directly in to Excel.
Then you have Ado.net See David Hayden's article here which allows you to access the data via any DotNet language...even IronPython
I'm developing printing solution for MS Office 2007. Office automation is not right for me, because it requires Office to be installed. Open XML Document Viewer is solution for converting Word files (.docx) to HTML format by XSLT transform, but it works only for .docx. Can the same technology be used for Excel spreadsheets files?
You could use this article XSL transformation of SpreadsheetML to HTML as a starting point to develop your own transform. You can also look at the open source XSLTs in OpenXML/ODF Translator Add-ins for Office to get some ideas on things you may need to account for in any conversion outside of OOXML. The one thing to keep in mind is that SpreadsheetML is more similiar to PresentationML than it is to WordprocessingML in file structure inside the package (i.e. for every sheet, there is a seperate file).
If your doing this from .NET, I'd do this from LINQ instead of XSLT. I've done transforms from DrawingML into SVG and Linq makes it easy (in terms of similiar functionality to XSLT, staying within .NET, etc.)
If you're looking at Excel 97-03 (xls) or Excel 2007 (xlsx) files then I'd recommend FlexCel. I've used it, is very good and honestly quite cheap compared to it's competition.
Note that it doesn't fully support all formatting present in Excel 2007 yet I don't think. But it does have built in functionality to export to HTML.
You could write a SpreadsheetML parser. The schema is available online from Microsoft.
I wrote one a while back that covered data, structure and basic formatting to throw it throw a library and re-save it as an XLS file. Wasn't too difficult.