ExcelDNA / Managed XLL / Excel Interop - c#

I find it quite unbelieavble that the interop API is such a mess
A lot of methods have no comments on and seems to have been done very poorly
Has anyone else experienced the same and if so what library do you use to control Exel from C#?

The obvious practical problem with the VSTO/COM Interop technology is the overhead incurred when transitioning between worksheet and managed code. (And if you're trying to talk to Excel without the help of VSTO, stop doing so and save yourself some huge headaches). I thought VSTO did a pretty good job of providing a close analog of the Excel object model in the managed environment - certainly I didn't need to spend much time trying to understand much more about .NET Interop.
For longer-running automation activities the overhead's not so much of a problem, similar concerns to VBA automation apply: reduce calls across the interface as far as possible to get best performance.
For smaller, faster worksheet function-type work (the sort of thing where we might write an XLL, say) that overhead can be a killer. ExcelDNA seems to be a great way into delivering managed code through the XLL model - and the price is right.

SpreadsheetGear for .NET is an Excel compatible spreadsheet component for .NET. It will not enable you to control Excel, but it will give you an Excel compatible spreadsheet engine for ASP.NET / WinForms / etc... that can create, read, modify, view, edit, format, calculate, print and write Excel workbooks and charts from .NET. Since SpreadsheetGear is 100% safe managed code, there is no per-call performance penalty like you get with Excel.
The SpreadsheetGear API is very similar to Excel's API - except for the fact that many APIs are more strongly typed so they tend to be easier to use from C# than Excel's API.
You can see a feature list here, live ASP.NET reporting / charting / dashboard / calculation samples for VB and C# here and download the free trial here.
Disclaimer: I own SpreadsheetGear LLC

Related

c# reading xls file not working when microsoft office Excel is not installed? [duplicate]

A client wants to "Web-enable" a spreadsheet calculation -- the user to specify the values of certain cells, then show them the resulting values in other cells.
(They do NOT want to show the user a "spreadsheet-like" interface. This is not a UI question.)
They have a huge spreadsheet with lots of calculations over many, many sheets. But, in the end, only two things matter -- (1) you put numbers in a couple cells on one sheet, and (2) you get corresponding numbers off a couple cells in another sheet. The rest of it is a black box.
I want to present a UI to the user to enter the numbers they want, then I'd like to programatically open the Excel file, set the numbers, tell it to re-calc, and read the result out.
Is this possible/advisable? Is there a commercial component that makes this easier? Are their pitfalls I'm not considering?
(I know I can use Office Automation to do this, but I know it's not recommended to do that server-side, since it tries to run in the context of a user, etc.)
A lot of people are saying I need to recreate the formulas in code. However, this would be staggeringly complex.
It is possible, but not advisable (and officially unsupported).
You can interact with Excel through COM or the .NET Primary Interop Assemblies, but this is meant to be a client-side process.
On the server side, no display or desktop is available and any unexpected dialog boxes (for example) will make your web app hang – your app will behave flaky.
Also, attaching an Excel process to each request isn't exactly a low-resource approach.
Working out the black box and re-implementing it in a proper programming language is clearly the better (as in "more reliable and faster") option.
Related reading: KB257757: Considerations for server-side Automation of Office
You definitely don't want to be using interop on the server side, it's bad enough using it as a kludge on the client side.
I can see two options:
Figure out the spreadsheet logic. This may benefit you in the long term by making the business logic a known quantity, and in the short term you may find that there are actually bugs in the spreadsheet (I have encountered tons of monster spreadsheets used for years that turn out to have simple bugs in them - everyone just assumed the answers must be right)
Evaluate SpreadSheetGear.NET, which is basically a replacement for interop that does it all without Excel (it replicates a huge chunk of Excel's non-visual logic and IO in .NET)
Although this is certainly possible using ASP.NET, it's very inadvisable. It's un-scalable and prone to concurrency errors.
Your best bet is to analyze the spreadsheet calculations and duplicate them. Now, granted, your business is not going to like the time it takes to do this, but it will (presumably) give them a more usable system.
Alternatively, you can simply serve up the spreadsheet to users from your website, in which case you do almost nothing.
Edit: If your stakeholders really insist on using Excel server-side, I suggest you take a good hard look at Excel Services as #John Saunders suggests. It may not get you everything you want, but it'll get you quite a bit, and should solve some of the issues you'll end up with trying to do it server-side with ASP.NET.
That's not to say that it's a panacea; your mileage will certainly vary. And Sharepoint isn't exactly cheap to buy or maintain. In fact, short-term costs could easily be dwarfed by long-term costs if you go the Sharepoint route--but it might the best option to fit a requirement.
I still suggest you push back in favor of coding all of your logic in a separate .NET module. That way you can use it both server-side and client-side. Excel can easily pass calculations to a COM object, and you can very easily publish your .NET library as COM objects. In the end, you'd have a much more maintainable and usable architecture.
Neglecting the discussion whether it makes sense to manipulate an excel sheet on the server-side, one way to perform this would probably look like adopting the
Microsoft.Office.Interop.Excel.dll
Using this library, you can tell Excel to open a Spreadsheet, change and read the contents from .NET. I have used the library in a WinForm application, and I guess that it can also be used from ASP.NET.
Still, consider the concurrency problems already mentioned... However, if the sheet is accessed unfrequently, why not...
The simplest way to do this might be to:
Upload the Excel workbook to Google Docs -- this is very clean, in my experience
Use the Google Spreadsheets Data API to update the data and return the numbers.
Here's a link to get you started on this, if you want to go that direction:
http://code.google.com/apis/spreadsheets/overview.html
Let me be more adamant than others have been: do not use Excel server-side. It is intended to be used as a desktop application, meaning it is not intended to be used from random different threads, possibly multiple threads at a time. You're better off writing your own spreadsheet than trying to use Excel (or any other Office desktop product) form a server.
This is one of the reasons that Excel Services exists. A quick search on MSDN turned up this link: http://blogs.msdn.com/excel/archive/category/11361.aspx. That's a category list, so contains a list of blog posts on the subject. See also Microsoft.Office.Excel.Server.WebServices Namespace.
It sounds like you're talking that the user has the spreadsheet open on their local system, and you want a web site to manipulate that local spreadsheet?
If that's the case, you can't really do that. Even Office automation won't help, unless you want to require them to upload the sheet to the server and download a new altered version.
What you can do is create a web service to do the calculations and add some vba or vsto code to the Excel sheet to talk to that service.

Wrapper for Excel Interop that allows "good" programming

I'm fairly new to programming with Interop.Excel and may lack the experience, but why is it such a pain to program with it? Casting objects everywhere, more or less no documentation on the methods (at least not in the code) and every method seems to return objects, so there is no way of telling what it actually does. Also, all arguments are objects to begin with.
Is there any good wrapper library out there that provides basic functionality (writing to cells, reading, creating sheets, deleting sheets, basic formatting and layouting) and does this in a good, clean and understandable (and by that I mean: well-documented) way?
PS: Working with C# and .NET-3.5
I'm using EPPlus and very happy with it. It's open source and free.
The owners offer support when ever it was required.
EPPlus on codeplex
Note that EPPlus doesn't support Excel 2003 format. .xlxs is supported while. xls is not.
Well this is quite pricey, but in my opinion worth the money if you are doing serious commercial work with excel. As well as being the fastest option, you also don't even have to have excel installed, anyone who has run excel on a server will understand why that is huge.
Http://www.spreadsheetgear.com
If you're having problems with using interop directly, you may want to try out OfficeWriter. It can do anything you've described above with Excel, and more. You can request a free trial. There's a fully documented api available at the documentation site.
DISCLAIMER: I'm one of the engineers who built the latest version.
Try ClosedXML, it wraps the OpenXML SDK to give a very intuitive way to create Excel files.

Excel macro run from the web?

I have been handed a critical macro that takes an old school file full of invoices which thankfully is quite consistent. The macro reads this file, moves the data around to make it consistant and then generates a three tab speadsheet which is pretty much three CSV's. It then generates off these three CSV's another speadsheet which has a tab for each invoice. The amount of invoices can really vary.
It works, everyone is happy. We would like to put this out on the web with some security. For now, have it so that the user:
1) Logs in, uploads the old school file and presses process which will then spit out the same speadsheet with each tab being an invoice.
2) Store the data in a database for future growth and use of this data, as well as reporting.
I'm teaching myself ASP.NET and C# and think this would be a great learning project. Before I jump into it, can this realistically be done and what would others recommend in this case? Should I simply re-write based off the logic in the macro or is there a way to port over existing VBA code?
You can do it with an Excel COM API. But this tends to lead to memory leaks, I would not recomend it.
Microsoft has Excel Services which allow you to run Excel Spreadsheets on the server. But it is very expensive and may not support Macros.
SpreadSheetGear may be able to do it. But I have not tested it myself.
I would recommend that you rewrite the application in C#, you would get a better solution, and it may not take you any longer than getting the spreadsheet running on the server.
Using the Excel COM API from a web application is difficult. There are security issues which are non-trivial to address. If you wanted to retain the excel processing then you could build some sort of an out of band process which monitors an upload directory and, when it detects a new file, kicks off a process of transforming the excel file as the old macro use to.
There is no easy transition from VBA to C# since all the VBA code assume the existence of excel which may not be the case. However you can call macros in workbooks using the COM API.
Driving Excel from C# is surprisingly hard to get 100% right. Conversely, driving Excel from a VB6 application is surprisingly easy. But, calling this from a web application makes it harder, since you need to deal both with security and concurrency (2 users at once will trip over each other).
Microsoft don't support the use of Excel on the server (apart from Excel Services), so don't expect any help there. SpreadsheetGear is suited to this, but you'd have to pay for it.
You say this would make a good learning project - I'd disagree; it's likely to put you off programming altogether. This particular mix doesn't have a "nice" solution - it's a case of finding the least-unpleasant hack. If you want to learn ASP.NET & C#, I'd say find another pet project.

How to speed up generation of Word files from C#?

I'm working on an application that generates a relatively large amount of Word output. Currently, we're using Word Interop services to do the document creation, but it's quite slow, especially in older (pre-2007) versions of Office. We'd like to speed up the generation.
I haven't done a lot of profiling yet, but I'm pretty confident that the problem is that we're making tons of COM calls. I'm hoping that profiling will yield a subset of calls that are slower than the others, but my gut tells me that it's probably a question of COM overhead (or Word Interop overhead), and not just a few slow calls.
Also, the product can generate HTML output, and that process (a) is very fast, and (b) uses pretty much the same codepaths, just with a different subclass for the HTML-specific pieces of functionality. So I'm pretty sure that our algorithm isn't fundamentally slow.
So... I'm looking for suggestions for alternate ways to accelerate the generation of Word files.
We can't just rename the generated HTML files to .doc, and we can't generate RTF instead -- in both cases, important formatting information get lost, and in the RTF case, inlined graphics don't work robustly.
One of the approaches we're evaluating is programmatically generating and opening a Word file (via interop) from a template that has a macro that knows how to consume a flat file and create the requisite output. We're interested in feedback about that approach, as well as any other ideas for speeding things up.
If you can afford it, I'd recommend Aspose.Words product. Very fast and Word does not need to be installed.
Also it's much easier to use then office interop.
Your macro approach is exactly how we sped up slow excel interop (using version 2003 i think).
We found (at least with excel) that much of the slowness was due to repeated individual calls via the interop. We started to bunch up commands (ie. format large ranges, and then change specific cells as required rather than formating each cell individually), and logically moved on to macros.
I think that the macro + template approach would happily translate.

Effort estimation: using C / Win32 or learning C# / .NET

I intend to write a small application to scratch a personal itch and probably make the life of some colleagues easier. Here is what I have:
10+ years of experience in C
Plenty of experience in programming against the Win16/32 API in C from the Win3.1 to 2000 days.
C library written by myself already doing about 75% of what the application shall do.
What the application shall do:
open a binary, feed it into the mentioned library.
take the resulting text output and feed it into a new Excel Workbook.
apply some formating.
integrate nicely with the Windows environment (availability in "Open With...", remember some stuff using the registry etc.)
(maybe later) before giving the CSV data to Excel, parse it by looking up the meaning of some values in an XML file.
Except for the XML parsing part I have done all of that stuff before including COM / Office Automation in C/Win32. There is a lot of boilerplate code involved, but it is doable and the result will be a pretty small application without the need for an installer.
So why even think about C# / .Net?
no experience with parsing XML
the promise of less boilerplate code for the Windows and Excel stuff (yes, I have done C++ with OWL, MFC, ATL etc. but I am not going there anymore - not for free/fun)
Since I have also experience with C++, VB(not .Net) and a little Java / Objective-C I suppose learning C# will all be about the .Net libraries and not actually about the language.
My considerations so far:
Learning .NET might be fun and might result in less code / first steps in a more modern environment.
Sticking with what I know will lead to a predictable outcome in terms of effort and function (except for the optional XML stuff)
VB looked great at the beginning until the projects where about 80% done, then the pain started and the DLL coding in C. I am concerned history could repeat itself if I choose .Net.
My primary objective is the functionality. Effort is a concern. The XML parsing is optional.
Please advice.
Update: one thing I forgot to mention explicitly is that I am also worried about easy deployment of the tool to my co-workers. With Win32 I am pretty sure I can come up with an EXE file < 1Mb that can be easily emailed and does not require installation. With .Net not so much. Can I create the necessary MSI or whatever in Visual Studio Express (free) or do I need 3rd party tools?
as others have your question mostly covered, I'd just like to quickly comment on your considerations:
Learning .NET might be fun and might result in less code / first steps in a more modern environment.
Totally agreed. It is definitely fun and usually it does result in less code. The investment you make now will certainly benefit you in future projects. It is way faster to program in .Net than in C. Not only it is easier, but it is also safer. You are isolated from many programming errors common in C mostly related to memory mismanagement. You also get a very complete managed API to do stuff you would usually need to build your own framework.
Sticking with what I know will lead to a predictable outcome in terms of effort and function (except for the optional XML stuff)
Hence your indecision. :-)
VB looked great at the beginning until the projects where about 80% done, then the pain started and the DLL coding in C. I am concerned history could repeat itself if I choose .Net. My primary objective is the functionality. Effort is a concern. The XML parsing is optional.
.Net is an entirely different beast from VB. Most of the things you wouldn't be able to do in VB, or at least do them easily, are supported by .Net. For instance, Windows Services are a snap to build in .Net. Socket programming is also supported, but there are very few reasons to do it yourself, as you've got loads of communication APIs with .Net. You've got web-services, .Net Remoting, MSMQ management, and more recently WCF. Proper multithreading is supported by .Net, unlike the idiotic apartment model in VB. In case you really need to go low level, you can also actually use pointers in C#, inside of unsafe code blocks, even though I would never advise to do so.
If you really need to do things in C, then integrating is also relatively easy. You can create COM objects and use interop to work with them from .Net. You can also interact directly with plain ol' dlls using DllImport. Using www.pinvoke.net makes it easier.
When I developed in VB, sometimes I also had to go back to C++ to do stuff that I wasn't able of doing in VB. Since I began programming in .Net, the only extremely rare scenarios I would need to go back to C++ were when I needed to use legacy COM components that used types I was having a hard time to marshal via interop. I wouldn't worry about history repeating itself.
If you're using COM, you may be interested in using C# 4.0 instead of earlier versions - the downside being that it's only in beta. But basically it makes COM stuff somewhat less ugly for various reasons.
I'd expect there to be plenty of good C libraries for XML parsing by now. I would expect the main benefit to actually be the knowledge gained. I doubt that you'll actually produce the code faster for this project, but the next one may well be a lot quicker.
How much do you care about learning new stuff?
It sounds like an ideal project for learning C# & .NET.
You know most of what you need to do so you can use that to gain a base level of understanding of C# & .NET which you can then apply to the stuff you need to learn.
As Rune says though, a key driver could be the timescales. If this is something you need in a hurry then coding it in C & using win32 directly might be the answer.
Sorry I couldn't be more definite.
I think you should use C#. With your experience the learning curve won't be too steep. The code will ultimately be cleaner (and less of it) than you probably could with C/Win32.
There is probably going to be no problem using your existing C-library with the [DllImport] attribute.
It depends. :-) It depends on whether you want to do this quickly or if you want to learn something new. It depends on whether you will be the only maintainer of the code or if others will maintain it in the future. It depends on how complex your xml handling will be and on how complex the COM automation is.
You will probably get a working application quicker if you do it in C than in C#. Both since you have much of the stuff needed already in place and since you know C well.
But this project sounds like a good match for C# and .Net. .Net has great support for XML and COM interop is easy but clumsy in C# (much better in the next version!). So if you are interested in learning C# and .Net this would be a good project to do so.
I would definitely do this in .Net and probably C# (but I am biased). Using .Net would probably result in code that is easier to read and maintain and most probably easier to write. So if you are interested in learning C# I would suggest you go for it!
Edit:
You worry about the size of the executable if you write it in .Net. I doubt that will be a problem, for most if not all of the libraries you will use for a project like this will already be installed on your computer. 1 Mb is rather large for a .Net executable, event for a big project.
a short notice on the installation. .NET is as default xcopy-able so you wouldn't need an installer for the exe to be usable. Mail it around (or with the next release of the .NET framework optionaly leave it on a network share)
You could look at building a hybrid system that uses C++/CLI and C#. C++/CLI provides a nice bridge between the two and lets you easily split different parts of the system between the managed and unmanaged worlds.
Not sure if the setup projects are included in the free versions of visual studio. But you could use clickonce (included with the framework) or WIX (open source XML based msi creation tool).
learning C# will all be about the .Net libraries and not actually about the language
No there are many things you need to learn about the language (delegates , events , generics ...) and also it is object oriented and it manages the memory by itself and yes no pointers :)
anyway C# and .NET are great all you need is some effort to get up to speed

Categories