Excel interop writes excel files that contain "empty" cells - c#

I have a c# tool that creates excel worksheets, which will be later read in again by another tool. This is done by using excel interop.
when reading the generated excel file, an exception stating: OleDbException: Too many fields defined.
it means that the file cannot be read in because there are too amny columns, but there should not be, as the real content only takes about 90 columns. As a workaround i deleted all the other columns manually in excel, and tried again to read it in.
this works as expected, so that means that the generated excel does contain nonempty cells (which are shown as empty cells in excel...)
is there a way to tell the inerop not to create empty cells, or is there another reason i should check?
Many thanks
Tom
PS: I am experiencing this problem with the 2003 interop libraries, while i've got Office 2007 installed.

I've Found the solution:
The tool was copying ranges from one sheet to another.
The source range was properly defined:
GetRange(fromWorkbookName, fromSheetName, A1, V20);
The destination cell hoever was adressed by:
GetRange(toWorkbookName, toSheetName, A1).EntireRow;
In 2003 interop, this seemed no Problem.
When doing this towards an xslx file i Office 2010, some "Empty" fields containing "#N/A" were created.
Nasty nasty...
anyhow the tool was not doing the copy/paste correctly (even excel itself warns you when you copy a range manually to an entire line). After correcting this it seems to work....

Related

Calculate emits 'Number of opened and closed parentheses does not match'

I have a protected worksheet that the provider will not unlock because it has proprietary formulas.
When I use the worksheet in Excel it works just fine. However, when I use EPPlus and call calculate it emits an error 'Number of opened and closed parentheses does not match'. I suspect one of the nested formulas is poorly formatted and Excel is just more tolerant.
Is there a setting to get around this (Like there is for circular references)?
Alternatively a way to scan the whole workbook for which of the nested formulas may be incorrect (so I could get the supplier of the sheet to correct it).
I see there is a module "EPPlus/FormulaParsing/LexicalAnalysis/SyntacticAnalyzer.cs" referenced at https://github.com/antiufo/epplus/blob/master/EPPlus/FormulaParsing/LexicalAnalysis/SyntacticAnalyzer.cs but I can find no examples of how to use this.
Gee, write that there is protection involved and your first thought is I am trying to get around it. I clearly states it just calculated a different result (Excel calculates, EPPlus throws an error). I said that so you would not tell me to change the sheet - I can't. To look at the formula in excel - I can't.
The issue is Excel calculates just fine, EPPlus does not. The evaluation engines differ and EPPlus throws errors when Excel does not.
The Error message said it was the 'Number of opened and closed parentheses does not match', I later figured it out later it was quotes not parentheses. The formula was something like =IF(A1="""", 0, 1).
As it turns out despite the protection, EPPlus does let you circumnavigate the protection and read all the formulas. Thanks for the hint. And it turns out it also cannot evaluate them correctly (to emulate Excel). Its rules are different about quotes. It also has missing formulas (NORMSINV - throws #NAME? error), it fails for circular references that excel does not (when the circular reference is in a VLOOKUP) . Pretty much it cannot be relied on to calculate a sheet as Excel would yet.

Reading particular columns of Excel sheet

I need to read particular columns from an Excel sheet (say Columns A,P,Q,B) and also some particular cells (say C3 or D10). I do not need it to be displayed in a DataGrid view or anything (all examples I have seen use DataGrid).
How do I do that and write them into a new CSV file?
I have no sample code as I do not know how to proceed.
I sugest you use the ExcelReaderInterop library.
using Microsoft.Office.Interop.Excel;
Detailed example can be found here:
http://www.dotnetperls.com/excel
I had to deal with a similar issue a few weeks ago and could not find a simpler way to deal with this. The post suggest this overkill approach may be due to lots of legacy code in the library.
We have successfully used the Microsoft Access Database engine to open and read Excel files. The "2010 Access Redistributable" can also be installed on a server free of charge. What you asking for is a multi-step process:
Open a connection to the file using the Access OleDbConnection. In the connection string the "Data Source" is the file name.
Select the appropriate worksheet, which will return a DataTable object.
Grab a row from the data table or iterate over top of all of them myDataTable.Rows.
Access the column in question.
This post shows some of the process:
SSIS Excel Source Connection. What does it use to read Excel?
Hopefully this gets you pointed in the right direction.
I copied the columns that I needed to another excel sheet and saved it as CSV. Then read this csv to perform the task.
This was the easiest option as the machine I was suppose to run the program didn't have Microsoft office.

Prompted to Save Changes on file created with EPPlus

I am creating a series of Excel Workbooks using EPPlus v3.1.3. When I open the newly created files, if I close it without touching anything it asks me if I want to save my changes. The only thing I've noticed changes if I say "yes" is that the app.xml file is slightly altered - there is no visible difference in the workbook, and the rest of the XML files are the same. I have tried both of these approaches:
ExcelPackage p = new ExcelPackage(new FileInfo(filename));
p.Save();
as well as
ExcelPackage p = new ExcelPackage();
p.SaveAs(new FileInfo(filename));
and both have the same problem. Is there a way to have the app.xml file output in its final form?
The reason this is an issue is because we use a SAS program to QC, and when the SAS program opens the files as they have been directly output from the EPPlus program it doesn't pick up the values from cells that have formulas in them. If it is opened and "yes" is chosen for "do you want to save changes", it works fine. However, as we are creating several hundred of these, that is not practical.
Also, I am using a template. The template appears normal.
What is particularly strange is that we have been using this system for well over a year, and this is the first time we have encountered this issue.
Is there any way around this? On either the C# or SAS side?
What you are seeing is not unusual actually. Epplus does not actually generate a full XLSX file - rather it creates the raw XML content (all office 2007 document formats are xml-based) and places it in the zip file which is renamed to XLSX. Since it has not been ran through the Excel engine it has not be fully formatted to excels liking.
If it is a simple data sheet then chances are Excel does not have to do much calculation - just basic formatting. So in that case it will not prompt you to save. But even then if you do you will see it change the XLSX file a little. If you really want to see what it is doing behind the scenes rename the file to .zip and look at the xml files inside before and after.
The problem you are running in to is because it is not just a simple table export Excel has to run calculations when opened for the first time. This could be many things - formulas, autofilters, auto column/row height adustments, outlining, etc. Basically, anything that will make the sheet look a little "different" after excel gets done with it.
Unfortunately, there is no easy fix for this. Running it through excel's DOM somehow would be simplest which of course defeats the purpose of using EPPlus. The other thing you could do is see the difference between the before and after of the xml files (and there are a bunch in there you would have to look at) and mimic what excel would change/add in the "after" file version by manually editing the XML content. This is not a very pretty option depending on how extensive the changes would be. You can see how I have done it in other situations here:
Create Pivot Table Filters With EPPLUS
Adding a specific autofilter on a column
Set Gridline Color Using EPPlus?
I ran into this same issue using EPPlus (version 4.1.0, fyi) and found adding the following code before closing fixed the problem:
p.Workbook.Calculate();
p.Workbook.FullCalcOnLoad = false;

Merging two excel files in c# without using interop

I have to merge two excel files containing one sheet in each of them and I have to generate a third file containing two sheets corresponding to the two original sheets.
This task can be done using "interop" and the code works but when the same code is run in a system that does not contain MS Office, the process fails and an error comes up.
Can you please guide me as to what dll files to be included or how this merging could be done without using interop?
Thanks in advance.
From what I've experienced, there is unfortunately no framework way of doing this (without writing your own excel file reader). I happened across this interesting library which does just that.
http://exceldatareader.codeplex.com/
So far it has worked for our needs and requires no interop.
You should use an external component to work with excel files. I use the syncfusion xslIo.
If you only have raw data (no formulas, etc) you could also just save the files using the XML Spreadsheet 2003 (*.xml) format (its very easy to read) and process the data using standard XML tools.

How do I set locked cell protection options in Excel with C# and Interop?

Here's the background info. I have an app that writes to an excel 2007 .xlsm file and I am using C# and the Excel 12.0 interop object libraries to do it, along with Visual Studio 2010. I am able to change the cell values and formulas, set the font and font style, set the cells to locked or not, etc. The last thing I need to do is to set the protection of the sheet to disallow selection of locked cells.
When I try to call this code, as a test of general sheet protection...
((Excel.Worksheet)excelApp.ThisWorkbook.Sheets[0]).Protect(Password: protectionPassword, AllowFormattingCells: false);
...I get the exception Exception from HRESULT: 0x800A03EC telling me a COM Exception was unhandled.
Also, the interop Protection object does not give me the option that I mentioned above, although that option is available in excel when I click "Protect Sheet" under the review tab.
So, now my question: How do I protect the desired sheet in Excel with with the option to AllowSelectLockedCells turned off using Excel Interop in C#?
You've probably solved this since it was asked, but for the benefit of those (such as me) who stumble upon this from search engines hoping for a solution:
Three points to get this working:
_Application.ThisWorkbook actually refers to the workbook object that contains macros, not the currently-active workbook in an Excel instance. For that you need _Application.ActiveWorkbook.
Excel worksheet indexes begin at 1, not 0.
To prevent locked cells from being selected (the AllowSelectLockedCells you were looking for) you first set the EnableSelection property to XlEnableSelection.xlUnlockedCells before locking the sheet.
So the following will do what you need:
((Excel.Worksheet)excelApp.ActiveWorkbook.Sheets[1]).EnableSelection = Excel.XlEnableSelection.xlUnlockedCells;
((Excel.Worksheet)excelApp.ActiveWorkbook.Sheets[1]).Protect(Password: protectionPassword, AllowFormattingCells: false);

Categories