Excel: Losing decimal separator when converting from strings to number - c#

I am trying to read some values from several files and save them in a new .xlsx file with different grouping. I devised a very simple setup to test different formatting and behavior with null values. I always open just-created file in Excel to see outcome. So far no problem.
However in my test-case I can achieve either: A) save the test values as they are (strings) or B) force Excel to regard them as numbers with given format (good), but lose decimal separator (very bad & strange).
I had traced problem to the last line in a code snippet below. The idea of self-assign is from another post somewhere here at SO but right now I am unable to find it.
If the line is commented-out the results are as in a string[,] contents only they are formatted as text (and Excel complains about this with "number formatted as text" message). If I uncomment it, the numbers are regarded as numbers but lose decimal separators. Also the problem might be a fact that I am in Czech Republic and decimal separator is , which might trouble Excel. Moreover, reading the values from start into a double[,] contents is out, since I need to indicate whether value is absent (with empty cell). And double?[,] contents crashes Excel...
Please, haven´t you met this behavior before? I would like to 1) be able to indicate missing value and 2) have contents of cells formatted as a number, not text. Can you help me how to achieve this?
excelApp = new Excel.Application();
excelWorkBooks = excelApp.Workbooks;
excelWorkBook = excelWorkBooks.Add();
excelSheets = excelWorkBook.Sheets;
excelWorkSheet = excelSheets[1]; //Beware! Excel is one-based as opposed to a zero-based C#
string[,] contents = new string[,] { { "1,23", "2,123123123", "3,1415926535" }, { "2,15", null, "" } };
int contentsHeight = contents.GetLength(0);
int contentsWidth = contents.GetLength(1);
System.Globalization.CultureInfo currentCulture = System.Threading.Thread.CurrentThread.CurrentCulture;
string numberFormat = string.Format("0" + currentCulture.NumberFormat.NumberDecimalSeparator + "00E+00");
for (int column = 0; column < contentsWidth; column++) {
excelWorkSheet.Columns[column + 1].NumberFormat = numberFormat;
}
Excel.Range range = excelWorkSheet.Range[excelWorkSheet.Cells[1, 1], excelWorkSheet.Cells[contentsHeight, contentsWidth]];
range.Value = contents;
// range.Value = range.Value; //Problematic place
EDIT: I tryed to change NumberFormat from 0,00E+00 to something like 0,0, 0.0, #,# for the sake of test, but with no success. Either crash (decimal dot) or remains as a text.

There's no need to convert numbers to text before writing them to a cell. Excel understands numbers. A further problem is that the code is trying to set the array as the value of an entire range, as if pasting into Excel.
It's possible to set numbers, even nulls, directly using a simple loop, eg
double?[,] contents = new double?[,] { { 1.23, 2.123123123, 3.1415926535 },
{ 2.15, null, null } };
int contentsHeight = contents.GetLength(0);
int contentsWidth = contents.GetLength(1);
...
for(int i=0;i<= contentsHeight; i++)
for (int j = 0; j <= contentsWidth; j++)
excelWorkSheet.Cells[i+1,j+1].Value = contents[i,j];
Instead of using Excel through Interop though, it's better to use a package like EPPlus to generate xlsx files directly without having Excel installed. This allows generating real Excel files even on web servers, where installing Excel is impossible.
The code for this particular problem would be similar:
var file = new FileInfo("test.xlsx");
using (var pck = new ExcelPackage(file))
{
var ws = pck.Workbook.Worksheets.Add("Rules");
for(int i=0;i<= contentsHeight; i++)
for (int j = 0; j <= contentsWidth; j++)
ws.Cells[i+1,j+1].Value = contents[i,j];
pck.Save();
}
EPPlus has some convenience methods that make loading a sheet easy, eg LoadFromDataTable or LoadFromCollection. If the data came from a DataTable, creating the sheet would be as simple as:
var file = new FileInfo("test.xlsx");
using (var pck = new ExcelPackage(file))
{
var ws = pck.Workbook.Worksheets.Add("Rules");
ws.LoadFromDataTable(myTable);
pck.Save();
}
LoadFromDataTable returns an ExcelRange which allows cell formatting just like Excel Interop.

Related

Validate column with Excel formula increments the formula is expected behavior or bug?

I'm trying to create a spreadsheet where the first sheet ("Catalog") contains some pre-filled and some empty values in a column. I want the values to be in a drop down list that are restricted to values found in the second sheet ("Products").
I would expect that if I set the the Excel validation formula for cells "A1:A1048576" in the "Catalog" sheet to be a list validation of "Products!A1:A100" that every cell would only allow values from "Products!A1:A100". However, I'm finding that my formula gets incremented for every row in the "Catalog" sheet (i.e. In row 2 the formula becomes "Products!A2:A101", in row 3 the formula becomes "Products!A3:A102").
If version matters I'm using EPPlus.Core v1.5.4 from NuGet.
I'm not sure if this is a bug or if I'm going about applying my formula wrong?
I've already tried directly applying the validation to every cell in the column one cell at a time. I found that not only does it moderately increase the size of the resulting Excel file but more importantly it also exponentially increases the time taken to generate the Excel file. Even applying the validation one cell at a time on the first 2000 rows more than doubles the generation time.
ExcelPackage package = new ExcelPackage();
int catalogProductCount = 10;
int productCount = 100;
var catalogWorksheet = package.Workbook.Worksheets.Add($"Catalog");
for (int i = 1; i <= catalogProductCount; i++)
{
catalogWorksheet.Cells[i, 1].Value = $"Product {i}";
}
var productsWorksheet = package.Workbook.Worksheets.Add($"Products");
for (int i = 1; i <= productCount; i++)
{
productsWorksheet.Cells[i, 1].Value = $"Product {i}";
}
var productValidation = catalogWorksheet.DataValidations.AddListValidation($"A1:A1048576");
productValidation.ErrorStyle = ExcelDataValidationWarningStyle.stop;
productValidation.ErrorTitle = "An invalid product was entered";
productValidation.Error = "Select a product from the list";
productValidation.ShowErrorMessage = true;
productValidation.Formula.ExcelFormula = $"Products!A1:A{productCount}";
I guess I'm not that adept at Excel formulas.
Changing this line:
productValidation.Formula.ExcelFormula = $"Products!A1:A{productCount}";
to this:
productValidation.Formula.ExcelFormula = $"Products!$A$1:$A${productCount}";
stopped the auto increment issue. Hopefully this answer will save someone else some sanity as I wasted half a day on this issue myself.

ExcelWorksheet.UsedRange is counting wrong if file has empty rows on top

I have c# windows application that is reading files content. I wanted to extract values from used rows only.
I am using this code:
int rows = ExcelWorksheet.UsedRange.Rows.Count;
Everything works fine. Except when I have empty rows on top, the counting will be incorrect.
-File has no special characters, formula or such. Just plain text on it.
-The application can read excel xls and xlsx with no issue if the file has no empty rows on top.
Okay, now I've realized I'm doing it all wrong. Of course it will not read all of my UsedRange.Rows because in my for loop, I am starting the reading always on the first row. So I get the ((Microsoft.Office.Interop.Excel.Range)(ExcelWorksheet.UsedRange)).Row; as a starting point of reading
This code works:
int rows = ExcelWorksheet.UsedRange.Rows.Count;
int fRowIndex = ((Microsoft.Office.Interop.Excel.Range)(ExcelWorksheet.UsedRange)).Row;
int rowCycle = 1;
for (int rowcounter = fRowIndex; rowCycle <= rows; rowcounter++)
{
//code for reading
}
Instead of read Excel row-by-row, better to get it in C# as a Range, and then handle it as
Sheet.UsedRange.get_Value()
for whole UsedRange in Sheet. Whenever you'd like to get a part of UsedRange, do it as
Excel.Range cell1 = Sheet.Cells[r0, c0];
Excel.Range cell2 = Sheet.Cells[r1, c1];
Excel.Range rng = Sheet.Range[cell1, cell2];
var v = rng.get_Value();
You well know size of v in C# memory from the values of [r1-r0, c1-c0]

Excel Interop Open/Repair HResult exception

What I do: populate & format an Excel file using a mix of Interop and ClosedXML.
First, the file is populated via Interop, then saved, closed, then I format the cells' RichText using ClosedXML.
Unfortunately, this formatting causes Excel to view my file as "corrupt" and needs to repair it.
This is the relevant part:
var workbook = new XLWorkbook(xlsPath);
var sheet = workbook.Worksheet("Error Log");
for (var rownum = 2; rownum <= 10000; rownum++)
{
var oldcell = sheet.Cell("C" + rownum);
var newcell = sheet.Cell("D" + rownum);
var oldtext = oldcell.GetFormattedString();
if(string.IsNullOrEmpty(oldtext.Trim()))
break;
XlHelper.ColorCellText(oldcell, "del", System.Drawing.Color.Red);
XlHelper.ColorCellText(newcell, "add", System.Drawing.Color.Green);
}
workbook.Save();
And the colouring method:
public static void ColorCellText(IXLCell cel, string tagName, System.Drawing.Color col)
{
var rex = new Regex("\\<g\\sid\\=[\\sa-z0-9\\.\\:\\=\\\"]+?\\>");
var txt = cel.GetFormattedString();
var mc = rex.Matches(txt);
var xlcol = XLColor.FromColor(col);
foreach (Match m in mc)
{
txt = txt.Replace(m.Value, "");
txt = txt.Replace("</g>", "");
}
var startTag = string.Format("[{0}]", tagName);
var endTag = string.Format("[/{0}]", tagName);
var crt = cel.RichText;
crt.ClearText();
while (txt.Contains(startTag) || txt.Contains(endTag))
{
var pos1 = txt.IndexOf(startTag);
if (pos1 == -1)
pos1 = 0;
var pos2 = txt.IndexOf(endTag);
if (pos2 == -1)
pos2 = txt.Length - 1;
var txtLen = pos2 - pos1 - 5;
crt.AddText(txt.Substring(0, pos1));
crt.AddText(txt.Substring(pos1 + 5, txtLen)).SetFontColor(xlcol);
txt = txt.Substring(pos2 + 6);
}
if (!string.IsNullOrEmpty(txt))
crt.AddText(txt);
}
Error in file myfile.xlsx
The following repairs were performed: _x000d__x000a__x000d__x000a_
Repaired records:
string properties of /xl/sharedStrings.xml-Part (strings)
I've been through all the xmls looking for clues. In the affected sheet, in comparison view of Productivity Tool, some blocks appear as inserted in the repaired file and deleted in the corrupt one, although nothing significant seemed changed - except for one thing: the style attribute of that cell. Here an example:
<x:c r="AA2" s="59">
<x:f>
(IFERROR(VLOOKUP(G2,Legende!$A$42:$B$45,2,FALSE),0))
</x:f>
</x:c>
I have checked the styles.xml for style 59, but there is none. In the repaired file, this style has been changed to 14, which in my styles.xml is listed as a number format.
Unfortunately, a global search/replace of these invalid style indexes did not resolve the issue.
Seeing the things going on here with corrupt indexes, renamed xmls, invalid named ranges etc., I took a different route: not to use interop at all, maybe the corruption was caused by Excel in the first place and the coloring was only the last straw.
Using ClosedXml only:
Wow. Just wow. This makes it even worse. I commented out the colouring part since without that, Interop produced a readable file without errors, so that's what I expect of ClosedXml too.
This is how I open the file and address the worksheet with ClosedXml:
var wb= new XLWorkbook(xlsPath);
var errors = wb.Worksheet("Error Log");
This is how I write the values into the file:
errors.Cell(zeile, 1).SetValue(fname);
With zeile being a simple int counter.
I then dare to set a column width:
errors.Column(2).Width = 50;
errors.Column(3).Width = 50;
errors.Column(4).Width = 50;
As well as setting some values in another sheet in exactly the same fashion before saving with validation.
wb.Save(true);
wb.Dispose();
Lo and behold: The validation throws errors:
Attribute 'name' should have unique value. Its current value 'Legende duplicates with others.
Attribute 'sheetId' should have unique value. Its current value '4' duplicates with others.
A couple more errors like attribute 'top' having invalid value '11.425781'.
Excel cannot open the file directly, must repair it. My Sheet "Legende" is now empty and the first sheet instead of third, and I get an additional fourth sheet "Restored_Table1" which contains my original "Legende" contents.
What the hell is going on with this file??
New attempt: re-create the Excel template from scratch - in LibreOffice.
I now think that the issue is entirely misleading. If I use the newly created file from LibreOffice, the validation causes a System.OutOfMemory exception due to too many validation errors. Opening in Excel requires repair, gives additional sheet and so forth.
Creating in LibreOffice, then opening in Excel, saving, then using that file as template produces a much better result albeit not perfect yet.
Since I copied parts over from the old Excel file into LO while creating the new file, I assume some corrupt remnant got copied over.
I cannot shake the feeling that this is the file itself after all and has nothing to do with how I edit it!
Will post updaate tomorrow.
OK. Stuff this.
I created a completely fresh file with LibreOffice, making sure not to copy over anything at all from the original file, and I ditched Interop in favour of ClosedXml.
=> This produced a corrupt file in which my first sheet was cleared and its contents move to a "Restored_Table1".
After I opened my fresh new template with Excel via Open/Repair and saved it, the resulting, uncoloured file was NOT corrupt.
=> Colouring it produces the "original" corruption, all sheets intact.
ClosedXml seems to be marginally slower than Interop but at this point I couldn't care less. I guess we will have to live with the "corrupt" message and just get on with it.
I hate xlsx.

EPPlus - Named Range is there but not working

Similar to
EPPlus - Named Range is not populated
In his case, his ranges were at the workbook level but he was looking at the worksheet level.
My EP code shows a count of 0 ranges at the workbook level and 15 at the sheet level, as it should be.
Opening the worksheet.Names shows all 15, with proper names.
Retrieve a range, and the formula is correct with
"OFFSET(Sheet1!$A$33, 0, Sheet1!_CurrentMonth, 1, 55 -Sheet1!_CurrentMonth)", but almost everything else returns an exception on evaluation.
It reports 1 column, which is incorrect.
And the 'FullAddress' looks correct with "'Sheet1'!_Fund1Projected", but 'FullAddressAbsolute' gives "$#REF!$-1"
Lastly, I'm using a template, xltm, to create a spreadsheet, xlsm.
public static void CreateChart()
{
var excelFullPath = "C:\\Users\\username\\Documents\\Excel\\Templates\\";
var excelFileName = "LowCashBalanceChart.xlsm";
FileInfo newFile = new FileInfo(excelFullPath + excelFileName);
if (newFile.Exists)
newFile.Delete();
FileInfo template = new FileInfo(excelFullPath + "Sample Chart.xltm");
using (ExcelPackage xlPackage = new ExcelPackage(newFile, template))
{
ExcelWorksheet worksheet = xlPackage.Workbook.Worksheets["Sheet1"]; //xlPackage.Workbook.Worksheets.FirstOrDefault();
ExcelNamedRange namedRange;
namedRange = xlPackage.Workbook.Names["_Fund1Projected"]; // fails, no ranges at the WB level
namedRange = worksheet.Names["_Fund1Projected"]; // this one works
for (int rowIndex = namedRange.Start.Row; rowIndex <= namedRange.End.Row; rowIndex++) // Exception on range.Start
// 'namedRange.Start' threw an exception of type 'System.ArgumentOutOfRangeException'
{
for (int columnIndex = namedRange.Start.Column; columnIndex <= namedRange.End.Column; columnIndex++)
{
worksheet.Cells[rowIndex, columnIndex].Value = (rowIndex * 100 + columnIndex).ToString();
}
}
xlPackage.Save();
}
}
I looked at the code on GitHub, but nothing stands out.
Tried it with the ranges at the workbook level as well with the same results.
I solved my problem, I'll put answer here for anyone that may need it in the future.
I created a range for a 3x3 square.
Range1 = =Sheet1!$A$24:$C$26
I can write to that just fine. No exceptions.
But when we have ranges that have endpoints determined by values of other cells, it fails.
=OFFSET(Sheet1!$A$32, 0, Sheet1!_CurrentMonth, 1, 55 -Sheet1!_CurrentMonth)
The problem is that our named ranges are dynamic.
That’s why it was getting an exception.
The work-around is to not use dynamic ranges from EPPlus.
Just a little more C# code to handle the dynamic part instead of Excel handling it for you.

How do you check whether a cell is readonly in EXCEL using C#

I am importing data to Excel sheets from a database. For this, I am using datareader. The excel sheet template has some macros and few formulae calculated and its not the normal excel worksheet. so I have to write the data into the excel sheet only if the particular cell is allowed to write. If not, the data shouldn't be imported.
So for this, I have a XML file which says from which column I should start writing and in which row it should stop, I have done this for many sheets. But in one sheet, the first cell of the row is "readonly" (locked) and the rest are write access permitted.
Since I get the entire row from DB using Datareader, I am stuck with needing to write to the other cells, without writing to the locked cell.
I am attaching the code snippet for reference.
Please help me in doing this.
Sample ::
if (reader.HasRows)
{
minRow = 0;
minCol = 0;
Excel.Workbook SelWorkBook = excelAppln.Workbooks.Open(curfile, 0, false, 5, "", "", false, Excel.XlPlatform.xlWindows, "", true, false, 0, false, false, false);
Excel.Sheets excelSheets = SelWorkBook.Worksheets;
Excel.Worksheet excelworksheet = (Excel.Worksheet)excelSheets.get_Item(CurSheetName);
// Process each result in the result set
while (reader.Read())
{
// Create an array big enough to hold the column values
object[] values = new object[reader.FieldCount];
// Add the array to the ArrayList
rowList.Add(values);
// Get the column values into the array
reader.GetValues(values);
int iValueIndex = 0;
// If the Reading Format is by ColumnByColumn
if (CurTaskNode.ReadFormat == "ColumnbyColumn")
{
minCol = 0;
// minRow = 0;
for (int iCol = 0; iCol < CurTaskNode.HeaderData.Length; iCol++)
{
// Checking whether the Header data exists or not
if (CurTaskNode.HeaderData[minCol] != "")
{
// Assigning the Value from reader to the particular cell in excel sheet
excelworksheet.Cells[CurTaskNode.DATA_MIN_ROW + minRow, CurTaskNode.DATA_MIN_COL + minCol] = values[iValueIndex];
iValueIndex++;
}
minCol++;
}
minRow++;
}SelWorkBook.Close(true, curfile, null);
Please help me in resolving this.
Thank You,
Ramm
Ok, first you need to check the locked property of the first cell, then if it's locked slice the array (so that you have the whole row minus the first column), then write to the sheet. Here's some code, not necessarily exact, the SLICE function is just pseudo-code, there are a number of different ways of slicing arrays in C#, use the method of your choice:
if (!excelworksheet.Cells[CurTaskNode.DATA_MIN_ROW + minRow, CurTaskNode.DATA_MIN_COL + 1].Locked )
{
excelworksheet.Cells[CurTaskNode.DATA_MIN_ROW + minRow, CurTaskNode.DATA_MIN_COL + minCol] = values[iValueIndex];
iValueIndex++;
}
else
{
excelworksheet.Cells[CurTaskNode.DATA_MIN_ROW + minRow, CurTaskNode.DATA_MIN_COL + minCol] = values.SLICE(iValueIndex);
iValueIndex++;
}

Categories