Check excel cells - c#

I need to check if excel cell is empty, null or whitespace (in case there is some string in it). This code should work, but it takes every cell (empty or not) and puts it in rowArray (i guarantee there shouldnt be more than 20 values in this:
object[] rowArray = new object[20];
for (int i = 4; i < 38; i++)
{
excelRange = (Excel.Range)excelWorkSheet.Cells[i, 2];
if (!string.IsNullOrEmpty(excelRange.Text.ToString()))
{
int j = i - 4;
rowArray[j]= excelRange.Text.ToString();
}
}

If you want to check with NULL and whitespace in the string then you can use String.IsNullOrWhiteSpace which indicates
whether a specified string is null, empty, or consists only of white-space characters.
if (!string.IsNullOrWhiteSpace(excelRange.Text.ToString()))
{
int j = i - 4;
rowArray[j]= excelRange.Text.ToString();
}

Related

How to skip empty cells while reading data from Excel using OpenXML?

I am trying to read data from Excel and store it into a DataTable using OpenXML. I want data in my DataTable as it is in Excel sheet but when there is a empty cell in Excel, it was not looking as expected.
Because code row.Descendants<Cell>().ElementAt(i) skips empty cells while reading data and in DataTable Rows and Columns are stored incorrectly. I resolved this issue using below code but when my excel has more than 26 columns, it is not working as expected and again data are stored in DataTable incorrectly.
(i.e., While reading data from AA, AB, AC columns)
Can someone help me to rewrite this code to handle this issue when there is more than 26 columns.
private static int CellReferenceToIndex(Cell cell)
{
int index = 0;
string reference = cell.CellReference.ToString().ToUpper();
foreach (char ch in reference)
{
if (Char.IsLetter(ch))
{
int value = (int)ch - (int)'A';
index = (index == 0) ? value : ((index + 1) * 26) + value;
}
else
{
return index;
}
}
return index;
}
You can use example below (taken from here and improved by few validations):
public static int GetColumnIndex(this Cell cell)
{
string columnName = string.Empty;
if (cell != null)
{
string cellReference = cell.CellReference?.ToString();
if (!string.IsNullOrEmpty(cellReference))
// Using `Regex` to "pull out" only letters from cell reference
// (leave only "AB" column name from "AB123" cell reference)
columnName = Regex.Match(cellReference, #"[A-Z]{1,3}").Value;
}
// Column name validations (not null, not empty and contains only UPPERCASED letters)
// *uppercasing may be done manually with columnName.ToUpper()
if (string.IsNullOrEmpty(columnName))
throw new ArgumentException("Column name was not defined.", nameof(columnName));
else if (!Regex.IsMatch(columnName, #"^[A-Z]{1,3}$"))
throw new ArgumentException("Column name is not valid.", nameof(columnName));
int index = 0;
int pow = 1;
// A - 1 iteration, AA - 2 iterations, AAA - 3 iterations.
// On each iteration pow value multiplies by 26
// Letter number (in alphabet) + 1 multiplied by pow value
for (int i = columnName.Length - 1; i >= 0; i--)
{
index += (columnName[i] - 'A' + 1) * pow;
pow *= 26;
}
// Index couldn't be greater than 16384
if (index >= 16384)
throw new IndexOutOfRangeException("Index of provided column name (" + index + ") exceeds max range (16384).");
return index;
}
All exception throws you can replace with return -1 and some kind of Log("...") if you have logging. Otherwise you may not be sure what's problem happened and why was returned -1.
Usage is obvious:
var cells = row.Descendants<Cell>();
foreach (Cell cell in cells)
{
int columnIndex = cell.GetColumnIndex();
// Do what you want with that
}
EDIT.
I'm not sure what you're trying to achieve. And what you mean here:
Because code row.Descendants<Cell>().ElementAt(i) skips empty cells...
I didn't see that. Look at example below:
Random ElementAt in range between 0 and Descendants<Cell>().Count() works too and shows both empty and non-empty cells:

Figure out max number of consecutive seats

I had an interviewer ask me to write a program in c# to figure out the max number of 4 members families that can sit consecutively in a venue, taking into account that the 4 members must be consecutively seated in one single row, with the following context:
N represents the number of rows availabe.
The Columns are labeled from the letter "A" to "K", purposely ommiting the letter "i" (in other words, {A,B,C,D,E,F,G,H,J,K})
M represents a list of reserved seats
Quick example:
N = 2
M = {"1A","2F","1C"}
Solution = 3
In the representation you can see that, with the reservations and the size given, only three families of 4 can be seated in a consecutive order.
How would you solve this? is it possible to not use for loops? (Linq solutions)
I got mixed up in the for loops when trying to deal with the reservations aray: My idea was to obtain all the reservations that a row has, but then I don't really know how to deal with the letters (Converting directly from letter to number is a no go because the missing "I") and you kinda need the letters to position the reserved sits anyway.
Any approach or insight on how to go about this problem would be nice.
Thanks in advance!
Here is another implementation.
I also tried to explain why certain things have been done.
Good luck.
private static int GetNumberOfAvailablePlacesForAFamilyOfFour(int numberOfRows, string[] reservedSeats)
{
// By just declaring the column names as a string of the characters
// we can query the column index by colulmnNames.IndexOf(char)
string columnNames = "ABCDEFGHJK";
// Here we transform the reserved seats to a matrix
// 1A 2F 1C becomes
// reservedSeatMatrix[0] = [0, 2] -> meaning row 1 and columns A and C, indexes 0 and 2
// reservedSeatMatrix[1] = [5] -> meaning row 2 and column F, index 5
List<List<int>> reservedSeatMatrix = new List<List<int>>();
for (int row = 0; row < numberOfRows; row++)
{
reservedSeatMatrix.Add(new List<int>());
}
foreach (string reservedSeat in reservedSeats)
{
int seatRow = Convert.ToInt32(reservedSeat.Substring(0, reservedSeat.Length - 1));
int seatColumn = columnNames.IndexOf(reservedSeat[reservedSeat.Length - 1]);
reservedSeatMatrix[seatRow - 1].Add(seatColumn);
}
// Then comes the evaluation.
// Which is simple enough to read.
int numberOfAvailablePlacesForAFamilyOfFour = 0;
for (int row = 0; row < numberOfRows; row++)
{
// Reset the number of consecutive seats at the beginning of a new row
int numberOfConsecutiveEmptySeats = 0;
for (int column = 0; column < columnNames.Length; column++)
{
if (reservedSeatMatrix[row].Contains(column))
{
// reset when a reserved seat is reached
numberOfConsecutiveEmptySeats = 0;
continue;
}
numberOfConsecutiveEmptySeats++;
if(numberOfConsecutiveEmptySeats == 4)
{
numberOfAvailablePlacesForAFamilyOfFour++;
numberOfConsecutiveEmptySeats = 0;
}
}
}
return numberOfAvailablePlacesForAFamilyOfFour;
}
static void Main(string[] args)
{
int familyPlans = GetNumberOfAvailablePlacesForAFamilyOfFour(2, new string[] { "1A", "2F", "1C" });
}
Good luck on your interview
As always, you will be asked how could you improve that? So you'd consider complexity stuff like O(N), O(wtf).
Underlying implementation would always need for or foreach. Just importantly, never do unnecessary in a loop. For example, if there's only 3 seats left in a row, you don't need to keep hunting on that row because it is not possible to find any.
This might help a bit:
var n = 2;
var m = new string[] { "1A", "2F", "1C" };
// We use 2 dimension bool array here. If it is memory constraint, we can use BitArray.
var seats = new bool[n, 10];
// If you just need the count, you don't need a list. This is for returning more information.
var results = new List<object>();
// Set reservations.
foreach (var r in m)
{
var row = r[0] - '1';
// If it's after 'H', then calculate index based on 'J'.
// 8 is index of J.
var col = r[1] > 'H' ? (8 + r[1] - 'J') : r[1] - 'A';
seats[row, col] = true;
}
// Now you should all reserved seats marked as true.
// This is O(N*M) where N is number of rows, M is number of columns.
for (int row = 0; row < n; row++)
{
int start = -1;
int length = 0;
for (int col = 0; col < 10; col++)
{
if (start < 0)
{
if (!seats[row, col])
{
// If there's no consecutive seats has started, and current seat is available, let's start!
start = col;
length = 1;
}
}
else
{
// If have started, check if we could have 4 seats.
if (!seats[row, col])
{
length++;
if (length == 4)
{
results.Add(new { row, start });
start = -1;
length = 0;
}
}
else
{
// // We won't be able to reach 4 seats, so reset
start = -1;
length = 0;
}
}
if (start < 0 && col > 6)
{
// We are on column H now (only have 3 seats left), and we do not have a consecutive sequence started yet,
// we won't be able to make it, so break and continue next row.
break;
}
}
}
var solution = results.Count;
LINQ, for and foreach are similar things. It is possible you could wrap the above into a custom iterator like:
class ConsecutiveEnumerator : IEnumerable
{
public IEnumerator GetEnumerator()
{
}
}
Then you could start using LINQ.
If you represent your matrix in simple for developers format, it will be easier. You can accomplish it either by dictionary or perform not so complex mapping by hand. In any case this will calculate count of free consecutive seats:
public static void Main(string[] args)
{
var count = 0;//total count
var N = 2; //rows
var M = 10; //columns
var familySize = 4;
var matrix = new []{Tuple.Create(0,0),Tuple.Create(1,5), Tuple.Create(0,2)}.OrderBy(x=> x.Item1).ThenBy(x=> x.Item2).GroupBy(x=> x.Item1, x=> x.Item2);
foreach(var row in matrix)
{
var prevColumn = -1;
var currColumn = 0;
var free = 0;
var div = 0;
//Instead of enumerating entire matrix, we just calculate intervals in between reserved seats.
//Then we divide them by family size to know how many families can be contained within
foreach(var column in row)
{
currColumn = column;
free = (currColumn - prevColumn - 1)/familySize;
count += free;
prevColumn = currColumn;
}
currColumn = M;
free = (currColumn - prevColumn - 1)/familySize;
count += free;
}
Console.WriteLine("Result: {0}", count);
}

Getting the column totals in a 2D array but it always throws FormatException using C#

I am planning to get an array of the averages of each column.
But my app crashes at sum[j] += int.Parse(csvArray[i,j]); due to a FormatException. I have tried using Convert.ToDouble and Double.Parse but it still throws that exception.
The increments in the for loop start at 1 because Row 0 and Column 0 of the CSV array are strings (names and timestamps). For the divisor or total count of the fields that have values per column, I only count the fields that are not BLANK, hence the IF statement. I think I need help at handling the exception.
Below is the my existing for the method of getting the averages.
public void getColumnAverages(string filePath)
{
int col = colCount(filePath);
int row = rowCount(filePath);
string[,] csvArray = csvToArray(filePath);
int[] count = new int[col];
int[] sum = new int[col];
double[] average = new double[col];
for (int i = 1; i < row; i++)
{
for (int j = 1; j < col; j++)
{
if (csvArray[i,j] != " ")
{
sum[j] += int.Parse(csvArray[i,j]);
count[j]++;
}
}
}
for (int i = 0; i < average.Length; i++)
{
average[i] = (sum[i] + 0.0) / count[i];
}
foreach(double d in average)
{
System.Diagnostics.Debug.Write(d);
}
}
}
I have uploaded the CSV file that I use when I test the prototype. It has BLANK values on some columns. Was my existing IF statement unable to handle that case?
There are also entries like this 1.324556-e09due to the number of decimals I think. I guess I have to trim it in the csvToArray(filePath) method or are there other efficient ways? Thanks a million!
So there are a few problems with your code. The main reason for your format exception is that after looking at your CSV file your numbers are surrounded by quotes. Now I can't see from your code exactly how you convert your CSV file to an array but I'm guessing that you don't clear these out - I didn't when I first ran with your CSV and experienced the exact same error.
I then ran into an error because some of the values in your CSV are decimal, so the datatype int can't be used. I'm assuming that you still want the averages of these columns so in my slightly revised verion of your method I change the arrays used to be of type double.
AS #musefan suggested, I have also changed the check for empty places to use the IsNullOrWhiteSpace method.
Finally when you output your results you receive a NaN for the first value in the averages column, this is because when you don't take into account that you never populate the first position of your arrays so as not to process the string values. I'm unsure how you'd best like to correct this behaviour as I'm not sure of the intended purpose - this might be okay - so I've not made any changes to this for the moment, pop a mention in the comments if you want help on how to sort this!
So here is the updated method:
public static void getColumnAverages(string filePath)
{
// Differs from the current implementation, reads a file in as text and
// splits by a defined delim into an array
var filePaths = #"C:\test.csv";
var csvArray = File.ReadLines(filePaths).Select(x => x.Split(',')).ToArray();
// Differs from the current implementation
var col = csvArray[0].Length;
var row = csvArray.Length;
// Update variables to use doubles
double[] count = new double[col];
double[] sum = new double[col];
double[] average = new double[col];
Console.WriteLine("Started");
for (int i = 1; i < row; i++)
{
for (int j = 1; j < col; j++)
{
// Remove the quotes from your array
var current = csvArray[i][j].Replace("\"", "");
// Added the Method IsNullOrWhiteSpace
if (!string.IsNullOrWhiteSpace(current))
{
// Parse as double not int to account for dec. values
sum[j] += double.Parse(current);
count[j]++;
}
}
}
for (int i = 0; i < average.Length; i++)
{
average[i] = (sum[i] + 0.0) / count[i];
}
foreach (double d in average)
{
System.Diagnostics.Debug.Write(d + "\n");
}
}

Get Formatted Cell Values efficiently

I would like to be able to efficiently retrieve a multi-dimensional array of formatted cell values from Excel. When I say formatted values, I mean I would like to get them exactly as they appear in Excel with all the cell NumberFormat applied.
The Range.Value and Range.Value2 properties work great for retrieving the cell values of a large number of cells into a multi-dimensional array. But those are the actual cell values (well at least with Range.Value2 is, I'm not quite sure what Range.Value is doing with respect to some of the values).
If I want to retrieve the actual text that is displayed in the cells, I can use the Range.Text property. This has some caveats. First, you need to AutoFit the cells or else you may get something like #### if not all the text is visible with the current cell width. Secondly, Range.Text does not work for more than one cell at a time so you would have to loop through all of the cells in the range and this can be extremely slow for large data sets.
The other method that I tried is to copy the range into the clipboard and then parse the clipboard text as a tab-separated data stream and transfer it into a multi-dimensional array. This seems to work great, although it is slower than getting Range.Value2, it is much faster for large datasets than getting Range.Text. However, I don't like the idea of using the system clipboard. If this was a really long operation that takes 60 seconds and while that operation is running, the user may decide to switch to another application and would be very unhappy to find that their clipboard either doesn't work or has mysterious data in it.
Is there a way that I can retrieve the formatted cell values to a multi-dimensional array efficiently?
I have added some sample code that is run from a couple ribbon buttons in a VSTO app. The first set some good test values and number formats and the second button will display what they look like when retrieved using one of these methods in a MessageBox.
The sample output on my system is(It could be different on yours due to Regional Settings):
Output using Range.Value
1/25/2008 3:19:32 PM 5.12345
2008-01-25 15:19:32 0.456
Output using Range.Value2
39472.6385648148 5.12345
2008-01-25 15:19:32 0.456
Output using Clipboard Copy
1/25/2008 15:19 5.12
2008-01-25 15:19:32 45.60%
Output using Range.Text and Autofit
1/25/2008 15:19 5.12
2008-01-25 15:19:32 45.60%
The Range.Text and Clipboard methods produce the correct output, but as explained above they both have problems: Range.Text is slow and Clipboard is bad practice.
private void SetSampleValues()
{
var sheet = (Microsoft.Office.Interop.Excel.Worksheet) Globals.ThisAddIn.Application.ActiveSheet;
sheet.Cells.ClearContents();
sheet.Cells.ClearFormats();
var range = sheet.Range["A1"];
range.NumberFormat = "General";
range.Value2 = "2008-01-25 15:19:32";
range = sheet.Range["A2"];
range.NumberFormat = "#";
range.Value2 = "2008-01-25 15:19:32";
range = sheet.Range["B1"];
range.NumberFormat = "0.00";
range.Value2 = "5.12345";
range = sheet.Range["B2"];
range.NumberFormat = "0.00%";
range.Value2 = ".456";
}
private string ArrayToString(ref object[,] vals)
{
int dim1Start = vals.GetLowerBound(0); //Excel Interop will return index-1 based arrays instead of index-0 based
int dim1End = vals.GetUpperBound(0);
int dim2Start = vals.GetLowerBound(1);
int dim2End = vals.GetUpperBound(1);
var sb = new StringBuilder();
for (int i = dim1Start; i <= dim1End; i++)
{
for (int j = dim2Start; j <= dim2End; j++)
{
sb.Append(vals[i, j]);
if (j != dim2End)
sb.Append("\t");
}
sb.Append("\n");
}
return sb.ToString();
}
private void GetCellValues()
{
var sheet = (Microsoft.Office.Interop.Excel.Worksheet)Globals.ThisAddIn.Application.ActiveSheet;
var usedRange = sheet.UsedRange;
var sb = new StringBuilder();
sb.Append("Output using Range.Value\n");
var vals = (object [,]) usedRange.Value; //1-based array
sb.Append(ArrayToString(ref vals));
sb.Append("\nOutput using Range.Value2\n");
vals = (object[,])usedRange.Value2; //1-based array
sb.Append(ArrayToString(ref vals));
sb.Append("\nOutput using Clipboard Copy\n");
string previousClipboardText = Clipboard.GetText();
usedRange.Copy();
string clipboardText = Clipboard.GetText();
Clipboard.SetText(previousClipboardText);
vals = new object[usedRange.Rows.Count, usedRange.Columns.Count]; //0-based array
ParseClipboard(clipboardText,ref vals);
sb.Append(ArrayToString(ref vals));
sb.Append("\nOutput using Range.Text and Autofit\n");
//if you dont autofit, Range.Text may give you something like #####
usedRange.Columns.AutoFit();
usedRange.Rows.AutoFit();
vals = new object[usedRange.Rows.Count, usedRange.Columns.Count];
int startRow = usedRange.Row;
int endRow = usedRange.Row + usedRange.Rows.Count - 1;
int startCol = usedRange.Column;
int endCol = usedRange.Column + usedRange.Columns.Count - 1;
for (int r = startRow; r <= endRow; r++)
{
for (int c = startCol; c <= endCol; c++)
{
vals[r - startRow, c - startCol] = sheet.Cells[r, c].Text;
}
}
sb.Append(ArrayToString(ref vals));
MessageBox.Show(sb.ToString());
}
//requires reference to Microsoft.VisualBasic to get TextFieldParser
private void ParseClipboard(string text, ref object[,] vals)
{
using (var tabReader = new TextFieldParser(new StringReader(text)))
{
tabReader.SetDelimiters("\t");
tabReader.HasFieldsEnclosedInQuotes = true;
int row = 0;
while (!tabReader.EndOfData)
{
var fields = tabReader.ReadFields();
for (int i = 0; i < fields.Length; i++)
vals[row, i] = fields[i];
row++;
}
}
}
private void button1_Click(object sender, RibbonControlEventArgs e)
{
SetSampleValues();
}
private void button2_Click(object sender, RibbonControlEventArgs e)
{
GetCellValues();
}
I've found a partial solution. Apply the NumberFormat value to the parsed double of Value2. This only works for single cells as returning an array for NumberFormat with different formats in the array returns System.DBNull.
double.Parse(o.Value2.ToString()).ToString(o.NumberFormat.ToString())
The dates don't work with this though. If you know which columns contains certain things, like a formatted date, you can use DateTime.FromOADate on the double and then value.ToString(format) with the NumberFormat. The code below gets close but is not complete.
<snip>
sb.Append("\nOutput using Range.Value2\n");
vals = (object[,])usedRange.Value2; //1-based array
var format = GetFormat(usedRange);
sb.Append(ArrayToString(ref vals, format));
</snip>
private static object[,] GetFormat(Microsoft.Office.Interop.Excel.Range range)
{
var rows = range.Rows.Count;
var cols = range.Columns.Count;
object[,] vals = new object[rows, cols];
for (int r = 1; r <= rows; ++r)
{
for (int c = 1; c <= cols; ++c)
{
vals[r-1, c-1] = range[r, c].NumberFormat;
}
}
return vals;
}
private static string ArrayToString(ref object[,] vals, object[,] numberformat = null)
{
int dim1Start = vals.GetLowerBound(0); //Excel Interop will return index-1 based arrays instead of index-0 based
int dim1End = vals.GetUpperBound(0);
int dim2Start = vals.GetLowerBound(1);
int dim2End = vals.GetUpperBound(1);
var sb = new StringBuilder();
for (int i = dim1Start; i <= dim1End; i++)
{
for (int j = dim2Start; j <= dim2End; j++)
{
if (numberformat != null)
{
var format = numberformat[i-1, j-1].ToString();
double v;
if (double.TryParse(vals[i, j].ToString(), out v))
{
if (format.Contains(#"/") || format.Contains(":"))
{// parse a date
var date = DateTime.FromOADate(v);
sb.Append(date.ToString(format));
}
else
{
sb.Append(v.ToString(format));
}
}
else
{
sb.Append(vals[i, j].ToString());
}
}
else
{
sb.Append(vals[i, j]);
}
if (j != dim2End)
sb.Append("\t");
}
sb.Append("\n");
}
return sb.ToString();
}
One solution to your problem is to use:
Range(XYZ).Value(11) = Range(ABC).Value(11)
As described here, this will:
Returns the recordset representation of the specified Range object in an XML format.
Assuming that your excel is configured in OpenXML format, this will copy the value/formula AND the formatting of range ABC and inject it into range XYZ.
Additionally, this answer explains the difference between Value and Value2.
.Value2 gives you the underlying value of the cell (could be empty, string, error, number (double) or boolean)
.Value gives you the same as .Value2 except if the cell was formatted as currency or date it gives you a VBA currency (which may truncate decimal places) or VBA date.

Rebase a 1-based array in c#

I have an array in c# that is 1-based (generated from a call to get_Value for an Excel Range
I get a 2D array for example
object[,] ExcelData = (object[,]) MySheet.UsedRange.get_Value(Excel.XlRangeValueDataType.xlRangeValueDefault);
this appears as an array for example ExcelData[1..20,1..5]
is there any way to tell the compiler to rebase this so that I do not need to add 1 to loop counters the whole time?
List<string> RowHeadings = new List<string>();
string [,] Results = new string[MaxRows, 1]
for (int Row = 0; Row < MaxRows; Row++) {
if (ExcelData[Row+1, 1] != null)
RowHeadings.Add(ExcelData[Row+1, 1]);
...
...
Results[Row, 0] = ExcelData[Row+1, 1];
& other stuff in here that requires a 0-based Row
}
It makes things less readable since when creating an array for writing the array will be zero based.
Why not just change your index?
List<string> RowHeadings = new List<string>();
for (int Row = 1; Row <= MaxRows; Row++) {
if (ExcelData[Row, 1] != null)
RowHeadings.Add(ExcelData[Row, 1]);
}
Edit: Here is an extension method that would create a new, zero-based array from your original one (basically it just creates a new array that is one element smaller and copies to that new array all elements but the first element that you are currently skipping anyhow):
public static T[] ToZeroBasedArray<T>(this T[] array)
{
int len = array.Length - 1;
T[] newArray = new T[len];
Array.Copy(array, 1, newArray, 0, len);
return newArray;
}
That being said you need to consider if the penalty (however slight) of creating a new array is worth improving the readability of the code. I am not making a judgment (it very well may be worth it) I am just making sure you don't run with this code if it will hurt the performance of your application.
Create a wrapper for the ExcelData array with a this[,] indexer and do rebasing logic there. Something like:
class ExcelDataWrapper
{
private object[,] _excelData;
public ExcelDataWrapper(object[,] excelData)
{
_excelData = excelData;
}
public object this[int x, int y]
{
return _excelData[x+1, y+1];
}
}
Since you need Row to remain as-is (based on your comments), you could just introduce another loop variable:
List<string> RowHeadings = new List<string>();
string [,] Results = new string[MaxRows, 1]
for (int Row = 0, SrcRow = 1; SrcRow <= MaxRows; Row++, SrcRow++) {
if (ExcelData[SrcRow, 1] != null)
RowHeadings.Add(ExcelData[SrcRow, 1]);
...
...
Results[Row, 0] = ExcelData[SrcRow, 1];
}
Why not use:
for (int Row = 1; Row <= MaxRows; Row++) {
Or is there something I'm missing?
EDIT: as it turns out that something is missing, I would use another counter (starting at 0) for that purpose, and use a 1 based Row index for the array. It's not good practice to use the index for another use than the index in the target array.
Is changing the loop counter too hard for you?
for (int Row = 1; Row <= MaxRows; Row++)
If the counter's range is right, you don't have to add 1 to anything inside the loop so you don't lose readability. Keep it simple.
I agree that working with base-1 arrays from .NET can be a hassle. It is also potentially error-prone, as you have to mentally make a shift each time you use it, as well as correctly remember which situations will be base 1 and which will be base 0.
The most performant approach is to simply make these mental shifts and index appropriately, using base-1 or base-0 as required.
I personally prefer to convert the two dimensional base-1 arrays to two dimensional base-0 arrays. This, unfortunately, requires the performance hit of copying over the array to a new array, as there is no way to re-base an array in place.
Here's an extension method that can do this for the 2D arrays returned by Excel:
public static TResult[,] CloneBase0<TSource, TResult>(
this TSource[,] sourceArray)
{
If (sourceArray == null)
{
throw new ArgumentNullException(
"The 'sourceArray' is null, which is invalid.");
}
int numRows = sourceArray.GetLength(0);
int numColumns = sourceArray.GetLength(1);
TResult[,] resultArray = new TResult[numRows, numColumns];
int lb1 = sourceArray.GetLowerBound(0);
int lb2 = sourceArray.GetLowerBound(1);
for (int r = 0; r < numRows; r++)
{
for (int c = 0; c < numColumns; c++)
{
resultArray[r, c] = sourceArray[lb1 + r, lb2 + c];
}
}
return resultArray;
}
And then you can use it like this:
object[,] array2DBase1 = (object[,]) MySheet.UsedRange.get_Value(Type.Missing);
object[,] array2DBase0 = array2DBase1.CloneBase0();
for (int row = 0; row < array2DBase0.GetLength(0); row++)
{
for (int column = 0; column < array2DBase0.GetLength(1); column++)
{
// Your code goes here...
}
}
For massively sized arrays, you might not want to do this, but I find that, in general, it really cleans up your code (and mind-set) to make this conversion, and then always work in base-0.
Hope this helps...
Mike
For 1 based arrays and Excel range operations as well as UDF (SharePoint) functions I use this utility function
public static object[,] ToObjectArray(this Object Range)
{
Type type = Range.GetType();
if (type.IsArray && type.Name == "Object[,]")
{
var sourceArray = Range as Object[,];
int lb1 = sourceArray.GetLowerBound(0);
int lb2 = sourceArray.GetLowerBound(1);
if (lb1 == 0 && lb2 == 0)
{
return sourceArray;
}
else
{
int numRows = sourceArray.GetLength(0);
int numColumns = sourceArray.GetLength(1);
var resultArray = new Object[numRows, numColumns];
for (int r = 0; r < numRows; r++)
{
for (int c = 0; c < numColumns; c++)
{
resultArray[r, c] = sourceArray[lb1 + r, lb2 + c];
}
}
return resultArray;
}
}
else if (type.IsCOMObject)
{
// Get the Value2 property from the object.
Object value = type.InvokeMember("Value2",
System.Reflection.BindingFlags.Instance |
System.Reflection.BindingFlags.Public |
System.Reflection.BindingFlags.GetProperty,
null,
Range,
null);
if (value == null)
value = string.Empty;
if (value is string)
return new object[,] { { value } };
else if (value is double)
return new object[,] { { value } };
else
{
object[,] range = (object[,])value;
int rows = range.GetLength(0);
int columns = range.GetLength(1);
object[,] param = new object[rows, columns];
Array.Copy(range, param, rows * columns);
return param;
}
}
else
throw new ArgumentException("Not A Excel Range Com Object");
}
Usage
public object[,] RemoveZeros(object range)
{
return this.RemoveZeros(range.ToObjectArray());
}
[ComVisible(false)]
[UdfMethod(IsVolatile = false)]
public object[,] RemoveZeros(Object[,] range)
{...}
The first function is com visible and will accept an excel range or a chained call from another function (the chained call will return a 1 based object array), the second call is UDF enabled for Excel Services in SharePoint. All of the logic is in the second function. In this example we are just reformatting a range to replace zero with string.empty.
You could use a 3rd party Excel compatible component such as SpreadsheetGear for .NET which has .NET friendly APIs - including 0 based indexing for APIs such as IRange[int rowIndex, int colIndex].
Such components will also be much faster than the Excel API in almost all cases.
Disclaimer: I own SpreadsheetGear LLC

Categories