For example, I have a sheet called EmployeeSheet, which is just a single column of every employee's name first and last in a company. And let's assume this list is perfectly formatted and has no duplicates so every cell is unique in this sheet.
Now I have a sheet for each department in the company, such as FinanceSheet, ITSheet, and SalesSheet. Each sheet has in it somewhere (as in each sheet doesn't have the same layout) a list of employees in each department. However any 1 employee name should only appear once between all of the department sheets (this excludes the EmployeeSheet).
Here's the solution I can think of but not figure out how to implement, would be to make a multidimensional array (Learned a small bit about them in school, vaguely remember how to use though).
Pseudocode something like:
arrEmployees = {"Tom Hanks", "Burt Reynolds", "Your Mom"}
arrFinance = {"Tom Hanks"}
arrIT = {"Burt Reynolds"}
arrSales = {"Your Mom"}
arrSheets = {arrEmployees, arrFinance, arrIT, arrSales}
While I've been able to get single cell values and ranges as strings by using
Sheets shts = app.Worksheets;
Worksheet ws = (Worksheet)sheets.get_Item("EmployeeSheet");
Excel.Range empRange = (Excel.Range)worksheet.get_range("B2");
string empVal = empRange.Value2.ToString();
But with that process to get a single cell value to a string, I don't know how I would put that into an element of my array, let alone a range of values.
I'm sure my method is not the most efficient, and it might not even be possible, but that's why I'm here for help, so any tips are appreciated.
EDIT: This is the solution that ended up working for me. Thanks to Ian Edwards solution.
Dictionary<string, List<Point>> fields = new Dictionary<string, List<Point>>();
fields["Finance"] = new List<Point>() { new Point(2,20)};
fields["Sales"] = new List<Point>();
for (int row = 5; row <= 185; row += 20) {fields["Sales"].Add(new Point(2,row));}
List<string> names = new List<string>();
List<string> duplicates = new List<string>();
foreach (KeyValuePair<string, List<Point>> kp in fields)
{
Excel.Worksheet xlSheet = (Excel.Worksheet)workbook.Worksheets[kp.Key];
foreach (Point p in kp.Value)
{
if ((xlSheet.Cells[p.Y, p.X] as Excel.Range.Value != null)
{
string cellVal = ((xlSheet.Cells[p.Y,p.X] as Excel.Range).Value).ToString();
if (!names.Contains(cellVal))
{ names.Add(cellVal)) }
else { duplicates.Add(cellVal); } } } }
Here's a little example I knocked together - the comments should explain what's going on line by line.
You can declare the name of the worksheets you want to check for names, as well as where to start looking for names in the 'worksheets' dictionary.
I assume you don't know how many names are in each list - it will keep going down each list until it encounters a blank cell.
// Load the Excel app
Microsoft.Office.Interop.Excel.Application xlApp = new Microsoft.Office.Interop.Excel.Application();
// Open the workbook
var xlWorkbook = xlApp.Workbooks.Open("XLTEST.xlsx");
// Delcare the sheets and locations to look for names
Dictionary<string, Tuple<int, int>> worksheets = new Dictionary<string, Tuple<int, int>>()
{
// Declare the name of the sheets to look in and the 1 base X,Y index of where to start looking for names on each sheet (i.e. 1,1, = A1)
{ "Sheet1", new Tuple<int, int>(1, 1) },
{ "Sheet2", new Tuple<int, int>(2, 3) },
{ "Sheet3", new Tuple<int, int>(4, 5) },
{ "Sheet4", new Tuple<int, int>(2, 3) },
};
// List to keep track of all names in all sheets
List<string> names = new List<string>();
// Iterate over every sheet we need to look at
foreach(var worksheet in worksheets)
{
string workSheetName = worksheet.Key;
// Get this excel worksheet object
var xlWorksheet = (Microsoft.Office.Interop.Excel.Worksheet)xlWorkbook.Worksheets[workSheetName];
// Get the 1 based X,Y cell index
int row = worksheet.Value.Item1;
int column = worksheet.Value.Item2;
// Get the string contained in this cell
string name = (string)(xlWorksheet.Cells[row, column] as Microsoft.Office.Interop.Excel.Range).Value;
// name is null when the cell is empty - stop looking in this sheet and move on to the next one
while(name != null)
{
// Add the current name to the list
names.Add(name);
// Get the next name in the cell below this one
name = (string)(xlWorksheet.Cells[++row, column] as Microsoft.Office.Interop.Excel.Range).Value;
}
}
// Compare the number of names to the number of unique names
if (names.Count() != names.Distinct().Count())
{
// You have duplicate names!
}
You can use .Range to define multiple cells (ie, .Range["A1", "F500"])
https://msdn.microsoft.com/en-us/library/microsoft.office.tools.excel.worksheet.range.aspx
You can then use .get_Value to get the contents/values of all cells in that Range. According to dotnetperls.com get_Value() is much faster than get_Range() (see 'Performance' section). Using the combo of multiple ranges + get_value will definitely perform better of lots of single range calls using get_range.
https://msdn.microsoft.com/en-us/library/microsoft.office.tools.excel.namedrange.get_value(v=vs.120).aspx
I store them in the an Object Array.
(object[,])yourexcelRange.get_Value(Excel.XlRangeValueDataType.xlRangeValueDefault);
From there you can write your own comparison method to compare multiple arrays. One quirk is that doing this returns a 1-indexed array, instead of a standard 0-based index.
Related
I have excel like this :
ProductID
SomeExplanation
AnotherColumn1
AnotherColumn2
AnotherColumn3
1
X
6
A
65465
2
Y
5
B
6556
3
Z
7
C
65465
I want to create Dictionary that key values(which are ProductID, SomeExplanation,AnotherColumn1,AnotherColumn2, AnotherColumn3) and this dictionary must have List of values (for example dictionary key : ProductId and it's values : 1,2,3 etc..) and I think there must be List that containes all dictionaries.
I am using aspose library for excel and .net framework 4.5 .
Aspose returning the it's cell values as an object.
So my first question how can create List of dictionaries, and these dictionaries must have list of values (List<Dictionary<key,List of values>>) and how to add values to this List of dictionary ?
My second question with that : how can I fill this list of dictionaries with aspose worksheet ?
This is method that accept Aspose Worksheet as a parameter this worksheet parameter can be one of the excel files' worksheet.
I want to iterate through all cell and assign values to dictionary, and this values belong to its header(0 row and columnOrder)
For example: there is a list called myExcelContainer and this list is a series of Excel columns and also this columns is an dictionary that contains key of value (Excel header - for example ProductId) and the values [1, 2, 3] under the Excel header.
public List<Dictionary<string, List<object>>> GenerateExcelDictionary(Worksheet worksheet)
{
var columnMax = worksheet.Cells.MaxDataColumn;
var rowMax = worksheet.Cells.MaxDataRow;
var myExcelContainer = new List<Dictionary<string, List<object>>>();
var columnKeyWithValues = new Dictionary<string, List<object>>();
for (int column = 0; column < columnMax; column++)
{
var columnName = worksheet.Cells[0, column].Value.ToString().Replace(" ", string.Empty);
columnKeyWithValues.Add(columnName, new List<object>());
}
for (int column = 0; column < columnMax; column++)
{
var values = new List<object>();
for (int row = 1;row < rowMax;row++)
{
values.Add(worksheet.Cells[row, column]);
}
columnKeyWithValues[worksheet.Cells[0, column].Value.ToString()] = values;
}
myExcelContainer.Add(columnKeyWithValues);
return myExcelContainer;
}
this is the excel container :
var myExcelContainer = new List<Dictionary<string, List<object>>>();
But If you can improve the algortihm performance I want you to share, please.
My english not great :) .
I wrote a small method that will give me the headers of a table in excel:
private List<string> GetCurrentHeaders(int headerRow, Excel.Worksheet ws)
{
//List where specific values get saved if they exist
List<string> headers = new List<string>();
//List of all the values that need to exist as headers
var headerlist = columnName.GetAllValues();
for (int i = 0; i < headerlist.Count; i++)
{
//GetData() is a Method that outputs the Data from a cell.
//headerRow is defining one row under the row I actually need, therefore -1 )
string header = GetData(i + 1, headerRow - 1, ws);
if (headerlist.Contains(header) && !headers.Contains(header))
{
headers.Add(header);
}
}
return headers;
}
Now I got an Excel-table, where the first value I need is in cell A11 (or Row 11, Column 1).
When I set a breakpoint after string header = GetData(i + 1, headerRow - 1, ws);, where i+1 = 1 and headerRow - 1 = 11, I can see that the value he read is empty, which is not the case.
The GetData-Method just does one simple thing:
public string GetData(int row, int col, Excel.Worksheet ws)
{
string val = "";
val = Convert.ToString(ws.Cells[row, col].Value) != null
? ws.Cells[row, col].Value.ToString() : "";
val = val.Replace("\n", " ");
return val;
}
I don't get why this can't get me the value I need, while it works on every other excel table too. The excel itself is no different from the others. It's file extension is .xls, the data is in the same layout as in the other tables, etc
There are a few steps to getting this right. You need to know the dimensions of your table to know where the headers are. Your method hast two ways of knowing this: 1) passing the table Range to the method, or 2) giving the coordinates of a cell within the table (usually the top-left cell) and trusting the CurrentRegion property to do the job for you. The most reliable way would be the first as you will be explicitly telling the method where to look, but it'll require the consumer to figure out the address which isn't always straightforward. The CurrentRegion approach works fine too but note that if you have an empty column within your table range, it will only address until that empty column. Having said all that, you could have the following:
List<string> GetHeaders(Worksheet worksheet, int row, int column)
{
Range currentRegion = worksheet.Cells[row, column].CurrentRegion;
Range headersRow = currentRegion.Rows[1];
var headers = headersRow
.Cast<Range>() // We cast so we can use LINQ
.Select(c => c.Text is DBNull ? null : c.Text as string) //The null value of c.Text is not null but DBNull
.ToList();
return headers;
}
Then you can simply test if you're missing headers. The following code assumes the ActiveCell is a cell within the table Range, but you can change that easily to address a specific cell.
List<string> GetMissingHeaders(List<string> expectedHeaders)
{
var worksheet = App.ActiveSheet; //App is your Excel application
Range activeCell = worksheet.ActiveCell;
var headers = GetHeaders(worksheet, activeCell.Row, activeCell.Column);
return expectedHeaders.Where(h => headers.Any(i => i == h) == false).ToList();
}
I want to copy filtered Excel Range (particular column) to array or list. My problem is, I'm able to copy an normal range to array easily. But when I apply filter, I'm unable to copy it properly. I have tried multiple ways.
I have tried with Range.Cells.Value and Range.Rows.Cast<Excel.Range>(). it gives me only two rows(1st two) but there are 15 rows in excel sheet when I filter based on a criteria:
Excel.Range srcRange = sheet.UsedRange;
srcRange.AutoFilter(field, criteria, Excel.XlAutoFilterOperator.xlFilterValues, Type.Missing, true);
Excel.Range filteredRange = sheet.UsedRange.SpecialCells(Excel.XlCellType.xlCellTypeVisible, Type.Missing);
Excel.Range rn = filteredRange.Columns[columnNumber];
//var myVal = (System.Array)rn.Rows.Cast<Excel.Range>().SelectMany(x => x.ToString());
//var myVal = (System.Array)rn.Rows.Cast<object>().SelectMany(x => x.ToString()); //this gives exception - com object cannot be casted to string type
var myVl = (System.Array)rn.Cells.Value;
arr1 = myVl.OfType<object>().Select(o => o.ToString()).ToArray();
It skips other rows and takes only first continuous rows! So if I have rows indexed 1,2,3 in the filtered criteria, the array is populated only with first three rows. Though there are rows at different index like. I know I can do this something like this:
foreach (Excel.Range area in filteredRange.Areas)
{
foreach (Excel.Range row in area.Rows)
{
int index = row.Row;
string test = sheet.Cells[index, column].Value.ToString();
tmpList.Add(test);
}
}
But this is not a solution for me as I can't write this when I want to copy values from multiple columns! So I was looking for a 1 liner. I don't mind whether I store the values into an array or list.
It would be really helpful if someone can point me into the right direction. Thanks!
You could try something like this
Excel.Range filteredRange = sheet.UsedRange.SpecialCells(Excel.XlCellType.xlCellTypeVisible, Type.Missing);
foreach (Excel.Range area in filteredRange.Areas)
{
foreach (Excel.Range row in area.Rows)
{
foreach (Excel.Range column in area.Columns)
{
list.Add(sheet.Cells[row.Row, column.Column].Value.ToString());
}
}
}
I need to conditionally colorize ranges in a PivotTable. I tried to do it this way:
private void ColorizeContractItemBlocks(List<string> contractItemDescs)
{
int FIRST_DESCRIPTION_ROW = 7;
int DESCRIPTION_COL = 1;
int ROWS_BETWEEN_DESCRIPTIONS = 4;
int rowsUsed = pivotTableSheet.Cells.Rows.Count;
int currentRowBeingExamined = FIRST_DESCRIPTION_ROW;
// Loop through PivotTable data, colorizing contract items
while (currentRowBeingExamined < rowsUsed)
{
Cell descriptionCell = pivotTableSheet.Cells[currentRowBeingExamined, DESCRIPTION_COL];
String desc = descriptionCell.Value.ToString();
if (contractItemDescs.Contains(desc))
{
// args are firstRow, firstColumn, totalRows, totalColumns
Range rangeToColorize = pivotTableSheet.Cells.CreateRange(
currentRowBeingExamined, 0,
ROWS_BETWEEN_DESCRIPTIONS, _grandTotalsColumnPivotTable + 1);
Style style = workBook.Styles[workBook.Styles.Add()];
style.BackgroundColor = CONTRACT_ITEM_COLOR;
StyleFlag styleFlag = new StyleFlag();
styleFlag.All = true;
rangeToColorize.ApplyStyle(style, styleFlag);
}
currentRowBeingExamined = currentRowBeingExamined + ROWS_BETWEEN_DESCRIPTIONS;
}
}
...but it doesn't work, because rowsUsed does not take into consideration the rows on the PivotTable on the pivotTableSheet, and so my while loop is never entered.
How can I determine how many rows the PivotTable takes up on the sheet, so that I can loop through the PivotTable?
Or, am I approaching this the wrong way? Is there a different standard way of manipulating the styles/formatting of a PivotTable after it has been generated?
#B. Clay Shannon, You may consider using any of the following APIs for your requirement. I have added comments to the code for your reference.
var book = new Workbook(dir + "sample.xlsx");
var sheet = book.Worksheets[0];
var pivot = sheet.PivotTables[0];
// DataBodyRange returns CellArea that represents range between the header row & insert row
var dataBodyRange = pivot.DataBodyRange;
Console.WriteLine(dataBodyRange);
// TableRange1 returns complete Pivot Table area except page fields
var tableRange1 = pivot.TableRange1;
Console.WriteLine(tableRange1);
// TableRange2 returns complete Pivot Table area including page fields
var tableRange2 = pivot.TableRange2;
Console.WriteLine(tableRange2);
// ColumnRange returns range that represents the column area of the Pivot Table
var columnRange = pivot.ColumnRange;
Console.WriteLine(columnRange);
// RowRange returns range that represents the row area of the Pivot Table
var rowRange = pivot.RowRange;
Console.WriteLine(rowRange);
In case you still face any difficulty, please share your sample spreadsheet along with desired results (that you may create manually in Excel application) in a thread at Aspose.Cells support forum for thorough investigation.
Note: I am working as Developer Evangelist at Aspose.
The RowRange property of the pivot table should take you row by row through every element in the table:
Excel.Worksheet ws = wb.Sheets["Sheet1"];
Excel.PivotTable pt = ws.PivotTables("PivotTable1");
Excel.Range cell;
foreach (Excel.Range row in pt.RowRange)
{
cell = ws.Cells[row.Row, 5]; // for example, whatever is in column E
// do your formatting here
}
There are other ranges available -- for example, I typically only care about:
pt.DataBodyRange
Which is every cell within the actual pivot table (whatever is being aggregated).
I am writing a program that accesses an excel template containing columns of data (with an unique ID number in the first column). Based on the first two numbers of the ID number, the row will either be kept or deleted. In the template, this unique ID number column feeds an ActiveX Combobox's (located on the Worksheet) ListFill attribute. When the non-matching rows are removed, the ListFill attribute is reset, but the text is not reset.
Example, if I select rows based on '02' being the first two numbers of the unique ID in Column A, I have no problem removing everything that does not start with '02' but the Combobox text still reads "010001" since that is the first Unique ID in the template, even though it doesn't exist in the new list.
I tell you all this to ask if anyone knows a better way to access the combobox? I can access it as an OLEObject, but that does not allow me to change the index or text properties of the combobox as they are 'read only' as per the following intellisense error in VS 2013:
Property or Indexer 'Microsoft.Office.Interop.Excel_OLEObject.Index' cannot be assinged to -- it is read only.
The error appears on the line:
oleobj.Index = 1;
The code snippet is below. The current Excel application is passed as xlApp and the array comboboxes is passed. Each member of the comboboxes array contains the sheet name the combobox is on, the name of the control and the ListFillRange it has on the template. Example array member would be:
Sheet1!:cbTest:$A$1:$A$10
private void ResetComboBoxes2(string[] comboboxes, Excel.Application xlApp)
{
Excel.Worksheet wksht = new Excel.Worksheet();
Excel.Range rng;
int listEndCellNum;
string listEndCellApha;
string listEndCell;
for (int i = 0; i < comboboxes.Length; i++)
{
string[] comboBoxesSplit = comboboxes[i].Split(':');
string sheetName = comboBoxesSplit[0].ToString();
string oleObjName = comboBoxesSplit[1].ToString();
string[] rangeArray = comboBoxesSplit[2].Split(':');
string rangeStart = rangeArray[0];
listEndCellNum = wksht.Range[rangeStart].End[Excel.XlDirection.xlDown].Offset[1, 0].Row - 1;
string[] cellBreakdown = rangeStart.Split('$');
listEndCellApha = cellBreakdown[1];
listEndCell = "$" + listEndCellApha + "$" + listEndCellNum;
string listFull = rangeStart + ":" + listEndCell;
wksht = xlApp.ActiveWorkbook.Worksheets[sheetName];
foreach (Excel.OLEObject oleobj in wksht.OLEObjects())
{
if (oleobj.Name.ToString() == oleObjName)
{
oleobj.ListFillRange = listFull;
oleobj.Index = 1;
}
}
}
}
I'm not even sure there IS a way to do this properly. I could always make a chunk of VBA code to reset it before saving and access that through C# but I am hoping to do it here.
So I was able to figure out that I was doing too much thinking. I went back to VBA then transposed that back to C#. The result was the following code, which yu will notice is considerably shorter and succinct. I had to test the oleObject's programID which for ALL activeX comboboxes is "Forms.ComboBox.1" then grab that object's name, then call it by name, with an extra "Object" in there for good measure.
private void ResetComboBoxes2(string[] comboboxes, Excel.Application xlApp)
{
Excel.Worksheet wksht = new Excel.Worksheet();
for (int i = 0; i < comboboxes.Length; i++)
{
string[] comboBoxesSplit = comboboxes[i].Split(':');
string sheetName = comboBoxesSplit[0].ToString();
wksht = xlApp.ActiveWorkbook.Worksheets[sheetName];
foreach (Excel.OLEObject oleobj in wksht.OLEObjects())
{
if (oleobj.progID == "Forms.ComboBox.1")//oleobj.Name.ToString() == oleObjName)
{
string cbName = oleobj.Name.ToString();
wksht.OLEObjects(cbName).Object.ListIndex = 0;
}
}
}
}