I have a DataTable with size m x n and want to copy over all the contents(including column headers) to an excel file that is already open. I have the reference to the Excel.WorkBook and it is known which WorkSheet will the data be copied to.
I know the easiest(and dirtiest way) is:
Excel.WorkSheet outSheet; //set to desired worksheet
int rowIdx = 1;
int colIdx = 1;
//add header row
foreach (DataColumn dc in dt.Columns)
{
outSheet.Cells[rowIdx, colIdx++] = dc.ColumnName;
}
colIdx = 1; //reset to Cell 1
//add rest of rows
foreach (DataRow dr in dt.Rows)
{
colIdx = 0;
foreach (DataColumn dc in dt.Columns)
{
outSheet.Cells[rowIdx + 1, colIdx + 1] = dr[colIdx].ToString();
colIdx++;
}
rowIdx++;
}
This works but unfortunately incurs a huge time cost as it needs to access and paste data cell by cell. Is there a better way to accomplish this?
I wrote a small example for you. tl;dr you can assign an array of values to an Excel range. But this one must meet some specifications. credits go to Eric Carter
Stopwatch sw = new Stopwatch();
sw.Start();
Application xlApp = new Application();
Workbook xlBook = xlApp.Workbooks.Open(#"E:\Temp\StackOverflow\COM_Interop_CS\bin\Debug\demo.xlsx");
Worksheet wrkSheet = xlBook.Worksheets[1];
try
{
/// credits go to:
/// http://blogs.msdn.com/b/eric_carter/archive/2004/05/04/126190.aspx
///
/// [cite] when you want to set a range of values to an array, you must declare that array as a 2
/// dimensional array where the left-most dimension is the number of rows you are going to set and
/// the right-most dimension is the number of columns you are going to set.
///
/// Even if you are just setting one column, you can’t create a 1 dimensional array and have it work[/cite]
Excel.Range range = wrkSheet.Range["A1", "Z100000"];
int maxRows = 100000, maxCols = 26;
object[,] values = new object[maxRows, maxCols];
int counter = 0;
for (int row = 0; row < maxRows; row++)
{
for (int col = 0; col < maxCols; col++)
{
values[row, col] = counter++;
}
}
range.Value2 = values;
}
catch (Exception ex)
{
Debug.WriteLine(ex.Message);
}
xlApp.Visible = true;
sw.Stop();
Console.WriteLine("Elapsed: {0}", sw.Elapsed);
I added 100.000 rows and 26 cols in less than 10 seconds. I hope this is appropriate for you!
Related
I created Excel file using this code:
Sheets worksheets = wb.Sheets;
Worksheet worksheet = (Worksheet)worksheets[4];
int rows = dt.Rows.Count;
int columns = dt.Columns.Count;
var data = new object[rows + 1, columns];
for (var column = 0; column < columns; column++)
{
data[0, column] = dt.Columns[column].ColumnName;
}
for (var row = 0; row < rows; row++)
{
for (var column = 0; column < columns; column++)
{
data[row + 1, column] = dt.Rows[row][column];
}
}
Range beginWrite = (Range)worksheet.Cells[1, 1];
Range endWrite = (Range)worksheet.Cells[rows + 1, columns];
Range sheetData = worksheet.Range[beginWrite, endWrite];
sheetData.Value2 = data;
worksheet.Select();
sheetData.Worksheet.ListObjects.Add(XlListObjectSourceType.xlSrcRange,
sheetData,
Type.Missing,
XlYesNoGuess.xlNo,
Type.Missing);
sheetData.Select();
Excel.ActiveWindow.DisplayGridlines = false;
Excel.Application.Range["2:2"].Select();
Excel.Application.Range["$A$3"].Select();
the problem here it set default format style to excel fileI don't know how to clear all format style in excel sheet
If all you are trying to do is delete all styles, this would work:
using Excelx = Microsoft.Office.Interop.Excel;
Excelx.Workbook wb = excel.ActiveWorkbook;
foreach (Excelx.Style st in wb.Styles)
st.Delete();
Then again, you may only want to clear out custom styles (not the ones that come standard), in which case a small modification would do it:
foreach (Excelx.Style st in wb.Styles)
{
if (!st.BuiltIn)
st.Delete();
}
Styles are stored at the workbook level, so at some point you need to declare your workbook. From there, the Styles collection of the Workbook object has everything you need.
I'm using the following code snippet to write some data into an excel file using EPPlus. My application does some big data processing and since excel has a limit of ~1 million rows, space runs out time to time. So what I am trying to achieve is this, once a System.ArgumentException : row out of range is detected or in other words.. no space is left in the worksheet.. the remainder of the data will be written in the 2nd worksheet in the same workbook. I have tried the following code but no success yet. Any help will be appreciated!
try
{
for (int i = 0; i < data.Count(); i++)
{
var cell1 = ws.Cells[rowIndex, colIndex];
cell1.Value = data[i];
colIndex++;
}
rowIndex++;
}
catch (System.ArgumentException)
{
for (int i = 0; i < data.Count(); i++)
{
var cell2 = ws1.Cells[rowIndex, colIndex];
cell2.Value = data[i];
colIndex++;
}
rowIndex++;
}
You shouldnt use a catch to handle that kind of logic - it is more for a last resort. Better to engineer you code to deal with your situation since this is very predictable.
The excel 2007 format has a hard limit of 1,048,576 rows. With that, you know exactly how many rows you should put before going to a new sheet. From there it is simple for loops and math:
[TestMethod]
public void Big_Row_Count_Test()
{
var existingFile = new FileInfo(#"c:\temp\temp.xlsx");
if (existingFile.Exists)
existingFile.Delete();
const int maxExcelRows = 1048576;
using (var package = new ExcelPackage(existingFile))
{
//Assume a data row count
var rowCount = 2000000;
//Determine number of sheets
var sheetCount = (int)Math.Ceiling((double)rowCount/ maxExcelRows);
for (var i = 0; i < sheetCount; i++)
{
var ws = package.Workbook.Worksheets.Add(String.Format("Sheet{0}", i));
var sheetRowLimit = Math.Min((i + 1)*maxExcelRows, rowCount);
//Remember +1 for 1-based excel index
for (var j = i * maxExcelRows + 1; j <= sheetRowLimit; j++)
{
var cell1 = ws.Cells[j - (i*maxExcelRows), 1];
cell1.Value = j;
}
}
package.Save();
}
}
I'm using csharp to insert data into excel sheet into 7 columns. The interface of this program will allow users to select 7 checkboxes. If they select all 7, all the 7 columns in spreadhseet will have data, if they select one checkbox then only one column will have data. I have got a for loop which will check if data is there, if no data exists, I want to remove that column in epplus. Here's a previous discussion on this topic
How can I delete a Column of XLSX file with EPPlus in web app
It's quiet old so I just wanna check if there's a way to do this.
Or, is there a way to cast epplus excel sheet to microsoft interop excel sheet and perform some operations.
Currently, I've code like this:
for(int j=1; j <= 9; j++) //looping through columns
{
int flag = 0;
for(int i = 3; i <= 10; i++) // looping through rows
{
if(worksheet.cells[i, j].Text != "")
{
flag ++;
}
}
if (flag == 0)
{
worksheet.column[j].hidden = true; // hiding the columns- want to remove it
}
}
Can we do something like:
Excel.Application xlApp = new Microsoft.Office.Interop.Excel.Application();
xlApp = worksheet; (where worksheet is epplus worksheet)
Are you using EPPlus 4? The ability to do column inserts and deletion was added with the new Cell store model they implemented. So you can now do something like this:
[TestMethod]
public void DeleteColumn_Test()
{
//http://stackoverflow.com/questions/28359165/how-to-remove-a-column-from-excel-sheet-in-epplus
var existingFile = new FileInfo(#"c:\temp\temp.xlsx");
if (existingFile.Exists)
existingFile.Delete();
//Throw in some data
var datatable = new DataTable("tblData");
datatable.Columns.Add(new DataColumn("Col1"));
datatable.Columns.Add(new DataColumn("Col2"));
datatable.Columns.Add(new DataColumn("Col3"));
for (var i = 0; i < 20; i++)
{
var row = datatable.NewRow();
row["Col1"] = "Col1 Row" + i;
row["Col2"] = "Col2 Row" + i;
row["Col3"] = "Col3 Row" + i;
datatable.Rows.Add(row);
}
using (var pack = new ExcelPackage(existingFile))
{
var ws = pack.Workbook.Worksheets.Add("Content");
ws.Cells.LoadFromDataTable(datatable, true);
ws.DeleteColumn(2);
pack.SaveAs(existingFile);
}
}
I have a question. Is there a way that I could go through all the cols/rows in a spreadsheet using a for loop?? Right now I am using foreach loops like this in my code: (You can just ignore what's going on inside).
foreach (ExcelRow row in w1.Rows)
{
foreach (ExcelCell cell in row.AllocatedCells)
{
Console.Write("row: {0}", globalVar.iRowActual);
if (globalVar.iRowActual > 1)
{
cellValue = SafeCellValue(cell);
Console.WriteLine("value is: {0}", cellValue);
}
}
globalVar.iRowActual++;
}
The problem is that I would like to assign the value of each cell to a new variable and pass it to another method. I would like to use for loops for this and I know I can use CalculateMaxUsedColumns as the limit for the cols but is there a property like that, that I could use for the rows?!
This is what I would like to do:
int columnCount = ws.CalculateMaxUsedColumns();
int rowCount = ws.CalculateMaxUsedRows(); ------> PART I NEED HELP WITH
for(int i=0; i <columnCount; i++){
for(int j = 0; j<rowCount; j++){
.....
}
}
Any kind of help would be greatly appreciated. Thanks!!!
Here is a way you can iterate in GemBox.Spreadsheet through all the columns / rows in a spreadsheet using a for loop.
Go through the CellRange which is returned by ExcelWorksheet.GetUsedCellRange method.
ExcelFile workbook = ExcelFile.Load("Sample.xlsx");
ExcelWorksheet worksheet = workbook.Worksheets[0];
CellRange range = worksheet.GetUsedCellRange(true);
for (int r = range.FirstRowIndex; r <= range.LastRowIndex; r++)
{
for (int c = range.FirstColumnIndex; c <= range.LastColumnIndex; c++)
{
ExcelCell cell = range[r - range.FirstRowIndex, c - range.FirstColumnIndex];
string cellName = CellRange.RowColumnToPosition(r, c);
string cellRow = ExcelRowCollection.RowIndexToName(r);
string cellColumn = ExcelColumnCollection.ColumnIndexToName(c);
Console.WriteLine(string.Format("Cell name: {1}{0}Cell row: {2}{0}Cell column: {3}{0}Cell value: {4}{0}",
Environment.NewLine, cellName, cellRow, cellColumn, (cell.Value) ?? "Empty"));
}
}
EDIT
In newer versions there are some additional APIs which can simplify this. For instance, you can now use foreach and still retreive the row and column indexes with ExcelCell.Row.Index and ExcelCell.Column.Index and you can retreive the names without using those static methods (without RowColumnToPosition, RowIndexToName and ColumnIndexToName).
ExcelFile workbook = ExcelFile.Load("Sample.xlsx");
ExcelWorksheet worksheet = workbook.Worksheets[0];
foreach (ExcelRow row in worksheet.Rows)
{
foreach (ExcelCell cell in row.AllocatedCells)
{
Console.WriteLine($"Cell value: {cell.Value ?? "Empty"}");
Console.WriteLine($"Cell name: {cell.Name}");
Console.WriteLine($"Row index: {cell.Row.Index}");
Console.WriteLine($"Row name: {cell.Row.Name}");
Console.WriteLine($"Column index: {cell.Column.Index}");
Console.WriteLine($"Column name: {cell.Column.Name}");
Console.WriteLine();
}
}
Also, here are two other ways how you can iterate through sheet cells in for loop.
1) Use ExcelWorksheets.Rows.Count and ExcelWorksheets.CalculateMaxUsedColumns() to get the last used row and column.
ExcelFile workbook = ExcelFile.Load("Sample.xlsx");
ExcelWorksheet worksheet = workbook.Worksheets[0];
int rowCount = worksheet.Rows.Count;
int columnCount = worksheet.CalculateMaxUsedColumns();
for (int r = 0; r < rowCount; r++)
{
for (int c = 0; c < columnCount; c++)
{
ExcelCell cell = worksheet.Cells[r, c];
Console.WriteLine($"Cell value: {cell.Value ?? "Empty"}");
Console.WriteLine($"Cell name: {cell.Name}");
Console.WriteLine($"Row name: {cell.Row.Name}");
Console.WriteLine($"Column name: {cell.Column.Name}");
Console.WriteLine();
}
}
If you have a non-uniform spreadsheet in which rows have different column count (for instance, first row has 10 cells, second row has 100 cells, etc.), then you could use the following change in order to avoid iterating through non-allocated cells:
int rowCount = worksheet.Rows.Count;
for (int r = 0; r < rowCount; r++)
{
ExcelRow row = worksheet.Rows[r];
int columnCount = row.AllocatedCells.Count;
for (int c = 0; c < columnCount; c++)
{
ExcelCell cell = row.Cells[c];
// ...
}
}
2) Use CellRange.GetReadEnumerator method, it iterates through only already allocated cells in the range.
ExcelFile workbook = ExcelFile.Load("Sample.xlsx");
ExcelWorksheet worksheet = workbook.Worksheets[0];
CellRangeEnumerator enumerator = worksheet.Cells.GetReadEnumerator();
while (enumerator.MoveNext())
{
ExcelCell cell = enumerator.Current;
Console.WriteLine($"Cell value: {cell.Value ?? "Empty"}");
Console.WriteLine($"Cell name: {cell.Name}");
Console.WriteLine($"Row name: {cell.Row.Name}");
Console.WriteLine($"Column name: {cell.Column.Name}");
Console.WriteLine();
}
A rather higeisch dataset with 16000 x 12 entries needs to be dumped into a worksheet.
I use the following function now:
for (int r = 0; r < dt.Rows.Count; ++r)
{
for (int c = 0; c < dt.Columns.Count; ++c)
{
worksheet.Cells[c + 1][r + 1] = dt.Rows[r][c].ToString();
}
}
I rediced the example to the center piece
Here is what i implemented after reading the suggestion from Dave Zych.
This works great.
private static void AppendWorkSheet(Excel.Workbook workbook, DataSet data, String tableName)
{
Excel.Worksheet worksheet;
if (UsedSheets == 0) worksheet = workbook.Worksheets[1];
else worksheet = workbook.Worksheets.Add();
UsedSheets++;
DataTable dt = data.Tables[0];
var valuesArray = new object[dt.Rows.Count, dt.Columns.Count];
for (int r = 0; r < dt.Rows.Count; ++r)
{
for (int c = 0; c < dt.Columns.Count; ++c)
{
valuesArray[r, c] = dt.Rows[r][c].ToString();
}
}
Excel.Range c1 = (Excel.Range)worksheet.Cells[1, 1];
Excel.Range c2 = (Excel.Range)worksheet.Cells[dt.Rows.Count, dt.Columns.Count];
Excel.Range range = worksheet.get_Range(c1, c2);
range.Cells.Value2 = valuesArray;
worksheet.Name = tableName;
}
Build a 2D array of your values from your DataSet, and then you can set a range of values in Excel to the values of the array.
object valuesArray = new object[dataTable.Rows.Count, dataTable.Columns.Count];
for(int i = 0; i < dt.Rows.Count; i++)
{
//If you know the number of columns you have, you can specify them this way
//Otherwise use an inner for loop on columns
valuesArray[i, 0] = dt.Rows[i]["ColumnName"].ToString();
valuesArray[i, 1] = dt.Rows[i]["ColumnName2"].ToString();
...
}
//Calculate the second column value by the number of columns in your dataset
//"O" is just an example in this case
//Also note: Excel is 1 based index
var sheetRange = worksheet.get_Range("A2:O2",
string.Format("A{0}:O{0}", dt.Rows.Count + 1));
sheetRange.Cells.Value2 = valuesArray;
This is much, much faster than looping and setting each cell individually. If you're setting each cell individually, you have to talk to Excel through COM (for lack of a better phrase) for each cell (which in your case is ~192,000 times), which is incredibly slow. Looping, building your array and only talking to Excel once removes much of that overhead.