Right this is starting to drive me mad, I have a asp:gridview with check boxes, the user has the ability to check which information he/she wants to export to excel, when they click the button the below code is executed, now you can see im doing a for each row in the gridview etc
if the check box for a row is checked i go to the DB execute some information return a datatable and then try and add its values to the Epplus excel spreadsheet, but in the foreach(datacolum) and foreach(DataRow) it doesnt allow me to use
ws.Cells[1, iColumnCount] = c.ColumnName; as it says its read only?
but this one excel spread sheet could have 1 - 10 different bits of information depending on how many checkboxes are checked....can someone please help me and put me out of my misery........ :(
heres my full code
protected void BtnTest_Click(object sender, EventArgs e)
{
bool ReportGenerated = false;
FileInfo newFile = new FileInfo("C:\\Users\\Scott.Atkinson\\Desktop\\book1.xls");
ExcelPackage pck = new ExcelPackage(newFile);
foreach (GridViewRow row in gvPerformanceResult.Rows)
{
object misValue = System.Reflection.Missing.Value;
CheckBox chkExcel = (CheckBox)row.FindControl("chkExportToExcel");
if (chkExcel.Checked)
{
HyperLink HypCreatedBy = (HyperLink)row.FindControl("HyperCreatedBy"); //Find the name of Sales agent
string CreatedBy = HypCreatedBy.Text;
string Fname = HypCreatedBy.Text;
string[] names = Fname.Split();
CreatedBy = names[0];
CreatedBy = CreatedBy + "." + names[1];
WebUser objUser = new WebUser(CreatedBy, true);
DataTable DT = new DataTable();
LeadOpportunities objLeadOpportunities = new LeadOpportunities();
DT = objLeadOpportunities.LoadPRCDetail("PRC", objUser.ShortAbbr, objUser.CanViewAllLead, ReportCriteria); // Load the information to export to Excel.
if (DT.Rows.Count > 0)
{
ReportGenerated = true;
//Add the Content sheet
var ws = pck.Workbook.Worksheets.Add("Content");
ws.View.ShowGridLines = true;
int iRowCount = ws.Dimension.Start.Row; //Counts how many rows have been used in the Excel Spreadsheet
int iColumnCount = ws.Dimension.Start.Column; //Counts how many Columns have been used.
if (iRowCount > 1)
iRowCount = iRowCount + 2;
else
iRowCount = 1;
iColumnCount = 0;
foreach (DataColumn c in DT.Columns)
{
iColumnCount++;
if (iRowCount == 0)
ws.Cells[1, iColumnCount] = c.ColumnName;
else
ws.Cells[iRowCount, iColumnCount] = c.ColumnName;
}
foreach (DataRow r in DT.Rows)
{
iRowCount++;
iColumnCount = 0;
foreach (DataColumn c in DT.Columns)
{
iColumnCount++;
if (iRowCount == 1)
ws.Cells[iRowCount + 1, iColumnCount] = r[c.ColumnName].ToString();
else
ws.Cells[iRowCount, iColumnCount] = r[c.ColumnName].ToString();
WorkSheet.Columns.AutoFit(); //Correct the width of the columns
}
}
pck.Save();
System.Diagnostics.Process.Start("C:\\Users\\Scott.Atkinson\\Desktop\\book1.xls");
}
}
}
}
Any help would be highly appreciated.
it doesnt allow me to use
ws.Cells[1, iColumnCount] = c.ColumnName;
That line should have been:
ws.Cells[1,iColumnCount].Value = c.ColumnName
but it now falls over on the int iRowCount = ws.Dimension.Start.Row; //Counts how many rows have been used in the Excel Spreadsheet int iColumnCount = ws.Dimension.Start.Column; //Counts how many Columns have been used. can someone help me get the row/column count?
The .Dimension property gives the address for the range covering the top left cell to the bottom right cell so to get the row count we can use:
var rowCount = ws.Dimension.End.Row - ws.Dimension.Start.Row + 1;
and similarly for the column count:
var colCount = ws.Dimension.End.Column - ws.Dimension.Start.Column + 1;
Related
I have a requirement where-in I have to fill dataTable from a sheet of Microsoft excel.
The sheet may have lots of data so the requirement is that when a foreach loop is iterated over the data table which is supposed to hold the data from Microsoft excel sheet should fill the table on demand.
Meaning if there are 1000000 records in the sheet the data table should fetch data in batches of 100 depending on the current position of the foreach current item in the loop.
Any pointer or suggestion will be appreciated.
I would suggest you to use OpenXML to parse and read your excel data from file.
This will also allow you to read out specific sections/regions from your workbook.
You will find more information and also an example at this link:
Microsoft Docs - Parse and read a large spreadsheet document (Open XML SDK)
This will be more efficiently and easier to develop than use the official microsoft office excel interop.
**I am not near a PC with Visual stuido, so this code is untested, and may have syntax errors until I can test it later.
It will still give you the main idea of what needs to be done.
private void ExcelDataPages(int firstRecord, int numberOfRecords)
{
Excel.Application dataApp = new Excel.Application();
Excel.Workbook dataWorkbook = new Excel.Workbook();
int x = 0;
dataWorkbook.DisplayAlerts = false;
dataWorkbook.Visible = false;
dataWorkbook.AutomationSecurity = Microsoft.Office.Core.MsoAutomationSecurity.msoAutomationSecurityLow;
dataWorkbook = dataApp.Open(#"C:\Test\YourWorkbook.xlsx");
try
{
Excel.Worksheet dataSheet = dataWorkbook.Sheet("Name of Sheet");
while (x < numberOfRecords)
{
Range currentRange = dataSheet.Rows[firstRecord + x]; //For all columns in row
foreach (Range r in currentRange.Cells) //currentRange represents all the columns in the row
{
// do what you need to with the Data here.
}
x++;
}
}
catch (Exception ex)
{
//Enter in Error handling
}
dataWorkbook.Close(false); //Depending on how quick you will access the next batch of data, you may not want to close the Workbook, reducing load time each time. This may also mean you need to move the open of the workbook to a higher level in your class, or if this is the main process of the app, make it static, stopping the garbage collector from destroying the connection.
dataApp.Quit();
}
Give the following a try--it uses NuGet package DocumentFormat.OpenXml The code is from Using OpenXmlReader. However, I modified it to add data to a DataTable. Since you're reading data from the same Excel file multiple times, it's faster to open the Excel file once using an instance of SpreadSheetDocument and dispose of it when finished. Since the instance of SpreedSheetDocument needs to be disposed of before your application exits, IDisposable is used.
Where it says "ToDo", you'll need to replace the code that creates the DataTable columns with your own code to create the correct columns for your project.
I tested the code below with an Excel file containing approximately 15,000 rows. When reading 100 rows at a time, the first read took approximately 500 ms - 800 ms, whereas subsequent reads took approximately 100 ms - 400 ms.
Create a class (name: HelperOpenXml)
HelperOpenXml.cs
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using DocumentFormat.OpenXml;
using DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.Spreadsheet;
using System.Data;
using System.Diagnostics;
namespace ExcelReadSpecifiedRowsUsingOpenXml
{
public class HelperOpenXml : IDisposable
{
public string Filename { get; private set; } = string.Empty;
public int RowCount { get; private set; } = 0;
private SpreadsheetDocument spreadsheetDocument = null;
private DataTable dt = null;
public HelperOpenXml(string filename)
{
this.Filename = filename;
}
public void Dispose()
{
if (spreadsheetDocument != null)
{
try
{
spreadsheetDocument.Dispose();
dt.Clear();
}
catch(Exception ex)
{
throw ex;
}
}
}
public DataTable GetRowsSax(int startRow, int endRow, bool firstRowIsHeader = false)
{
int startIndex = startRow;
int endIndex = endRow;
if (firstRowIsHeader)
{
//if first row is header, increment by 1
startIndex = startRow + 1;
endIndex = endRow + 1;
}
if (spreadsheetDocument == null)
{
//create new instance
spreadsheetDocument = SpreadsheetDocument.Open(Filename, false);
//create new instance
dt = new DataTable();
//ToDo: replace 'dt.Columns.Add(...)' below with your code to create the DataTable columns
//add columns to DataTable
dt.Columns.Add("A");
dt.Columns.Add("B");
dt.Columns.Add("C");
dt.Columns.Add("D");
dt.Columns.Add("E");
dt.Columns.Add("F");
dt.Columns.Add("G");
dt.Columns.Add("H");
dt.Columns.Add("I");
dt.Columns.Add("J");
dt.Columns.Add("K");
}
else
{
//remove existing data from DataTable
dt.Rows.Clear();
}
WorkbookPart workbookPart = spreadsheetDocument.WorkbookPart;
int numWorkSheetParts = 0;
foreach (WorksheetPart worksheetPart in workbookPart.WorksheetParts)
{
using (OpenXmlReader reader = OpenXmlReader.Create(worksheetPart))
{
int rowIndex = 0;
//use the reader to read the XML
while (reader.Read())
{
if (reader.ElementType == typeof(Row))
{
reader.ReadFirstChild();
List<string> cValues = new List<string>();
int colIndex = 0;
do
{
//only get data from desired rows
if ((rowIndex > 0 && rowIndex >= startIndex && rowIndex <= endIndex) ||
(rowIndex == 0 && !firstRowIsHeader && rowIndex >= startIndex && rowIndex <= endIndex))
{
if (reader.ElementType == typeof(Cell))
{
Cell c = (Cell)reader.LoadCurrentElement();
string cellRef = c.CellReference; //ex: A1, B1, ..., A2, B2
string cellValue = string.Empty;
//string/text data is stored in SharedString
if (c.DataType != null && c.DataType == CellValues.SharedString)
{
SharedStringItem ssi = workbookPart.SharedStringTablePart.SharedStringTable.Elements<SharedStringItem>().ElementAt(int.Parse(c.CellValue.InnerText));
cellValue = ssi.Text.Text;
}
else
{
cellValue = c.CellValue.InnerText;
}
//Debug.WriteLine("{0}: {1} ", c.CellReference, cellValue);
//add value to List which is used to add a row to the DataTable
cValues.Add(cellValue);
}
}
colIndex += 1; //increment
} while (reader.ReadNextSibling());
if (cValues.Count > 0)
{
//if List contains data, use it to add row to DataTable
dt.Rows.Add(cValues.ToArray());
}
rowIndex += 1; //increment
if (rowIndex > endIndex)
{
break; //exit loop
}
}
}
}
numWorkSheetParts += 1; //increment
}
DisplayDataTableData(dt); //display data in DataTable
return dt;
}
private void DisplayDataTableData(DataTable dt)
{
foreach (DataColumn dc in dt.Columns)
{
Debug.WriteLine("colName: " + dc.ColumnName);
}
foreach (DataRow r in dt.Rows)
{
Debug.WriteLine(r[0].ToString() + " " + r[1].ToString());
}
}
}
}
Usage:
private string excelFilename = #"C:\Temp\Test.xlsx";
private HelperOpenXml helperOpenXml = null;
...
private void GetData(int startIndex, int endIndex, bool firstRowIsHeader)
{
helperOpenXml.GetRowsSax(startIndex, endIndex, firstRowIsHeader);
}
Note: Make sure to call Dispose() (ex: helperOpenXml.Dispose();) before your application exits.
Update:
OpenXML stores dates as the number of days since 01 Jan 1900. For dates prior to 01 Jan 1900, they are stored in SharedString. For more info see Reading a date from xlsx using open xml sdk
Here's a code snippet:
Cell c = (Cell)reader.LoadCurrentElement();
...
string cellValue = string.Empty
...
cellValue = c.CellValue.InnerText;
double dateCellValue = 0;
Double.TryParse(cellValue, out dateCellValue);
DateTime dt = DateTime.FromOADate(dateCellValue);
cellValue = dt.ToString("yyyy/MM/dd");
Another simple alternative is this: Take a look at the NUGET package ExcelDataReader, with additional information on
https://github.com/ExcelDataReader/ExcelDataReader
Usage example:
[Fact]
void Test_ExcelDataReader()
{
System.Text.Encoding.RegisterProvider(System.Text.CodePagesEncodingProvider.Instance);
var scriptPath = Path.GetDirectoryName(Util.CurrentQueryPath); // LinqPad script path
var filePath = $#"{scriptPath}\TestExcel.xlsx";
using (var stream = File.Open(filePath, FileMode.Open, FileAccess.Read))
{
// Auto-detect format, supports:
// - Binary Excel files (2.0-2003 format; *.xls)
// - OpenXml Excel files (2007 format; *.xlsx, *.xlsb)
using (var reader = ExcelDataReader.ExcelReaderFactory.CreateReader(stream))
{
var result = reader.AsDataSet();
// The result of each spreadsheet is in result.Tables
var t0 = result.Tables[0];
Assert.True(t0.Rows[0][0].Dump("R0C0").ToString()=="Hello", "Expected 'Hello'");
Assert.True(t0.Rows[0][1].Dump("R0C1").ToString()=="World!", "Expected 'World!'");
} // using
} // using
} // fact
Before you start reading, you need to set and encoding provider as follows:
System.Text.Encoding.RegisterProvider(
System.Text.CodePagesEncodingProvider.Instance);
The cells are addressed the following way:
var t0 = result.Tables[0]; // table 0 is the first worksheet
var cell = t0.Rows[0][0]; // on table t0, read cell row 0 column 0
And you can easily loop through the rows and columns in a for loop as follows:
for (int r = 0; r < t0.Rows.Count; r++)
{
var row = t0.Rows[r];
var columns = row.ItemArray;
for (int c = 0; c < columns.Length; c++)
{
var cell = columns[c];
cell.Dump();
}
}
I use this code with EPPlus DLL, Don't forget to add reference. But should check to match with your requirement.
public DataTable ReadExcelDatatable(bool hasHeader = true)
{
using (var pck = new OfficeOpenXml.ExcelPackage())
{
using (var stream = File.OpenRead(this._fullPath))
{
pck.Load(stream);
}
var ws = pck.Workbook.Worksheets.First();
DataTable tbl = new DataTable();
int i = 1;
foreach (var firstRowCell in ws.Cells[1, 1, 1, ws.Dimension.End.Column])
{
//table head
tbl.Columns.Add(hasHeader ? firstRowCell.Text : string.Format("Column {0}", firstRowCell.Start.Column));
tbl.Columns.Add(_tableHead[i]);
i++;
}
var startRow = hasHeader ? 2 : 1;
for (int rowNum = startRow; rowNum <= ws.Dimension.End.Row; rowNum++)
{
var wsRow = ws.Cells[rowNum, 1, rowNum, ws.Dimension.End.Column];
DataRow row = tbl.Rows.Add();
foreach (var cell in wsRow)
{
row[cell.Start.Column - 1] = cell.Text;
}
}
return tbl;
}
}
I'm going to give you a different answer. If the performance is bad loading a million rows into a DataTable resort to using a Driver to load the data: How to open a huge excel file efficiently
DataSet excelDataSet = new DataSet();
string filePath = #"c:\temp\BigBook.xlsx";
// For .XLSXs we use =Microsoft.ACE.OLEDB.12.0;, for .XLS we'd use Microsoft.Jet.OLEDB.4.0; with "';Extended Properties=\"Excel 8.0;HDR=YES;\"";
string connectionString = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source='" + filePath + "';Extended Properties=\"Excel 12.0;HDR=YES;\"";
using (OleDbConnection conn = new OleDbConnection(connectionString))
{
conn.Open();
OleDbDataAdapter objDA = new System.Data.OleDb.OleDbDataAdapter
("select * from [Sheet1$]", conn);
objDA.Fill(excelDataSet);
//dataGridView1.DataSource = excelDataSet.Tables[0];
}
Next filter the DataSet's DataTable using a DataView. Using a DataView's RowFilter property you can specify subsets of rows based on their column values.
DataView prodView = new DataView(excelDataSet.Tables[0],
"UnitsInStock <= ReorderLevel",
"SupplierID, ProductName",
DataViewRowState.CurrentRows);
Ref: https://www.c-sharpcorner.com/article/dataview-in-C-Sharp/
Or you could use the DataTables' DefaultView RowFilter directly:
excelDataSet.Tables[0].DefaultView.RowFilter = "Amount >= 5000 and Amount <= 5999 and Name = 'StackOverflow'";
I'm exporting a DataSet to Excel (.xlsx) and for whatever reason, I'm getting an additional (unwanted) worksheet added to my workbook.
My dataTables are setup like so:
DataTable table1 = new DataTable();
table1 = CallReport("ReportName");
DataTable table2 = new DataTable();
table2 = CallReport("ReportName");
DataTable table3 = new DataTable();
table3 = CallReport("ReportName");
//add to DataSet
DataSet exportSet = new DataSet();
table3.TableName = "Combined";
exportSet.Tables.Add(table3);
table2.TableName = "Non-Services";
exportSet.Tables.Add(table2);
table1.TableName = "Services";
exportSet.Tables.Add(table1);
I'm then creating an Excel instance, then using foreach to loop through my DataSet and write the tables to the Excel workbook:
MSClass.CreateExcelFile();
foreach (DataTable table in exportSet.Tables)
{
MSClass.WriteToExcel(table, table.TableName);
}
MSClass.SaveExcelFile(exportPath);
Here is how I'm creating the Excel file, writing to it and saving it within my MSClass:
class MSClass
{
static Microsoft.Office.Interop.Excel.Application xl;
public static void CreateExcelFile()
{
//Creates new instance of Excel.Application()
xl = new Excel.Application();
//Adds new workbook to the application instance
xl.Workbooks.Add();
}
public static void SaveExcelFile(string filePath)
{
//Check to make sure the filePath isn't empty, and if it is, just show the file instance
if (filePath != null && filePath != "")
{
//in case there's an error
try
{
//Makes sure the sheet is in the active session
Excel._Worksheet sheet = xl.ActiveSheet;
//Stops from asking to overwrite the file.
//If I didn't want to overwrite I would be changing the file name everytime.
xl.DisplayAlerts = false;
//Saves the sheet to the file path
sheet.SaveAs(filePath);
//shows the final product
xl.Visible = true;
}
catch (Exception ex)
{
//throw up an Exception for any error and show the error message
throw new Exception("Export To Excel: Excel file could not be saved! Check filePath.\n" + ex.Message);
}
}
else //no file path is given
{
//Just show the current session of the Excel Application
xl.Visible = true;
}
}
public static void WriteToExcel(DataTable table, string sheetName)
{
//Adds new worksheet to workbook
Excel._Worksheet sheet = xl.ActiveWorkbook.Sheets[1];
//Gets count of columns within supplied DataTable
int colCount = table.Columns.Count;
//Creates object array for headers for each column
object[] header = new object[colCount];
//loop to add column names to header object array
for (int i = 0; i < colCount; i++)
{
//Adds column names to header object array
header[i] = table.Columns[i].ColumnName;
}
//Get range of headers
Excel.Range headerRange = sheet.get_Range((Excel.Range)(sheet.Cells[1, 1]), (Excel.Range)(sheet.Cells[1, colCount]));
//Applies the header to Excel file
headerRange.Value = header;
//Adds color to header to make more distinct
headerRange.Interior.Color = ColorTranslator.ToOle(Color.LightGray);
//Bolds the header
headerRange.Font.Bold = true;
//gets count of rows from DataTable
int rowCount = table.Rows.Count;
//Object multidimensional array for the cells within the dataTable
object[,] cells = new object[rowCount, colCount];
//Adds the cells from DataTable to cell Object array
for (int j = 0; j < rowCount; j++)
{
for (int i = 0; i < colCount; i++)
{
//Sets cell values to DataTable values
cells[j, i] = table.Rows[j][i];
}
}
//prints the cell values to Excel cell(s)
sheet.get_Range((Excel.Range)(sheet.Cells[2, 1]), (Excel.Range)(sheet.Cells[rowCount + 1, colCount])).Value = cells;
sheet.Name = sheetName;
sheet.Activate();
xl.Worksheets.Add(sheet);
}
}
I'm unsure what is currently causing the extra sheet to be exported. the sheets export in the order of: Sheet4 | Services | Non-Services | Combined
Where Sheet4 is completely blank, and the other sheets are exported in reversed order.
What is causing my extra (blank) sheet to be added to the workbook?
I identified the code slowing down the process as this one (where I'm filling the cells):
What I'm doing here is basically loading some data from a database using a DataSet.
Microsoft.Office.Interop.Excel.Range range1 = null;
Microsoft.Office.Interop.Excel.Range cell1 = null;
Microsoft.Office.Interop.Excel.Borders border1 = null;
for (i = 0; i <= ds.Tables[0].Rows.Count - 1; i++)
{
int s = i + 1;
for (j = 0; j <= ds.Tables[0].Columns.Count - 1; j++)
{
data = ds.Tables[0].Rows[i].ItemArray[j].ToString();
xlWorkSheet.Cells[s + 1, j + 1] = data;
range1 = xlWorkSheet.UsedRange;
cell1 = range1.Cells[s + 1, j + 1];
border1 = cell1.Borders;
if (((IList)terms).Contains(xlWorkSheet.Cells[1, j + 1].Value.ToString()))
{
cell1.Interior.Color = System.Drawing.Color.Red;
}
range1.Columns.AutoFit();
range1.HorizontalAlignment = Microsoft.Office.Interop.Excel.XlHAlign.xlHAlignCenter;
border1.LineStyle = Microsoft.Office.Interop.Excel.XlLineStyle.xlContinuous;
border1.Weight = 2d;
}
}
It's sometimes taking like more than 1 minute to load the whole thing. Is there is away to optimize it?.
Cell-by-cell is the slowest possible way to interact with Excel using Interop - look up how to add data to a sheet from an array in one operation.
E.g.
Write Array to Excel Range
shows this approach.
Interop libraries are extremely slow and spends huge source of system.
Instead of using Interop Libraries to create Excel files, you can simply use it OpenXML library.
I'm using it in production. And over 1 million rows it just takes about 10 seconds to export dataset to excel file.
Here is a sample code quoted from:
Export DataTable to Excel with Open Xml SDK in c#
private void ExportDSToExcel(DataSet ds, string destination)
{
using (var workbook = SpreadsheetDocument.Create(destination, DocumentFormat.OpenXml.SpreadsheetDocumentType.Workbook))
{
var workbookPart = workbook.AddWorkbookPart();
workbook.WorkbookPart.Workbook = new DocumentFormat.OpenXml.Spreadsheet.Workbook();
workbook.WorkbookPart.Workbook.Sheets = new DocumentFormat.OpenXml.Spreadsheet.Sheets();
uint sheetId = 1;
foreach (DataTable table in ds.Tables)
{
var sheetPart = workbook.WorkbookPart.AddNewPart<WorksheetPart>();
var sheetData = new DocumentFormat.OpenXml.Spreadsheet.SheetData();
sheetPart.Worksheet = new DocumentFormat.OpenXml.Spreadsheet.Worksheet(sheetData);
DocumentFormat.OpenXml.Spreadsheet.Sheets sheets = workbook.WorkbookPart.Workbook.GetFirstChild<DocumentFormat.OpenXml.Spreadsheet.Sheets>();
string relationshipId = workbook.WorkbookPart.GetIdOfPart(sheetPart);
if (sheets.Elements<DocumentFormat.OpenXml.Spreadsheet.Sheet>().Count() > 0)
{
sheetId =
sheets.Elements<DocumentFormat.OpenXml.Spreadsheet.Sheet>().Select(s => s.SheetId.Value).Max() + 1;
}
DocumentFormat.OpenXml.Spreadsheet.Sheet sheet = new DocumentFormat.OpenXml.Spreadsheet.Sheet() { Id = relationshipId, SheetId = sheetId, Name = table.TableName };
sheets.Append(sheet);
DocumentFormat.OpenXml.Spreadsheet.Row headerRow = new DocumentFormat.OpenXml.Spreadsheet.Row();
List<String> columns = new List<string>();
foreach (DataColumn column in table.Columns)
{
columns.Add(column.ColumnName);
DocumentFormat.OpenXml.Spreadsheet.Cell cell = new DocumentFormat.OpenXml.Spreadsheet.Cell();
cell.DataType = DocumentFormat.OpenXml.Spreadsheet.CellValues.String;
cell.CellValue = new DocumentFormat.OpenXml.Spreadsheet.CellValue(column.ColumnName);
headerRow.AppendChild(cell);
}
sheetData.AppendChild(headerRow);
foreach (DataRow dsrow in table.Rows)
{
DocumentFormat.OpenXml.Spreadsheet.Row newRow = new DocumentFormat.OpenXml.Spreadsheet.Row();
foreach (String col in columns)
{
DocumentFormat.OpenXml.Spreadsheet.Cell cell = new DocumentFormat.OpenXml.Spreadsheet.Cell();
cell.DataType = DocumentFormat.OpenXml.Spreadsheet.CellValues.String;
cell.CellValue = new DocumentFormat.OpenXml.Spreadsheet.CellValue(dsrow[col].ToString()); //
newRow.AppendChild(cell);
}
sheetData.AppendChild(newRow);
}
}
}
}
Edit
Based on the replies below, the error I am experiencing may or may not be causing my inability to read my excel file. That is, I am not getting data from the line worksheet.Cells[row,col].Value in my for loop given below.
Problem
I am trying to return a DataTable with information from an excel file. Specifically, it is an xlsx file from 2013 excel I believe. Please see the code below:
private DataTable ImportToDataTable(string Path)
{
DataTable dt = new DataTable();
FileInfo fi = new FileInfo(Path);
if(!fi.Exists)
{
throw new Exception("File " + Path + " Does not exist.");
}
using (ExcelPackage xlPackage = new ExcelPackage(fi))
{
//Get the worksheet in the workbook
ExcelWorksheet worksheet = xlPackage.Workbook.Worksheets.First();
//Obtain the worksheet size
ExcelCellAddress startCell = worksheet.Dimension.Start;
ExcelCellAddress endCell = worksheet.Dimension.End;
//Create the data column
for(int col = startCell.Column; col <= endCell.Column; col++)
{
dt.Columns.Add(col.ToString());
}
for(int row = startCell.Row; row <= endCell.Row; row++)
{
DataRow dr = dt.NewRow(); //Create a row
int i = 0;
for(int col = startCell.Column; col <= endCell.Column; col++)
{
dr[i++] = worksheet.Cells[row, col].Value.ToString();
}
dt.Rows.Add(dr);
}
}
return dt;
}
Error
This is where things get weird. I can see the proper value in startCell and endCell. However, when I look at worksheet I take a peek under Cells and I see something I don't understand:
worksheet.Cells.Current' threw an exception of type 'System.NullReferenceException
Attempts
Reformatting my excel with general fields.
Making sure no field in my excel was empty
RTFM'ed epplus documentation. Nothing suggestive of this error.
Looked at EPPlus errors on stackoverflow. My problem is unique.
Honestly, I am having trouble figuring out what this error is really saying? Is something wrong with my format? Is something wrong with epplus? I have read on here people had no problems with 2013 xlsx with eeplus and I am only trying to parse the excel file by row. If someone could help me shed light on what this error means and how to rectify it. I would be most grateful. I've spent quite a long time trying to figure this out.
When we give:
dr[i++] = worksheet.Cells[row, col].Value.ToString();
it search for value at that column, if the column is empty, it gives Null reference error.
Try instead:
dr[i++] = worksheet.Cells[row, col].Text;
Hope this will help
Like #Thorians said, current is really meant to use when you enumerating the cells. If you want to use it in purest form and actually be able to call current then you would need something like this:
using (var pck = new ExcelPackage(existingFile))
{
var worksheet = pck.Workbook.Worksheets.First();
//this is important to hold onto the range reference
var cells = worksheet.Cells;
//this is important to start the cellEnum object (the Enumerator)
cells.Reset();
//Can now loop the enumerator
while (cells.MoveNext())
{
//Current can now be used thanks to MoveNext
Console.WriteLine("Cell [{0}, {1}] = {2}"
, cells.Current.Start.Row
, cells.Current.Start.Column
, cells.Current.Value);
}
}
Note that you have to create a kind of local collection cells for this to work properly. Otherwise Current will be null if you tried `worksheet.cells.current'
But it would be simpler to use a ForEach and have the CLR do the work for you.
UPDATE: Based on comments. Your code should work fine as is, could it be your excel file:
[TestMethod]
public void Current_Cell_Test()
{
//http://stackoverflow.com/questions/32516676/trying-to-read-excel-file-with-epplus-and-getting-system-nullexception-error
//Throw in some data
var datatable = new DataTable("tblData");
datatable.Columns.AddRange(new[] { new DataColumn("Col1", typeof (int)), new DataColumn("Col2", typeof (int)),new DataColumn("Col3", typeof (object)) });
for (var i = 0; i < 10; i++)
{
var row = datatable.NewRow(); row[0] = i; row[1] = i * 10; row[2] = Path.GetRandomFileName(); datatable.Rows.Add(row);
}
//Create a test file
var fi = new FileInfo(#"c:\temp\test1.xlsx");
if (fi.Exists)
fi.Delete();
using (var pck = new ExcelPackage(fi))
{
var worksheet = pck.Workbook.Worksheets.Add("Sheet1");
worksheet.Cells.LoadFromDataTable(datatable, true);
pck.Save();
}
var dt = new DataTable();
using (ExcelPackage xlPackage = new ExcelPackage(fi))
{
//Get the worksheet in the workbook
ExcelWorksheet worksheet = xlPackage.Workbook.Worksheets.First();
//Obtain the worksheet size
ExcelCellAddress startCell = worksheet.Dimension.Start;
ExcelCellAddress endCell = worksheet.Dimension.End;
//Create the data column
for (int col = startCell.Column; col <= endCell.Column; col++)
{
dt.Columns.Add(col.ToString());
}
for (int row = startCell.Row; row <= endCell.Row; row++)
{
DataRow dr = dt.NewRow(); //Create a row
int i = 0;
for (int col = startCell.Column; col <= endCell.Column; col++)
{
dr[i++] = worksheet.Cells[row, col].Value.ToString();
}
dt.Rows.Add(dr);
}
}
Console.Write("{{dt Rows: {0} Columns: {1}}}", dt.Rows.Count, dt.Columns.Count);
}
Give this in the output:
{Rows: 11, Columns: 3}
Current is the current range when enumerating.
there is nothing wrong with this throwing an exception in debugging inspection when it is not being used within an enumerating scope.
code sample:
var range = ws.Cells[1,1,1,100];
foreach (var cell in range)
{
var a = range.Current.Value; // a is same as b
var b = cell.Value;
}
I am also getting same issue while reading excel file and none of the solution provided worked for me. Here is working code:
public void readXLS(string FilePath)
{
FileInfo existingFile = new FileInfo(FilePath);
using (ExcelPackage package = new ExcelPackage(existingFile))
{
//get the first worksheet in the workbook
ExcelWorksheet worksheet = package.Workbook.Worksheets[1];
int colCount = worksheet.Dimension.End.Column; //get Column Count
int rowCount = worksheet.Dimension.End.Row; //get row count
for (int row = 1; row <= rowCount; row++)
{
for (int col = 1; col <= colCount; col++)
{
Console.WriteLine(" Row:" + row + " column:" + col + " Value:" + worksheet.Cells[row, col].Value.ToString().Trim());
}
}
}
}
I am exporting Sql data to Excel. The code I am using currently is :
DataTable dt = new DataTable();
// Create sql connection string
string conString = "Data Source=DELL\\SQLSERVER1;Trusted_Connection=True;DATABASE=Zelen;CONNECTION RESET=FALSE";
SqlConnection sqlCon = new SqlConnection(conString);
sqlCon.Open();
SqlDataAdapter da = new SqlDataAdapter("select LocalSKU,ItemName, QOH,Price,Discontinued,CAST(Barcode As varchar(25)) As Barcode,Integer2,Integer3,ISNULL(SalePrice,0.0000)AS SalePrice,SaleOn,ISNULL(Price2,0.0000)AS Price2 from dbo.Inventory", sqlCon);
System.Data.DataTable dtMainSQLData = new System.Data.DataTable();
da.Fill(dtMainSQLData);
DataColumnCollection dcCollection = dtMainSQLData.Columns;
// Export Data into EXCEL Sheet
Microsoft.Office.Interop.Excel.ApplicationClass ExcelApp = new Microsoft.Office.Interop.Excel.ApplicationClass();
ExcelApp.Application.Workbooks.Add(Type.Missing);
int i = 1;
int j = 1;
int s = 1;
//header row
foreach (DataColumn col in dtMainSQLData.Columns)
{
ExcelApp.Cells[i, j] = col.ColumnName;
j++;
ExcelApp.Rows.AutoFit();
ExcelApp.Columns.AutoFit();
}
i++;
//data rows
foreach (DataRow row in dtMainSQLData.Rows)
{
for (int k = 1; k < dtMainSQLData.Columns.Count + 1; k++)
{
ExcelApp.Cells[i, k] = "'" + row[k - 1].ToString();
}
i++;
s++;
Console.Write(s);
Console.Write("\n\r");
ExcelApp.Columns.AutoFit();
ExcelApp.Rows.AutoFit();
}
var b = Environment.CurrentDirectory + #"\Sheet1.xlsx";
ExcelApp.ActiveWorkbook.SaveCopyAs(b);
ExcelApp.ActiveWorkbook.Saved = true;
ExcelApp.Quit();
Console.WriteLine(".xlsx file Exported succssessfully.");
Takes are 70000 rows in my sql database. I am running this script in Console application.
It takes more then an hour to export it to excel file.
How can I use this to export it faster?
Examples would be appreciated.
Option 1:
See this answer. Use a library called ClosedXML to write the data to Excel.
Option 2:
Get a range big enough for all of the data and set the value equal to a 2 dimensional range. This works very fast without another referencing another library. I tried with 70000 records.
// Get an excel instance
Microsoft.Office.Interop.Excel.Application excel = new Microsoft.Office.Interop.Excel.Application();
// Get a workbook
Workbook wb = excel.Workbooks.Add();
// Get a worksheet
Worksheet ws = wb.Worksheets.Add();
ws.Name = "Test Export";
// Add column names to the first row
int col = 1;
foreach (DataColumn c in table.Columns) {
ws.Cells[1, col] = c.ColumnName;
col++;
}
// Create a 2D array with the data from the table
int i = 0;
string[,] data = new string[table.Rows.Count, table.Columns.Count];
foreach (DataRow row in table.Rows) {
int j = 0;
foreach (DataColumn c in table.Columns) {
data[i,j] = row[c].ToString();
j++;
}
i++;
}
// Set the range value to the 2D array
ws.Range[ws.Cells[2, 1], ws.Cells[table.Rows.Count + 1, table.Columns.Count]].value = data;
// Auto fit columns and rows, show excel, save.. etc
excel.Columns.AutoFit();
excel.Rows.AutoFit();
excel.Visible = true;
Edit: This version exported a million records on my machine it takes about a minute. This example uses Excel interop and breaks the rows in to chunks of 100,000.
// Start a stopwatch to time the process
System.Diagnostics.Stopwatch sw = new System.Diagnostics.Stopwatch();
sw.Start();
// Check if there are rows to process
if (table != null && table.Rows.Count > 0) {
// Determine the number of chunks
int chunkSize = 100000;
double chunkCountD = (double)table.Rows.Count / (double)chunkSize;
int chunkCount = table.Rows.Count / chunkSize;
chunkCount = chunkCountD > chunkCount ? chunkCount + 1 : chunkCount;
// Instantiate excel
Microsoft.Office.Interop.Excel.Application excel = new Microsoft.Office.Interop.Excel.Application();
// Get a workbook
Workbook wb = excel.Workbooks.Add();
// Get a worksheet
Worksheet ws = wb.Worksheets.Add();
ws.Name = "Test Export";
// Add column names to excel
int col = 1;
foreach (DataColumn c in table.Columns) {
ws.Cells[1, col] = c.ColumnName;
col++;
}
// Build 2D array
int i = 0;
string[,] data = new string[table.Rows.Count, table.Columns.Count];
foreach (DataRow row in table.Rows) {
int j = 0;
foreach (DataColumn c in table.Columns) {
data[i, j] = row[c].ToString();
j++;
}
i++;
}
int processed = 0;
int data2DLength = data.GetLength(1);
for (int chunk = 1; chunk <= chunkCount; chunk++) {
if (table.Rows.Count - processed < chunkSize) chunkSize = table.Rows.Count - processed;
string[,] chunkData = new string[chunkSize, data2DLength];
int l = 0;
for (int k = processed; k < chunkSize + processed; k++) {
for (int m = 0; m < data2DLength; m++) {
chunkData[l,m] = table.Rows[k][m].ToString();
}
l++;
}
// Set the range value to the chunk 2d array
ws.Range[ws.Cells[2 + processed, 1], ws.Cells[processed + chunkSize + 1, data2DLength]].value = chunkData;
processed += chunkSize;
}
// Auto fit columns and rows, show excel, save.. etc
excel.Columns.AutoFit();
excel.Rows.AutoFit();
excel.Visible = true;
}
// Stop the stopwatch and display the seconds elapsed
sw.Stop();
MessageBox.Show(sw.Elapsed.TotalSeconds.ToString());
If you save your data to as CSV formant you can load that into Excel, Here is some code i have modified from The Code Project site here http://www.codeproject.com/Tips/665519/Writing-a-DataTable-to-a-CSV-file
public class Program
{
static void Main(string[] args)
{
Stopwatch stopwatch = new Stopwatch();
stopwatch.Start();
DataTable dt = new DataTable();
// Create Connection object
using (SqlConnection conn = new SqlConnection(#"<Your Connection String>"))
{
// Create Command object
conn.Open();
using (SqlCommand cmd = new SqlCommand("SELECT * FROM <Your Table>", conn))
{
using (SqlDataReader reader = cmd.ExecuteReader())
{
try
{
dt.Load(reader);
using (StreamWriter writer = new StreamWriter("C:\\Temp\\dump.csv"))
{
DataConvert.ToCSV(dt, writer, false);
}
}
catch (Exception)
{
throw;
}
}
}
}
// Stop timing
stopwatch.Stop();
// Write result
Console.WriteLine("Time elapsed: {0}",
stopwatch.Elapsed);
Console.ReadKey();
}
}
public static class DataConvert
{
public static void ToCSV(DataTable sourceTable, TextWriter writer, bool includeHeaders)
{
if (includeHeaders)
{
List<string> headerValues = new List<string>();
foreach (DataColumn column in sourceTable.Columns)
{
headerValues.Add(QuoteValue(column.ColumnName));
}
writer.WriteLine(String.Join(",", headerValues.ToArray()));
}
string[] items = null;
foreach (DataRow row in sourceTable.Rows)
{
items = row.ItemArray.Select(o => QuoteValue(o.ToString())).ToArray();
writer.WriteLine(String.Join(",", items));
}
writer.Flush();
}
private static string QuoteValue(string value)
{
return String.Concat("\"", value.Replace("\"", "\"\""), "\"");
}
}
}
On my PC this took 30 seconds to process 1 million records...
you can try this function:
After set your data in a datatable.
Public Shared Sub ExportDataSetToExcel(ByVal ds As DataTable, ByVal filename As String)
Dim response As HttpResponse = HttpContext.Current.Response
response.Clear()
response.Buffer = True
response.Charset = ""
response.ContentType = "application/vnd.ms-excel"
Using sw As New StringWriter()
Using htw As New HtmlTextWriter(sw)
Dim dg As New DataGrid()
dg.DataSource = ds
dg.DataBind()
dg.RenderControl(htw)
response.Charset = "UTF-8"
response.ContentEncoding = System.Text.Encoding.UTF8
response.BinaryWrite(System.Text.Encoding.UTF8.GetPreamble())
response.Output.Write(sw.ToString())
response.[End]()
End Using
End Using
End Sub
I prefer Microsoft Open XML SDK's Open XML Writer. Open XML is the format all the new office files are in.
Export a large data query (60k+ rows) to Excel
Vincent Tan has a nice article on the topic.
http://polymathprogrammer.com/2012/08/06/how-to-properly-use-openxmlwriter-to-write-large-excel-files/