I am trying to access data from Excel in C#. Ideally I want to put the data into a list or a series collection. I was using this tutorial - http://www.aspsnippets.com/Articles/Read-Excel-file-using-OLEDB-Data-Provider-in-C-Net.aspx.
It was very helpful but I think he missed out the data adapter part. Here is the code I got following his example.
string connectionString = null;
connectionString = "Provider = Microsoft.ACE.OLEDB.12.0; Data Source = P:\\Visual Studio 2012\\Projects\\SmartSheetAPI\\SmartSheetAPI\\bin\\Debug\\OUTPUT.xls; Extended Properties = 'excel 12.0 Xml; HDR=YES; IMEX=1;';";
//Establish Connection
string dataSource = "P:\\Visual Studio 2012\\Projects\\SmartSheetAPI\\SmartSheetAPI\\bin\\Debug\\OUTPUT.xls;";
string excelConnection = "Provider=Microsoft.ACE.OLEDB.12.0; Data Source = " + dataSource + " Extended Properties='Excel 8.0; HDR=Yes'";
OleDbConnection connExcel = new OleDbConnection(connectionString);
OleDbCommand cmdExcel = new OleDbCommand();
cmdExcel.Connection = connExcel;
//Accessing Sheets
connExcel.Open();
DataTable dtExcelSchema;
dtExcelSchema = connExcel.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
connExcel.Close();
//access excel Sheets (tables in database)
DataSet dataset = new DataSet();
string SheetName = dtExcelSchema.Rows[0]["TABLE_NAME"].ToString();
cmdExcel.CommandText = "SELECT * From [" + SheetName + "]";
da.SelectCommand = cmdExcel;
da.Fill(dataset);
connExcel.Close();
If you look at the bottom three lines you will notice he uses da.SelectCommand and da.Fill to fill the dataset. But I think this requires a dataadapter and he doesn't have that in his example. I have tried creating a dataadapter as below:
SqlDataAdapter dataadapter = new SqlDataAdapter();
But I get an error stating: cannot implicitly convert type 'System.Data.OleDb.OleDbCommand' to System.Data.SqlClient.SqlCommand'.
I know it is working right up to the select statement. Can someone help me I basically just want to be able to access the information I am getting in the select statement.
Accessing excel data using Oledb connection is always a headache.You can try third party controls instead, like Aspose.Usage is very simple .You can try the following code after adding the control's reference to your project.
//Creating a file stream containing the Excel file to be opened
FileStream fstream = new FileStream("C:\\book1.xls", FileMode.Open);
//Instantiating a Workbook object
//Opening the Excel file through the file stream
Workbook workbook = new Workbook(fstream);
//Accessing the first worksheet in the Excel file
Worksheet worksheet = workbook.Worksheets[0];
//Exporting the contents of 7 rows and 2 columns starting from 1st cell to DataTable
DataTable dataTable = worksheet.Cells.ExportDataTable(0, 0, 7, 2, true);
//Binding the DataTable with DataGrid
dataGrid1.DataSource = dataTable;
//Closing the file stream to free all resources
fstream.Close();
You need an OleDBDataAdapter, not SqlDataAdapter. So, do this:
OleDBDataAdapter da = new OleDBDataAdapter(cmdExcel);
da.Fill(dataset);
Excel is an OLEDB data source, and so the classes you should be using will be prefixed with OleDb in general, just like the ones for database connectivity and manipulation are prefixed with Sql.
Documentation
Related
I am reading excel data using the OleDbDataAdapter for doing this I am using the below code. My excel file has 80 rows and 19 columns. Each column represents different languages(e.g English Arabic, Chinese, etc).
Each row has certain strings.
public DataSet ReadExcelFile(string dataSource)
{
DataSet ds = new DataSet();
string connectionString = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + dataSource
+ " ; Extended Properties='Excel 12.0; IMEX=1'";
using (OleDbConnection conn = new OleDbConnection(connectionString))
{
conn.Open();
OleDbCommand cmd = new OleDbCommand();
cmd.Connection = conn;
// Get all Sheets in Excel File
DataTable dtSheet = conn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
// Loop through all Sheets to get data
foreach (DataRow dr in dtSheet.Rows)
{
string sheetName = dr["TABLE_NAME"].ToString();
if (!sheetName.EndsWith("$"))
continue;
// Get all rows from the Sheet
cmd.CommandText = "SELECT * FROM [" + sheetName + "]";
DataTable dt = new DataTable();
dt.TableName = sheetName;
OleDbDataAdapter da = new OleDbDataAdapter(cmd);
da.Fill(dt);
ds.Tables.Add(dt);
}
cmd = null;
conn.Close();
}
return ds;
}
It works perfectly fine except for a few cells. for the few cells table does not have complete string this is happening for the Chinese language:
for example, my string is:
“个性化喂养模式”允许您预置常用的喂养模式。一旦设定好 , 当按“模式”键时 , 它将自动出现在喂养模式列表中。
-----------------------------------------------
您可以创建 , 编辑或删除个性化喂养模式。
-----------------------------------------------
提示 : 个性化喂养模式可能会被默认喂养列表隐藏。
-----------------------------------------------
使用“>”键选择需要的喂养模式。"
But I am getting only:
“个性化喂养模式”允许您预置常用的喂养模式。一旦设定好 , 当按“模式”键时 , 它将自动出现在喂养模式列表中。
-----------------------------------------------
您可以创建 , 编辑或删除个性化喂养模式。
-----------------------------------------------
提示 : 个性化喂养模式可能会被默认喂养列表隐藏。
-----------------------------------------------
The last row is missing.
This is happening only for 3 cell rest cell are coming properly.
It seems that it is being truncated to 255 characters.
According to this Microsoft Oledb truncates the data length to 255 characters
When you use OLEDB providers then the datatype is determined automatically by the provider based on the first 8 rows. If you have lengthy cells in the first 8 rows then data type will be set as text and otherwise it will be memo type which can hold 255 characters only. To overcome this issue either change the registry setting as mentioned in below KB article: http://support.microsoft.com/kb/281517 or use Microsoft.Jet.OLEDB provider to read the data.
Or you may try the OpenXml approach. Parse and read a large spreadsheet document (Open XML SDK)
i am trying to read excel data to C# using ODBC here is my code
string lstrFileName = "Sheet1";
//string strConnString = "Driver={Microsoft Text Driver (*.txt; *.csv)};Dbq="+path+ ";Extensions=asc,csv,tab,txt;Persist Security Info=False";
string strConnString = "Driver={Microsoft Excel Driver (*.xls, *.xlsx, *.xlsm, *.xlsb)};Dbq=E:\\T1.xlsx;Extensions=xls/xlsx;Persist Security Info=False";
DataTable ds;
using (OdbcConnection oConn = new OdbcConnection(strConnString))
{
using (OdbcCommand oCmd = new OdbcCommand())
{
oCmd.Connection = oConn;
oCmd.CommandType = System.Data.CommandType.Text;
oCmd.CommandText = "select A from [" + lstrFileName + "$]";
OdbcDataAdapter oAdap = new OdbcDataAdapter();
oAdap.SelectCommand = oCmd;
ds = new DataTable();
oAdap.Fill(ds);
oAdap.Dispose();
// ds.Dispose();
}
}
my sample data
A
1
2
3
AA
BB
its data table its read 1,2,3 and two blank row
i can understand because of first row its deciding data type , but how can i convert as String and read all row .
Any suggestion .
i Already tried CStr but no help .
For a previous discussion of similar problem here, please check following:
DBNull in non-empty cell when reading Excel file through OleDB
As a workaround, you may also format the column as "text"(i.e. in Excel, select column, right click "Format Cells..."), though this might be impractical if you will process large number of files or if you must not touch the file..
This is partially speculation, but when reading an Excel document as a database, the adapter has to make a judgement on datatypes and usually does a pretty good job. However, because Excel allows mixed datatypes (and databases do not), it occasionally gets it wrong.
My recommendation would to be to not use a data adapter, and just read in every field as an object type. From there, you can easily cast them to strings (StringBuilder, ToString(), etc) or even TryParse into fields you suspect they should be, ignoring the ODBC datatype.
Something like this would be a boilerplate for that:
using (OdbcCommand oCmd = new OdbcCommand())
{
oCmd.Connection = oConn;
oCmd.CommandType = System.Data.CommandType.Text;
oCmd.CommandText = "select A from [" + lstrFileName + "$]";
using (OdbcDataReader reader = oCmd.ExecuteReader())
{
object[] fields = new object[reader.FieldCount];
while (reader.Read())
{
reader.GetValues(fields);
// do something with fields
}
}
}
Additional information: The Microsoft Office Access database engine could not find the object 'C:\Users\username\Documents\sampleData.xls'. Make sure the object exists and that you spell its name and the path name correctly.
The Error is highlighted at
theDataAdapter.Fill(spreadSheetData);
Here's the sample data I used (tried in .csv , .xls , .xlsx )
Name Age Status Children
Johnny 34 Married 3
Joey 21 Single 1
Michael 16 Dating 0
Smith 42 Divorced 4
Here's the code associated:
using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Windows.Forms;
using System.IO;
using System.Data.OleDb;
namespace uploadExcelFile
{
public partial class Form1 : Form
{
public Form1()
{
InitializeComponent();
}
private void btnImport_Click(object sender, EventArgs e)
{
var frmDialog = new System.Windows.Forms.OpenFileDialog();
if (frmDialog.ShowDialog() == System.Windows.Forms.DialogResult.OK)
{
string strFileName = frmDialog.FileName;
System.IO.FileInfo spreadSheetFile = new System.IO.FileInfo(strFileName);
scheduleGridView.DataSource = spreadSheetFile.ToString();
System.Diagnostics.Debug.WriteLine(frmDialog.FileName);
System.Diagnostics.Debug.WriteLine(frmDialog.SafeFileName);
String name = frmDialog.SafeFileName;
String constr = String.Format(#"Provider=Microsoft.ACE.OLEDB.12.0;Data Source={0};Extended Properties=""Excel 12.0 Xml;HDR=YES""", frmDialog.FileName);
OleDbConnection myConnection = new OleDbConnection(constr);
OleDbCommand onlineConnection = new OleDbCommand("SELECT * FROM [" + frmDialog.FileName + "]", myConnection);
myConnection.Open();
OleDbDataAdapter theDataAdapter = new OleDbDataAdapter(onlineConnection);
DataTable spreadSheetData = myConnection.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
theDataAdapter.Fill(spreadSheetData);
scheduleGridView.DataSource = spreadSheetData;
}
}
}
}
scheduleGridView is the DataGridViews name, & btnImport is the name for the import Button.
I've installed 2007 Office System Driver: Data Connectivity Components; which gave me the AccessDatabaseEngine.exe, but from there I've been stuck here without understanding how to get around this. It should go without saying that the filepath is correct in its entirety. There is no odd characters in the path name either (spaces, underlines, etc)
Mini Update :: (another dead end it seems like)
Although the initial error says, "could not find the object 'C:\Users\username\Documents\sampleData.xls'"
In the Debugger the exception is read as
When I look at details the exception as "C:\Users\username\Documents\sampleData.xls"
So I thought the error was that it wasn't taking the path as a literal, but this article C# verbatim string literal not working. Very Strange backslash always double
Shows very clearly that that is not my issue.
I am guessing you may be mistaken by what is returned from the following line of code…
DataTable spreadSheetData = myConnection.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
The DataTable returned from this line will have nine (9) columns (TABLE_CATALOG, TABLE_SCHEMA, TABLE_NAME, TABLE_TYPE, TABLE_GUID, DESCRIPTION, TABLE_PROPID, DATE_CREATED and DATE_MODIFIED). This ONE (1) DataTable returned simply “Describes” the worksheet(s) and named range(s) in the entire selected Excel workbook. Each row in this DataTable represent either a worksheet OR a named range. To distinguish worksheets from named ranges, the “TABLE_NAME” column in this DataTable has the name of the worksheet or range AND ends each “Worksheet” Name with a dollar sign ($). If the “TABLE_NAME” value in a row does NOT end in dollar sign, then it is a range and not a worksheet.
Therefore, when the line
OleDbDataAdapter theDataAdapter = new OleDbDataAdapter(onlineConnection);
Blows up and says it cannot file the “filename” error… is somewhat expected because this line is looking for a “worksheet” name, not a filename. On the line creating the select command…
OleDbCommand onlineConnection = new OleDbCommand("SELECT * FROM [" + frmDialog.FileName + "]", myConnection);
This is incorrect; you have already selected the filename and open the file with
String constr = String.Format(#"Provider=Microsoft.ACE.OLEDB.12.0;Data Source={0};Extended Properties=""Excel 12.0 Xml;HDR=YES""", frmDialog.FileName);
OleDbConnection myConnection = new OleDbConnection(constr);
myConnection.Open();
The correct OleDbCommand line should be…
OleDbCommand onlineConnection = new OleDbCommand("SELECT * FROM [" + sheetName + "]", myConnection);
The problem here is that the current code is not getting the worksheet names. Therefore, we cannot “select” the worksheet from the workbook then fill the adapter with the worksheet.
The other issue is setting the DataGridView’s DataSource to spreadSheetData… when you get the worksheet(s) from an Excel “Workbook”, you must assume there will be more than one sheet. Therefore a DataSet will work as a container to hold all the worksheets in the workbook. Each DataTable in the DataSet would be a single worksheet and it can be surmised that the DataGridView can only display ONE (1) of these tables at a time. Given this, below are the changes described along with an added button to display the “Next” worksheet in the DataGridView since there may be more than one worksheet in the workbook. Hope this makes sense.
int sheetIndex = 0;
DataSet ds = new DataSet();
public Form1() {
InitializeComponent();
}
private void btnImport_Click(object sender, EventArgs e) {
var frmDialog = new System.Windows.Forms.OpenFileDialog();
if (frmDialog.ShowDialog() == System.Windows.Forms.DialogResult.OK) {
String constr = String.Format(#"Provider=Microsoft.ACE.OLEDB.12.0;Data Source={0};Extended Properties=""Excel 12.0 Xml;HDR=YES""", frmDialog.FileName);
OleDbConnection myConnection = new OleDbConnection(constr);
myConnection.Open();
DataTable spreadSheetData = myConnection.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
string sheetName = "";
DataTable dt;
OleDbCommand onlineConnection;
OleDbDataAdapter theDataAdapter;
// fill the "DataSet" each table in the set is a worksheet in the Excel file
foreach (DataRow dr in spreadSheetData.Rows) {
sheetName = dr["TABLE_NAME"].ToString();
sheetName = sheetName.Replace("'", "");
if (sheetName.EndsWith("$")) {
onlineConnection = new OleDbCommand("SELECT * FROM [" + sheetName + "]", myConnection);
theDataAdapter = new OleDbDataAdapter(onlineConnection);
dt = new DataTable();
dt.TableName = sheetName;
theDataAdapter.Fill(dt);
ds.Tables.Add(dt);
}
}
myConnection.Close();
scheduleGridView.DataSource = ds.Tables[0];
setLabel();
}
}
private void setLabel() {
label1.Text = "Showing worksheet " + sheetIndex + " Named: " + ds.Tables[sheetIndex].TableName + " out of a total of " + ds.Tables.Count + " worksheets";
}
private void btnNextSheet_Click(object sender, EventArgs e) {
if (sheetIndex == ds.Tables.Count - 1)
sheetIndex = 0;
else
sheetIndex++;
scheduleGridView.DataSource = ds.Tables[sheetIndex];
setLabel();
}
I solved it. Well there was a workaround. I used the Excel Data Reader found in this thread: How to Convert DataSet to DataTable
Which led me to https://github.com/ExcelDataReader/ExcelDataReader
^ The readme was fantastic, just went to solution explorer, right click on references, manage NuGet Packages, select browse in the new box, enter ExcelDataReader, then in the .cs file be sure to include, "using Excel;" at the top, the code mentioned in the first link was essentially enough, but here's my exact code for those wondering.
var frmDialog = new System.Windows.Forms.OpenFileDialog();
if (frmDialog.ShowDialog() == System.Windows.Forms.DialogResult.OK)
{
/*string strFileName = frmDialog.FileName;
//System.IO.FileInfo spreadSheetFile = new System.IO.FileInfo(strFileName);
System.IO.StreamReader reader = new System.IO.StreamReader(strFileName);
*/
string strFileName = frmDialog.FileName;
FileStream stream = File.Open(strFileName, FileMode.Open, FileAccess.Read);
//1. Reading from a binary Excel file ('97-2003 format; *.xls)
IExcelDataReader excelReader = ExcelReaderFactory.CreateBinaryReader(stream);
//...
//2. Reading from a OpenXml Excel file (2007 format; *.xlsx)
//IExcelDataReader excelReader = ExcelReaderFactory.CreateOpenXmlReader(stream);
//...
//3. DataSet - The result of each spreadsheet will be created in the result.Tables
//DataSet result = excelReader.AsDataSet();
//...
//4. DataSet - Create column names from first row
excelReader.IsFirstRowAsColumnNames = true;
DataSet result = excelReader.AsDataSet();
DataTable data = result.Tables[0];
//5. Data Reader methods
while (excelReader.Read())
{
//excelReader.GetInt32(0);
}
scheduleGridView.DataSource = data;
excelReader.Close();
I've got a Excel 97/03 document that has "blabla" in its A1 cell in sheet "Sheet1". I thought the following should be able to extract it:
string con = #"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=Book1.xls;" + #"Extended Properties='Excel 8.0;HDR=Yes;'";
using (OleDbConnection connection = new OleDbConnection(con))
{
connection.Open();
OleDbDataAdapter da = new OleDbDataAdapter("Select * From [Sheet1$]", connection);
DataTable dt = new DataTable();
da.Fill(dt);
dynamic cellA1 = dt.Rows[0][0].ToString();
But cellA1 is empty (""). Anyone know how to fix this, I should be able to treat it as a database and get cells from it?
"HDR=Yes;" indicates that the first row contains columnnames, not data. "HDR=No;" indicates the opposite. maybe thats the issue.
The datatable is using the first row of data as its headers, to access the A1 cell simply use the name of the first column:
dynamic cellA1 = dt.Columns[0].ToString();
I am creating a test framework that should read parameters from an excel sheet. I would like to be able to :
Get a row count of test records in the sheet
Get column count
Reference a particular cell eg A23 and read or write values to it.
I found this code on the internet. Its great but it appears to have been coded to work with a form component. I dont necessarily need to show the excel sheet on a datagrid.
This is the code I found. Its working ok but I need to add the functionalities above. Thanks for your help :)
using System.Data;
using System.Data.OleDb;
...
OleDbConnection con = new OleDbConnection(#"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C:\Book1.xls;Extended Properties=Excel 8.0");
OleDbDataAdapter da = new OleDbDataAdapter("select * from MyObject", con);
DataTable dt = new DataTable();
da.Fill(dt);
Count Row
sheet.Range["A11"].Formula = “COUNT(A1:A10)”;
Count Column
sheet.Range["A12"].Formula = “COUNT(A1:F1)”;
.NET Excel componetn
you can Reference particular cells using this code :
Select * from [Sheet1$A1:B10]
for example above code access to cell A1 to B10
see here
You can use this method:
private DataTable LoadXLS(string filePath)
{
DataTable table = new DataTable();
DataRow row;
try
{
using (OleDbConnection cnLogin = new OleDbConnection())
{
cnLogin.ConnectionString = "provider=Microsoft.Jet.OLEDB.4.0;Data Source='" + filePath + "';Extended Properties=Excel 8.0;";
cnLogin.Open();
string sQuery = "SELECT * FROM [Sheet1$]";
table.Columns.Add("Tags", typeof(string));
table.Columns.Add("ReplaceWords", typeof(string));
OleDbCommand comDB = new OleDbCommand(sQuery, cnLogin);
using (OleDbDataReader drJobs = comDB.ExecuteReader(CommandBehavior.Default))
{
while (drJobs.Read())
{
row = table.NewRow();
row["Tags"] = drJobs[0].ToString();
row["ReplaceWords"] = drJobs[1].ToString();
table.Rows.Add(row);
}
}
}
return table;
}
And use like this:
DataTable dtXLS = LoadXLS(path);
//and do what you need
If you need to write into Excel you need to check this out http://msdn.microsoft.com/en-us/library/dd264733.aspx
an easy way to handle excel files and operations to them is the following one:
add the microsoft.office.interop.excel reference to your project (Add Reference.. => search under the .NET tab => add the reference)
create a new excel application and open the workbook:
Excel.Application application = new Excel.Application();
Excel.Workbook workbook = application.Workbooks.Open(workBookPath);
Excel.Worksheet worksheet = workbook.Sheets[worksheetNumber];
you can get the row and column count with the following lines:
var endColumn = worksheet.Columns.CurrentRegion.EntireColumn.Count;
var endRow = worksheet.Rows.CurrentRegion.EntireRow.Count;***
reading values form a cell or a range of cells can be done in the following way(rowIndex is the number of the row in which the cells you want to read out are):
System.Array values = (System.Array)worksheet.get_Range("A" +
rowIndex.ToString(), "D" + rowIndex.ToString()).Cells.Value;