Storing a big Excel table into array with C# - c#

I have an Excel table with about 50 columns and over 6000 rows.
I found the following solution to read the data:
https://coderwall.com/p/app3ya/read-excel-file-in-c
It uses Microsoft.Office.Interop.Excel to read the file.
Sadly, it is really slow. Reading a file with only 50 rows allready take about a minute. I never finished loading the 6000 row file.
I then thought about using csv, but the table contains , and ; so this won't be an option.
Can anyone suggest another method?

Apart from my comment-
Here is the method I use in order to read from an Excel file and into a table. You will need to have:
using Microsoft.Office.Interop; using statement, along with adding the correct Microsoft.Office.Interop.Excel reference to your project.
Method:
public DataTable ReadExcel(string fileName, string TableName)
{
DataTable dt = new DataTable();
OleDbConnection conn = new OleDbConnection(#"Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + fileName + ";Extended Properties=\"Excel 8.0\"");
OleDbCommand cmd = new OleDbCommand("SELECT * FROM " + TableName, conn);
try
{
conn.Open();
OleDbDataReader reader = cmd.ExecuteReader();
while (!reader.IsClosed)
{
dt.Load(reader);
}
}
finally
{
conn.Close();
}
return dt;
}
Explanation:
fileName will be the file path to the Excel file you are wanting to read the data form.
TableName will be the Excel Sheet name you are wanting to read data from.
The reason it is written this way, is because C# will read it and treat the Excel file like a database, where instead of sheets, there are tables.
You may need to alter the OleDbConnection(#"Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + fileName + ";Extended Properties=\"Excel 8.0\"");
You can find the proper/correct Provider here: https://www.connectionstrings.com/excel/

If you're only going to read the Excel file, I suggest ExcelDataReader instead of the interop.

Related

SQL Bulk Insert in C# not inserting values

I'm completely new to C#, so I'm sure I'm going to get a lot of comments about how my code is formatted - I welcome them. Please feel free to throw any advice or constructive criticisms you might have along the way.
I'm building a very simple Windows Form App that is eventually supposed to take data from an Excel file of varying size, potentially several times per day, and insert it into a table in SQL Server 2005. Thereafter, a stored procedure within the database takes over to perform various update and insert tasks depending on the values inserted into this table.
For this reason, I've decided to use the SQL Bulk Insert method, since I can't know if the user will only insert 10 rows - or 10,000 - at any given execution.
The function I'm using looks like this:
public void BulkImportFromExcel(string excelFilePath)
{
excelApp = new Excel.Application();
excelBook = excelApp.Workbooks.Open(excelFilePath);
excelSheet = excelBook.Worksheets.get_Item(sheetName);
excelRange = excelSheet.UsedRange;
excelBook.Close(0);
try
{
using (SqlConnection sqlConn = new SqlConnection())
{
sqlConn.ConnectionString =
"Data Source=" + serverName + ";" +
"Initial Catalog=" + dbName + ";" +
"User id=" + dbUserName + ";" +
"Password=" + dbPassword + ";";
using (OleDbConnection excelConn = new OleDbConnection())
{
excelQuery = "SELECT InvLakNo FROM [" + sheetName + "$]";
excelConn.ConnectionString = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + excelFilePath + ";Extended Properties='Excel 8.0;HDR=Yes'";
excelConn.Open();
using (OleDbCommand oleDBCmd = new OleDbCommand(excelQuery, excelConn))
{
OleDbDataReader dataReader = oleDBCmd.ExecuteReader();
using (SqlBulkCopy bulkImport = new SqlBulkCopy(sqlConn.ConnectionString))
{
bulkImport.DestinationTableName = sqlTable;
SqlBulkCopyColumnMapping InvLakNo = new SqlBulkCopyColumnMapping("InvLakNo", "InvLakNo");
bulkImport.ColumnMappings.Add(InvLakNo);
sqlQuery = "IF OBJECT_ID('ImportFromExcel') IS NOT NULL BEGIN SELECT * INTO [" + DateTime.Now.ToString().Replace(" ", "_") + "_ImportFromExcel] FROM ImportFromExcel; DROP TABLE ImportFromExcel; END CREATE TABLE ImportFromExcel (InvLakNo INT);";
using (SqlCommand sqlCmd = new SqlCommand(sqlQuery, sqlConn))
{
sqlConn.Open();
sqlCmd.ExecuteNonQuery();
while (dataReader.Read())
{
bulkImport.WriteToServer(dataReader);
}
}
}
}
}
}
}
catch(Exception ex)
{
MessageBox.Show(ex.ToString());
}
finally
{
excelApp.Quit();
}
}
The function runs without errors or warnings, and if I replace the WriteToServer with manual SQL commands, the rows are inserted; but the bulkImport isn't inserting anything.
NOTE: There is only one field in this example, and in the actual function I'm currently running to test; but in the end there will be dozens and dozens of fields being inserted, and I'll be doing a ColumnMapping for all of them.
Also, as stated, I am aware that my code is probably horrible - please feel free to give me any pointers you deem helpful. I'm ready and willing to learn.
Thanks!
I think it would be a very long and messy answer if I commented on your code and also gave pointer sample codes in the same message, so I decided to divide then into two messages. Comments first:
You are using automation to get what? You already have the sheet name as I see it and worse you are doing app.Quit() at the end. Completely remove that automation code.
If you needed some information from excel (like sheet names, column names) then you could use OleDbConnecton's GetOleDbSchemaTable method.
You might do the mapping basically in 2 ways:
Excel column ordinal to SQL table column name
Excel column name to SQL table column name
both would do. In a generic code, assuming you have column names same in both sources, but their ordinal and count may differ, you could get the column names from OleDbConnection schema table and do the mapping in a loop.
You are dropping and creating a table named "ImportFromExcel" for the purpose of temp data insertion, then why not simply create a temp SQL server table by using a # prefix in table name? OTOH that code piece is a little weird, it would do an import from "ImportFromExcel" if it is there, then drop and create a new one and attempt to do bulk import into that new one. In first run, SqlBulkCopy (SBC) would fill ImportFromExcel and on next run it would be copied to a table named (DateTime.Now ...) and then emptied via drop and create again. BTW, naming:
DateTime.Now.ToString().Replace(" ", "_") + "_ImportFromExcel"
doesn't feel right. While it looks tempting, it is not sortable, probably you would want something like this instead:
DateTime.Now.ToString("yyyyMMddHHmmss") + "_ImportFromExcel"
Or better yet:
"ImportFromExcel_" +DateTime.Now.ToString("yyyyMMddHHmmss")
so you would have something that is sorted and selectable for all the imports as a wildcard or looping for some reason.
Then you are writing to server inside a reader.Read() loop. That is not the way WriteToServer works. You wouldn't do reader.Read() but simply:
sbc.WriteToServer(reader);
In my next message e I will give simple schema reading and a simple SBC sample from excel into a temp table, as well as a suggestion how you should do that instead.
Here is the sample for reading schema information from Excel (here we read the tablenames - sheet names with tables in them):
private IEnumerable<string> GetTablesFromExcel(string dataSource)
{
IEnumerable<string> tables;
using (OleDbConnection con = new OleDbConnection("Provider=Microsoft.ACE.OLEDB.12.0;" +
string.Format("Data Source={0};", dataSource) +
"Extended Properties=\"Excel 12.0;HDR=Yes\""))
{
con.Open();
var schemaTable = con.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
tables = schemaTable.AsEnumerable().Select(t => t.Field<string>("TABLE_NAME"));
con.Close();
}
return tables;
}
And here is a sample that does SBC from excel into a temp table:
void Main()
{
string sqlConnectionString = #"server=.\SQLExpress;Trusted_Connection=yes;Database=Test";
string path = #"C:\Users\Cetin\Documents\ExcelFill.xlsx"; // sample excel sheet
string sheetName = "Sheet1$";
using (OleDbConnection cn = new OleDbConnection(
"Provider=Microsoft.ACE.OLEDB.12.0;Data Source="+path+
";Extended Properties=\"Excel 8.0;HDR=Yes\""))
using (SqlConnection scn = new SqlConnection( sqlConnectionString ))
{
scn.Open();
// create temp SQL server table
new SqlCommand(#"create table #ExcelData
(
[Id] int,
[Barkod] varchar(20)
)", scn).ExecuteNonQuery();
// get data from Excel and write to server via SBC
OleDbCommand cmd = new OleDbCommand(String.Format("select * from [{0}]",sheetName), cn);
SqlBulkCopy sbc = new SqlBulkCopy(scn);
// Mapping sample using column ordinals
sbc.ColumnMappings.Add(0,"[Id]");
sbc.ColumnMappings.Add(1,"[Barkod]");
cn.Open();
OleDbDataReader rdr = cmd.ExecuteReader();
// SqlBulkCopy properties
sbc.DestinationTableName = "#ExcelData";
// write to server via reader
sbc.WriteToServer(rdr);
if (!rdr.IsClosed) { rdr.Close(); }
cn.Close();
// Excel data is now in SQL server temp table
// It might be used to do any internal insert/update
// i.e.: Select into myTable+DateTime.Now
new SqlCommand(string.Format(#"select * into [{0}]
from [#ExcelData]",
"ImportFromExcel_" +DateTime.Now.ToString("yyyyMMddHHmmss")),scn)
.ExecuteNonQuery();
scn.Close();
}
}
While this would work, thinking in the long run, you need column names, and maybe their types differ, it might be an overkill to do this stuff using SBC and you might instead directly do it from MS SQL server's OpenQuery:
SELECT * into ... from OpenQuery(...)
the WriteToServer(IDataReader) is intended to do internally the IDataReader.Read()operation.
using (SqlCommand sqlCmd = new SqlCommand(sqlQuery, sqlConn))
{
sqlConn.Open();
sqlCmd.ExecuteNonQuery();
bulkImport.WriteToServer(dataReader);
}
You can check the MSDN doc on that function, has a working example: https://msdn.microsoft.com/en-us/library/434atets(v=vs.110).aspx

Search Excel sheet from ASP.NET C#

UPDATE:
I have found that this code works! it searches the Excel sheet and only outputs the data I need.
But can anyone explain to me why this works? how does it know that the first line in the spreadsheet is the "index"??
//Coneection String by default empty
string ConStr = "";
//connection string for that file which extantion is .xlsx
ConStr = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + "C:\\TestExcel.xlsx" + ";Extended Properties='Excel 12.0 XML;HDR=YES;';";
//making query
string query = "SELECT * FROM [lol$] where ID='i2200'";
//Providing connection
OleDbConnection conn = new OleDbConnection(ConStr);
//checking that connection state is closed or not if closed the
//open the connection
if (conn.State == ConnectionState.Closed)
{
conn.Open();
}
//create command object
OleDbCommand cmd = new OleDbCommand(query, conn);
// create a data adapter and get the data into dataadapter
OleDbDataAdapter da = new OleDbDataAdapter(cmd);
DataSet ds = new DataSet();
//fill the Excel data to data set
da.Fill(ds);
foreach (DataRow row in ds.Tables[0].Rows)
{
lblud.Text = "" + row["Hylde"];
}
OLD
I have been trying to do this for several hours now but no matter what i try, I don't end up with the result i want.
So now im "starting from scratch" again. See if I have approached this incorretly.
Question:
I wan't to create a ASPX website that can search my excel sheet for specific data.
Something like Select * from [Sheet1$] where Column A = i2200
then display only Column B and C from that specific row into a Label / two labels.
See picture here: http://itguruen.dk/EXCEL.png
Does anyone have a simple way of doing this?
Thanks in advance!
Jasper
Have you thought about importing the Excel Spreadsheet into a DataTable, and then analyse that DataTable to populate your labels? You can perform SQL queries on DataTables, so you'll be able to extract the exact data you require quite easily (the hardest part will be importing the Excel Spreadsheet into the DataTable).
There's a very detailed report on this process here: http://www.aspsnippets.com/Articles/Read-and-Import-Excel-File-into-DataSet-or-DataTable-using-C-and-VBNet-in-ASPNet.aspx
Update the post so you can see the solution.
Allthough I dont really know why this works??

Reading from excel file using oledb - empty strings

I'm using oledb to read from excel file.
DataTable sheet1 = new DataTable();
string excelCS = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + path + ";" + "Extended Properties=\"Excel 12.0 Xml;HDR=NO;TypeGuessRows=0;ImportMixedTypes=Text\"";
using (OleDbConnection connection = new OleDbConnection(excelCS))
{
connection.Open();
string selectSql = #"SELECT * FROM [Sheet1$]";
using (OleDbDataAdapter adapter = new OleDbDataAdapter(selectSql, connection))
{
adapter.Fill(sheet1);
}
connection.Close();
}
But there is a problem with some cells of the file.
For some cells I get an empty value instead of text. I tried to put some other text into these cells but it didn't work - I'm still getting empty strings.
But after deleting the column and inserting again my application get the right value of cell. Important is that the problem is not with all cells in the column.
Is this a problem with cell format or something? This excel file will be generated by another system so I won't be able to modify it manually.
Has anybody any sugestions what's wrong and what can I do?
Use IMEX = 1 at the end of your connection string. That will fix your problem.
To always use IMEX=1 is a safer way to retrieve data for mixed data
columns. .."
Remember that, sometimes there are some errors involved using IMEX while you're using Update rather than Selecting.
using this method convert Execel to Dataset without Empty String in c#
public static DataSet ConvertExcelToDataTable(string FileName)
{
DataSet ds = new DataSet();
string strConn = "";
strConn = "Provider=Microsoft.ACE.OLEDB.12.0;" + "Data Source=" + FileName + ";Extended Properties=\"Excel 8.0; HDR=YES; IMEX=1;\"";
OleDbDataAdapter da = new OleDbDataAdapter
("SELECT * FROM [Sheet1$]", strConn);
da.Fill(ds);
return ds;
}
it will return dataset.
I had this issue. What I found was that on the cells that returned blank values the data looked like strings, but the rest of the data looked like numbers, so excel stored the strings in a different place as the numbers. I changed the column format to text and all the data was picked up.
This thread might help with changing the format: Format an Excel column (or cell) as Text in C#?

OleDbDataAdapter not filling all of the rows

Hey I am using the DataAdapter to read an excel file and fill a Data Table with that data.
Here is my query and connection string.
private string Query = "SELECT * FROM Sheet1";
private string ConnectString = "Provider=Microsoft.ACE.OLEDB.12.0;"
+ "Data Source=\"" + Location + "\";"
+ "Extended Properties=\"Excel 12.0 Xml;HDR=YES\"";
OleDbDataAdapter DBAddapter = new OleDbDataAdapter(Query, ConnectString);
DataTable DBTable = new DataTable();
DBAddapter.Fill(DBTable);
The problem is my excel file has 12000 records however its only filling 2502 records into my data table.
Is there a limit on how many records the data adapter can read and write to the data table?
The problem might be that the sheet would contain mixed data and it was only reading numbers. The solution is to specify:
Properties="Excel 12.0;IMEX=1";
IMEX=1 allows the reader to import all data not only numbers.

Inserting a row into Excel Spreadsheet via C# and OleDb

I need to programmatically insert a row into an Excel Spreadsheet multiple times. I need to actually insert a new row and not insert data, that is, I need to actually shift all other rows down by one.
I am currently using OleDB to insert the data itself like so:
//Note I have missed some code out for simplicities sake, this all works fine however
OleDbConnection oledbConn = null;
OleDbCommand cmd = null;
OleDbConnection = new OleDbConnection(connString);
OleDbConnection.Open();
string connString = string.Format("Provider=Microsoft.Jet.OLEDB.4.0;Data Source={0};Extended Properties=\"Excel 8.0; \"", TargetFile);
sting InsertCommand = string.Format("INSERT INTO [{0}${1}:{1}] Values({2})", WorksheetName, Coord, valuestring);
cmd = new OleDbCommand(InsertCommand, oledbConn);
cmd.ExecuteNonQuery();
//close etc
I want to be able to insert a row in a similar fashion. Is this possible?
At a glance, you need to specify read write, the default is read only. Perhaps:
"Provider=Microsoft.Jet.OLEDB.4.0;" & _
"Data Source=C:\Docs\Test.xls;" & _
"Mode=ReadWrite;Extended Properties=""Excel 8.0;HDR=No"""
At a second glance and re comments, I think Interop might be the best bet.
ipavlic is right, you will be better off using an external third-party library for this. There are several available. OfficeWriter is one example:
http://www.officewriter.com
Once you open a workbook with OfficeWriter, you can use this method on the Worksheet class to insert rows. It mimics Excel's behavior, including updating/stretching formulas and other updates:
public void InsertRows(int rowNumber, int rowCount)

Categories