c# wrong reading from excel with oledb - c#

private void OnCreated(object sender, FileSystemEventArgs e)
{
excelDataSet.Clear();
string extension = Path.GetExtension(e.FullPath);
if (extension == ".xls" || extension == ".xlsx")
{
string ConnectionString = "";
if (extension == ".xls") { ConnectionString = "Provider=Microsoft.Jet.OLEDB.4.0; Data Source = '" + e.FullPath + "';Extended Properties=\"Excel 8.0;HDR=YES;\""; }
if (extension == ".xlsx") { ConnectionString = "Provider=Microsoft.ACE.OLEDB.12.0; Data Source = '" + e.FullPath + "';Extended Properties=\"Excel 12.0;HDR=YES;\""; }
using (OleDbConnection conn = new OleDbConnection(ConnectionString))
{
conn.Open();
OleDbDataAdapter objDA = new OleDbDataAdapter("select * from [Sheet1$]", conn);
objDA.Fill(excelDataSet);
conn.Close();
conn.Dispose();
}
}
}
This is my code. It's working when my filewatcher triggers. Problem is the excel file I read has 1 header row and 3 row that has values. When I use this code and check my dataset row count I get 9.. I've no idea where does it take that 9 from, am I doing something wrong? I'm checking my code for last 30-35 min and still couldn't find what I'm doing wrong..
I get the column's right but the rows are not working. I don't need the header line btw
Update: my example excel file had 3 rows and I was getting 9 as row count. I just copied these rows and made my file 24 row + 1 header row and when I did rows.count I got 24 as answer. So it worked fine? Is that normal?

There is a Nuget called Linq to Excel. I used this nuget in several projects to query the data inside .csv and .xlsx files without any difficulty, it is easy to implement. It might be poor in performance but it can resolve your problem.
Here is the documentation of Linq to Excel

I would highly recommend you to take a look at EPPLUS library https://github.com/JanKallman/EPPlus/wiki
I have plently of trouble with oledb until i found EPPLUS. It's really easy to use for creating and updating excel files. There are plenty of good examples out there like the one under which is from How do i iterate through rows in an excel table using epplus?
var package = new ExcelPackage(new FileInfo("sample.xlsx"));
ExcelWorksheet workSheet = package.Workbook.Worksheets[1];
var start = workSheet.Dimension.Start;
var end = workSheet.Dimension.End;
for (int row = start.Row; row <= end.Row; row++)
{ // Row by row...
for (int col = start.Column; col <= end.Column; col++)
{ // ... Cell by cell...
object cellValue = workSheet.Cells[row, col].Text; // This got me the actual value I needed.
}
}

Related

c# OleDbConnection csv to excel skips first line from csv

When using this code for some reason it skips the first line of the csv file, which are the headers. What am I doing wrong?
string strFileName = path;
OleDbConnection conn = new OleDbConnection("Provider=Microsoft.Jet.OLEDB.4.0; Data Source = " + System.IO.Path.GetDirectoryName(strFileName) + "; Extended Properties = \"Text\"");
conn.Open();
OleDbDataAdapter adapter = new OleDbDataAdapter("SELECT * FROM " + System.IO.Path.GetFileName(strFileName), conn);
DataSet ds = new DataSet("Temp");
adapter.Fill(ds);
DataTable tb = ds.Tables[0];
string data = null;
for (int j = 0; j <= tb.Rows.Count - 1; j++)
{
for (int k = 0; k <= tb.Columns.Count - 1; k++)
{
data = tb.Rows[j].ItemArray[k].ToString();
SaturnAddIn.getInstance().Application.ActiveWorkbook.ActiveSheet.Cells[j + 1, k + 1] = data;
}
}
It will skip the first row of headers, unless you use:
Extended Properties=Text;HDR=No;
But in this case it will treat the first row as a data-row which will probably (at some stage) cause data-type errors.
Normally you would skip the first row, and create the headers in Excel manually.
This comment notes the same behavior when the FULL PATH is passed into the SELECT statement. Since the directory of the file is provided in the OleDbConnection it does not need to be provided a second time.
There are some similar notes at this answer (to a different question) that indicate that the path should be in the connection, as well.
It also recommends using a "real" CSV parser.
Also found that when HDR=YES you can get the first column using the table.Columns[0].ColumnName and using some sort of loop.

OleDB, Misses the first character of data

I have a CSV Reading code for ASP.NET application I maintain. This ASP.NET website is running fine from 3 yrs now, and CSV reading code that use Ole.JetDB.4.0 is doing its work fine, except that once in a while some CSV with more than 4K-5K records create a problem. Usually the problem is that a record at random position [random row] miss the first character of it.
CSV File is just bunch of name and addresses per row, and they are in ASNI Format. CSV is comma seperate, no data have "comma" in data and now enclosing of field in Single or Double quote. Also, it doesn't happen often, We use the same code for say 70K record upload they works fine, but some time say in 3 yrs about 3-4 files have this problem only, we upload about one file daily.
For those who need what I did
using (System.Data.OleDb.OleDbConnection conn = new System.Data.OleDb.OleDbConnection
("Provider=Microsoft.Jet.OLEDB.4.0;Extended Properties='text;HDR=Yes;FMT=Delimited';Data Source=" + HttpContext.Current.Server.MapPath("/System/SaleList/"))
{
string sql_select = "select * from [" + this.FileName + "]";
System.Data.OleDb.OleDbDataAdapter da = new System.Data.OleDb.OleDbDataAdapter();
da.SelectCommand = new System.Data.OleDb.OleDbCommand(sql_select, conn);
DataSet ds = new DataSet();
// Read the First line of File to know the header
string[] lines = System.IO.File.ReadAllLines(HttpContext.Current.Server.MapPath("/System/SaleList/") + FileName);
string header = "";
if (lines.Length > 0)
header = lines[0];
string[] headers = header.Split(',');
CreateSchema(headers, FileName);
da.Fill(ds, "ListData");
DataTable dt = ds.Tables["ListData"];
}
And this code is working fine except the mention thing. I cut some unrelated part so, might not work by copy paste.
EDIT: More information
I try to use ODBC with Microsoft Text Driver, then I use ACE Driver with OleDB. the result is same with all three drive.
If I swap the problem record, with the preceding Row those rows are read quite well, until the next problem row [if more than one row is having problem in original file], if those are only problem row it works fine.
So from above it looks like that something is there that distract character counter, but how I can ensure it working smooth is still a quiz.
EDIT 2: I have submitted it as bug to Microsoft here : https://connect.microsoft.com/VisualStudio/feedback/details/811869/oledb-ace-driver-12-jet-4-0-or-odbc-text-driver-all-fail-to-read-data-properly-from-csv-text-file
I would suggest you examine a problem file with a hex editor - inspect the line that causes the problem and the line immediately preceding it.
In particular look at the line terminators (CR/LF? CR only? LF only?) and look for any non-printable characters.
Try using ACE Driver instead of JET (it's available on x86 and x64 servers, JET is only x86!)
using (System.Data.OleDb.OleDbConnection conn
= new System.Data.OleDb.OleDbConnection
("Provider=Microsoft.ACE.OLEDB.12.0;Extended Properties="Excel 12.0 Xml;HDR=YES";
Data Source=" + HttpContext.Current.Server.MapPath("/System/SaleList/"))
{
I got the same OleDB, Missing characters of data problem, see here:
The characters go missing because the Microsoft.Jet.OLEDB.4.0 driver
tries to guess the column datatype. In my case its was treating the
data as hexadecimal not alphanumeric.
Problematic oledbProviderString:
oledbProviderString = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=\"
{0}\";Extended Properties=\"Text;HDR=No;FMT=Delimited\"";
To fix the problem I added TypeGuessRows=0
oledbProviderString = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=\"
{0}\";Extended Properties=\"Text;HDR=No;FMT=Delimited;TypeGuessRows=0\"";
Repro:
Create a Book1.csv file with this content:
KU88,G6,CC
KU88,F7,CC
Step through this code as pictured above.
private void button1_Click(object sender, EventArgs e)
{
string folder = #"G:\Developers\Folder";
ReproProblem(folder);
}
static string oledbProviderString = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=\"{0}\";Extended Properties=\"Text;HDR=No;FMT=Delimited\"";
private void ReproProblem(string folderPath)
{
using (OleDbConnection oledbConnection = new OleDbConnection(string.Format(oledbProviderString, folderPath)))
{
string sqlStatement = "Select * from [Book1.csv]";
//open the connection
oledbConnection.Open();
//Create an OleDbDataAdapter for our connection
OleDbDataAdapter adapter = new OleDbDataAdapter(sqlStatement, oledbConnection);
//Create a DataTable and fill it with data
DataTable table = new DataTable();
adapter.Fill(table);
//close the connection
oledbConnection.Close();
}
}
why dont u just use this:
using (System.Data.OleDb.OleDbConnection conn = new System.Data.OleDb.OleDbConnection
("Provider=Microsoft.Jet.OLEDB.4.0;Extended Properties='text;HDR=Yes;FMT=Delimited';Data Source=" + HttpContext.Current.Server.MapPath("/System/SaleList/"))
{
string sql_select = "select * from [" + this.FileName + "]";
System.Data.OleDb.OleDbDataAdapter da = new System.Data.OleDb.OleDbDataAdapter();
da.SelectCommand = new System.Data.OleDb.OleDbCommand(sql_select, conn);
DataSet ds = new DataSet();
// Read the First line of File to know the header
string[] lines = System.IO.File.ReadAllLines(HttpContext.Current.Server.MapPath("/System/SaleList/") + FileName);
DataTable mdt=new DataTable("ListData");
for (int i = 1; i < lines.Length; i++)
{
string[] sep=lines[i].Split(',');
foreach (var item in sep)
{
mdt.Rows.Add(sep);
}
}
string header = "";
if (lines.Length > 0)
header = lines[0];
string[] headers = header.Split(',');
ds.Tables.Add(mdt);
CreateSchema(headers, FileName);
da.Fill(ds, "ListData");
DataTable dt = mdt;}
i didnt debugged it. i hope there is no problem but if there is im here for you.
thank you very much

How to load large amount of data to DataTable

I am exporting data from Excel to a DataTable, but I am getting some performance issues when my Excel file contains large amount of rows...
public DataView LoadFromExcel()
{
Microsoft.Office.Interop.Excel.Application application =
new Microsoft.Office.Interop.Excel.Application();
Workbook workbook = null;
Worksheet worksheet = null;
string filename = null;
OpenFileDialog file = new OpenFileDialog();
if (true == file.ShowDialog())
{
filename = file.FileName;
}
workbook = application.Workbooks.Open(filename, true, true);
worksheet = workbook.Sheets[1];
Range range = worksheet.UsedRange;
int row = range.Rows.Count;
int columns = range.Columns.Count;
System.Data.DataTable dt = new System.Data.DataTable();
for (int i = 1; i <= columns; i++)
{
dt.Columns.Add((range.Cells[1, i] as Range).Value2.ToString());
}
for (row = 2; row <= range.Rows.Count; row++)
{
DataRow dr = dt.NewRow();
for (int column = 1; column <= range.Columns.Count; column++)
{
dr[column - 1] = (range.Cells[row, column] as
Microsoft.Office.Interop.Excel.Range).Value2.ToString();
}
dt.Rows.Add(dr);
dt.AcceptChanges();
}
workbook.Close(true, Missing.Value, Missing.Value);
application.Quit();
return dt.DefaultView;
}
Is there any way I can solve this problem? Please help.
I think this is not the right approach.
For inserting large amount of data into a table, you should use "Bulk Insert" feature of your database and during bulk insert, you should turn off the database log and roll-back features. Otherwise the bulk insert would act just like bunch of ordinary inserts.
I know Oracle and SQL Server has this feature and some NoSQL databases has it too. Since you have not mentioned what is your database, it helps to google it.
You can do it with the help of OLEDb provider. I have tried doing for 50000 records. It may help you, just try below code:
// txtPath.Text is the path to the excel file.
string conString = #"Provider=Microsoft.ACE.OLEDB.12.0;" + "Data Source=" + txtPath.Text + ";" + "Extended Properties=" + "\"" + "Excel 12.0;HDR=YES;" + "\"";
OleDbConnection oleCon = new OleDbConnection(conString);
OleDbCommand oleCmd = new OleDbCommand("SELECT field1, field2, field3 FROM [Sheet1$]", oleCon);
DataTable dt = new DataTable();
oleCon.Open();
dt.Load(oleCmd.ExecuteReader());
oleCon.Close();
You have to take care of few things:
Name of the sheet should be Sheet1 or else give the proper name in the query.
While reading the sheet, sheet should not be open.
The column name should be properly defined in the query
Column name should be on the first row in the sheet
I hope it will help you...
Let me know if any thing more you require... :)
You can use Sql bulk copy to perform such operation.
Try reading the values to variables and do some filters in order to avoid sending wrong values that can affect your database.
It is wrong to save unknown data to database most expecially MS SQL - do some filtering to make the saving easier and preserve your DB health..

Read and write to Excel in c#

I am creating a test framework that should read parameters from an excel sheet. I would like to be able to :
Get a row count of test records in the sheet
Get column count
Reference a particular cell eg A23 and read or write values to it.
I found this code on the internet. Its great but it appears to have been coded to work with a form component. I dont necessarily need to show the excel sheet on a datagrid.
This is the code I found. Its working ok but I need to add the functionalities above. Thanks for your help :)
using System.Data;
using System.Data.OleDb;
...
OleDbConnection con = new OleDbConnection(#"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C:\Book1.xls;Extended Properties=Excel 8.0");
OleDbDataAdapter da = new OleDbDataAdapter("select * from MyObject", con);
DataTable dt = new DataTable();
da.Fill(dt);
Count Row
sheet.Range["A11"].Formula = “COUNT(A1:A10)”;
Count Column
sheet.Range["A12"].Formula = “COUNT(A1:F1)”;
.NET Excel componetn
you can Reference particular cells using this code :
Select * from [Sheet1$A1:B10]
for example above code access to cell A1 to B10
see here
You can use this method:
private DataTable LoadXLS(string filePath)
{
DataTable table = new DataTable();
DataRow row;
try
{
using (OleDbConnection cnLogin = new OleDbConnection())
{
cnLogin.ConnectionString = "provider=Microsoft.Jet.OLEDB.4.0;Data Source='" + filePath + "';Extended Properties=Excel 8.0;";
cnLogin.Open();
string sQuery = "SELECT * FROM [Sheet1$]";
table.Columns.Add("Tags", typeof(string));
table.Columns.Add("ReplaceWords", typeof(string));
OleDbCommand comDB = new OleDbCommand(sQuery, cnLogin);
using (OleDbDataReader drJobs = comDB.ExecuteReader(CommandBehavior.Default))
{
while (drJobs.Read())
{
row = table.NewRow();
row["Tags"] = drJobs[0].ToString();
row["ReplaceWords"] = drJobs[1].ToString();
table.Rows.Add(row);
}
}
}
return table;
}
And use like this:
DataTable dtXLS = LoadXLS(path);
//and do what you need
If you need to write into Excel you need to check this out http://msdn.microsoft.com/en-us/library/dd264733.aspx
an easy way to handle excel files and operations to them is the following one:
add the microsoft.office.interop.excel reference to your project (Add Reference.. => search under the .NET tab => add the reference)
create a new excel application and open the workbook:
Excel.Application application = new Excel.Application();
Excel.Workbook workbook = application.Workbooks.Open(workBookPath);
Excel.Worksheet worksheet = workbook.Sheets[worksheetNumber];
you can get the row and column count with the following lines:
var endColumn = worksheet.Columns.CurrentRegion.EntireColumn.Count;
var endRow = worksheet.Rows.CurrentRegion.EntireRow.Count;***
reading values form a cell or a range of cells can be done in the following way(rowIndex is the number of the row in which the cells you want to read out are):
System.Array values = (System.Array)worksheet.get_Range("A" +
rowIndex.ToString(), "D" + rowIndex.ToString()).Cells.Value;

Convert Excel Range to ADO.NET DataSet or DataTable, etc

I have an Excel spreadsheet that will sit out on a network share drive. It needs to be accessed by my Winforms C# 3.0 application (many users could be using the app and hitting this spreadsheet at the same time). There is a lot of data on one worksheet. This data is broken out into areas that I have named as ranges. I need to be able to access these ranges individually, return each range as a dataset, and then bind it to a grid.
I have found examples that use OLE and have got these to work. However, I have seen some warnings about using this method, plus at work we have been using Microsoft.Office.Interop.Excel as the standard thus far. I don't really want to stray from this unless I have to. Our users will be using Office 2003 on up as far as I know.
I can get the range I need with the following code:
MyDataRange = (Microsoft.Office.Interop.Excel.Range)
MyWorkSheet.get_Range("MyExcelRange", Type.Missing);
The OLE way was nice as it would take my first row and turn those into columns. My ranges (12 total) are for the most part different from each other in number of columns. Didn't know if this info would affect any recommendations.
Is there any way to use Interop and get the returned range back into a dataset?
I don't know about a built-in function, but it shouldn't be difficult to write it yourself. Pseudocode:
DataTable MakeTableFromRange(Range range)
{
table = new DataTable
for every column in range
{
add new column to table
}
for every row in range
{
add new datarow to table
for every column in range
{
table.cells[column, row].value = range[column, row].value
}
}
return table
}
I don't know what type of data you have.But for an excel data like shown in this link http://www.freeimagehosting.net/image.php?f8d4ef4173.png, you can use the following code to load into data table.
private void Form1_Load(object sender, EventArgs e)
{
try
{
DataTable sheetTable = loadSingleSheet(#"C:\excelFile.xls", "Sheet1$");
dataGridView1.DataSource = sheetTable;
}
catch (Exception Ex)
{
MessageBox.Show(Ex.Message, "");
}
}
private OleDbConnection returnConnection(string fileName)
{
return new OleDbConnection("Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + fileName + "; Jet OLEDB:Engine Type=5;Extended Properties=\"Excel 8.0;\"");
}
private DataTable loadSingleSheet(string fileName, string sheetName)
{
DataTable sheetData = new DataTable();
using (OleDbConnection conn = this.returnConnection(fileName))
{
conn.Open();
// retrieve the data using data adapter
OleDbDataAdapter sheetAdapter = new OleDbDataAdapter("select * from [" + sheetName + "]", conn);
sheetAdapter.Fill(sheetData);
}
return sheetData;
}
It's worth to take a look at NPOI when it comes to read/write Excel 2003 XLS files. NPOI is a life saver.
I think you'll have to iterate your range and create DataRows to put in your DataTable.
This question on StackOverflow provides more resources:
Create Excel (.XLS and .XLSX) file from C#
This method does not work well when the same column in the excel spread sheet contains both text and numbers. For instance, if Range("A3")=Hello and Range("A7")=5 then it reads only Hello and the value for Range("A7") is DBNULL
private void Form1_Load(object sender, EventArgs e)
{
try
{
DataTable sheetTable = loadSingleSheet(#"C:\excelFile.xls", "Sheet1$");
dataGridView1.DataSource = sheetTable;
}
catch (Exception Ex)
{
MessageBox.Show(Ex.Message, "");
}
}
private OleDbConnection returnConnection(string fileName)
{
return new OleDbConnection("Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + fileName + "; Jet OLEDB:Engine Type=5;Extended Properties=\"Excel 8.0;\"");
}
private DataTable loadSingleSheet(string fileName, string sheetName)
{
DataTable sheetData = new DataTable();
using (OleDbConnection conn = this.returnConnection(fileName))
{
conn.Open();
// retrieve the data using data adapter
OleDbDataAdapter sheetAdapter = new OleDbDataAdapter("select * from [" + sheetName + "]", conn);
sheetAdapter.Fill(sheetData);
}
return sheetData;
I made method which can take alredy filtered data from Excel
Taking filtered data in Range format
Worksheet sheet = null;
sheet = (Worksheet)context.cDocumentExcel.Sheets[requiredSheetName];
DataTable dt = new DataTable();
sheet.Activate();
sheet.UsedRange.Select();
List<Range> ranges = new List<Range>();
Range usedrange = sheet.UsedRange;
foreach (var oneRange in usedrange.SpecialCells(XlCellType.xlCellTypeVisible))
{
ranges.Add(oneRange);
}
dt = (_makeTableFromRange(ranges));
Converting from Range to DataTable
private static DataTable _makeTableFromRange(List<Range> ranges)
{
var table = new DataTable();
foreach (var range in ranges)
{
while (table.Columns.Count < range.Column)
{
table.Columns.Add();
}
while (table.Rows.Count < range.Row)
{
table.Rows.Add();
}
table.Rows[range.Row - 1][range.Column - 1] = range.Value2;
}
//clean from empty rows
var filteredRows = table.Rows.Cast<DataRow>().
Where(row => !row.ItemArray.All(field => field is System.DBNull ||
string.Compare((field as string).Trim(), string.Empty) ==
0));
table = filteredRows.CopyToDataTable();
return table;
}

Categories