How to Read an Excel Table placed not in First Cell - c#

I have an Excel workbook where the Table is placed after the 9th row in the worksheet. How am I supposed to read the Table at that point?
Currently, I am able to read the Excel worksheet using Microsoft.ACE.OLEDB.12.0 provider like this:
OleDbConnection connection = new OleDbConnection();
var connectionString = $"Provider=Microsoft.ACE.OLEDB.12.0; data source={fileName}; Extended Properties=Excel 8.0;";
connection.ConnectionString = connectionString;
connection.Open();
DataTable dbSchema = connection.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
if (dbSchema == null || dbSchema.Rows.Count < 1)
{
throw new Exception("Error: Could not determine the name of the first worksheet.");
}
string firstSheetName = dbSchema.Rows[0]["TABLE_NAME"].ToString();
var adapter = new OleDbDataAdapter($"SELECT * FROM [{firstSheetName}]", connectionString);
var ds = new DataSet();
adapter.Fill(ds, "anyNameHere");
DataTable table = ds.Tables[0];
MessageBox.Show($"No of Records found: {table.Rows.Count}");
What I observe in the above code is that the Table is read but null values are yielded for non-table fields. However, I will need to do a filter for Row Number after n ( if n is the place where the table is placed) if I am supposed to get the intended.
I would welcome if this can be achieved by other means instead of OleDbConnection

the Table is placed after the 9th row in the worksheet
So you know the index of the heading
I would welcome if this can be achieved by other means instead of OleDbConnection
Actually, I use ExcelDataReader.Mapping, you can specify the row of the heading, here is how it works:
I have this data in an excel file
Model
public class SheetData
{
public string Name { set; get; }
public int Value { set; get; }
}
Usage (note that HeadingIndex takes value of 8 = 9 - 1)
using var stream = File.OpenRead(#"C:\Users\mosul\Desktop\Sample.xlsx");
using var importer = new ExcelImporter(stream);
var sheet = importer.ReadSheet();
sheet.HeadingIndex = 8;
var data = sheet.ReadRows<SheetData>().ToList();
Console.WriteLine(data.Count); // 3
That's it.

Related

how to fix duplicate insert of excel records to database

I am inserting into database excel sheet, I have been able to upload with and without sheet names, I just want to know how can I prevent the data from being inserting multiple times e.g. if my sheet has 2 records the loop inserts it twice and the table ends up looking like this:
ID DOB NAME SURNAME
1 1/02/1998 jack turner
2 2/02/1989 jill blue
1 1/02/1998 jack turner
2 2/02/1989 jill blue
Code:
public void up(string sFileName = #"filename") {
string ssqltable = "[dbo].[My_Table]";
//string sFileName = #"filename";
try{
string sConStr = string.Format("Provider=Microsoft.ACE.OLEDB.12.0;Data Source={0};Extended Properties='Excel 8.0;HDR=YES';", sFileName);
DataTable dt = new DataTable();
SqlConnection sqlconn = new SqlConnection(strConnString);
sqlconn.Open();
using (OleDbConnection connection = new OleDbConnection(sConStr))
{
connection.Open();
dt = connection.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, new object[] { null, null, null, "TABLE" });
var sheets = dt.Rows[0].Field<string>("TABLE_NAME");
foreach(var sheet in sheets) //loop through the collection of sheets ;)
{
//your logic here...
string myexceldataquery = string.Format("Select * FROM [{0}]; ", sheets);
//get data
OleDbConnection oledbconn = new OleDbConnection(sConStr);
OleDbCommand oledbcmd = new OleDbCommand(myexceldataquery, oledbconn);
oledbconn.Open();
OleDbDataReader dr = oledbcmd.ExecuteReader();
{
DataTable table = new DataTable("benlist");
table.Load(dr);
// add two extra columns to data table to be added to database table
table.Columns.Add("name",typeof(string));
table.Columns.Add("surname",typeof(string));
// add data to additional columns
foreach (DataRow row in table.Rows){
row["name"] =Session["Username"].ToString();
row["surname"] = Session["Username"].ToString();
}
SqlBulkCopy bulkcopy = new SqlBulkCopy(strConnString);
bulkcopy.DestinationTableName = ssqltable;
////Mapping Table column
bulkcopy.ColumnMappings.Add("IDNumber", "[IDNumber]");
bulkcopy.ColumnMappings.Add("DOB", "[DOB]");
bulkcopy.ColumnMappings.Add("name", "[name]");
bulkcopy.ColumnMappings.Add("surname", "[surname]");
//sqlcmd.ExecuteNonQuery();
//while (dr.Read())
//{
bulkcopy.WriteToServer(table);
//}
connection.Close();
sqlconn.Close();
}
}
}
}
catch (Exception){}
ClientScript.RegisterStartupScript(GetType(), "alert", "alert('File Uploaded');", true);
}
I expect the data to be inserted once no duplicates e.g
ID DOB NAME SURNAME
1 1/02/1998 jack turner
2 2/02/1989 jill blue
so i removed the loop and the data no longer gets duplicated when i insert it into the database table, thanks
reference: Getting the first sheet from an Excel document regardless of sheet name with OleDb
using (OleDbConnection connection = new OleDbConnection(sConStr))
{
connection.Open();
/// get sheet name
dt = connection.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, new object[] { null, null, null, "TABLE" });
//var sheets = dt.Rows[0].Field<string>("TABLE_NAME");
// foreach(var sheet in sheets) //loop through the collection of sheets ;)
// {
var sheets = dt.Rows[0].Field<string>("TABLE_NAME");
//your logic here...
string myexceldataquery = string.Format("Select * FROM [{0}]; ", sheets);
//get data

C# csv loaded into datatable missing columns

I've loaded a csv into a datatable but some of the columns are being read as blank when they're not blank. I initially posed this question only having an issue with the header, but I'm now also seeing this issue in my data rows so I need to reask... what is the problem with my dataset and why are some columns being read as a blank?
Currently, this setup will read data for columns 1-7 (I don't need 8-10). Data is populated for all columns correctly except column 4. Strangely, I have two files I've tested, both similar in structure, but one of them has values in column 4 and the other doesn't. The full code is setup to loop through many files, checking for start and end of data, then load to sql server.
Sample CSV:
By OrgID/Location
As of: December 6, 2017 at 10:13 AM
Date Range: summaryYM 2017M08 to 2017M08
"orgid=13778 medType=' '"
"col1","col2","col3","col4","col5","col6","col7","col8","col9","col10"
13778,140242,"2A","2017M08",0,0.058,78,".",".",
13778,140242,"2B","2017M08",0,0.014,19,".",".",
13778,140242,"2C","2017M08",0,0.083,133,".",".",
13778,140242,"2ICU","2017M08",0,0.099,114,".",".",
13778,140242,"3 ICU","2017M08",0,0.076,88,".",".",
code
//open connection to csv
string connStrCsv = string.Format(#"Provider=Microsoft.Jet.OleDb.4.0; Data Source={0};Extended Properties=""Text;HDR=NO;FMT=Delimited"""
, Path.GetDirectoryName(file));
OleDbConnection connCsv = new OleDbConnection(connStrCsv);
connCsv.Open();
//store csv data in datatable
string readCsv = "select * from [" + Path.GetFileName(file) + "]";
OleDbDataAdapter adapter = new OleDbDataAdapter(readCsv, connCsv);
DataSet ds = new DataSet();
adapter.Fill(ds, "sheet1");
DataTable table = ds.Tables["sheet1"];
connCsv.Close();
//find header to define start of data
int start = 0;
StreamReader headerSearch = null;
int incr = 0;
headerSearch = new StreamReader(file);
while (!headerSearch.EndOfStream)
{
incr++;
string line = headerSearch.ReadLine();
if (line.Contains("\"col1\",\"col2\",\"col3\",\"col4\",\"col5\",\"col6\""))
{
start = incr;
}
}
headerSearch.Close();
//load each row of excel into SQL server until first empty row
string sqlConnStr = "Data Source=mysource;Initial Catalog=mydatabase;Trusted_Connection=Yes;Integrated Security=SSPI;";
SqlConnection connSql = new SqlConnection(sqlConnStr);
connSql.Open();
int end = start;
while (table.Rows[end][0].ToString().Length != 0)
{
string sql = string.Format
(#"
delete from schema.table
where ss_col1 = {0}
and ss_col2 = '{1}'
and ss_col3 = '{2}'
and ss_col4 = '{3}';
insert into schema.table
values ({4}
,'{5}'
,'{6}'
,'{7}'
, {8}
,'{9}'
,'{10}'
,getdate()
,user_name()
,getdate()
,user_name());"
//delete statement variables
, table.Rows[end][0].ToString()
, table.Rows[end][2].ToString()
, table.Rows[end][3].ToString()
, infTypes[i]
//insert statement variables
, table.Rows[end][0].ToString()
, table.Rows[end][2].ToString()
, table.Rows[end][3].ToString()
, infTypes[i]
, table.Rows[end][4]
, table.Rows[end][5].ToString()
, table.Rows[end][6]
);
SqlCommand execSql = new SqlCommand(sql, connSql);
execSql.ExecuteNonQuery();
end++;
}
connSql.Close();
Do you have to load it into a datatable?
If "header1","header2","header3","header4","header5","header6" are unique would it not be easier just read the csv file until you find those?
Example...
StreamReader Reader = null;
string FilePath = "Your File Path";
try
{
Reader = new StreamReader(FilePath);
while(Reader.Peek() > 0)
{
string line = Reader.ReadLine();
bool HeaderFound = false;
if(line == "What ever your headers are")
{
HeaderFound = true;
}
if(HeaderFound)
{
//Here is all your data you were looking for.
//Do whatever you need to do with it now.
}
}
} catch(exception e)
{/*Deal with the issues*/}
finally
{
if(Reader != null)
{
Reader.Close();
Reader.Dispose();
}
}

Change cell value of excel file in c#?

I have excel file and loaded in c# windows applciaction.
I want to change the value in excel cell e.g change value in cell a10 and save the file.
The excel file contains multiple sheets.
Any help in this regard?
var ds = new DataSet();
ds = Parse(fileName);
static DataSet Parse(string fileName)
{
string connectionString = string.Format("provider=Microsoft.Jet.OLEDB.4.0; data source={0};Extended Properties=Excel 8.0;", fileName);
DataSet data = new DataSet();
foreach (var sheetName in GetExcelSheetNames(connectionString))
{
using (OleDbConnection con = new OleDbConnection(connectionString))
{
var dataTable = new DataTable();
string query = string.Format("SELECT * FROM [{0}]", sheetName);
con.Open();
OleDbDataAdapter adapter = new OleDbDataAdapter(query, con);
adapter.Fill(dataTable);
data.Tables.Add(dataTable);
}
}
return data;
}
static string[] GetExcelSheetNames(string connectionString)
{
OleDbConnection con = null;
DataTable dt = null;
con = new OleDbConnection(connectionString);
con.Open();
dt = con.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
if (dt == null)
{
return null;
}
String[] excelSheetNames = new String[dt.Rows.Count];
int i = 0;
foreach (DataRow row in dt.Rows)
{
excelSheetNames[i] = row["TABLE_NAME"].ToString();
i++;
}
return excelSheetNames;
}
}
To specify that your sheet has a header row or not, modify the connection string to specify the HDR value. Refer to http://www.connectionstrings.com/excel/ for more information.
If your sheet has a header row, you can refer the columns by the header.
If your sheet does not have a header row use F1, F2, F3.... Fn where F1 is the first selected column. If you don't specify where to start, then column A, B, C correspond to F1, F2, F3 etc.
e.g.
SELECT * FROM [Sheet1$] <-- Column A=F1, B=F2 etc.
SELECT * FROM [Sheet1$B1:Z100] <-- Column B=F1, C=F2 etc.
Now once you know how to refer to the columns, rest should be easy. Create an OledbCommand object and execute your command.
UPDATE [Sheet1$A1:A1] SET F1='TestValue1' <-- trick to update only one cell
UPDATE [Sheet1$] SET F1='TestValue1', F2 = 'some value 2' WHERE WhateverCondition
I haven't ever tried with Datasets and DataAdapters with excel oledb, but logically that should work too because in the end they all drill down to Command object.

Read and write to Excel in c#

I am creating a test framework that should read parameters from an excel sheet. I would like to be able to :
Get a row count of test records in the sheet
Get column count
Reference a particular cell eg A23 and read or write values to it.
I found this code on the internet. Its great but it appears to have been coded to work with a form component. I dont necessarily need to show the excel sheet on a datagrid.
This is the code I found. Its working ok but I need to add the functionalities above. Thanks for your help :)
using System.Data;
using System.Data.OleDb;
...
OleDbConnection con = new OleDbConnection(#"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C:\Book1.xls;Extended Properties=Excel 8.0");
OleDbDataAdapter da = new OleDbDataAdapter("select * from MyObject", con);
DataTable dt = new DataTable();
da.Fill(dt);
Count Row
sheet.Range["A11"].Formula = “COUNT(A1:A10)”;
Count Column
sheet.Range["A12"].Formula = “COUNT(A1:F1)”;
.NET Excel componetn
you can Reference particular cells using this code :
Select * from [Sheet1$A1:B10]
for example above code access to cell A1 to B10
see here
You can use this method:
private DataTable LoadXLS(string filePath)
{
DataTable table = new DataTable();
DataRow row;
try
{
using (OleDbConnection cnLogin = new OleDbConnection())
{
cnLogin.ConnectionString = "provider=Microsoft.Jet.OLEDB.4.0;Data Source='" + filePath + "';Extended Properties=Excel 8.0;";
cnLogin.Open();
string sQuery = "SELECT * FROM [Sheet1$]";
table.Columns.Add("Tags", typeof(string));
table.Columns.Add("ReplaceWords", typeof(string));
OleDbCommand comDB = new OleDbCommand(sQuery, cnLogin);
using (OleDbDataReader drJobs = comDB.ExecuteReader(CommandBehavior.Default))
{
while (drJobs.Read())
{
row = table.NewRow();
row["Tags"] = drJobs[0].ToString();
row["ReplaceWords"] = drJobs[1].ToString();
table.Rows.Add(row);
}
}
}
return table;
}
And use like this:
DataTable dtXLS = LoadXLS(path);
//and do what you need
If you need to write into Excel you need to check this out http://msdn.microsoft.com/en-us/library/dd264733.aspx
an easy way to handle excel files and operations to them is the following one:
add the microsoft.office.interop.excel reference to your project (Add Reference.. => search under the .NET tab => add the reference)
create a new excel application and open the workbook:
Excel.Application application = new Excel.Application();
Excel.Workbook workbook = application.Workbooks.Open(workBookPath);
Excel.Worksheet worksheet = workbook.Sheets[worksheetNumber];
you can get the row and column count with the following lines:
var endColumn = worksheet.Columns.CurrentRegion.EntireColumn.Count;
var endRow = worksheet.Rows.CurrentRegion.EntireRow.Count;***
reading values form a cell or a range of cells can be done in the following way(rowIndex is the number of the row in which the cells you want to read out are):
System.Array values = (System.Array)worksheet.get_Range("A" +
rowIndex.ToString(), "D" + rowIndex.ToString()).Cells.Value;

Convert Excel Range to ADO.NET DataSet or DataTable, etc

I have an Excel spreadsheet that will sit out on a network share drive. It needs to be accessed by my Winforms C# 3.0 application (many users could be using the app and hitting this spreadsheet at the same time). There is a lot of data on one worksheet. This data is broken out into areas that I have named as ranges. I need to be able to access these ranges individually, return each range as a dataset, and then bind it to a grid.
I have found examples that use OLE and have got these to work. However, I have seen some warnings about using this method, plus at work we have been using Microsoft.Office.Interop.Excel as the standard thus far. I don't really want to stray from this unless I have to. Our users will be using Office 2003 on up as far as I know.
I can get the range I need with the following code:
MyDataRange = (Microsoft.Office.Interop.Excel.Range)
MyWorkSheet.get_Range("MyExcelRange", Type.Missing);
The OLE way was nice as it would take my first row and turn those into columns. My ranges (12 total) are for the most part different from each other in number of columns. Didn't know if this info would affect any recommendations.
Is there any way to use Interop and get the returned range back into a dataset?
I don't know about a built-in function, but it shouldn't be difficult to write it yourself. Pseudocode:
DataTable MakeTableFromRange(Range range)
{
table = new DataTable
for every column in range
{
add new column to table
}
for every row in range
{
add new datarow to table
for every column in range
{
table.cells[column, row].value = range[column, row].value
}
}
return table
}
I don't know what type of data you have.But for an excel data like shown in this link http://www.freeimagehosting.net/image.php?f8d4ef4173.png, you can use the following code to load into data table.
private void Form1_Load(object sender, EventArgs e)
{
try
{
DataTable sheetTable = loadSingleSheet(#"C:\excelFile.xls", "Sheet1$");
dataGridView1.DataSource = sheetTable;
}
catch (Exception Ex)
{
MessageBox.Show(Ex.Message, "");
}
}
private OleDbConnection returnConnection(string fileName)
{
return new OleDbConnection("Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + fileName + "; Jet OLEDB:Engine Type=5;Extended Properties=\"Excel 8.0;\"");
}
private DataTable loadSingleSheet(string fileName, string sheetName)
{
DataTable sheetData = new DataTable();
using (OleDbConnection conn = this.returnConnection(fileName))
{
conn.Open();
// retrieve the data using data adapter
OleDbDataAdapter sheetAdapter = new OleDbDataAdapter("select * from [" + sheetName + "]", conn);
sheetAdapter.Fill(sheetData);
}
return sheetData;
}
It's worth to take a look at NPOI when it comes to read/write Excel 2003 XLS files. NPOI is a life saver.
I think you'll have to iterate your range and create DataRows to put in your DataTable.
This question on StackOverflow provides more resources:
Create Excel (.XLS and .XLSX) file from C#
This method does not work well when the same column in the excel spread sheet contains both text and numbers. For instance, if Range("A3")=Hello and Range("A7")=5 then it reads only Hello and the value for Range("A7") is DBNULL
private void Form1_Load(object sender, EventArgs e)
{
try
{
DataTable sheetTable = loadSingleSheet(#"C:\excelFile.xls", "Sheet1$");
dataGridView1.DataSource = sheetTable;
}
catch (Exception Ex)
{
MessageBox.Show(Ex.Message, "");
}
}
private OleDbConnection returnConnection(string fileName)
{
return new OleDbConnection("Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + fileName + "; Jet OLEDB:Engine Type=5;Extended Properties=\"Excel 8.0;\"");
}
private DataTable loadSingleSheet(string fileName, string sheetName)
{
DataTable sheetData = new DataTable();
using (OleDbConnection conn = this.returnConnection(fileName))
{
conn.Open();
// retrieve the data using data adapter
OleDbDataAdapter sheetAdapter = new OleDbDataAdapter("select * from [" + sheetName + "]", conn);
sheetAdapter.Fill(sheetData);
}
return sheetData;
I made method which can take alredy filtered data from Excel
Taking filtered data in Range format
Worksheet sheet = null;
sheet = (Worksheet)context.cDocumentExcel.Sheets[requiredSheetName];
DataTable dt = new DataTable();
sheet.Activate();
sheet.UsedRange.Select();
List<Range> ranges = new List<Range>();
Range usedrange = sheet.UsedRange;
foreach (var oneRange in usedrange.SpecialCells(XlCellType.xlCellTypeVisible))
{
ranges.Add(oneRange);
}
dt = (_makeTableFromRange(ranges));
Converting from Range to DataTable
private static DataTable _makeTableFromRange(List<Range> ranges)
{
var table = new DataTable();
foreach (var range in ranges)
{
while (table.Columns.Count < range.Column)
{
table.Columns.Add();
}
while (table.Rows.Count < range.Row)
{
table.Rows.Add();
}
table.Rows[range.Row - 1][range.Column - 1] = range.Value2;
}
//clean from empty rows
var filteredRows = table.Rows.Cast<DataRow>().
Where(row => !row.ItemArray.All(field => field is System.DBNull ||
string.Compare((field as string).Trim(), string.Empty) ==
0));
table = filteredRows.CopyToDataTable();
return table;
}

Categories