I have a problem with looping through an excel file.
I want to be able to create an automatic code that will handle multiple excel files
There is a fix header in each file, so the "real" datas begin at line 15.
I'm trying to use "usedRange" but I don't really understand the doc.
Actually, I have this :
var excel = new Excel.Application();
var wkb = OpenBook(excel, _myExcelFile, true, false, false);
var sheet = wkb.Sheets["B.C"] as Excel.Worksheet;
var usedRange = sheet.UsedRange;
var i = 0;
foreach (Excel.Range row in sheet.UsedRange.Rows)
{
i++;
// I get data like this (for column 2 for example) :
// Convert.ToString(row.Cells[i, 2].Value);
}
Problem is that my excel file have over 3000+ rows, but the loop returns only 1800+, I can't figure why.
I think that there is a problem with the "UsedRange" function but don't know why.
How can I loop ALL rows in my file?
Another option is to turn your excel into a datatable - following is untested (I don't have excel on this PC)
System.Data.OleDb.OleDbConnection MyConnection;
System.Data.DataTable DtSet;
System.Data.OleDb.OleDbDataAdapter MyCommand;
MyConnection = new System.Data.OleDb.OleDbConnection("provider=Microsoft.Jet.OLEDB.4.0;Data Source='c:\\csharp.net-informations.xls';Extended Properties=Excel 8.0;");
MyCommand = new System.Data.OleDb.OleDbDataAdapter("select * from [Sheet1$]", MyConnection);
MyCommand.TableMappings.Add("Table", "TestTable");
DtSet = new System.Data.DataTable();
MyCommand.Fill(DtSet);
MyConnection.Close();
You can then do linq on it with things like
var x = DTset.AsEnumberable().Where( r => r["My Field"].ToString() == "Pick me");
or just use it like a normal datatable
Related
I want to read a lot of cells from Excel to the 2-dimensional array in C#.
Using Microsoft.Office.Interop.Excel and reading cells one by one is too slow. I know how to write the array to the range (Microsoft.Office.Interop.Excel really slow) but I would like to do it in the opposite direction
_Excel.Application xlApp = new _Excel.Application();
_Excel.Workbook xlWorkBook;
_Excel.Worksheet xlWorkSheet;
object misValue = System.Reflection.Missing.Value;
xlWorkBook = xlApp.Workbooks.Open(path);
xlWorkSheet = xlWorkBook.Worksheets["Engineering BOM"];
_Excel.Range range = (_Excel.Range)xlWorkSheet.Cells[1, 1];
range = range.get_Resize(13000, 9);
string[,] indexMatrix = new string[13000, 9];
// below code should be much faster
for (int i = 1; i < 1300; i++)
{
for (int j = 1; j < 9; j++)
{
indexMatrix[i, j] = xlWorkSheet.Cells[i, j].Value2;
}
}
As a result I want to have values from cells range in array (range size is exactly the same as array size). Now app is reading cell by cell and writing data to array but it is too slow. Is any way to copy a whole range to cells directly?
thank you in advance :)
You can try this, it should be faster but:
You have to use data tables(in this case it is better to use a data table instead a
multidimensional array.)
You don't need to care about range anymore.
So what are we going to do? connect to excel and make a query to select all the data and fill a data table. What we need? a few lines of code.
First we declare our connection string:
For Excel 2007 or above (*.XLSX files)
string connectionString = string.Format("Provider=Microsoft.ACE.OLEDB.12.0;Data Source={0};Extended Properties=\"Excel 12.0 Xml;HDR=No;IMEX=1\";", fullPath);
For Excel 2003 (*.XLS files)
string connectionString = string.Format("Provider=Microsoft.Jet.OLEDB.4.0; data source={0}; Extended Properties=\"Excel 8.0;HDR=No;IMEX=1\";", fullPath);
where fullPath is the full file path of your excel file
Now we have to create the connection and fill the data table:
OleDbConnection SQLConn = new OleDbConnection(strConnectionString);
SQLConn.Open();
OleDbDataAdapter SQLAdapter = new OleDbDataAdapter();
string sql = "SELECT * FROM [" + sheetName + "$]";
OleDbCommand selectCMD = new OleDbCommand(sql, SQLConn);
SQLAdapter.SelectCommand = selectCMD;
SQLAdapter.Fill(dtXLS);
SQLConn.Close();
where sheetName is your sheet name, and dtXLS is your data table populated with all your excel value.
This should be faster.
I guess that range is somewhat defining a 'data table'. If that is right, then fastest would be to read that as Data using OleDb or ODbc (and doesn't even need excel to be installed):
DataTable tbl = new DataTable();
using (OleDbConnection con = new OleDbConnection(#"Provider=Microsoft.ACE.OLEDB.12.0;" +
$"Data Source={path};" +
#"Extended Properties=""Excel 12.0;HDR=Yes"""))
using (OleDbCommand cmd = new OleDbCommand(#"Select * from [Engineering BOM$A1:i13000]", con))
{
con.Open();
tbl.Load(cmd.ExecuteReader());
}
If it was not, then you could do this:
Excel.Application xl = new Excel.Application();
var wb = xl.Workbooks.Open(path);
Excel.Worksheet ws = (Excel.Worksheet)wb.Worksheets["Engineering BOM"];
var v = ws.Range["A1:I13000"].Value;
(Not sure if excel itself could do such a big array allocation).
I am trying to access data from Excel in C#. Ideally I want to put the data into a list or a series collection. I was using this tutorial - http://www.aspsnippets.com/Articles/Read-Excel-file-using-OLEDB-Data-Provider-in-C-Net.aspx.
It was very helpful but I think he missed out the data adapter part. Here is the code I got following his example.
string connectionString = null;
connectionString = "Provider = Microsoft.ACE.OLEDB.12.0; Data Source = P:\\Visual Studio 2012\\Projects\\SmartSheetAPI\\SmartSheetAPI\\bin\\Debug\\OUTPUT.xls; Extended Properties = 'excel 12.0 Xml; HDR=YES; IMEX=1;';";
//Establish Connection
string dataSource = "P:\\Visual Studio 2012\\Projects\\SmartSheetAPI\\SmartSheetAPI\\bin\\Debug\\OUTPUT.xls;";
string excelConnection = "Provider=Microsoft.ACE.OLEDB.12.0; Data Source = " + dataSource + " Extended Properties='Excel 8.0; HDR=Yes'";
OleDbConnection connExcel = new OleDbConnection(connectionString);
OleDbCommand cmdExcel = new OleDbCommand();
cmdExcel.Connection = connExcel;
//Accessing Sheets
connExcel.Open();
DataTable dtExcelSchema;
dtExcelSchema = connExcel.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
connExcel.Close();
//access excel Sheets (tables in database)
DataSet dataset = new DataSet();
string SheetName = dtExcelSchema.Rows[0]["TABLE_NAME"].ToString();
cmdExcel.CommandText = "SELECT * From [" + SheetName + "]";
da.SelectCommand = cmdExcel;
da.Fill(dataset);
connExcel.Close();
If you look at the bottom three lines you will notice he uses da.SelectCommand and da.Fill to fill the dataset. But I think this requires a dataadapter and he doesn't have that in his example. I have tried creating a dataadapter as below:
SqlDataAdapter dataadapter = new SqlDataAdapter();
But I get an error stating: cannot implicitly convert type 'System.Data.OleDb.OleDbCommand' to System.Data.SqlClient.SqlCommand'.
I know it is working right up to the select statement. Can someone help me I basically just want to be able to access the information I am getting in the select statement.
Accessing excel data using Oledb connection is always a headache.You can try third party controls instead, like Aspose.Usage is very simple .You can try the following code after adding the control's reference to your project.
//Creating a file stream containing the Excel file to be opened
FileStream fstream = new FileStream("C:\\book1.xls", FileMode.Open);
//Instantiating a Workbook object
//Opening the Excel file through the file stream
Workbook workbook = new Workbook(fstream);
//Accessing the first worksheet in the Excel file
Worksheet worksheet = workbook.Worksheets[0];
//Exporting the contents of 7 rows and 2 columns starting from 1st cell to DataTable
DataTable dataTable = worksheet.Cells.ExportDataTable(0, 0, 7, 2, true);
//Binding the DataTable with DataGrid
dataGrid1.DataSource = dataTable;
//Closing the file stream to free all resources
fstream.Close();
You need an OleDBDataAdapter, not SqlDataAdapter. So, do this:
OleDBDataAdapter da = new OleDBDataAdapter(cmdExcel);
da.Fill(dataset);
Excel is an OLEDB data source, and so the classes you should be using will be prefixed with OleDb in general, just like the ones for database connectivity and manipulation are prefixed with Sql.
Documentation
I am reading .xlsx file using c# like this
string strConn = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + fileName +
";Extended Properties=\"Excel 12.0;HDR=No;IMEX=1\";";
var output = new DataSet();
using (var conn = new OleDbConnection(strConn))
{
conn.Open();
var dt = conn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, new object[] { null, null, null, "TABLE" });
foreach (DataRow row in dt.Rows)
{
string sheet = row["TABLE_NAME"].ToString();
var cmd = new OleDbCommand("SELECT * FROM [+"+sheet+"+]", conn);
cmd.CommandType = CommandType.Text;
OleDbDataAdapter xlAdapter = new OleDbDataAdapter(cmd);
xlAdapter.Fill(output,"School");
}
}
But I am getting error at xlAdapter.Fill(output,"School");
Error is
The Microsoft Office Access database engine could not find the object '+_xlnm.Print_Area+'. Make sure the object exists and that you spell its name and the path name correctly.
I am not able to figure out that what is wrong happening in code.
I believe your sheet is named _xlnm.Print_Area.
Try changing this line
var cmd = new OleDbCommand("SELECT * FROM [+"+sheet+"+]", conn);
to
var cmd = new OleDbCommand("SELECT * FROM ["+sheet+"]", conn);
When you define print area in your sheet "_xlnm.Print_Area" automatic added with your new sheet. Please remove print area form excel sheet or use following code
if (!dr["TABLE_NAME"].ToString().Contains("_xlnm#Print_Area"))
{
obj.SheetName = dr["TABLE_NAME"].ToString();
lst.Add(obj);
}
variable sheet contains value: +_xlnm.Print_Area+
This +_xlnm.Print_Area+ does not actually exists.
Thats why error is comming.
Check with that object.
I would check what you are getting in row["TABLE_NAME"].ToString(); values. Alternatively, you can try OpenXML SDK: How to: Parse and read a large spreadsheet document (Open XML SDK)
I am creating a test framework that should read parameters from an excel sheet. I would like to be able to :
Get a row count of test records in the sheet
Get column count
Reference a particular cell eg A23 and read or write values to it.
I found this code on the internet. Its great but it appears to have been coded to work with a form component. I dont necessarily need to show the excel sheet on a datagrid.
This is the code I found. Its working ok but I need to add the functionalities above. Thanks for your help :)
using System.Data;
using System.Data.OleDb;
...
OleDbConnection con = new OleDbConnection(#"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C:\Book1.xls;Extended Properties=Excel 8.0");
OleDbDataAdapter da = new OleDbDataAdapter("select * from MyObject", con);
DataTable dt = new DataTable();
da.Fill(dt);
Count Row
sheet.Range["A11"].Formula = “COUNT(A1:A10)”;
Count Column
sheet.Range["A12"].Formula = “COUNT(A1:F1)”;
.NET Excel componetn
you can Reference particular cells using this code :
Select * from [Sheet1$A1:B10]
for example above code access to cell A1 to B10
see here
You can use this method:
private DataTable LoadXLS(string filePath)
{
DataTable table = new DataTable();
DataRow row;
try
{
using (OleDbConnection cnLogin = new OleDbConnection())
{
cnLogin.ConnectionString = "provider=Microsoft.Jet.OLEDB.4.0;Data Source='" + filePath + "';Extended Properties=Excel 8.0;";
cnLogin.Open();
string sQuery = "SELECT * FROM [Sheet1$]";
table.Columns.Add("Tags", typeof(string));
table.Columns.Add("ReplaceWords", typeof(string));
OleDbCommand comDB = new OleDbCommand(sQuery, cnLogin);
using (OleDbDataReader drJobs = comDB.ExecuteReader(CommandBehavior.Default))
{
while (drJobs.Read())
{
row = table.NewRow();
row["Tags"] = drJobs[0].ToString();
row["ReplaceWords"] = drJobs[1].ToString();
table.Rows.Add(row);
}
}
}
return table;
}
And use like this:
DataTable dtXLS = LoadXLS(path);
//and do what you need
If you need to write into Excel you need to check this out http://msdn.microsoft.com/en-us/library/dd264733.aspx
an easy way to handle excel files and operations to them is the following one:
add the microsoft.office.interop.excel reference to your project (Add Reference.. => search under the .NET tab => add the reference)
create a new excel application and open the workbook:
Excel.Application application = new Excel.Application();
Excel.Workbook workbook = application.Workbooks.Open(workBookPath);
Excel.Worksheet worksheet = workbook.Sheets[worksheetNumber];
you can get the row and column count with the following lines:
var endColumn = worksheet.Columns.CurrentRegion.EntireColumn.Count;
var endRow = worksheet.Rows.CurrentRegion.EntireRow.Count;***
reading values form a cell or a range of cells can be done in the following way(rowIndex is the number of the row in which the cells you want to read out are):
System.Array values = (System.Array)worksheet.get_Range("A" +
rowIndex.ToString(), "D" + rowIndex.ToString()).Cells.Value;
I'm importing data in a SQL Server 2008 database from excel file where the first row is headers (HDR=1). The thing is that the second row is also kind of headers which i don't really need to be imported. So how do I ignore the second row from that excel (I guess if the first row is the headers, the actual second row in excel is first)?
In MySQL is just about saying IGNORE LINES 1 in the end of import command ... How do I do it in SQL Server?
Here is part of the code doing that:
//Create Connection to Excel work book
OleDbConnection excelConnection = new OleDbConnection(excelConnectionString);
//Create OleDbCommand to fetch data from Excel
OleDbCommand cmd = new OleDbCommand("Select [task_code],[status_code],[wbs] from [task$]", excelConnection);
excelConnection.Open();
OleDbDataReader dReader;
dReader = cmd.ExecuteReader();
SqlBulkCopy sqlBulk = new SqlBulkCopy(connectionString);
//Give your Destination table name
sqlBulk.DestinationTableName = "task";
sqlBulk.WriteToServer(dReader);
sqlBulk.Close();
Thanks
Use the following:
...
OleDbDataReader dReader;
dReader = cmd.ExecuteReader();
if( !dReader.Read() || !dReader.Read())
return "No data";
SqlBulkCopy sqlBulk = new SqlBulkCopy(connectionString);
...
A quick solution would be to:
Copy the file
Use Office interop to delete the second line of the spreadsheet
Import the amended spreadsheet
To delete the line from the spreadsheet:
public static void DeleteRow(string pathToFile, string sheetName, string cellRef)
{
Application app= new Application();
Workbook workbook = app.Workbooks.Open(pathToFile);
for (int sheetNum = 1; sheetNum < workbook.Sheets.Count + 1; sheetNum++)
{
Worksheet sheet = (Worksheet)workbook.Sheets[sheetNum];
if (sheet.Name != sheetName)
{
continue;
}
Range secondRow = sheet.Range[cellRef];
secondRow.EntireRow.Delete();
}
workbook.Save();
workbook.Close();
app.Quit();
}