Check excel sheet for specific header - c#

I need to check Excel files for a sheet containing a specific header in C#.
The sheet names, order and quantity are variable. I need to check all the sheets for the one containing a specific header and then store the name in a variable for later processing.
I can currently get all the sheet names but I am not able to check if it contains what I am looking for or not.
The goal is to get the sheet name and insert it in a SQL statement to process the file in a SSIS package.
This is the code I am currently using:
public void Main()
{
string excelFile;
string connectionString;
OleDbConnection excelConnection;
DataTable tablesInFile;
string[] excelTables = new string[5];
excelFile = Dts.Variables["User::var_MonitoringFile"].Value.ToString();
connectionString = "Provider=Microsoft.ACE.OLEDB.12.0;" +
"Data Source= "+ excelFile +
";Extended Properties=\"EXCEL 8.0;HDR=YES; IMEX=1;\"";
excelConnection = new OleDbConnection(connectionString);
excelConnection.Open();
tablesInFile = excelConnection.GetSchema("Tables");
DataRow tableInFile = tablesInFile.Rows[0];
Dts.Variables["User::var_ExcelSheet"].Value = tableInFile["TABLE_NAME"].ToString();
excelConnection.Close();
Dts.TaskResult = (int)ScriptResults.Success;
}

Related

C# Find and Replace Line Breaks in excel cell

I need import Excel file to sql server by SSIS. there is a situation. two columns in excel associate with each other.
I can do find/replace in excel to replace the line break to some delimiter(say &&). Is there any way we can do the replace by c# code?
Thanks
I made the following spreadsheet to match your data and added an ID column:
Now back to SSIS, I added the following data flow:
Open up Script component and go to inputs and outputs and define your columns:
Go back to Script and click edit.
Paste in the follow code to read the spreadsheet and parse into your outputs:
public override void CreateNewOutputRows()
{
string fileName = #"D:\imports\survey.xlsx";
string SheetName = "Bananas";
string cstr = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + fileName + ";Extended Properties=\"Excel 12.0;HDR=YES;IMEX=1\"";
using (System.Data.OleDb.OleDbConnection xlConn = new System.Data.OleDb.OleDbConnection(cstr))
{
xlConn.Open();
System.Data.OleDb.OleDbCommand xlCmd = xlConn.CreateCommand();
xlCmd.CommandText = string.Format("Select * from [{0}$]", SheetName);
xlCmd.CommandType = CommandType.Text;
using (System.Data.OleDb.OleDbDataReader rdr = xlCmd.ExecuteReader())
while(rdr.Read())
{
int id = Convert.ToInt32((decimal.Parse(rdr[0].ToString())));
string[] keys = rdr.GetString(1).Split('\n');
string[] values = rdr.GetString(2).Split('\n');
if (keys.Length > 0)
{
for(int i = 0;i<keys.Length;i++)
{
Output0Buffer.AddRow();
Output0Buffer.ID = id;
Output0Buffer.key = keys[i];
Output0Buffer.Pair = values[i];
}
}
}
}
}
Finally, the output:

Excel to DataGridView

Additional information: The Microsoft Office Access database engine could not find the object 'C:\Users\username\Documents\sampleData.xls'. Make sure the object exists and that you spell its name and the path name correctly.
The Error is highlighted at
theDataAdapter.Fill(spreadSheetData);
Here's the sample data I used (tried in .csv , .xls , .xlsx )
Name Age Status Children
Johnny 34 Married 3
Joey 21 Single 1
Michael 16 Dating 0
Smith 42 Divorced 4
Here's the code associated:
using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Windows.Forms;
using System.IO;
using System.Data.OleDb;
namespace uploadExcelFile
{
public partial class Form1 : Form
{
public Form1()
{
InitializeComponent();
}
private void btnImport_Click(object sender, EventArgs e)
{
var frmDialog = new System.Windows.Forms.OpenFileDialog();
if (frmDialog.ShowDialog() == System.Windows.Forms.DialogResult.OK)
{
string strFileName = frmDialog.FileName;
System.IO.FileInfo spreadSheetFile = new System.IO.FileInfo(strFileName);
scheduleGridView.DataSource = spreadSheetFile.ToString();
System.Diagnostics.Debug.WriteLine(frmDialog.FileName);
System.Diagnostics.Debug.WriteLine(frmDialog.SafeFileName);
String name = frmDialog.SafeFileName;
String constr = String.Format(#"Provider=Microsoft.ACE.OLEDB.12.0;Data Source={0};Extended Properties=""Excel 12.0 Xml;HDR=YES""", frmDialog.FileName);
OleDbConnection myConnection = new OleDbConnection(constr);
OleDbCommand onlineConnection = new OleDbCommand("SELECT * FROM [" + frmDialog.FileName + "]", myConnection);
myConnection.Open();
OleDbDataAdapter theDataAdapter = new OleDbDataAdapter(onlineConnection);
DataTable spreadSheetData = myConnection.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
theDataAdapter.Fill(spreadSheetData);
scheduleGridView.DataSource = spreadSheetData;
}
}
}
}
scheduleGridView is the DataGridViews name, & btnImport is the name for the import Button.
I've installed 2007 Office System Driver: Data Connectivity Components; which gave me the AccessDatabaseEngine.exe, but from there I've been stuck here without understanding how to get around this. It should go without saying that the filepath is correct in its entirety. There is no odd characters in the path name either (spaces, underlines, etc)
Mini Update :: (another dead end it seems like)
Although the initial error says, "could not find the object 'C:\Users\username\Documents\sampleData.xls'"
In the Debugger the exception is read as
When I look at details the exception as "C:\Users\username\Documents\sampleData.xls"
So I thought the error was that it wasn't taking the path as a literal, but this article C# verbatim string literal not working. Very Strange backslash always double
Shows very clearly that that is not my issue.
I am guessing you may be mistaken by what is returned from the following line of code…
DataTable spreadSheetData = myConnection.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
The DataTable returned from this line will have nine (9) columns (TABLE_CATALOG, TABLE_SCHEMA, TABLE_NAME, TABLE_TYPE, TABLE_GUID, DESCRIPTION, TABLE_PROPID, DATE_CREATED and DATE_MODIFIED). This ONE (1) DataTable returned simply “Describes” the worksheet(s) and named range(s) in the entire selected Excel workbook. Each row in this DataTable represent either a worksheet OR a named range. To distinguish worksheets from named ranges, the “TABLE_NAME” column in this DataTable has the name of the worksheet or range AND ends each “Worksheet” Name with a dollar sign ($). If the “TABLE_NAME” value in a row does NOT end in dollar sign, then it is a range and not a worksheet.
Therefore, when the line
OleDbDataAdapter theDataAdapter = new OleDbDataAdapter(onlineConnection);
Blows up and says it cannot file the “filename” error… is somewhat expected because this line is looking for a “worksheet” name, not a filename. On the line creating the select command…
OleDbCommand onlineConnection = new OleDbCommand("SELECT * FROM [" + frmDialog.FileName + "]", myConnection);
This is incorrect; you have already selected the filename and open the file with
String constr = String.Format(#"Provider=Microsoft.ACE.OLEDB.12.0;Data Source={0};Extended Properties=""Excel 12.0 Xml;HDR=YES""", frmDialog.FileName);
OleDbConnection myConnection = new OleDbConnection(constr);
myConnection.Open();
The correct OleDbCommand line should be…
OleDbCommand onlineConnection = new OleDbCommand("SELECT * FROM [" + sheetName + "]", myConnection);
The problem here is that the current code is not getting the worksheet names. Therefore, we cannot “select” the worksheet from the workbook then fill the adapter with the worksheet.
The other issue is setting the DataGridView’s DataSource to spreadSheetData… when you get the worksheet(s) from an Excel “Workbook”, you must assume there will be more than one sheet. Therefore a DataSet will work as a container to hold all the worksheets in the workbook. Each DataTable in the DataSet would be a single worksheet and it can be surmised that the DataGridView can only display ONE (1) of these tables at a time. Given this, below are the changes described along with an added button to display the “Next” worksheet in the DataGridView since there may be more than one worksheet in the workbook. Hope this makes sense.
int sheetIndex = 0;
DataSet ds = new DataSet();
public Form1() {
InitializeComponent();
}
private void btnImport_Click(object sender, EventArgs e) {
var frmDialog = new System.Windows.Forms.OpenFileDialog();
if (frmDialog.ShowDialog() == System.Windows.Forms.DialogResult.OK) {
String constr = String.Format(#"Provider=Microsoft.ACE.OLEDB.12.0;Data Source={0};Extended Properties=""Excel 12.0 Xml;HDR=YES""", frmDialog.FileName);
OleDbConnection myConnection = new OleDbConnection(constr);
myConnection.Open();
DataTable spreadSheetData = myConnection.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
string sheetName = "";
DataTable dt;
OleDbCommand onlineConnection;
OleDbDataAdapter theDataAdapter;
// fill the "DataSet" each table in the set is a worksheet in the Excel file
foreach (DataRow dr in spreadSheetData.Rows) {
sheetName = dr["TABLE_NAME"].ToString();
sheetName = sheetName.Replace("'", "");
if (sheetName.EndsWith("$")) {
onlineConnection = new OleDbCommand("SELECT * FROM [" + sheetName + "]", myConnection);
theDataAdapter = new OleDbDataAdapter(onlineConnection);
dt = new DataTable();
dt.TableName = sheetName;
theDataAdapter.Fill(dt);
ds.Tables.Add(dt);
}
}
myConnection.Close();
scheduleGridView.DataSource = ds.Tables[0];
setLabel();
}
}
private void setLabel() {
label1.Text = "Showing worksheet " + sheetIndex + " Named: " + ds.Tables[sheetIndex].TableName + " out of a total of " + ds.Tables.Count + " worksheets";
}
private void btnNextSheet_Click(object sender, EventArgs e) {
if (sheetIndex == ds.Tables.Count - 1)
sheetIndex = 0;
else
sheetIndex++;
scheduleGridView.DataSource = ds.Tables[sheetIndex];
setLabel();
}
I solved it. Well there was a workaround. I used the Excel Data Reader found in this thread: How to Convert DataSet to DataTable
Which led me to https://github.com/ExcelDataReader/ExcelDataReader
^ The readme was fantastic, just went to solution explorer, right click on references, manage NuGet Packages, select browse in the new box, enter ExcelDataReader, then in the .cs file be sure to include, "using Excel;" at the top, the code mentioned in the first link was essentially enough, but here's my exact code for those wondering.
var frmDialog = new System.Windows.Forms.OpenFileDialog();
if (frmDialog.ShowDialog() == System.Windows.Forms.DialogResult.OK)
{
/*string strFileName = frmDialog.FileName;
//System.IO.FileInfo spreadSheetFile = new System.IO.FileInfo(strFileName);
System.IO.StreamReader reader = new System.IO.StreamReader(strFileName);
*/
string strFileName = frmDialog.FileName;
FileStream stream = File.Open(strFileName, FileMode.Open, FileAccess.Read);
//1. Reading from a binary Excel file ('97-2003 format; *.xls)
IExcelDataReader excelReader = ExcelReaderFactory.CreateBinaryReader(stream);
//...
//2. Reading from a OpenXml Excel file (2007 format; *.xlsx)
//IExcelDataReader excelReader = ExcelReaderFactory.CreateOpenXmlReader(stream);
//...
//3. DataSet - The result of each spreadsheet will be created in the result.Tables
//DataSet result = excelReader.AsDataSet();
//...
//4. DataSet - Create column names from first row
excelReader.IsFirstRowAsColumnNames = true;
DataSet result = excelReader.AsDataSet();
DataTable data = result.Tables[0];
//5. Data Reader methods
while (excelReader.Read())
{
//excelReader.GetInt32(0);
}
scheduleGridView.DataSource = data;
excelReader.Close();

OleDB connection throws exception in Windows Server 2012R

I am using a function in order to open an .xls file with multiple worksheets and copy the entire content into a .csv file.
Everything works just fine on my local machine: no exceptions, no errors etc.
But when I am running it on windows server 2012R I am getting an exception when the connection is opened.
Here is the code where I am trying to open an OleDB connection and then query through the file:
static void ConvertExcelToCsv(string excelFilePath, string csvOutputFile, int worksheetNumber)
{
// connection string
var cnnStr = String.Format("Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + excelFilePath + ";Extended Properties=\"Excel 8.0;HDR=no;Format=xls\"");
var cnn = new OleDbConnection(cnnStr);
// get schema, then data
var dt = new DataTable();
try
{
cnn.Open();
var schemaTable = cnn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
if (schemaTable.Rows.Count < worksheetNumber) throw new ArgumentException("The worksheet number provided cannot be found in the spreadsheet");
string worksheet = schemaTable.Rows[worksheetNumber - 1]["table_name"].ToString().Replace("'", "");
string sql = String.Format("select * from [{0}]", worksheet);
var da = new OleDbDataAdapter(sql, cnn);
da.Fill(dt);
....
The excelFilePath is my source excel file (.xls) and csvOutputFile is the file were the content is going to be passed to.
Does anyone has any ideas why I am getting this exception??

The Microsoft Office Access database engine could not find an object

I'm trying to copy data from excel to sql server but facing the following error.
The Microsoft Office Access database engine could not find the object 'sheet1$'. Make sure the object exists and that you spell its name and the path name correctly.
My code is:
protected void importdatafromexcel(string filepath)
{
string sqltable = "PFDummyExcel";
string exceldataquery = "select EmployeeId,EmployeeName,Amount from [Sheet1$]";
string excelconnectionstring = #"Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + filepath + ";Extended Properties=Excel 12.0;Persist Security Info=False";
string sqlconnectionstring = System.Configuration.ConfigurationManager.ConnectionStrings["HRGold"].ConnectionString;
SqlConnection con = new SqlConnection(sqlconnectionstring);
OleDbConnection oledb = new OleDbConnection(excelconnectionstring);
OleDbCommand oledbcmd = new OleDbCommand(exceldataquery, oledb);
oledb.Open();
OleDbDataReader dr = oledbcmd.ExecuteReader();
SqlBulkCopy bulkcopy = new SqlBulkCopy(sqlconnectionstring);
bulkcopy.DestinationTableName = sqltable;
while (dr.Read())
{
bulkcopy.WriteToServer(dr);
}
oledb.Close();
}
Please tell me how i solve this..
This error is raised because of you are trying to access sheet (which name is sheet1) in excel file. By default first sheet name is "sheet1" but user have either rename this name or delete this sheet.
To resolved this issue first of all you have to get all sheet name from excel file, then you have to pass this sheet name in your above code to import data.
string filePath = "your file path";
string excelconnectionstring = #"Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + filepath + ";Extended Properties=Excel 12.0;Persist Security Info=False";
OleDbConnection Connection = new OleDbConnection(excelconnectionstring);
DataTable activityDataTable = Connection.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
if(activityDataTable != null)
{
//validate worksheet name.
var itemsOfWorksheet = new List<SelectListItem>();
string worksheetName;
for (int cnt = 0; cnt < activityDataTable.Rows.Count; cnt++)
{
worksheetName = activityDataTable.Rows[cnt]["TABLE_NAME"].ToString();
if (worksheetName.Contains('\''))
{
worksheetName = worksheetName.Replace('\'', ' ').Trim();
}
if (worksheetName.Trim().EndsWith("$"))
itemsOfWorksheet.Add(new SelectListItem { Text = worksheetName.TrimEnd('$'), Value = worksheetName });
}
}
// itemsOfWorksheet : all worksheet name is added in this
so you can use itemsOfWorksheet[0] as sheet name in-place of "sheet1"
I had similar issue, I sorted it out by
Saving the excel file from fileuploader to a temporary folder inside website folder.
Using path to that file in my connection string
Rest all was same and now the error: The Microsoft Office Access database engine could not find the object 'sheet1$' was gone.

C# VS2005 Import .CSV File into SQL Database

I am trying to import a .csv file into my database. I am able to import an excel worksheet into my database, however due to different file format as .csv as from .xls, I need to make an import function specially for .csv.
Below is my code:
protected void Button1_Click(object sender, EventArgs e)
{
if (FileUpload1.HasFile)
{
// Get the name of the Excel spreadsheet to upload.
string strFileName = Server.HtmlEncode(FileUpload1.FileName);
// Get the extension of the Excel spreadsheet.
string strExtension = Path.GetExtension(strFileName);
// Validate the file extension.
if (strExtension != ".xls" && strExtension != ".xlsx" && strExtension != ".csv" && strExtension != ".csv")
{
Response.Write("<script>alert('Failed to import DEM Conflicting Role Datasheet. Cause: Invalid Excel file.');</script>");
return;
}
// Generate the file name to save.
string strUploadFileName = #"C:\Documents and Settings\rhlim\My Documents\Visual Studio 2005\WebSites\SoD\UploadFiles\" + DateTime.Now.ToString("yyyyMMddHHmmss") + strExtension;
// Save the Excel spreadsheet on server.
FileUpload1.SaveAs(strUploadFileName);
// Create Connection to Excel Workbook
string connStr = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + strUploadFileName + ";Extended Properties=Text;";
using (OleDbConnection ExcelConnection = new OleDbConnection(connStr)){
OleDbCommand ExcelCommand = new OleDbCommand("SELECT [columns] FROM +userrolelist", ExcelConnection);
OleDbDataAdapter ExcelAdapter = new OleDbDataAdapter(ExcelCommand);
ExcelConnection.Open();
using (DbDataReader dr = ExcelCommand.ExecuteReader())
{
// SQL Server Connection String
string sqlConnectionString = "Data Source=<IP>;Initial Catalog=<DB>;User ID=<userid>;Password=<password>";
// Bulk Copy to SQL Server
using (SqlBulkCopy bulkCopy =
new SqlBulkCopy(sqlConnectionString))
{
bulkCopy.DestinationTableName = "DEMUserRoles";
bulkCopy.WriteToServer(dr);
Response.Write("<script>alert('DEM User Data imported');</script>");
}
}
}
}
else Response.Write("<script>alert('Failed to import DEM User Roles Data. Cause: No file found.');</script>");
}
The file has been successfully saved, but the error says that the path for the file is not valid, even though the file has been successfully saved as .csv, therefore I am not able to continue with the process of importing the data into my database.
Below are the screenshots of my error:
In conclusion I am having the error that the file path which the csv file is saved is not valid, although the csv file is successfully saved. Need some help from experienced. Thank You
If you're reading a CSV file, your connection string should specify the directory containing your CSV file.
string connStr = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" +
Path.GetDirectoryName(strUploadFileName);
You then use the filename in your SELECT statement:
"SELECT * FROM [" + Path.GetFileName(strUploadFileName) + "]"
I think you have this problem because you use "/" instead of "\"
Try to modify the path C:\.....
You need to use the backward slashes(\) on the file path.
string strUploadFileName = #"C:\Documents and Settings\rhlim\My Documents\Visual Studio 2005\WebSites\SoD\UploadFiles\" + DateTime.Now.ToString("yyyyMMddHHmmss") + strExtension;
EDIT 1: I believe FileUpload1.SaveAs converts the / to \ internally to identify the correct location.
EDIT 2: Its the problem with your connectionstring, even though you are using .csv format, you need to set Excel 8.0 or Excel 12.0 Xml as the Extended Properties
Here is the sample:
string connStr = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + strUploadFileName + ";Extended Properties=Excel 12.0 Xml;";
For other types, check the code of OLEDB section of my article.
To avoid the connection open you can use like
// Read the CSV file name & file path
// I am usisg here Kendo UI Uploader
string path = "";
string filenamee = "";
if (files != null)
{
foreach (var file in files)
{
var fileName = Path.GetFileName(file.FileName);
path = Path.GetFullPath(file.FileName);
filenamee = fileName;
}
// Read the CSV file data
StreamReader sr = new StreamReader(path);
string line = sr.ReadLine();
string[] value = line.Split(',');
DataTable dt = new DataTable();
DataRow row;
foreach (string dc in value)
{
dt.Columns.Add(new DataColumn(dc));
}
while (!sr.EndOfStream)
{
value = sr.ReadLine().Split(',');
if (value.Length == dt.Columns.Count)
{
row = dt.NewRow();
row.ItemArray = value;
dt.Rows.Add(row);
}
}
For more help you can also See This Link

Categories