Search entire workbook and fetch out values - c#

In my excel workbook
The first sheet tab contains
Number Name Code Subject
100 Mark ABC Mathematics
101 John XYZ Physics
The second sheet tab contains
Number Name Code Subject
103 Mark DEF Chemistry
104 John GHI Biology
I want to pass the code(which is going to be unique) as a parameter and search the entire excel workbook
and fetch name and subject..
ie..select name,subject from myexcelworkbook where code='ABC'
I am able to get sheet names, column count etc. but not able to search thro' an entire excel workbook and get the required values
const string fileName="C:\\FileName.xls";
OleDbConnectionStringBuilder connectionStringBuilder = new OleDbConnectionStringBuilder();
connectionStringBuilder.Provider = "Microsoft.ACE.OLEDB.12.0";
connectionStringBuilder.DataSource = fileName;
connectionStringBuilder.Add("Mode", "Read");
const string extendedProperties = "Excel 12.0;IMEX=1;HDR=YES";
connectionStringBuilder.Add("Extended Properties", extendedProperties);
using (OleDbConnection objConn = new OleDbConnection(connectionStringBuilder.ConnectionString))
{
objConn.Open();
Microsoft.Office.Interop.Excel.Application xlsApp = new Microsoft.Office.Interop.Excel.Application();
Workbook wb = xlsApp.Workbooks.Open(fileName, 0, true, 5, "", "", true, XlPlatform.xlWindows, "\t", false, false, 0, true);
Sheets sheets = wb.Worksheets;
for (int i =1 ; i <= wb.Worksheets.Count; i++)
{
MessageBox.Show(wb.Sheets[i].Name.ToString()); - gives sheet names inside the workbook
}
Worksheet ws = (Worksheet)sheets.get_Item(1); - gives the elements of specified sheet tab
}
//To get elements inside a specific sheet of an excel workbook/get column names
OleDbCommand objCmdSelect = new OleDbCommand("SELECT * FROM [" + "Sheet1" + "$]", objConn);
OleDbDataAdapter objAdapter1 = new OleDbDataAdapter();
objAdapter1.SelectCommand = objCmdSelect;
DataSet objDataset1 = new DataSet();
objAdapter1.Fill(objDataset1);
string columnNames = string.Empty;
// For each DataTable, print the ColumnName. use dataset.Rows to iterate row data...
foreach (System.Data.DataTable table in objDataset1.Tables)
{
foreach (DataColumn column in table.Columns)
{
if (columnNames.Length > 0)
{
columnNames = columnNames + Environment.NewLine + column.ColumnName;
}
else
{
columnNames = column.ColumnName;
}
}
}
Can someone share some ideas, so that i can find out the unique data inside the excel workbook and fetch out the needed values based on that? thanks in advance.

If the workbooks are as structured as you indicate and you will be querying multiple students then, as #Andy G suggests, you might find it easiest to just get the data into some form of record set then you run your queries on it through Linq or SQL or whatever you prefer. As you aren't wishing to modify the Excel workbook at all this is probably a better approach.
Alternatively, you can use the Excel API, like you were also trying to use. You can enumerate through the worksheets, running a Find on each. A la:
internal void GetYourData()
{
//... code to get the relevant Workbook and relevant/new Excel Application
Tuple<string, string> pupil;
string searchTerm = "ABC";
//Get the cell of the match
Range match = FindFirstOccurrenceInWorkbook(workbook, searchTerm);
if (match != null)
{
//Do whatever - per your data structure, it is probably easiest to just use .Offset(row, column) property
pupil = new Tuple<string, string>((string)match.Offset(0, -1).Value, (string)match.Offset(0, 1).Value);
}
//... code to do whatever with your results
}
internal static Range FindFirstOccurrenceInWorkbook(Workbook workbook, string searchTerm)
{
if (workbook == null) throw new ArgumentNullException("workbook");
if (searchTerm == null) throw new ArgumentNullException("searchTerm");
Sheets wss = workbook.Worksheets;
Range match = null;
foreach (Worksheet ws in wss)
{
Range cells = ws.Cells;
//Add more args as needed - this is just an example
match = cells.Find(
what: searchTerm,
after: Type.Missing,
lookIn: XlFindLookIn.xlFormulas,
lookAt: XlLookAt.xlPart);
System.Runtime.InteropServices.Marshal.ReleaseComObject(cells);
System.Runtime.InteropServices.Marshal.ReleaseComObject(ws);
if (match != null)
{
break;
}
}
System.Runtime.InteropServices.Marshal.ReleaseComObject(wss);
GC.Collect();
GC.WaitForPendingFinalizers();
GC.Collect();
return match;
}
Also, you can't use COM objects like you use without generating orphaned references -> this will prevent the Excel application from closing. You need to explicitly assign all object properties you want to use like Workbook.Worksheets to a variable then call System.Runtime.InteropServices.Marshal.ReleaseCOMObject(object) when the variable is about to go out of scope.

Related

how to column names of excel sheet into list dynamically using asp.net [duplicate]

How to read an Excel file using C#? I open an Excel file for reading and copy it to clipboard to search email format, but I don't know how to do it.
FileInfo finfo;
Excel.ApplicationClass ExcelObj = new Excel.ApplicationClass();
ExcelObj.Visible = false;
Excel.Workbook theWorkbook;
Excel.Worksheet worksheet;
if (listView1.Items.Count > 0)
{
foreach (ListViewItem s in listView1.Items)
{
finfo = new FileInfo(s.Text);
if (finfo.Extension == ".xls" || finfo.Extension == ".xlsx" || finfo.Extension == ".xlt" || finfo.Extension == ".xlsm" || finfo.Extension == ".csv")
{
theWorkbook = ExcelObj.Workbooks.Open(s.Text, 0, true, 5, "", "", true, Excel.XlPlatform.xlWindows, "\t", false, false, 0, true, false, false);
for (int count = 1; count <= theWorkbook.Sheets.Count; count++)
{
worksheet = (Excel.Worksheet)theWorkbook.Worksheets.get_Item(count);
worksheet.Activate();
worksheet.Visible = false;
worksheet.UsedRange.Cells.Select();
}
}
}
}
OK,
One of the more difficult concepts to grasp about Excel VSTO programming is that you don't refer to cells like an array, Worksheet[0][0] won't give you cell A1, it will error out on you. Even when you type into A1 when Excel is open, you are actually entering data into Range A1. Therefore you refer to cells as Named Ranges. Here's an example:
Excel.Worksheet sheet = workbook.Sheets["Sheet1"] as Excel.Worksheet;
Excel.Range range = sheet.get_Range("A1", Missing.Value)
You can now literally type:
range.Text // this will give you the text the user sees
range.Value2 // this will give you the actual value stored by Excel (without rounding)
If you want to do something like this:
Excel.Range range = sheet.get_Range("A1:A5", Missing.Value)
if (range1 != null)
foreach (Excel.Range r in range1)
{
string user = r.Text
string value = r.Value2
}
There might be a better way, but this has worked for me.
The reason you need to use Value2 and not Value is because the Value property is a parametrized and C# doesn't support them yet.
As for the cleanup code, i will post that when i get to work tomorrow, i don't have the code with me, but it's very boilerplate. You just close and release the objects in the reverse order you created them. You can't use a Using() block because the Excel.Application or Excel.Workbook doesn't implement IDisposable, and if you don't clean-up, you will be left with a hanging Excel objects in memory.
Note:
If you don't set the Visibility property Excel doesn't display, which can be disconcerting to your users, but if you want to just rip the data out, that is probably good enough
You could OleDb, that will work too.
I hope that gets you started, let me know if you need further clarification. I'll post a complete
here is a complete sample:
using System;
using System.IO;
using System.Reflection;
using NUnit.Framework;
using ExcelTools = Ms.Office;
using Excel = Microsoft.Office.Interop.Excel;
namespace Tests
{
[TestFixture]
public class ExcelSingle
{
[Test]
public void ProcessWorkbook()
{
string file = #"C:\Users\Chris\Desktop\TestSheet.xls";
Console.WriteLine(file);
Excel.Application excel = null;
Excel.Workbook wkb = null;
try
{
excel = new Excel.Application();
wkb = ExcelTools.OfficeUtil.OpenBook(excel, file);
Excel.Worksheet sheet = wkb.Sheets["Data"] as Excel.Worksheet;
Excel.Range range = null;
if (sheet != null)
range = sheet.get_Range("A1", Missing.Value);
string A1 = String.Empty;
if( range != null )
A1 = range.Text.ToString();
Console.WriteLine("A1 value: {0}", A1);
}
catch(Exception ex)
{
//if you need to handle stuff
Console.WriteLine(ex.Message);
}
finally
{
if (wkb != null)
ExcelTools.OfficeUtil.ReleaseRCM(wkb);
if (excel != null)
ExcelTools.OfficeUtil.ReleaseRCM(excel);
}
}
}
}
I'll post the functions from ExcelTools tomorrow, I don't have that code with me either.
Edit:
As promised, here are the Functions from ExcelTools you might need.
public static Excel.Workbook OpenBook(Excel.Application excelInstance, string fileName, bool readOnly, bool editable,
bool updateLinks) {
Excel.Workbook book = excelInstance.Workbooks.Open(
fileName, updateLinks, readOnly,
Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing,
Type.Missing, editable, Type.Missing, Type.Missing, Type.Missing,
Type.Missing, Type.Missing);
return book;
}
public static void ReleaseRCM(object o) {
try {
System.Runtime.InteropServices.Marshal.ReleaseComObject(o);
} catch {
} finally {
o = null;
}
}
To be frank, this stuff is much easier if you use VB.NET. It's in C# because I didn't write it. VB.NET does option parameters well, C# does not, hence the Type.Missing. Once you typed Type.Missing twice in a row, you run screaming from the room!
As for you question, you can try to following:
http://msdn.microsoft.com/en-us/library/microsoft.office.interop.excel.range.find(VS.80).aspx
I will post an example when I get back from my meeting... cheers
Edit: Here is an example
range = sheet.Cells.Find("Value to Find",
Type.Missing,
Type.Missing,
Type.Missing,
Type.Missing,
Excel.XlSearchDirection.xlNext,
Type.Missing,
Type.Missing, Type.Missing);
range.Text; //give you the value found
Here is another example inspired by this site:
range = sheet.Cells.Find("Value to find", Type.Missing, Type.Missing,Excel.XlLookAt.xlWhole,Excel.XlSearchOrder.xlByColumns,Excel.XlSearchDirection.xlNext,false, false, Type.Missing);
It helps to understand the parameters.
P.S. I'm one of those weird people who enjoys learning COM automation. All this code steamed from a tool I wrote for work which required me to process over 1000+ spreadsheets from the lab each Monday.
You can use Microsoft.Office.Interop.Excel assembly to process excel files.
Right click on your project and go to Add reference. Add the
Microsoft.Office.Interop.Excel assembly.
Include using
Microsoft.Office.Interop.Excel; to make use of assembly.
Here is the sample code:
using Microsoft.Office.Interop.Excel;
//create the Application object we can use in the member functions.
Microsoft.Office.Interop.Excel.Application _excelApp = new Microsoft.Office.Interop.Excel.Application();
_excelApp.Visible = true;
string fileName = "C:\\sampleExcelFile.xlsx";
//open the workbook
Workbook workbook = _excelApp.Workbooks.Open(fileName,
Type.Missing, Type.Missing, Type.Missing, Type.Missing,
Type.Missing, Type.Missing, Type.Missing, Type.Missing,
Type.Missing, Type.Missing, Type.Missing, Type.Missing,
Type.Missing, Type.Missing);
//select the first sheet
Worksheet worksheet = (Worksheet)workbook.Worksheets[1];
//find the used range in worksheet
Range excelRange = worksheet.UsedRange;
//get an object array of all of the cells in the worksheet (their values)
object[,] valueArray = (object[,])excelRange.get_Value(
XlRangeValueDataType.xlRangeValueDefault);
//access the cells
for (int row = 1; row <= worksheet.UsedRange.Rows.Count; ++row)
{
for (int col = 1; col <= worksheet.UsedRange.Columns.Count; ++col)
{
//access each cell
Debug.Print(valueArray[row, col].ToString());
}
}
//clean up stuffs
workbook.Close(false, Type.Missing, Type.Missing);
Marshal.ReleaseComObject(workbook);
_excelApp.Quit();
Marshal.FinalReleaseComObject(_excelApp);
Why don't you create OleDbConnection? There are a lot of available resources in the Internet. Here is an example
OleDbConnection con = new OleDbConnection("Provider=Microsoft.Jet.OLEDB.4.0;Data Source="+filename+";Extended Properties=Excel 8.0");
con.Open();
try
{
//Create Dataset and fill with imformation from the Excel Spreadsheet for easier reference
DataSet myDataSet = new DataSet();
OleDbDataAdapter myCommand = new OleDbDataAdapter(" SELECT * FROM ["+listname+"$]" , con);
myCommand.Fill(myDataSet);
con.Close();
richTextBox1.AppendText("\nDataSet Filled");
//Travers through each row in the dataset
foreach (DataRow myDataRow in myDataSet.Tables[0].Rows)
{
//Stores info in Datarow into an array
Object[] cells = myDataRow.ItemArray;
//Traverse through each array and put into object cellContent as type Object
//Using Object as for some reason the Dataset reads some blank value which
//causes a hissy fit when trying to read. By using object I can convert to
//String at a later point.
foreach (object cellContent in cells)
{
//Convert object cellContect into String to read whilst replacing Line Breaks with a defined character
string cellText = cellContent.ToString();
cellText = cellText.Replace("\n", "|");
//Read the string and put into Array of characters chars
richTextBox1.AppendText("\n"+cellText);
}
}
//Thread.Sleep(15000);
}
catch (Exception ex)
{
MessageBox.Show(ex.ToString());
//Thread.Sleep(15000);
}
finally
{
con.Close();
}
try
{
DataTable sheet1 = new DataTable("Excel Sheet");
OleDbConnectionStringBuilder csbuilder = new OleDbConnectionStringBuilder();
csbuilder.Provider = "Microsoft.ACE.OLEDB.12.0";
csbuilder.DataSource = fileLocation;
csbuilder.Add("Extended Properties", "Excel 12.0 Xml;HDR=YES");
string selectSql = #"SELECT * FROM [Sheet1$]";
using (OleDbConnection connection = new OleDbConnection(csbuilder.ConnectionString))
using (OleDbDataAdapter adapter = new OleDbDataAdapter(selectSql, connection))
{
connection.Open();
adapter.Fill(sheet1);
}
}
catch (Exception e)
{
Console.WriteLine(e.Message);
}
This worked for me. Please try it and let me know for queries.
First of all, it's important to know what you mean by "open an Excel file for reading and copy it to clipboard..."
This is very important because there are many ways you could do that depending just on what you intend to do. Let me explain:
If you want to read a set of data and copy that in the clipboard and you know the data format (e.g. column names), I suggest you use an OleDbConnection to open the file, this way you can treat the xls file content as a Database Table, so you can read data with SQL instruction and treat the data as you want.
If you want to do operations on the data with the Excel object model then open it in the way you began.
Some time it's possible to treat an xls file as a kind of csv file, there are tools like File Helpers which permit you to treat and open an xls file in a simple way by mapping a structure on an arbitrary object.
Another important point is in which Excel version the file is.
I have, unfortunately I say, a strong experience working with Office automation in all ways, even if bounded in concepts like Application Automation, Data Management and Plugins, and generally I suggest only as the last resort, to using Excel automation or Office automation to read data; just if there aren't better ways to accomplish that task.
Working with automation could be heavy in performance, in terms of resource cost, could involve in other issues related for example to security and more, and last but not at least, working with COM interop it's not so "free"..
So my suggestion is think and analyze the situation within your needs and then take the better way.
Here's a 2020 answer - if you don't need to support the older .xls format (so pre 2003) you could use either:
LightweightExcelReader to access specfic cells, or cursor through all the data in a spreadsheet.
or
ExcelToEnumerable if you want to map spreadsheet data to a list of objects.
Pros :
Performance - at the time of writing (the the fastest way to read an .xlsx file)[https://github.com/ChrisHodges/ExcelToEnumerable#performance].
Simplicity - less verbose than OLE DB or OpenXml
Cons:
Neither LightweightExcelReader nor ExcelToEnumerable support .xls files.
Disclaimer: I am the author of LightweightExcelReader and ExcelToEnumerable
Use Open XML.
Here is some code to process a spreadsheet with a specific tab or sheet name and dump it to something like CSV. (I chose a pipe instead of comma).
I wish it was easier to get the value from a cell, but I think this is what we are stuck with. You can see that I reference the MSDN documents where I got most of this code. That is what Microsoft recommends.
/// <summary>
/// Got code from: https://msdn.microsoft.com/en-us/library/office/gg575571.aspx
/// </summary>
[Test]
public void WriteOutExcelFile()
{
var fileName = "ExcelFiles\\File_With_Many_Tabs.xlsx";
var sheetName = "Submission Form"; // Existing tab name.
using (var document = SpreadsheetDocument.Open(fileName, isEditable: false))
{
var workbookPart = document.WorkbookPart;
var sheet = workbookPart.Workbook.Descendants<Sheet>().FirstOrDefault(s => s.Name == sheetName);
var worksheetPart = (WorksheetPart)(workbookPart.GetPartById(sheet.Id));
var sheetData = worksheetPart.Worksheet.Elements<SheetData>().First();
foreach (var row in sheetData.Elements<Row>())
{
foreach (var cell in row.Elements<Cell>())
{
Console.Write("|" + GetCellValue(cell, workbookPart));
}
Console.Write("\n");
}
}
}
/// <summary>
/// Got code from: https://msdn.microsoft.com/en-us/library/office/hh298534.aspx
/// </summary>
/// <param name="cell"></param>
/// <param name="workbookPart"></param>
/// <returns></returns>
private string GetCellValue(Cell cell, WorkbookPart workbookPart)
{
if (cell == null)
{
return null;
}
var value = cell.CellFormula != null
? cell.CellValue.InnerText
: cell.InnerText.Trim();
// If the cell represents an integer number, you are done.
// For dates, this code returns the serialized value that
// represents the date. The code handles strings and
// Booleans individually. For shared strings, the code
// looks up the corresponding value in the shared string
// table. For Booleans, the code converts the value into
// the words TRUE or FALSE.
if (cell.DataType == null)
{
return value;
}
switch (cell.DataType.Value)
{
case CellValues.SharedString:
// For shared strings, look up the value in the
// shared strings table.
var stringTable =
workbookPart.GetPartsOfType<SharedStringTablePart>()
.FirstOrDefault();
// If the shared string table is missing, something
// is wrong. Return the index that is in
// the cell. Otherwise, look up the correct text in
// the table.
if (stringTable != null)
{
value =
stringTable.SharedStringTable
.ElementAt(int.Parse(value)).InnerText;
}
break;
case CellValues.Boolean:
switch (value)
{
case "0":
value = "FALSE";
break;
default:
value = "TRUE";
break;
}
break;
}
return value;
}
Use OLEDB Connection to communicate with excel files. it gives better result
using System.Data.OleDb;
string physicalPath = "Your Excel file physical path";
OleDbCommand cmd = new OleDbCommand();
OleDbDataAdapter da = new OleDbDataAdapter();
DataSet ds = new DataSet();
String strNewPath = physicalPath;
String connString = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + strNewPath + ";Extended Properties=\"Excel 12.0;HDR=Yes;IMEX=2\"";
String query = "SELECT * FROM [Sheet1$]"; // You can use any different queries to get the data from the excel sheet
OleDbConnection conn = new OleDbConnection(connString);
if (conn.State == ConnectionState.Closed) conn.Open();
try
{
cmd = new OleDbCommand(query, conn);
da = new OleDbDataAdapter(cmd);
da.Fill(ds);
}
catch
{
// Exception Msg
}
finally
{
da.Dispose();
conn.Close();
}
The Output data will be stored in dataset, using the dataset object you can easily access the datas.
Hope this may helpful
Using OlebDB, we can read excel file in C#, easily, here is the code while working with Web-Form, where FileUpload1 is file uploading tool
string path = Server.MapPath("~/Uploads/");
if (!Directory.Exists(path))
{
Directory.CreateDirectory(path);
}
//get file path
filePath = path + Path.GetFileName(FileUpload1.FileName);
//get file extenstion
string extension = Path.GetExtension(FileUpload1.FileName);
//save file on "Uploads" folder of project
FileUpload1.SaveAs(filePath);
string conString = string.Empty;
//check file extension
switch (extension)
{
case ".xls": //Excel 97-03.
conString = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=Excel03ConString;Extended Properties='Excel 8.0;HDR=YES'";
break;
case ".xlsx": //Excel 07 and above.
conString = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=Excel07ConString;Extended Properties='Excel 8.0;HDR=YES'";
break;
}
//create datatable object
DataTable dt = new DataTable();
conString = string.Format(conString, filePath);
//Use OldDb to read excel
using (OleDbConnection connExcel = new OleDbConnection(conString))
{
using (OleDbCommand cmdExcel = new OleDbCommand())
{
using (OleDbDataAdapter odaExcel = new OleDbDataAdapter())
{
cmdExcel.Connection = connExcel;
//Get the name of First Sheet.
connExcel.Open();
DataTable dtExcelSchema;
dtExcelSchema = connExcel.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
string sheetName = dtExcelSchema.Rows[0]["TABLE_NAME"].ToString();
connExcel.Close();
//Read Data from First Sheet.
connExcel.Open();
cmdExcel.CommandText = "SELECT * From [" + sheetName + "]";
odaExcel.SelectCommand = cmdExcel;
odaExcel.Fill(dt);
connExcel.Close();
}
}
}
//bind datatable with GridView
GridView1.DataSource = dt;
GridView1.DataBind();
Source : https://qawithexperts.com/article/asp-net/read-excel-file-and-import-data-into-gridview-using-datatabl/209
Console application similar code example
https://qawithexperts.com/article/c-sharp/read-excel-file-in-c-console-application-example-using-oledb/168
If you need don't want to use OleDB, you can try https://github.com/ExcelDataReader/ExcelDataReader
which seems to have the ability to handle both formats (.xls and .xslx)
Excel File Reader & Writer Without Excel On u'r System
Download and add the dll for
NPOI u'r project.
Using this code to read a excel file.
using (FileStream file = new FileStream(filePath, FileMode.Open, FileAccess.Read, FileShare.ReadWrite))
{
XSSFWorkbook XSSFWorkbook = new XSSFWorkbook(file);
}
ISheet objxlWorkSheet = XSSFWorkbook.GetSheetAt(0);
int intRowCount = 1;
int intColumnCount = 0;
for (; ; )
{
IRow Row = objxlWorkSheet.GetRow(intRowCount);
if (Row != null)
{
ICell Cell = Row.GetCell(0);
ICell objCell = objxlWorkSheet.GetRow(intRowCount).GetCell(intColumnCount); }}
You can use ExcelDataReader see GitHub
You need to install nugets :
-ExcelDataReader
-ExcelDataReader.DataSet
using System;
using System.Collections.Generic;
using System.Data;
using System.Linq;
using System.IO;
using ExcelDataReader;
using System.Text;
/// <summary>
/// Excel parsing in this class is performed by using a common shareware Lib found on:
/// https://github.com/ExcelDataReader/ExcelDataReader
/// </summary>
public static class ExcelParser
{
/// <summary>
/// Load, read and get values from Excel sheet
/// </summary>
public static List<FileRow> GetExcelRows(string path, string sheetName, bool skipFirstLine)
{
if (File.Exists(path))
{
return GetValues(path, sheetName, skipFirstLine);
}
else
throw new Exception("The process cannot access the file");
}
/// <summary>
/// Parse sheet names from given Excel file.
/// </summary>
public static List<string> GetSheetNames(string path)
{
using (var stream = new FileStream(path, FileMode.Open, FileAccess.Read, FileShare.Read))
{
Encoding.RegisterProvider(CodePagesEncodingProvider.Instance);
using (var excelReader = GetExcelDataReader(path, stream))
{
var dataset = excelReader.AsDataSet(new ExcelDataSetConfiguration()
{
ConfigureDataTable = (_) => new ExcelDataTableConfiguration()
{
UseHeaderRow = true
}
});
var names = from DataTable table in dataset.Tables
select table.TableName;
return names.ToList();
}
}
}
/// <summary>
/// Parse values from Excel sheet and add to Rows collection.
/// </summary>
public static List<FileRow> GetValues(string path, string sheetName, bool skipFirstLine)
{
var rowItems = new List<FileRow>();
using (var stream = new FileStream(path, FileMode.Open, FileAccess.Read, FileShare.Read))
{
using (var excelReader = GetExcelDataReader(path, stream))
{
var dataset = excelReader.AsDataSet(new ExcelDataSetConfiguration()
{
ConfigureDataTable = (_) => new ExcelDataTableConfiguration()
{
UseHeaderRow = skipFirstLine
}
});
foreach (DataRow row in dataset.Tables[sheetName].Rows)
{
var rowItem = new FileRow();
foreach (var value in row.ItemArray)
rowItem.Values.Add(value);
rowItems.Add(rowItem);
}
}
}
return rowItems;
}
private static IExcelDataReader GetExcelDataReader(string path, Stream stream)
{
var extension = GetExtension(path);
switch (extension)
{
case "xls":
return ExcelReaderFactory.CreateBinaryReader(stream);
case "xlsx":
return ExcelReaderFactory.CreateOpenXmlReader(stream);
default:
throw new Exception(string.Format("'{0}' is not a valid Excel extension", extension));
}
}
private static string GetExtension(string path)
{
var extension = Path.GetExtension(path);
return extension == null ? null : extension.ToLower().Substring(1);
}
}
With this entity :
public class FileRow
{
public List<object> Values { get; set; }
public FileRow()
{
Values = new List<object>();
}
}
Use like that :
var txtPath = #"D:\Path\excelfile.xlsx";
var sheetNames = ExcelParser.GetSheetNames(txtPath);
var datas = ExcelParser.GetExcelRows(txtPath, sheetNames[0], true);
The recommended way to read Excel files on server side app is Open XML.
Sharing few links -
https://msdn.microsoft.com/en-us/library/office/hh298534.aspx
https://msdn.microsoft.com/en-us/library/office/ff478410.aspx
https://msdn.microsoft.com/en-us/library/office/cc823095.aspx
public void excelRead(string sheetName)
{
Excel.Application appExl = new Excel.Application();
Excel.Workbook workbook = null;
try
{
string methodName = "";
Excel.Worksheet NwSheet;
Excel.Range ShtRange;
//Opening Excel file(myData.xlsx)
appExl = new Excel.Application();
workbook = appExl.Workbooks.Open(sheetName, Missing.Value, ReadOnly: false);
NwSheet = (Excel.Worksheet)workbook.Sheets.get_Item(1);
ShtRange = NwSheet.UsedRange; //gives the used cells in sheet
int rCnt1 = 0;
int cCnt1 = 0;
for (rCnt1 = 1; rCnt1 <= ShtRange.Rows.Count; rCnt1++)
{
for (cCnt1 = 1; cCnt1 <= ShtRange.Columns.Count; cCnt1++)
{
if (Convert.ToString(NwSheet.Cells[rCnt1, cCnt1].Value2) == "Y")
{
methodName = NwSheet.Cells[rCnt1, cCnt1 - 2].Value2;
Type metdType = this.GetType();
MethodInfo mthInfo = metdType.GetMethod(methodName);
if (Convert.ToString(NwSheet.Cells[rCnt1, cCnt1 - 2].Value2) == "fn_AddNum" || Convert.ToString(NwSheet.Cells[rCnt1, cCnt1 - 2].Value2) == "fn_SubNum")
{
StaticVariable.intParam1 = Convert.ToInt32(NwSheet.Cells[rCnt1, cCnt1 + 3].Value2);
StaticVariable.intParam2 = Convert.ToInt32(NwSheet.Cells[rCnt1, cCnt1 + 4].Value2);
object[] mParam1 = new object[] { StaticVariable.intParam1, StaticVariable.intParam2 };
object result = mthInfo.Invoke(this, mParam1);
StaticVariable.intOutParam1 = Convert.ToInt32(result);
NwSheet.Cells[rCnt1, cCnt1 + 5].Value2 = Convert.ToString(StaticVariable.intOutParam1) != "" ? Convert.ToString(StaticVariable.intOutParam1) : String.Empty;
}
else
{
object[] mParam = new object[] { };
mthInfo.Invoke(this, mParam);
NwSheet.Cells[rCnt1, cCnt1 + 5].Value2 = StaticVariable.outParam1 != "" ? StaticVariable.outParam1 : String.Empty;
NwSheet.Cells[rCnt1, cCnt1 + 6].Value2 = StaticVariable.outParam2 != "" ? StaticVariable.outParam2 : String.Empty;
}
NwSheet.Cells[rCnt1, cCnt1 + 1].Value2 = StaticVariable.resultOut;
NwSheet.Cells[rCnt1, cCnt1 + 2].Value2 = StaticVariable.resultDescription;
}
else if (Convert.ToString(NwSheet.Cells[rCnt1, cCnt1].Value2) == "N")
{
MessageBox.Show("Result is No");
}
else if (Convert.ToString(NwSheet.Cells[rCnt1, cCnt1].Value2) == "EOF")
{
MessageBox.Show("End of File");
}
}
}
workbook.Save();
workbook.Close(true, Missing.Value, Missing.Value);
appExl.Quit();
System.Runtime.InteropServices.Marshal.FinalReleaseComObject(ShtRange);
System.Runtime.InteropServices.Marshal.FinalReleaseComObject(NwSheet);
System.Runtime.InteropServices.Marshal.FinalReleaseComObject(workbook);
System.Runtime.InteropServices.Marshal.FinalReleaseComObject(appExl);
}
catch (Exception)
{
workbook.Close(true, Missing.Value, Missing.Value);
}
finally
{
GC.Collect();
GC.WaitForPendingFinalizers();
System.Runtime.InteropServices.Marshal.CleanupUnusedObjectsInCurrentContext();
}
}
//code for reading excel data in datatable
public void testExcel(string sheetName)
{
try
{
MessageBox.Show(sheetName);
foreach(Process p in Process.GetProcessesByName("EXCEL"))
{
p.Kill();
}
//string fileName = "E:\\inputSheet";
Excel.Application oXL;
Workbook oWB;
Worksheet oSheet;
Range oRng;
// creat a Application object
oXL = new Excel.Application();
// get WorkBook object
oWB = oXL.Workbooks.Open(sheetName);
// get WorkSheet object
oSheet = (Microsoft.Office.Interop.Excel.Worksheet)oWB.Sheets[1];
System.Data.DataTable dt = new System.Data.DataTable();
//DataSet ds = new DataSet();
//ds.Tables.Add(dt);
DataRow dr;
StringBuilder sb = new StringBuilder();
int jValue = oSheet.UsedRange.Cells.Columns.Count;
int iValue = oSheet.UsedRange.Cells.Rows.Count;
// get data columns
for (int j = 1; j <= jValue; j++)
{
oRng = (Microsoft.Office.Interop.Excel.Range)oSheet.Cells[1, j];
string strValue = oRng.Text.ToString();
dt.Columns.Add(strValue, System.Type.GetType("System.String"));
}
//string colString = sb.ToString().Trim();
//string[] colArray = colString.Split(':');
// get data in cell
for (int i = 2; i <= iValue; i++)
{
dr = dt.NewRow();
for (int j = 1; j <= jValue; j++)
{
oRng = (Microsoft.Office.Interop.Excel.Range)oSheet.Cells[i, j];
string strValue = oRng.Text.ToString();
dr[j - 1] = strValue;
}
dt.Rows.Add(dr);
}
if(StaticVariable.dtExcel != null)
{
StaticVariable.dtExcel.Clear();
StaticVariable.dtExcel = dt.Copy();
}
else
StaticVariable.dtExcel = dt.Copy();
oWB.Close(true, Missing.Value, Missing.Value);
oXL.Quit();
MessageBox.Show(sheetName);
}
catch (Exception ex)
{
MessageBox.Show(ex.Message);
}
finally
{
}
}
//code for class initialize
public static void startTesting(TestContext context)
{
Playback.Initialize();
ReadExcel myClassObj = new ReadExcel();
string sheetName="";
StreamReader sr = new StreamReader(#"E:\SaveSheetName.txt");
sheetName = sr.ReadLine();
sr.Close();
myClassObj.excelRead(sheetName);
myClassObj.testExcel(sheetName);
}
//code for test initalize
public void runValidatonTest()
{
DataTable dtFinal = StaticVariable.dtExcel.Copy();
for (int i = 0; i < dtFinal.Rows.Count; i++)
{
if (TestContext.TestName == dtFinal.Rows[i][2].ToString() && dtFinal.Rows[i][3].ToString() == "Y" && dtFinal.Rows[i][4].ToString() == "TRUE")
{
MessageBox.Show(TestContext.TestName);
MessageBox.Show(dtFinal.Rows[i][2].ToString());
StaticVariable.runValidateResult = "true";
break;
}
}
//StaticVariable.dtExcel = dtFinal.Copy();
}
I'd recommend you to use Bytescout Spreadsheet.
https://bytescout.com/products/developer/spreadsheetsdk/bytescoutspreadsheetsdk.html
I tried it with Monodevelop in Unity3D and it is pretty straight forward. Check this sample code to see how the library works:
https://bytescout.com/products/developer/spreadsheetsdk/read-write-excel.html

export data in a list and array to excel using c#

I have a list of numbers in one class. I am trying to export the data in the list and the data in array[4,4] to Excel. I would like the list as one column with the header and the array as a table with headers (column name) and row names.
I did not find a proper solution to try
public void Data_to_Excel()
{
//start excel
NsExcel.Application excapp = new Microsoft.Office.Interop.Excel.Application();
//if you want to make excel visible
excapp.Visible = true;
//create a blank workbook
var workbook = excapp.Workbooks.Add(NsExcel.XlWBATemplate.xlWBATWorksheet);
//or open one - this is no pleasant, but yue're probably interested in the first parameter
string workbookPath = " filepath here ";
workbook = excapp.Workbooks.Open(workbookPath, 0, false, 5, "", "", false, Excel.XlPlatform.xlWindows, "", true, false, 0, true, false, false);
//Not done yet. You have to work on a specific sheet - note the cast
//You may not have any sheets at all. Then you have to add one with NsExcel.Worksheet.Add()
var sheet = (NsExcel.Worksheet)workbook.Sheets[1]; //indexing starts from 1
//do something usefull: you select now an individual cell
//var range = sheet.get_Range("A1", "A1");
//range.Value2 = "test"; //Value2 is not a typo
//now the list
string cellName;
int counter = 1;
foreach (var item in Sim_Obj.Bottom_Rows_Count_Stack)
{
cellName = "A" + counter.ToString();
var range = sheet.get_Range(cellName, cellName);
range.Value2 = item.ToString();
++counter;
}
workbook.SaveAs("Cash_Surge_Sim_Results.xlsx");
}
I have the 20 results in the rich text box in a list.
I have table layout data in an array[4,4]
I want the array data as a table form in Excel and
the list data as one column.

Selecting a range in a worksheet C#

I currently have the following code opening and reading in an excel spreadsheet:
var connectionString = string.Format("Provider=Microsoft.ACE.OLEDB.12.0;Data Source={0};Extended Properties=Excel 12.0;", fileNameTextBox.Text);
var queryString = String.Format("SELECT * FROM [{0}]",DETAILS_SHEET_NAME);
var adapter = new OleDbDataAdapter(queryString, connectionString);
var ds = new DataSet();
adapter.Fill(ds, DETAILS_SHEET_NAME);
DataTable data = ds.Tables[DETAILS_SHEET_NAME];
dataGridView1.DataSource = data;
dataGridView1.AutoResizeColumns(DataGridViewAutoSizeColumnsMode.AllCellsExceptHeader);
This is all good and well except I'm not interested in the first row (Possibly first two rows as row 2 is headers) of the worksheet. How can I modify the select Query to select a range like I would in excel?
I'm interested in reading in say columns A-N in rows all rows from 2 onwards that contain data.
I also need to access a couple of specific cells on a different worksheet, I assume I have to build another adaptor with a different query string for each of those cells?
Modify Select statement including just the columns you need instead of wildcard "*" like in the following example:
("SELECT Column1, Column2 FROM DETAILS_SHEET_NAME");
You can apply additional logic in order to remove unnecessary rows, for example, a "paging solution" (i.e. selecting rows from N to M) like the following one:
Assuming the Database Table "TBL_ITEM" contains two columns (fields) of interest: “Item” column, representing the unique ID and “Rank”, which is used for sorting in ascending order, the general paging problem is stated as following: Select N-rows from the table ordered by Rank offsetting (i.e. skipping) (M-N) rows:
SELECT TOP N Item,
Rank FROM (SELECT TOP M Rank, Item FROM TBL_ITEM ORDER BY Rank)
AS [SUB_TAB] ORDER BY Rank DESC
This solution and its extensions/samples are thoroughly discussed in my article Pure SQL solution to Database Table Paging (link: http://www.codeproject.com/Tips/441079/Pure-SQL-solution-to-Database-Table-Paging)
Finally, you can use a code snippet shown below in Listing 2 to export a content of DataTable object in Excel file with plenty of customization features that could be added to a code;
Listing 2. Export DataTable to Excel File (2007/2010):
internal static bool Export2Excel(DataTable dataTable, bool Interactive)
{
object misValue = System.Reflection.Missing.Value;
// Note: don't include Microsoft.Office.Interop.Excel in reference (using),
// it will cause ambiguity w/System.Data: both have DataTable obj
Microsoft.Office.Interop.Excel.Application _appExcel = null;
Microsoft.Office.Interop.Excel.Workbook _excelWorkbook = null;
Microsoft.Office.Interop.Excel.Worksheet _excelWorksheet = null;
try
{
// excel app object
_appExcel = new Microsoft.Office.Interop.Excel.Application();
// make it visible to User if Interactive flag is set
_appExcel.Visible = Interactive;
// excel workbook object added to app
_excelWorkbook = _appExcel.Workbooks.Add(misValue);
_excelWorksheet = _appExcel.ActiveWorkbook.ActiveSheet
as Microsoft.Office.Interop.Excel.Worksheet;
// column names row (range obj)
Microsoft.Office.Interop.Excel.Range _columnsNameRange;
_columnsNameRange = _excelWorksheet.get_Range("A1", misValue);
_columnsNameRange = _columnsNameRange.get_Resize(1, dataTable.Columns.Count);
// data range obj
Microsoft.Office.Interop.Excel.Range _dataRange;
_dataRange = _excelWorksheet.get_Range("A2", misValue);
_dataRange = _dataRange.get_Resize(dataTable.Rows.Count, dataTable.Columns.Count);
// column names array to be assigned to columnNameRange
string[] _arrColumnNames = new string[dataTable.Columns.Count];
// 2d-array of data to be assigned to _dataRange
string[,] _arrData = new string[dataTable.Rows.Count, dataTable.Columns.Count];
// populate both arrays: _arrColumnNames and _arrData
// note: 2d-array structured as row[idx=0], col[idx=1]
for (int i = 0; i < dataTable.Columns.Count; i++)
{
for (int j = 0; j < dataTable.Rows.Count; j++)
{
_arrColumnNames[i] = dataTable.Columns[i].ColumnName;
_arrData[j, i] = dataTable.Rows[j][i].ToString();
}
}
//assign column names array to _columnsNameRange obj
_columnsNameRange.set_Value(misValue, _arrColumnNames);
//assign data array to _dataRange obj
_dataRange.set_Value(misValue, _arrData);
// save and close if Interactive flag not set
if (!Interactive)
{
// Excel 2010 - "14.0"
// Excel 2007 - "12.0"
string _ver = _appExcel.Version;
string _fileName ="TableExport_" +
DateTime.Today.ToString("yyyy_MM_dd") + "-" +
DateTime.Now.ToString("hh_mm_ss");
// check version and select file extension
if (_ver == "14.0" || _ver == "12.0") { _fileName += ".xlsx";}
else { _fileName += ".xls"; }
// save and close Excel workbook
_excelWorkbook.Close(true, "{DRIVE LETTER}:\\" + _fileName, misValue);
}
return true;
}
catch (Exception ex) { throw; }
finally
{
// quit excel app process
if (_appExcel != null)
{
_appExcel.UserControl = false;
_appExcel.Quit();
_appExcel = null;
misValue = null;
}
}
}
You can simply ask for no headers. Modify your connection string, add HDR=No.
For your second issue, I found this post, Maybe you'll find it helpful.

Optimal way to Read an Excel file (.xls/.xlsx)

I know that there are different ways to read an Excel file:
Iterop
Oledb
Open Xml SDK
Compatibility is not a question because the program will be executed in a controlled environment.
My Requirement :
Read a file to a DataTable / CUstom Entities (I don't know how to make dynamic properties/fields to an object[column names will be variating in an Excel file])
Use DataTable/Custom Entities to perform some operations using its data.
Update DataTable with the results of the operations
Write it back to excel file.
Which would be simpler.
Also if possible advice me on custom Entities (adding properties/fields to an object dynamically)
Take a look at Linq-to-Excel. It's pretty neat.
var book = new LinqToExcel.ExcelQueryFactory(#"File.xlsx");
var query =
from row in book.Worksheet("Stock Entry")
let item = new
{
Code = row["Code"].Cast<string>(),
Supplier = row["Supplier"].Cast<string>(),
Ref = row["Ref"].Cast<string>(),
}
where item.Supplier == "Walmart"
select item;
It also allows for strongly-typed row access too.
I realize this question was asked nearly 7 years ago but it's still a top Google search result for certain keywords regarding importing excel data with C#, so I wanted to provide an alternative based on some recent tech developments.
Importing Excel data has become such a common task to my everyday duties, that I've streamlined the process and documented the method on my blog: best way to read excel file in c#.
I use NPOI because it can read/write Excel files without Microsoft Office installed and it doesn't use COM+ or any interops. That means it can work in the cloud!
But the real magic comes from pairing up with NPOI Mapper from Donny Tian because it allows me to map the Excel columns to properties in my C# classes without writing any code. It's beautiful.
Here is the basic idea:
I create a .net class that matches/maps the Excel columns I'm interested in:
class CustomExcelFormat
{
[Column("District")]
public int District { get; set; }
[Column("DM")]
public string FullName { get; set; }
[Column("Email Address")]
public string EmailAddress { get; set; }
[Column("Username")]
public string Username { get; set; }
public string FirstName
{
get
{
return Username.Split('.')[0];
}
}
public string LastName
{
get
{
return Username.Split('.')[1];
}
}
}
Notice, it allows me to map based on column name if I want to!
Then when I process the excel file all I need to do is something like this:
public void Execute(string localPath, int sheetIndex)
{
IWorkbook workbook;
using (FileStream file = new FileStream(localPath, FileMode.Open, FileAccess.Read))
{
workbook = WorkbookFactory.Create(file);
}
var importer = new Mapper(workbook);
var items = importer.Take<CustomExcelFormat>(sheetIndex);
foreach(var item in items)
{
var row = item.Value;
if (string.IsNullOrEmpty(row.EmailAddress))
continue;
UpdateUser(row);
}
DataContext.SaveChanges();
}
Now, admittedly, my code does not modify the Excel file itself. I am instead saving the data to a database using Entity Framework (that's why you see "UpdateUser" and "SaveChanges" in my example). But there is already a good discussion on SO about how to save/modify a file using NPOI.
Using OLE Query, it's quite simple (e.g. sheetName is Sheet1):
DataTable LoadWorksheetInDataTable(string fileName, string sheetName)
{
DataTable sheetData = new DataTable();
using (OleDbConnection conn = this.returnConnection(fileName))
{
conn.Open();
// retrieve the data using data adapter
OleDbDataAdapter sheetAdapter = new OleDbDataAdapter("select * from [" + sheetName + "$]", conn);
sheetAdapter.Fill(sheetData);
conn.Close();
}
return sheetData;
}
private OleDbConnection returnConnection(string fileName)
{
return new OleDbConnection("Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + fileName + "; Jet OLEDB:Engine Type=5;Extended Properties=\"Excel 8.0;\"");
}
For newer Excel versions:
return new OleDbConnection("Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + fileName + ";Extended Properties=Excel 12.0;");
You can also use Excel Data Reader an open source project on CodePlex. Its works really well to export data from Excel sheets.
The sample code given on the link specified:
FileStream stream = File.Open(filePath, FileMode.Open, FileAccess.Read);
//1. Reading from a binary Excel file ('97-2003 format; *.xls)
IExcelDataReader excelReader = ExcelReaderFactory.CreateBinaryReader(stream);
//...
//2. Reading from a OpenXml Excel file (2007 format; *.xlsx)
IExcelDataReader excelReader = ExcelReaderFactory.CreateOpenXmlReader(stream);
//...
//3. DataSet - The result of each spreadsheet will be created in the result.Tables
DataSet result = excelReader.AsDataSet();
//...
//4. DataSet - Create column names from first row
excelReader.IsFirstRowAsColumnNames = true;
DataSet result = excelReader.AsDataSet();
//5. Data Reader methods
while (excelReader.Read())
{
//excelReader.GetInt32(0);
}
//6. Free resources (IExcelDataReader is IDisposable)
excelReader.Close();
Reference: How do I import from Excel to a DataSet using Microsoft.Office.Interop.Excel?
Try to use this free way to this, https://freenetexcel.codeplex.com
Workbook workbook = new Workbook();
workbook.LoadFromFile(#"..\..\parts.xls",ExcelVersion.Version97to2003);
//Initialize worksheet
Worksheet sheet = workbook.Worksheets[0];
DataTable dataTable = sheet.ExportDataTable();
If you can restrict it to just (Open Office XML format) *.xlsx files, then probably the most popular library would be EPPLus.
Bonus is, there are no other dependencies. Just install using nuget:
Install-Package EPPlus
Try to use Aspose.cells library (not free, but trial is enough to read), it is quite good
Install-package Aspose.cells
There is sample code:
using Aspose.Cells;
using System;
namespace ExcelReader
{
class Program
{
static void Main(string[] args)
{
// Replace path for your file
readXLS(#"C:\MyExcelFile.xls"); // or "*.xlsx"
Console.ReadKey();
}
public static void readXLS(string PathToMyExcel)
{
//Open your template file.
Workbook wb = new Workbook(PathToMyExcel);
//Get the first worksheet.
Worksheet worksheet = wb.Worksheets[0];
//Get cells
Cells cells = worksheet.Cells;
// Get row and column count
int rowCount = cells.MaxDataRow;
int columnCount = cells.MaxDataColumn;
// Current cell value
string strCell = "";
Console.WriteLine(String.Format("rowCount={0}, columnCount={1}", rowCount, columnCount));
for (int row = 0; row <= rowCount; row++) // Numeration starts from 0 to MaxDataRow
{
for (int column = 0; column <= columnCount; column++) // Numeration starts from 0 to MaxDataColumn
{
strCell = "";
strCell = Convert.ToString(cells[row, column].Value);
if (String.IsNullOrEmpty(strCell))
{
continue;
}
else
{
// Do your staff here
Console.WriteLine(strCell);
}
}
}
}
}
}
Read from excel, modify and write back
/// <summary>
/// /Reads an excel file and converts it into dataset with each sheet as each table of the dataset
/// </summary>
/// <param name="filename"></param>
/// <param name="headers">If set to true the first row will be considered as headers</param>
/// <returns></returns>
public DataSet Import(string filename, bool headers = true)
{
var _xl = new Excel.Application();
var wb = _xl.Workbooks.Open(filename);
var sheets = wb.Sheets;
DataSet dataSet = null;
if (sheets != null && sheets.Count != 0)
{
dataSet = new DataSet();
foreach (var item in sheets)
{
var sheet = (Excel.Worksheet)item;
DataTable dt = null;
if (sheet != null)
{
dt = new DataTable();
var ColumnCount = ((Excel.Range)sheet.UsedRange.Rows[1, Type.Missing]).Columns.Count;
var rowCount = ((Excel.Range)sheet.UsedRange.Columns[1, Type.Missing]).Rows.Count;
for (int j = 0; j < ColumnCount; j++)
{
var cell = (Excel.Range)sheet.Cells[1, j + 1];
var column = new DataColumn(headers ? cell.Value : string.Empty);
dt.Columns.Add(column);
}
for (int i = 0; i < rowCount; i++)
{
var r = dt.NewRow();
for (int j = 0; j < ColumnCount; j++)
{
var cell = (Excel.Range)sheet.Cells[i + 1 + (headers ? 1 : 0), j + 1];
r[j] = cell.Value;
}
dt.Rows.Add(r);
}
}
dataSet.Tables.Add(dt);
}
}
_xl.Quit();
return dataSet;
}
public string Export(DataTable dt, bool headers = false)
{
var wb = _xl.Workbooks.Add();
var sheet = (Excel.Worksheet)wb.ActiveSheet;
//process columns
for (int i = 0; i < dt.Columns.Count; i++)
{
var col = dt.Columns[i];
//added columns to the top of sheet
var currentCell = (Excel.Range)sheet.Cells[1, i + 1];
currentCell.Value = col.ToString();
currentCell.Font.Bold = true;
//process rows
for (int j = 0; j < dt.Rows.Count; j++)
{
var row = dt.Rows[j];
//added rows to sheet
var cell = (Excel.Range)sheet.Cells[j + 1 + 1, i + 1];
cell.Value = row[i];
}
currentCell.EntireColumn.AutoFit();
}
var fileName="{somepath/somefile.xlsx}";
wb.SaveCopyAs(fileName);
_xl.Quit();
return fileName;
}
I used Office's NuGet Package: DocumentFormat.OpenXml and pieced together the code from that component's doc site.
With the below helper code, was similar in complexity to my other CSV file format parsing in that project...
public static async Task ImportXLSX(Stream stream, string sheetName) {
{
// This was necessary for my Blazor project, which used a BrowserFileStream object
MemoryStream ms = new MemoryStream();
await stream.CopyToAsync(ms);
using (var document = SpreadsheetDocument.Open(ms, false))
{
// Retrieve a reference to the workbook part.
WorkbookPart wbPart = document.WorkbookPart;
// Find the sheet with the supplied name, and then use that
// Sheet object to retrieve a reference to the first worksheet.
Sheet theSheet = wbPart?.Workbook.Descendants<Sheet>().Where(s => s?.Name == sheetName).FirstOrDefault();
// Throw an exception if there is no sheet.
if (theSheet == null)
{
throw new ArgumentException("sheetName");
}
WorksheetPart wsPart = (WorksheetPart)(wbPart.GetPartById(theSheet.Id));
// For shared strings, look up the value in the
// shared strings table.
var stringTable =
wbPart.GetPartsOfType<SharedStringTablePart>()
.FirstOrDefault();
// I needed to grab 4 cells from each row
// Starting at row 11, until the cell in column A is blank
int row = 11;
while (true) {
var accountNameCell = GetCell(wsPart, "A" + row.ToString());
var accountName = GetValue(accountNameCell, stringTable);
if (string.IsNullOrEmpty(accountName)) {
break;
}
var investmentNameCell = GetCell(wsPart, "B" + row.ToString());
var investmentName = GetValue(investmentNameCell, stringTable);
var symbolCell = GetCell(wsPart, "D" + row.ToString());
var symbol = GetValue(symbolCell, stringTable);
var marketValue = GetCell(wsPart, "J" + row.ToString()).InnerText;
// DO STUFF with data
row++;
}
}
}
private static string? GetValue(Cell cell, SharedStringTablePart stringTable) {
try {
return stringTable.SharedStringTable.ElementAt(int.Parse(cell.InnerText)).InnerText;
} catch (Exception) {
return null;
}
}
private static Cell GetCell(WorksheetPart wsPart, string cellReference) {
return wsPart.Worksheet.Descendants<Cell>().Where(c => c.CellReference.Value == cellReference)?.FirstOrDefault();
}

Using Excel OleDb to get sheet names IN SHEET ORDER

I'm using OleDb to read from an excel workbook with many sheets.
I need to read the sheet names, but I need them in the order they are defined in the spreadsheet; so If I have a file that looks like this;
|_____|_____|____|____|____|____|____|____|____|
|_____|_____|____|____|____|____|____|____|____|
|_____|_____|____|____|____|____|____|____|____|
\__GERMANY__/\__UK__/\__IRELAND__/
Then I need to get the dictionary
1="GERMANY",
2="UK",
3="IRELAND"
I've tried using OleDbConnection.GetOleDbSchemaTable(), and that gives me the list of names, but it alphabetically sorts them. The alpha-sort means I don't know which sheet number a particular name corresponds to. So I get;
GERMANY, IRELAND, UK
which has changed the order of UK and IRELAND.
The reason I need it to be sorted is that I have to let the user choose a range of data by name or index; they can ask for 'all the data from GERMANY to IRELAND' or 'data from sheet 1 to sheet 3'.
Any ideas would be greatly appreciated.
if I could use the office interop classes, this would be straightforward. Unfortunately, I can't because the interop classes don't work reliably in non-interactive environments such as windows services and ASP.NET sites, so I needed to use OLEDB.
Can you not just loop through the sheets from 0 to Count of names -1? that way you should get them in the correct order.
Edit
I noticed through the comments that there are a lot of concerns about using the Interop classes to retrieve the sheet names. Therefore here is an example using OLEDB to retrieve them:
/// <summary>
/// This method retrieves the excel sheet names from
/// an excel workbook.
/// </summary>
/// <param name="excelFile">The excel file.</param>
/// <returns>String[]</returns>
private String[] GetExcelSheetNames(string excelFile)
{
OleDbConnection objConn = null;
System.Data.DataTable dt = null;
try
{
// Connection String. Change the excel file to the file you
// will search.
String connString = "Provider=Microsoft.Jet.OLEDB.4.0;" +
"Data Source=" + excelFile + ";Extended Properties=Excel 8.0;";
// Create connection object by using the preceding connection string.
objConn = new OleDbConnection(connString);
// Open connection with the database.
objConn.Open();
// Get the data table containg the schema guid.
dt = objConn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
if(dt == null)
{
return null;
}
String[] excelSheets = new String[dt.Rows.Count];
int i = 0;
// Add the sheet name to the string array.
foreach(DataRow row in dt.Rows)
{
excelSheets[i] = row["TABLE_NAME"].ToString();
i++;
}
// Loop through all of the sheets if you want too...
for(int j=0; j < excelSheets.Length; j++)
{
// Query each excel sheet.
}
return excelSheets;
}
catch(Exception ex)
{
return null;
}
finally
{
// Clean up.
if(objConn != null)
{
objConn.Close();
objConn.Dispose();
}
if(dt != null)
{
dt.Dispose();
}
}
}
Extracted from Article on the CodeProject.
Since above code do not cover procedures for extracting list of sheet name for Excel 2007,following code will be applicable for both Excel(97-2003) and Excel 2007 too:
public List<string> ListSheetInExcel(string filePath)
{
OleDbConnectionStringBuilder sbConnection = new OleDbConnectionStringBuilder();
String strExtendedProperties = String.Empty;
sbConnection.DataSource = filePath;
if (Path.GetExtension(filePath).Equals(".xls"))//for 97-03 Excel file
{
sbConnection.Provider = "Microsoft.Jet.OLEDB.4.0";
strExtendedProperties = "Excel 8.0;HDR=Yes;IMEX=1";//HDR=ColumnHeader,IMEX=InterMixed
}
else if (Path.GetExtension(filePath).Equals(".xlsx")) //for 2007 Excel file
{
sbConnection.Provider = "Microsoft.ACE.OLEDB.12.0";
strExtendedProperties = "Excel 12.0;HDR=Yes;IMEX=1";
}
sbConnection.Add("Extended Properties",strExtendedProperties);
List<string> listSheet = new List<string>();
using (OleDbConnection conn = new OleDbConnection(sbConnection.ToString()))
{
conn.Open();
DataTable dtSheet = conn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
foreach (DataRow drSheet in dtSheet.Rows)
{
if (drSheet["TABLE_NAME"].ToString().Contains("$"))//checks whether row contains '_xlnm#_FilterDatabase' or sheet name(i.e. sheet name always ends with $ sign)
{
listSheet.Add(drSheet["TABLE_NAME"].ToString());
}
}
}
return listSheet;
}
Above function returns list of sheet in particular excel file for both excel type(97,2003,2007).
Can't find this in actual MSDN documentation, but a moderator in the forums said
I am afraid that OLEDB does not preserve the sheet order as they were in Excel
Excel Sheet Names in Sheet Order
Seems like this would be a common enough requirement that there would be a decent workaround.
This is short, fast, safe, and usable...
public static List<string> ToExcelsSheetList(string excelFilePath)
{
List<string> sheets = new List<string>();
using (OleDbConnection connection =
new OleDbConnection((excelFilePath.TrimEnd().ToLower().EndsWith("x"))
? "Provider=Microsoft.ACE.OLEDB.12.0;Data Source='" + excelFilePath + "';" + "Extended Properties='Excel 12.0 Xml;HDR=YES;'"
: "provider=Microsoft.Jet.OLEDB.4.0;Data Source='" + excelFilePath + "';Extended Properties=Excel 8.0;"))
{
connection.Open();
DataTable dt = connection.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
foreach (DataRow drSheet in dt.Rows)
if (drSheet["TABLE_NAME"].ToString().Contains("$"))
{
string s = drSheet["TABLE_NAME"].ToString();
sheets.Add(s.StartsWith("'")?s.Substring(1, s.Length - 3): s.Substring(0, s.Length - 1));
}
connection.Close();
}
return sheets;
}
Another way:
a xls(x) file is just a collection of *.xml files stored in a *.zip container.
unzip the file "app.xml" in the folder docProps.
<?xml version="1.0" encoding="UTF-8" standalone="true"?>
-<Properties xmlns:vt="http://schemas.openxmlformats.org/officeDocument/2006/docPropsVTypes" xmlns="http://schemas.openxmlformats.org/officeDocument/2006/extended-properties">
<TotalTime>0</TotalTime>
<Application>Microsoft Excel</Application>
<DocSecurity>0</DocSecurity>
<ScaleCrop>false</ScaleCrop>
-<HeadingPairs>
-<vt:vector baseType="variant" size="2">
-<vt:variant>
<vt:lpstr>Arbeitsblätter</vt:lpstr>
</vt:variant>
-<vt:variant>
<vt:i4>4</vt:i4>
</vt:variant>
</vt:vector>
</HeadingPairs>
-<TitlesOfParts>
-<vt:vector baseType="lpstr" size="4">
<vt:lpstr>Tabelle3</vt:lpstr>
<vt:lpstr>Tabelle4</vt:lpstr>
<vt:lpstr>Tabelle1</vt:lpstr>
<vt:lpstr>Tabelle2</vt:lpstr>
</vt:vector>
</TitlesOfParts>
<Company/>
<LinksUpToDate>false</LinksUpToDate>
<SharedDoc>false</SharedDoc>
<HyperlinksChanged>false</HyperlinksChanged>
<AppVersion>14.0300</AppVersion>
</Properties>
The file is a german file (Arbeitsblätter = worksheets).
The table names (Tabelle3 etc) are in the correct order. You just need to read these tags;)
regards
I have created the below function using the information provided in the answer from #kraeppy (https://stackoverflow.com/a/19930386/2617732). This requires the .net framework v4.5 to be used and requires a reference to System.IO.Compression. This only works for xlsx files and not for the older xls files.
using System.IO.Compression;
using System.Xml;
using System.Xml.Linq;
static IEnumerable<string> GetWorksheetNamesOrdered(string fileName)
{
//open the excel file
using (FileStream data = new FileStream(fileName, FileMode.Open))
{
//unzip
ZipArchive archive = new ZipArchive(data);
//select the correct file from the archive
ZipArchiveEntry appxmlFile = archive.Entries.SingleOrDefault(e => e.FullName == "docProps/app.xml");
//read the xml
XDocument xdoc = XDocument.Load(appxmlFile.Open());
//find the titles element
XElement titlesElement = xdoc.Descendants().Where(e => e.Name.LocalName == "TitlesOfParts").Single();
//extract the worksheet names
return titlesElement
.Elements().Where(e => e.Name.LocalName == "vector").Single()
.Elements().Where(e => e.Name.LocalName == "lpstr")
.Select(e => e.Value);
}
}
I like the idea of #deathApril to name the sheets as 1_Germany, 2_UK, 3_IRELAND. I also got your issue to do this rename for hundreds of sheets. If you don't have a problem to rename the sheet name then you can use this macro to do it for you. It will take less than seconds to rename all sheet names. unfortunately ODBC, OLEDB return the sheet name order by asc. There is no replacement for that. You have to either use COM or rename your name to be in the order.
Sub Macro1()
'
' Macro1 Macro
'
'
Dim i As Integer
For i = 1 To Sheets.Count
Dim prefix As String
prefix = i
If Len(prefix) < 4 Then
prefix = "000"
ElseIf Len(prefix) < 3 Then
prefix = "00"
ElseIf Len(prefix) < 2 Then
prefix = "0"
End If
Dim sheetName As String
sheetName = Sheets(i).Name
Dim names
names = Split(sheetName, "-")
If (UBound(names) > 0) And IsNumeric(names(0)) Then
'do nothing
Else
Sheets(i).Name = prefix & i & "-" & Sheets(i).Name
End If
Next
End Sub
UPDATE:
After reading #SidHoland comment regarding BIFF an idea flashed. The following steps can be done through code. Don't know if you really want to do that to get the sheet names in the same order. Let me know if you need help to do this through code.
1. Consider XLSX as a zip file. Rename *.xlsx into *.zip
2. Unzip
3. Go to unzipped folder root and open /docprops/app.xml
4. This xml contains the sheet name in the same order of what you see.
5. Parse the xml and get the sheet names
UPDATE:
Another solution - NPOI might be helpful here
http://npoi.codeplex.com/
FileStream file = new FileStream(#"yourexcelfilename", FileMode.Open, FileAccess.Read);
HSSFWorkbook hssfworkbook = new HSSFWorkbook(file);
for (int i = 0; i < hssfworkbook.NumberOfSheets; i++)
{
Console.WriteLine(hssfworkbook.GetSheetName(i));
}
file.Close();
This solution works for xls. I didn't try xlsx.
Thanks,
Esen
This worked for me. Stolen from here: How do you get the name of the first page of an excel workbook?
object opt = System.Reflection.Missing.Value;
Excel.Application app = new Microsoft.Office.Interop.Excel.Application();
Excel.Workbook workbook = app.Workbooks.Open(WorkBookToOpen,
opt, opt, opt, opt, opt, opt, opt,
opt, opt, opt, opt, opt, opt, opt);
Excel.Worksheet worksheet = workbook.Worksheets[1] as Microsoft.Office.Interop.Excel.Worksheet;
string firstSheetName = worksheet.Name;
Try this. Here is the code to get the sheet names in order.
private Dictionary<int, string> GetExcelSheetNames(string fileName)
{
Excel.Application _excel = null;
Excel.Workbook _workBook = null;
Dictionary<int, string> excelSheets = new Dictionary<int, string>();
try
{
object missing = Type.Missing;
object readOnly = true;
Excel.XlFileFormat.xlWorkbookNormal
_excel = new Excel.ApplicationClass();
_excel.Visible = false;
_workBook = _excel.Workbooks.Open(fileName, 0, readOnly, 5, missing,
missing, true, Excel.XlPlatform.xlWindows, "\\t", false, false, 0, true, true, missing);
if (_workBook != null)
{
int index = 0;
foreach (Excel.Worksheet sheet in _workBook.Sheets)
{
// Can get sheet names in order they are in workbook
excelSheets.Add(++index, sheet.Name);
}
}
}
catch (Exception e)
{
return null;
}
finally
{
if (_excel != null)
{
if (_workBook != null)
_workBook.Close(false, Type.Missing, Type.Missing);
_excel.Application.Quit();
}
_excel = null;
_workBook = null;
}
return excelSheets;
}
As per MSDN, In a case of spreadsheets inside of Excel it might not work because Excel files are not real databases. So you will be not able to get the sheets name in order of their visualization in workbook.
Code to get sheets name as per their visual appearance using interop:
Add reference to Microsoft Excel 12.0 Object Library.
Following code will give the sheets name in the actual order stored in workbook, not the sorted name.
Sample Code:
using Microsoft.Office.Interop.Excel;
string filename = "C:\\romil.xlsx";
object missing = System.Reflection.Missing.Value;
Microsoft.Office.Interop.Excel.Application excel = new Microsoft.Office.Interop.Excel.Application();
Microsoft.Office.Interop.Excel.Workbook wb =excel.Workbooks.Open(filename, missing, missing, missing, missing,missing, missing, missing, missing, missing, missing, missing, missing, missing, missing);
ArrayList sheetname = new ArrayList();
foreach (Microsoft.Office.Interop.Excel.Worksheet sheet in wb.Sheets)
{
sheetname.Add(sheet.Name);
}
I don't see any documentation that says the order in app.xml is guaranteed to be the order of the sheets. It PROBABLY is, but not according to the OOXML specification.
The workbook.xml file, on the other hand, includes the sheetId attribute, which does determine the sequence - from 1 to the number of sheets. This is according to the OOXML specification. workbook.xml is described as the place where the sequence of the sheets is kept.
So reading workbook.xml after it is extracted form the XLSX would be my recommendation. NOT app.xml. Instead of docProps/app.xml, use xl/workbook.xml and look at the element, as shown here -
`
<workbook xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships">
<fileVersion appName="xl" lastEdited="5" lowestEdited="5" rupBuild="9303" />
<workbookPr defaultThemeVersion="124226" />
- <bookViews>
<workbookView xWindow="120" yWindow="135" windowWidth="19035" windowHeight="8445" />
</bookViews>
- <sheets>
<sheet name="By song" sheetId="1" r:id="rId1" />
<sheet name="By actors" sheetId="2" r:id="rId2" />
<sheet name="By pit" sheetId="3" r:id="rId3" />
</sheets>
- <definedNames>
<definedName name="_xlnm._FilterDatabase" localSheetId="0" hidden="1">'By song'!$A$1:$O$59</definedName>
</definedNames>
<calcPr calcId="145621" />
</workbook>
`

Categories