Selecting a range in a worksheet C# - c#

I currently have the following code opening and reading in an excel spreadsheet:
var connectionString = string.Format("Provider=Microsoft.ACE.OLEDB.12.0;Data Source={0};Extended Properties=Excel 12.0;", fileNameTextBox.Text);
var queryString = String.Format("SELECT * FROM [{0}]",DETAILS_SHEET_NAME);
var adapter = new OleDbDataAdapter(queryString, connectionString);
var ds = new DataSet();
adapter.Fill(ds, DETAILS_SHEET_NAME);
DataTable data = ds.Tables[DETAILS_SHEET_NAME];
dataGridView1.DataSource = data;
dataGridView1.AutoResizeColumns(DataGridViewAutoSizeColumnsMode.AllCellsExceptHeader);
This is all good and well except I'm not interested in the first row (Possibly first two rows as row 2 is headers) of the worksheet. How can I modify the select Query to select a range like I would in excel?
I'm interested in reading in say columns A-N in rows all rows from 2 onwards that contain data.
I also need to access a couple of specific cells on a different worksheet, I assume I have to build another adaptor with a different query string for each of those cells?

Modify Select statement including just the columns you need instead of wildcard "*" like in the following example:
("SELECT Column1, Column2 FROM DETAILS_SHEET_NAME");
You can apply additional logic in order to remove unnecessary rows, for example, a "paging solution" (i.e. selecting rows from N to M) like the following one:
Assuming the Database Table "TBL_ITEM" contains two columns (fields) of interest: “Item” column, representing the unique ID and “Rank”, which is used for sorting in ascending order, the general paging problem is stated as following: Select N-rows from the table ordered by Rank offsetting (i.e. skipping) (M-N) rows:
SELECT TOP N Item,
Rank FROM (SELECT TOP M Rank, Item FROM TBL_ITEM ORDER BY Rank)
AS [SUB_TAB] ORDER BY Rank DESC
This solution and its extensions/samples are thoroughly discussed in my article Pure SQL solution to Database Table Paging (link: http://www.codeproject.com/Tips/441079/Pure-SQL-solution-to-Database-Table-Paging)
Finally, you can use a code snippet shown below in Listing 2 to export a content of DataTable object in Excel file with plenty of customization features that could be added to a code;
Listing 2. Export DataTable to Excel File (2007/2010):
internal static bool Export2Excel(DataTable dataTable, bool Interactive)
{
object misValue = System.Reflection.Missing.Value;
// Note: don't include Microsoft.Office.Interop.Excel in reference (using),
// it will cause ambiguity w/System.Data: both have DataTable obj
Microsoft.Office.Interop.Excel.Application _appExcel = null;
Microsoft.Office.Interop.Excel.Workbook _excelWorkbook = null;
Microsoft.Office.Interop.Excel.Worksheet _excelWorksheet = null;
try
{
// excel app object
_appExcel = new Microsoft.Office.Interop.Excel.Application();
// make it visible to User if Interactive flag is set
_appExcel.Visible = Interactive;
// excel workbook object added to app
_excelWorkbook = _appExcel.Workbooks.Add(misValue);
_excelWorksheet = _appExcel.ActiveWorkbook.ActiveSheet
as Microsoft.Office.Interop.Excel.Worksheet;
// column names row (range obj)
Microsoft.Office.Interop.Excel.Range _columnsNameRange;
_columnsNameRange = _excelWorksheet.get_Range("A1", misValue);
_columnsNameRange = _columnsNameRange.get_Resize(1, dataTable.Columns.Count);
// data range obj
Microsoft.Office.Interop.Excel.Range _dataRange;
_dataRange = _excelWorksheet.get_Range("A2", misValue);
_dataRange = _dataRange.get_Resize(dataTable.Rows.Count, dataTable.Columns.Count);
// column names array to be assigned to columnNameRange
string[] _arrColumnNames = new string[dataTable.Columns.Count];
// 2d-array of data to be assigned to _dataRange
string[,] _arrData = new string[dataTable.Rows.Count, dataTable.Columns.Count];
// populate both arrays: _arrColumnNames and _arrData
// note: 2d-array structured as row[idx=0], col[idx=1]
for (int i = 0; i < dataTable.Columns.Count; i++)
{
for (int j = 0; j < dataTable.Rows.Count; j++)
{
_arrColumnNames[i] = dataTable.Columns[i].ColumnName;
_arrData[j, i] = dataTable.Rows[j][i].ToString();
}
}
//assign column names array to _columnsNameRange obj
_columnsNameRange.set_Value(misValue, _arrColumnNames);
//assign data array to _dataRange obj
_dataRange.set_Value(misValue, _arrData);
// save and close if Interactive flag not set
if (!Interactive)
{
// Excel 2010 - "14.0"
// Excel 2007 - "12.0"
string _ver = _appExcel.Version;
string _fileName ="TableExport_" +
DateTime.Today.ToString("yyyy_MM_dd") + "-" +
DateTime.Now.ToString("hh_mm_ss");
// check version and select file extension
if (_ver == "14.0" || _ver == "12.0") { _fileName += ".xlsx";}
else { _fileName += ".xls"; }
// save and close Excel workbook
_excelWorkbook.Close(true, "{DRIVE LETTER}:\\" + _fileName, misValue);
}
return true;
}
catch (Exception ex) { throw; }
finally
{
// quit excel app process
if (_appExcel != null)
{
_appExcel.UserControl = false;
_appExcel.Quit();
_appExcel = null;
misValue = null;
}
}
}

You can simply ask for no headers. Modify your connection string, add HDR=No.
For your second issue, I found this post, Maybe you'll find it helpful.

Related

Excel Exporting with Additional Worksheet

I'm exporting a DataSet to Excel (.xlsx) and for whatever reason, I'm getting an additional (unwanted) worksheet added to my workbook.
My dataTables are setup like so:
DataTable table1 = new DataTable();
table1 = CallReport("ReportName");
DataTable table2 = new DataTable();
table2 = CallReport("ReportName");
DataTable table3 = new DataTable();
table3 = CallReport("ReportName");
//add to DataSet
DataSet exportSet = new DataSet();
table3.TableName = "Combined";
exportSet.Tables.Add(table3);
table2.TableName = "Non-Services";
exportSet.Tables.Add(table2);
table1.TableName = "Services";
exportSet.Tables.Add(table1);
I'm then creating an Excel instance, then using foreach to loop through my DataSet and write the tables to the Excel workbook:
MSClass.CreateExcelFile();
foreach (DataTable table in exportSet.Tables)
{
MSClass.WriteToExcel(table, table.TableName);
}
MSClass.SaveExcelFile(exportPath);
Here is how I'm creating the Excel file, writing to it and saving it within my MSClass:
class MSClass
{
static Microsoft.Office.Interop.Excel.Application xl;
public static void CreateExcelFile()
{
//Creates new instance of Excel.Application()
xl = new Excel.Application();
//Adds new workbook to the application instance
xl.Workbooks.Add();
}
public static void SaveExcelFile(string filePath)
{
//Check to make sure the filePath isn't empty, and if it is, just show the file instance
if (filePath != null && filePath != "")
{
//in case there's an error
try
{
//Makes sure the sheet is in the active session
Excel._Worksheet sheet = xl.ActiveSheet;
//Stops from asking to overwrite the file.
//If I didn't want to overwrite I would be changing the file name everytime.
xl.DisplayAlerts = false;
//Saves the sheet to the file path
sheet.SaveAs(filePath);
//shows the final product
xl.Visible = true;
}
catch (Exception ex)
{
//throw up an Exception for any error and show the error message
throw new Exception("Export To Excel: Excel file could not be saved! Check filePath.\n" + ex.Message);
}
}
else //no file path is given
{
//Just show the current session of the Excel Application
xl.Visible = true;
}
}
public static void WriteToExcel(DataTable table, string sheetName)
{
//Adds new worksheet to workbook
Excel._Worksheet sheet = xl.ActiveWorkbook.Sheets[1];
//Gets count of columns within supplied DataTable
int colCount = table.Columns.Count;
//Creates object array for headers for each column
object[] header = new object[colCount];
//loop to add column names to header object array
for (int i = 0; i < colCount; i++)
{
//Adds column names to header object array
header[i] = table.Columns[i].ColumnName;
}
//Get range of headers
Excel.Range headerRange = sheet.get_Range((Excel.Range)(sheet.Cells[1, 1]), (Excel.Range)(sheet.Cells[1, colCount]));
//Applies the header to Excel file
headerRange.Value = header;
//Adds color to header to make more distinct
headerRange.Interior.Color = ColorTranslator.ToOle(Color.LightGray);
//Bolds the header
headerRange.Font.Bold = true;
//gets count of rows from DataTable
int rowCount = table.Rows.Count;
//Object multidimensional array for the cells within the dataTable
object[,] cells = new object[rowCount, colCount];
//Adds the cells from DataTable to cell Object array
for (int j = 0; j < rowCount; j++)
{
for (int i = 0; i < colCount; i++)
{
//Sets cell values to DataTable values
cells[j, i] = table.Rows[j][i];
}
}
//prints the cell values to Excel cell(s)
sheet.get_Range((Excel.Range)(sheet.Cells[2, 1]), (Excel.Range)(sheet.Cells[rowCount + 1, colCount])).Value = cells;
sheet.Name = sheetName;
sheet.Activate();
xl.Worksheets.Add(sheet);
}
}
I'm unsure what is currently causing the extra sheet to be exported. the sheets export in the order of: Sheet4 | Services | Non-Services | Combined
Where Sheet4 is completely blank, and the other sheets are exported in reversed order.
What is causing my extra (blank) sheet to be added to the workbook?

transfer from c# to excel with multiple sheets

I have a few different dictionaries with different categories of information and I need to output them all into an xls or csv file with multiple spreadsheets. Currently, I have to download each excel file for a specific date range individually and then copy and paste them together so they're on different sheets of the same file. Is there any way to download all of them together in one document? Currently, I use the following code to output their files:
writeCsvToStream(
organize.ToDictionary(k => k.Key, v => v.Value as IacTransmittal), writer
);
ms.Seek(0, SeekOrigin.Begin);
Response.Clear();
Response.AddHeader("Content-Disposition", "attachment; filename=" + fileName);
Response.AddHeader("Content-Length", ms.Length.ToString());
Response.ContentType = "application/octet-stream";
ms.CopyTo(Response.OutputStream);
Response.End();
where writeCsvToStream just creates the text for the individual file.
There are some different options you could use.
ADO.NET Excel driver - with this API you can populate data into Excel documents using SQL style syntax. Each worksheet in the workbook is a table, each column header in a worksheet is a column in that table etc.
Here is a code project article on the exporting to Excel using ADO.NET:
http://www.codeproject.com/Articles/567155/Work-with-MS-Excel-and-ADO-NET
The ADO.NET approach is safe to use in a multi-user, web app environment.
Use OpenXML to export the data
OpenXML is a schema definition for different types of documents and the later versions of Excel (the ones that use .xlsx, .xlsm etc. instead of just .xls) use this format for the documents. The OpenXML schema is huge and somewhat cumbersome, however you can do pretty much anything with it.
Here is a code project article on exporting data to Excel using OpenXML:
http://www.codeproject.com/Articles/692121/Csharp-Export-data-to-Excel-using-OpenXML-librarie
The OpenXML approach is safe to use in a multi-user, web app environment.
A third approach is to use COM automation which is the same as programmatically running an instance of the Excel desktop application and using COM to control the actions of that instance.
Here is an article on that topic:
http://support.microsoft.com/kb/302084
Note that this third approach (office automation) is not safe in a multi-user, web app environment. I.e. it should not be used on a server, only from standalone desktop applications.
If you're open to learning a new library, I highly recommend EPPlus.
I'm making a few assumptions here since you didn't post much code to translate, but an example of usage may look like this:
using OfficeOpenXml;
using OfficeOpenXml.Style;
public static void WriteXlsOutput(Dictionary<string, IacTransmittal> collection) //accepting one dictionary as a parameter
{
using (FileStream outFile = new FileStream("Example.xlsx", FileMode.Create))
{
using (ExcelPackage ePackage = new ExcelPackage(outFile))
{
//group the collection by date property on your class
foreach (IGrouping<DateTime, IacTransmittal> collectionByDate in collection
.OrderBy(i => i.Value.Date.Date)
.GroupBy(i => i.Value.Date.Date)) //assuming the property is named Date, using Date property of DateTIme so we only create new worksheets for individual days
{
ExcelWorksheet eWorksheet = ePackage.Workbook.Worksheets.Add(collectionByDate.Key.Date.ToString("yyyyMMdd")); //add a new worksheet for each unique day
Type iacType = typeof(IacTransmittal);
PropertyInfo[] iacProperties = iacType.GetProperties();
int colCount = iacProperties.Count(); //number of properties determines how many columns we need
//set column headers based on properties on your class
for (int col = 1; col <= colCount; col++)
{
eWorksheet.Cells[1, col].Value = iacProperties[col - 1].Name ; //assign the value of the cell to the name of the property
}
int rowCounter = 2;
foreach (IacTransmittal iacInfo in collectionByDate) //iterate over each instance of this class in this igrouping
{
int interiorColCount = 1;
foreach (PropertyInfo iacProp in iacProperties) //iterate over properties on the class
{
eWorksheet.Cells[rowCounter, interiorColCount].Value = iacProp.GetValue(iacInfo, null); //assign cell values by getting the value of each property in the class
interiorColCount++;
}
rowCounter++;
}
}
ePackage.Save();
}
}
}
Thanks for the ideas! I was eventually able to figure out the following
using Excel = Microsoft.Office.Interop.Excel;
Excel.Application ExcelApp = new Excel.Application();
Excel.Workbook ExcelWorkBook = null;
Excel.Worksheet ExcelWorkSheet = null;
ExcelApp.Visible = true;
ExcelWorkBook = ExcelApp.Workbooks.Add(Excel.XlWBATemplate.xlWBATWorksheet);
List<string> SheetNames = new List<string>()
{ "Sheet1", "Sheet2", "Sheet3", "Sheet4", "Sheet5", "Sheet6", "Sheet7"};
string [] headers = new string []
{ "Field 1", "Field 2", "Field 3", "Field 4", "Field 5" };
for (int i = 0; i < SheetNames.Count; i++)
ExcelWorkBook.Worksheets.Add(); //Adding New sheet in Excel Workbook
for (int k = 0; k < SheetNames.Count; k++ )
{
int r = 1; // Initialize Excel Row Start Position = 1
ExcelWorkSheet = ExcelWorkBook.Worksheets[k + 1];
//Writing Columns Name in Excel Sheet
for (int col = 1; col < headers.Length + 1; col++)
ExcelWorkSheet.Cells[r, col] = headers[col - 1];
r++;
switch (k)
{
case 0:
foreach (var kvp in Sheet1)
{
ExcelWorkSheet.Cells[r, 1] = kvp.Value.Field1;
ExcelWorkSheet.Cells[r, 2] = kvp.Value.Field2;
ExcelWorkSheet.Cells[r, 3] = kvp.Value.Field3;
ExcelWorkSheet.Cells[r, 4] = kvp.Value.Field4;
ExcelWorkSheet.Cells[r, 5] = kvp.Value.Field5;
r++;
}
break;
}
ExcelWorkSheet.Name = SheetNames[k];//Renaming the ExcelSheets
}
//Activate the first worksheet by default.
((Excel.Worksheet)ExcelApp.ActiveWorkbook.Sheets[1]).Activate();
//Save As the excel file.
ExcelApp.ActiveWorkbook.SaveCopyAs(#"out_My_Book1.xls");

Search entire workbook and fetch out values

In my excel workbook
The first sheet tab contains
Number Name Code Subject
100 Mark ABC Mathematics
101 John XYZ Physics
The second sheet tab contains
Number Name Code Subject
103 Mark DEF Chemistry
104 John GHI Biology
I want to pass the code(which is going to be unique) as a parameter and search the entire excel workbook
and fetch name and subject..
ie..select name,subject from myexcelworkbook where code='ABC'
I am able to get sheet names, column count etc. but not able to search thro' an entire excel workbook and get the required values
const string fileName="C:\\FileName.xls";
OleDbConnectionStringBuilder connectionStringBuilder = new OleDbConnectionStringBuilder();
connectionStringBuilder.Provider = "Microsoft.ACE.OLEDB.12.0";
connectionStringBuilder.DataSource = fileName;
connectionStringBuilder.Add("Mode", "Read");
const string extendedProperties = "Excel 12.0;IMEX=1;HDR=YES";
connectionStringBuilder.Add("Extended Properties", extendedProperties);
using (OleDbConnection objConn = new OleDbConnection(connectionStringBuilder.ConnectionString))
{
objConn.Open();
Microsoft.Office.Interop.Excel.Application xlsApp = new Microsoft.Office.Interop.Excel.Application();
Workbook wb = xlsApp.Workbooks.Open(fileName, 0, true, 5, "", "", true, XlPlatform.xlWindows, "\t", false, false, 0, true);
Sheets sheets = wb.Worksheets;
for (int i =1 ; i <= wb.Worksheets.Count; i++)
{
MessageBox.Show(wb.Sheets[i].Name.ToString()); - gives sheet names inside the workbook
}
Worksheet ws = (Worksheet)sheets.get_Item(1); - gives the elements of specified sheet tab
}
//To get elements inside a specific sheet of an excel workbook/get column names
OleDbCommand objCmdSelect = new OleDbCommand("SELECT * FROM [" + "Sheet1" + "$]", objConn);
OleDbDataAdapter objAdapter1 = new OleDbDataAdapter();
objAdapter1.SelectCommand = objCmdSelect;
DataSet objDataset1 = new DataSet();
objAdapter1.Fill(objDataset1);
string columnNames = string.Empty;
// For each DataTable, print the ColumnName. use dataset.Rows to iterate row data...
foreach (System.Data.DataTable table in objDataset1.Tables)
{
foreach (DataColumn column in table.Columns)
{
if (columnNames.Length > 0)
{
columnNames = columnNames + Environment.NewLine + column.ColumnName;
}
else
{
columnNames = column.ColumnName;
}
}
}
Can someone share some ideas, so that i can find out the unique data inside the excel workbook and fetch out the needed values based on that? thanks in advance.
If the workbooks are as structured as you indicate and you will be querying multiple students then, as #Andy G suggests, you might find it easiest to just get the data into some form of record set then you run your queries on it through Linq or SQL or whatever you prefer. As you aren't wishing to modify the Excel workbook at all this is probably a better approach.
Alternatively, you can use the Excel API, like you were also trying to use. You can enumerate through the worksheets, running a Find on each. A la:
internal void GetYourData()
{
//... code to get the relevant Workbook and relevant/new Excel Application
Tuple<string, string> pupil;
string searchTerm = "ABC";
//Get the cell of the match
Range match = FindFirstOccurrenceInWorkbook(workbook, searchTerm);
if (match != null)
{
//Do whatever - per your data structure, it is probably easiest to just use .Offset(row, column) property
pupil = new Tuple<string, string>((string)match.Offset(0, -1).Value, (string)match.Offset(0, 1).Value);
}
//... code to do whatever with your results
}
internal static Range FindFirstOccurrenceInWorkbook(Workbook workbook, string searchTerm)
{
if (workbook == null) throw new ArgumentNullException("workbook");
if (searchTerm == null) throw new ArgumentNullException("searchTerm");
Sheets wss = workbook.Worksheets;
Range match = null;
foreach (Worksheet ws in wss)
{
Range cells = ws.Cells;
//Add more args as needed - this is just an example
match = cells.Find(
what: searchTerm,
after: Type.Missing,
lookIn: XlFindLookIn.xlFormulas,
lookAt: XlLookAt.xlPart);
System.Runtime.InteropServices.Marshal.ReleaseComObject(cells);
System.Runtime.InteropServices.Marshal.ReleaseComObject(ws);
if (match != null)
{
break;
}
}
System.Runtime.InteropServices.Marshal.ReleaseComObject(wss);
GC.Collect();
GC.WaitForPendingFinalizers();
GC.Collect();
return match;
}
Also, you can't use COM objects like you use without generating orphaned references -> this will prevent the Excel application from closing. You need to explicitly assign all object properties you want to use like Workbook.Worksheets to a variable then call System.Runtime.InteropServices.Marshal.ReleaseCOMObject(object) when the variable is about to go out of scope.

Select specific cell in Excel and go to next row

Using C# how can I "jump" to a specific cell in an Excel spreadsheet and go to the next row? I need to populate an existing spreadsheet with data from a list. This is how I thought it would work:
Globals.LookupTable.Range["A2"].Select();
foreach (CFolderType ft in FolderTypes) {
Globals.LookupTable.Rows.Next.Value2 = ft.name;
}
This shall move down in column A and insert the values.
Something like this should do the work:
Microsoft.Office.Interop.Excel.Application excel = new Microsoft.Office.Interop.Excel.ApplicationClass();
excel.Workbooks.Add(System.Reflection.Missing.Value);
int rowIdx = 2;
foreach (CFolderType ft in FolderTypes)
{
Microsoft.Office.Interop.Excel.Range aCell = excel.get_Range("A" + rowIdx.ToString(), System.Reflection.Missing.Value);
aRange.Value = ft.name;
rowIdx++;
}

Using Excel OleDb to get sheet names IN SHEET ORDER

I'm using OleDb to read from an excel workbook with many sheets.
I need to read the sheet names, but I need them in the order they are defined in the spreadsheet; so If I have a file that looks like this;
|_____|_____|____|____|____|____|____|____|____|
|_____|_____|____|____|____|____|____|____|____|
|_____|_____|____|____|____|____|____|____|____|
\__GERMANY__/\__UK__/\__IRELAND__/
Then I need to get the dictionary
1="GERMANY",
2="UK",
3="IRELAND"
I've tried using OleDbConnection.GetOleDbSchemaTable(), and that gives me the list of names, but it alphabetically sorts them. The alpha-sort means I don't know which sheet number a particular name corresponds to. So I get;
GERMANY, IRELAND, UK
which has changed the order of UK and IRELAND.
The reason I need it to be sorted is that I have to let the user choose a range of data by name or index; they can ask for 'all the data from GERMANY to IRELAND' or 'data from sheet 1 to sheet 3'.
Any ideas would be greatly appreciated.
if I could use the office interop classes, this would be straightforward. Unfortunately, I can't because the interop classes don't work reliably in non-interactive environments such as windows services and ASP.NET sites, so I needed to use OLEDB.
Can you not just loop through the sheets from 0 to Count of names -1? that way you should get them in the correct order.
Edit
I noticed through the comments that there are a lot of concerns about using the Interop classes to retrieve the sheet names. Therefore here is an example using OLEDB to retrieve them:
/// <summary>
/// This method retrieves the excel sheet names from
/// an excel workbook.
/// </summary>
/// <param name="excelFile">The excel file.</param>
/// <returns>String[]</returns>
private String[] GetExcelSheetNames(string excelFile)
{
OleDbConnection objConn = null;
System.Data.DataTable dt = null;
try
{
// Connection String. Change the excel file to the file you
// will search.
String connString = "Provider=Microsoft.Jet.OLEDB.4.0;" +
"Data Source=" + excelFile + ";Extended Properties=Excel 8.0;";
// Create connection object by using the preceding connection string.
objConn = new OleDbConnection(connString);
// Open connection with the database.
objConn.Open();
// Get the data table containg the schema guid.
dt = objConn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
if(dt == null)
{
return null;
}
String[] excelSheets = new String[dt.Rows.Count];
int i = 0;
// Add the sheet name to the string array.
foreach(DataRow row in dt.Rows)
{
excelSheets[i] = row["TABLE_NAME"].ToString();
i++;
}
// Loop through all of the sheets if you want too...
for(int j=0; j < excelSheets.Length; j++)
{
// Query each excel sheet.
}
return excelSheets;
}
catch(Exception ex)
{
return null;
}
finally
{
// Clean up.
if(objConn != null)
{
objConn.Close();
objConn.Dispose();
}
if(dt != null)
{
dt.Dispose();
}
}
}
Extracted from Article on the CodeProject.
Since above code do not cover procedures for extracting list of sheet name for Excel 2007,following code will be applicable for both Excel(97-2003) and Excel 2007 too:
public List<string> ListSheetInExcel(string filePath)
{
OleDbConnectionStringBuilder sbConnection = new OleDbConnectionStringBuilder();
String strExtendedProperties = String.Empty;
sbConnection.DataSource = filePath;
if (Path.GetExtension(filePath).Equals(".xls"))//for 97-03 Excel file
{
sbConnection.Provider = "Microsoft.Jet.OLEDB.4.0";
strExtendedProperties = "Excel 8.0;HDR=Yes;IMEX=1";//HDR=ColumnHeader,IMEX=InterMixed
}
else if (Path.GetExtension(filePath).Equals(".xlsx")) //for 2007 Excel file
{
sbConnection.Provider = "Microsoft.ACE.OLEDB.12.0";
strExtendedProperties = "Excel 12.0;HDR=Yes;IMEX=1";
}
sbConnection.Add("Extended Properties",strExtendedProperties);
List<string> listSheet = new List<string>();
using (OleDbConnection conn = new OleDbConnection(sbConnection.ToString()))
{
conn.Open();
DataTable dtSheet = conn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
foreach (DataRow drSheet in dtSheet.Rows)
{
if (drSheet["TABLE_NAME"].ToString().Contains("$"))//checks whether row contains '_xlnm#_FilterDatabase' or sheet name(i.e. sheet name always ends with $ sign)
{
listSheet.Add(drSheet["TABLE_NAME"].ToString());
}
}
}
return listSheet;
}
Above function returns list of sheet in particular excel file for both excel type(97,2003,2007).
Can't find this in actual MSDN documentation, but a moderator in the forums said
I am afraid that OLEDB does not preserve the sheet order as they were in Excel
Excel Sheet Names in Sheet Order
Seems like this would be a common enough requirement that there would be a decent workaround.
This is short, fast, safe, and usable...
public static List<string> ToExcelsSheetList(string excelFilePath)
{
List<string> sheets = new List<string>();
using (OleDbConnection connection =
new OleDbConnection((excelFilePath.TrimEnd().ToLower().EndsWith("x"))
? "Provider=Microsoft.ACE.OLEDB.12.0;Data Source='" + excelFilePath + "';" + "Extended Properties='Excel 12.0 Xml;HDR=YES;'"
: "provider=Microsoft.Jet.OLEDB.4.0;Data Source='" + excelFilePath + "';Extended Properties=Excel 8.0;"))
{
connection.Open();
DataTable dt = connection.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
foreach (DataRow drSheet in dt.Rows)
if (drSheet["TABLE_NAME"].ToString().Contains("$"))
{
string s = drSheet["TABLE_NAME"].ToString();
sheets.Add(s.StartsWith("'")?s.Substring(1, s.Length - 3): s.Substring(0, s.Length - 1));
}
connection.Close();
}
return sheets;
}
Another way:
a xls(x) file is just a collection of *.xml files stored in a *.zip container.
unzip the file "app.xml" in the folder docProps.
<?xml version="1.0" encoding="UTF-8" standalone="true"?>
-<Properties xmlns:vt="http://schemas.openxmlformats.org/officeDocument/2006/docPropsVTypes" xmlns="http://schemas.openxmlformats.org/officeDocument/2006/extended-properties">
<TotalTime>0</TotalTime>
<Application>Microsoft Excel</Application>
<DocSecurity>0</DocSecurity>
<ScaleCrop>false</ScaleCrop>
-<HeadingPairs>
-<vt:vector baseType="variant" size="2">
-<vt:variant>
<vt:lpstr>Arbeitsblätter</vt:lpstr>
</vt:variant>
-<vt:variant>
<vt:i4>4</vt:i4>
</vt:variant>
</vt:vector>
</HeadingPairs>
-<TitlesOfParts>
-<vt:vector baseType="lpstr" size="4">
<vt:lpstr>Tabelle3</vt:lpstr>
<vt:lpstr>Tabelle4</vt:lpstr>
<vt:lpstr>Tabelle1</vt:lpstr>
<vt:lpstr>Tabelle2</vt:lpstr>
</vt:vector>
</TitlesOfParts>
<Company/>
<LinksUpToDate>false</LinksUpToDate>
<SharedDoc>false</SharedDoc>
<HyperlinksChanged>false</HyperlinksChanged>
<AppVersion>14.0300</AppVersion>
</Properties>
The file is a german file (Arbeitsblätter = worksheets).
The table names (Tabelle3 etc) are in the correct order. You just need to read these tags;)
regards
I have created the below function using the information provided in the answer from #kraeppy (https://stackoverflow.com/a/19930386/2617732). This requires the .net framework v4.5 to be used and requires a reference to System.IO.Compression. This only works for xlsx files and not for the older xls files.
using System.IO.Compression;
using System.Xml;
using System.Xml.Linq;
static IEnumerable<string> GetWorksheetNamesOrdered(string fileName)
{
//open the excel file
using (FileStream data = new FileStream(fileName, FileMode.Open))
{
//unzip
ZipArchive archive = new ZipArchive(data);
//select the correct file from the archive
ZipArchiveEntry appxmlFile = archive.Entries.SingleOrDefault(e => e.FullName == "docProps/app.xml");
//read the xml
XDocument xdoc = XDocument.Load(appxmlFile.Open());
//find the titles element
XElement titlesElement = xdoc.Descendants().Where(e => e.Name.LocalName == "TitlesOfParts").Single();
//extract the worksheet names
return titlesElement
.Elements().Where(e => e.Name.LocalName == "vector").Single()
.Elements().Where(e => e.Name.LocalName == "lpstr")
.Select(e => e.Value);
}
}
I like the idea of #deathApril to name the sheets as 1_Germany, 2_UK, 3_IRELAND. I also got your issue to do this rename for hundreds of sheets. If you don't have a problem to rename the sheet name then you can use this macro to do it for you. It will take less than seconds to rename all sheet names. unfortunately ODBC, OLEDB return the sheet name order by asc. There is no replacement for that. You have to either use COM or rename your name to be in the order.
Sub Macro1()
'
' Macro1 Macro
'
'
Dim i As Integer
For i = 1 To Sheets.Count
Dim prefix As String
prefix = i
If Len(prefix) < 4 Then
prefix = "000"
ElseIf Len(prefix) < 3 Then
prefix = "00"
ElseIf Len(prefix) < 2 Then
prefix = "0"
End If
Dim sheetName As String
sheetName = Sheets(i).Name
Dim names
names = Split(sheetName, "-")
If (UBound(names) > 0) And IsNumeric(names(0)) Then
'do nothing
Else
Sheets(i).Name = prefix & i & "-" & Sheets(i).Name
End If
Next
End Sub
UPDATE:
After reading #SidHoland comment regarding BIFF an idea flashed. The following steps can be done through code. Don't know if you really want to do that to get the sheet names in the same order. Let me know if you need help to do this through code.
1. Consider XLSX as a zip file. Rename *.xlsx into *.zip
2. Unzip
3. Go to unzipped folder root and open /docprops/app.xml
4. This xml contains the sheet name in the same order of what you see.
5. Parse the xml and get the sheet names
UPDATE:
Another solution - NPOI might be helpful here
http://npoi.codeplex.com/
FileStream file = new FileStream(#"yourexcelfilename", FileMode.Open, FileAccess.Read);
HSSFWorkbook hssfworkbook = new HSSFWorkbook(file);
for (int i = 0; i < hssfworkbook.NumberOfSheets; i++)
{
Console.WriteLine(hssfworkbook.GetSheetName(i));
}
file.Close();
This solution works for xls. I didn't try xlsx.
Thanks,
Esen
This worked for me. Stolen from here: How do you get the name of the first page of an excel workbook?
object opt = System.Reflection.Missing.Value;
Excel.Application app = new Microsoft.Office.Interop.Excel.Application();
Excel.Workbook workbook = app.Workbooks.Open(WorkBookToOpen,
opt, opt, opt, opt, opt, opt, opt,
opt, opt, opt, opt, opt, opt, opt);
Excel.Worksheet worksheet = workbook.Worksheets[1] as Microsoft.Office.Interop.Excel.Worksheet;
string firstSheetName = worksheet.Name;
Try this. Here is the code to get the sheet names in order.
private Dictionary<int, string> GetExcelSheetNames(string fileName)
{
Excel.Application _excel = null;
Excel.Workbook _workBook = null;
Dictionary<int, string> excelSheets = new Dictionary<int, string>();
try
{
object missing = Type.Missing;
object readOnly = true;
Excel.XlFileFormat.xlWorkbookNormal
_excel = new Excel.ApplicationClass();
_excel.Visible = false;
_workBook = _excel.Workbooks.Open(fileName, 0, readOnly, 5, missing,
missing, true, Excel.XlPlatform.xlWindows, "\\t", false, false, 0, true, true, missing);
if (_workBook != null)
{
int index = 0;
foreach (Excel.Worksheet sheet in _workBook.Sheets)
{
// Can get sheet names in order they are in workbook
excelSheets.Add(++index, sheet.Name);
}
}
}
catch (Exception e)
{
return null;
}
finally
{
if (_excel != null)
{
if (_workBook != null)
_workBook.Close(false, Type.Missing, Type.Missing);
_excel.Application.Quit();
}
_excel = null;
_workBook = null;
}
return excelSheets;
}
As per MSDN, In a case of spreadsheets inside of Excel it might not work because Excel files are not real databases. So you will be not able to get the sheets name in order of their visualization in workbook.
Code to get sheets name as per their visual appearance using interop:
Add reference to Microsoft Excel 12.0 Object Library.
Following code will give the sheets name in the actual order stored in workbook, not the sorted name.
Sample Code:
using Microsoft.Office.Interop.Excel;
string filename = "C:\\romil.xlsx";
object missing = System.Reflection.Missing.Value;
Microsoft.Office.Interop.Excel.Application excel = new Microsoft.Office.Interop.Excel.Application();
Microsoft.Office.Interop.Excel.Workbook wb =excel.Workbooks.Open(filename, missing, missing, missing, missing,missing, missing, missing, missing, missing, missing, missing, missing, missing, missing);
ArrayList sheetname = new ArrayList();
foreach (Microsoft.Office.Interop.Excel.Worksheet sheet in wb.Sheets)
{
sheetname.Add(sheet.Name);
}
I don't see any documentation that says the order in app.xml is guaranteed to be the order of the sheets. It PROBABLY is, but not according to the OOXML specification.
The workbook.xml file, on the other hand, includes the sheetId attribute, which does determine the sequence - from 1 to the number of sheets. This is according to the OOXML specification. workbook.xml is described as the place where the sequence of the sheets is kept.
So reading workbook.xml after it is extracted form the XLSX would be my recommendation. NOT app.xml. Instead of docProps/app.xml, use xl/workbook.xml and look at the element, as shown here -
`
<workbook xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships">
<fileVersion appName="xl" lastEdited="5" lowestEdited="5" rupBuild="9303" />
<workbookPr defaultThemeVersion="124226" />
- <bookViews>
<workbookView xWindow="120" yWindow="135" windowWidth="19035" windowHeight="8445" />
</bookViews>
- <sheets>
<sheet name="By song" sheetId="1" r:id="rId1" />
<sheet name="By actors" sheetId="2" r:id="rId2" />
<sheet name="By pit" sheetId="3" r:id="rId3" />
</sheets>
- <definedNames>
<definedName name="_xlnm._FilterDatabase" localSheetId="0" hidden="1">'By song'!$A$1:$O$59</definedName>
</definedNames>
<calcPr calcId="145621" />
</workbook>
`

Categories