transfer from c# to excel with multiple sheets - c#

I have a few different dictionaries with different categories of information and I need to output them all into an xls or csv file with multiple spreadsheets. Currently, I have to download each excel file for a specific date range individually and then copy and paste them together so they're on different sheets of the same file. Is there any way to download all of them together in one document? Currently, I use the following code to output their files:
writeCsvToStream(
organize.ToDictionary(k => k.Key, v => v.Value as IacTransmittal), writer
);
ms.Seek(0, SeekOrigin.Begin);
Response.Clear();
Response.AddHeader("Content-Disposition", "attachment; filename=" + fileName);
Response.AddHeader("Content-Length", ms.Length.ToString());
Response.ContentType = "application/octet-stream";
ms.CopyTo(Response.OutputStream);
Response.End();
where writeCsvToStream just creates the text for the individual file.

There are some different options you could use.
ADO.NET Excel driver - with this API you can populate data into Excel documents using SQL style syntax. Each worksheet in the workbook is a table, each column header in a worksheet is a column in that table etc.
Here is a code project article on the exporting to Excel using ADO.NET:
http://www.codeproject.com/Articles/567155/Work-with-MS-Excel-and-ADO-NET
The ADO.NET approach is safe to use in a multi-user, web app environment.
Use OpenXML to export the data
OpenXML is a schema definition for different types of documents and the later versions of Excel (the ones that use .xlsx, .xlsm etc. instead of just .xls) use this format for the documents. The OpenXML schema is huge and somewhat cumbersome, however you can do pretty much anything with it.
Here is a code project article on exporting data to Excel using OpenXML:
http://www.codeproject.com/Articles/692121/Csharp-Export-data-to-Excel-using-OpenXML-librarie
The OpenXML approach is safe to use in a multi-user, web app environment.
A third approach is to use COM automation which is the same as programmatically running an instance of the Excel desktop application and using COM to control the actions of that instance.
Here is an article on that topic:
http://support.microsoft.com/kb/302084
Note that this third approach (office automation) is not safe in a multi-user, web app environment. I.e. it should not be used on a server, only from standalone desktop applications.

If you're open to learning a new library, I highly recommend EPPlus.
I'm making a few assumptions here since you didn't post much code to translate, but an example of usage may look like this:
using OfficeOpenXml;
using OfficeOpenXml.Style;
public static void WriteXlsOutput(Dictionary<string, IacTransmittal> collection) //accepting one dictionary as a parameter
{
using (FileStream outFile = new FileStream("Example.xlsx", FileMode.Create))
{
using (ExcelPackage ePackage = new ExcelPackage(outFile))
{
//group the collection by date property on your class
foreach (IGrouping<DateTime, IacTransmittal> collectionByDate in collection
.OrderBy(i => i.Value.Date.Date)
.GroupBy(i => i.Value.Date.Date)) //assuming the property is named Date, using Date property of DateTIme so we only create new worksheets for individual days
{
ExcelWorksheet eWorksheet = ePackage.Workbook.Worksheets.Add(collectionByDate.Key.Date.ToString("yyyyMMdd")); //add a new worksheet for each unique day
Type iacType = typeof(IacTransmittal);
PropertyInfo[] iacProperties = iacType.GetProperties();
int colCount = iacProperties.Count(); //number of properties determines how many columns we need
//set column headers based on properties on your class
for (int col = 1; col <= colCount; col++)
{
eWorksheet.Cells[1, col].Value = iacProperties[col - 1].Name ; //assign the value of the cell to the name of the property
}
int rowCounter = 2;
foreach (IacTransmittal iacInfo in collectionByDate) //iterate over each instance of this class in this igrouping
{
int interiorColCount = 1;
foreach (PropertyInfo iacProp in iacProperties) //iterate over properties on the class
{
eWorksheet.Cells[rowCounter, interiorColCount].Value = iacProp.GetValue(iacInfo, null); //assign cell values by getting the value of each property in the class
interiorColCount++;
}
rowCounter++;
}
}
ePackage.Save();
}
}
}

Thanks for the ideas! I was eventually able to figure out the following
using Excel = Microsoft.Office.Interop.Excel;
Excel.Application ExcelApp = new Excel.Application();
Excel.Workbook ExcelWorkBook = null;
Excel.Worksheet ExcelWorkSheet = null;
ExcelApp.Visible = true;
ExcelWorkBook = ExcelApp.Workbooks.Add(Excel.XlWBATemplate.xlWBATWorksheet);
List<string> SheetNames = new List<string>()
{ "Sheet1", "Sheet2", "Sheet3", "Sheet4", "Sheet5", "Sheet6", "Sheet7"};
string [] headers = new string []
{ "Field 1", "Field 2", "Field 3", "Field 4", "Field 5" };
for (int i = 0; i < SheetNames.Count; i++)
ExcelWorkBook.Worksheets.Add(); //Adding New sheet in Excel Workbook
for (int k = 0; k < SheetNames.Count; k++ )
{
int r = 1; // Initialize Excel Row Start Position = 1
ExcelWorkSheet = ExcelWorkBook.Worksheets[k + 1];
//Writing Columns Name in Excel Sheet
for (int col = 1; col < headers.Length + 1; col++)
ExcelWorkSheet.Cells[r, col] = headers[col - 1];
r++;
switch (k)
{
case 0:
foreach (var kvp in Sheet1)
{
ExcelWorkSheet.Cells[r, 1] = kvp.Value.Field1;
ExcelWorkSheet.Cells[r, 2] = kvp.Value.Field2;
ExcelWorkSheet.Cells[r, 3] = kvp.Value.Field3;
ExcelWorkSheet.Cells[r, 4] = kvp.Value.Field4;
ExcelWorkSheet.Cells[r, 5] = kvp.Value.Field5;
r++;
}
break;
}
ExcelWorkSheet.Name = SheetNames[k];//Renaming the ExcelSheets
}
//Activate the first worksheet by default.
((Excel.Worksheet)ExcelApp.ActiveWorkbook.Sheets[1]).Activate();
//Save As the excel file.
ExcelApp.ActiveWorkbook.SaveCopyAs(#"out_My_Book1.xls");

Related

Reading docx file with table

I have a simple document with one table in it. I would like to read its cells content. I found many tutorials for writing, but none for reading.
I suppose I should enumerate sections, but how to know which contains a table?
var document = DocX.Create(#"mydoc.docx");
var s = document.GetSections();
foreach (var item in s)
{
}
I'm using the following namespace aliases:
using excel = Microsoft.Office.Interop.Excel;
using word = Microsoft.Office.Interop.Word;
You can specifically grab the tables using this code:
private void WordRunButton_Click(object sender, EventArgs e)
{
var excelApp = new excel.Application();
excel.Workbooks workbooks = excelApp.Workbooks;
var wordApp = new word.Application();
word.Documents documents = wordApp.Documents;
wordApp.Visible = false;
excelApp.Visible = false;
// You don't want your computer to actually load each one visibly; would ruin performance.
string[] fileDirectories = Directory.GetFiles("Some Directory", "*.doc*",
SearchOption.AllDirectories);
foreach (var item in fileDirectories)
{
word._Document document = documents.Open(item);
foreach (word.Table table in document.Tables)
{
string wordFile = item;
appendName = Path.GetFileNameWithoutExtension(wordFile) + " Table " + tableCount + ".xlsx";
//Not needed if you're not going to save each table individually
var workbook = excelApp.Workbooks.Add(1);
excel._Worksheet worksheet = (excel.Worksheet)workbook.Sheets[1];
for (int row = 1; row <= table.Rows.Count; row++)
{
for (int col = 1; col <= table.Columns.Count; col++)
{
var cell = table.Cell(row, col);
var range = cell.Range;
var text = range.Text;
var cleaned = excelApp.WorksheetFunction.Clean(text);
worksheet.Cells[row, col] = cleaned;
}
}
workbook.SaveAs(Path.Combine("Some Directory", Path.GetFileName(appendName)), excel.XlFileFormat.xlWorkbookDefault);
//Last arg can be whatever file extension you want
//just make sure it matches what you set above.
workbook.Close();
Marshal.ReleaseComObject(workbook);
tableCount++;
}
document.Close();
Marshal.ReleaseComObject(document);
}
//Microsoft apps are picky with memory. Make sure you close and release each instance once you're done with it.
//Failure to do so will result in many lingering apps in the background
excelApp.Application.Quit();
workbooks.Close();
excelApp.Quit();
Marshal.ReleaseComObject(workbooks);
Marshal.ReleaseComObject(excelApp);
wordApp.Application.Quit();
wordApp.Quit();
Marshal.ReleaseComObject(documents);
Marshal.ReleaseComObject(wordApp);
}
The document is the actual word document type (word.Document). Make sure you check for split cells if you have them!
Hope this helps!
If you only have one table in document it should be rather simple. Try this:
DocX doc = DocX.Load("C:\\Temp\\mydoc.docx");
Table t = doc.Table[0];
//read cell content
string someText = t.Rows[0].Cells[0].Paragraps[0].Text;
You can loop through table rows and table cells inside each row, and also through Paragraphs inside each Cells[i] if there are more paragraphs. You can do that with simple for loop:
for (int i = 0; i < t.Rows.Count; i++)
{
someText = t.Rows[i].Cells[0].Paragraphs[0].Text;
}
Hope it helps.

how to append data into excel file

designation table:
deg_no deg_name
1 XYZ
2 ABC
3 pqs
4 qwe
5 tyu
6 pqr
7 lkj
8 you
9 zzz
10 xxx
ds = cls.ReturnDataSet("RetriveData_Alias1",
new SqlParameter("#Field", "deg_no"),
new SqlParameter("#TblNm", "designation"));
for (int i = 0; i < ds.Tables[0].Rows.Count; i++)
{
DataSet ds1 = new DataSet();
ds1 = cls.ReturnDataSet("RetriveData_Alias1",
new SqlParameter("#Field", "user_id,user_name"),
new SqlParameter("#TblNm", "User_details"),
new SqlParameter("#WhereClause", "where deg_no ='" + ds.Tables[0].Rows[i]["deg_no"].ToString() + "' "));
}
this for loop will run until the deg_no=10 and gives the all the user details and it is gives the perfect output as i want
the output.
but i want to write these data into excel file : user_details.xls
for suppose when i=1, then it will give you
user_id user_name deg_no
1 xyz 1
and for i=2
user_id user_name deg_no
2 pqr 3
and so on...
but without create new file each time .
suppose on first loop it will insert record of user_id =1
then in second loop the details of user_id=2 is append in the same file without creating a new file.
how can i do that?
If you want to generate an Excel file from database data, a reporting solution is the most flexible approach. If on the other hand you want to add the data to an already existing file, you can manipulate Excel files using the Open XML SDK or a higher-level library like EPPlus. EPPlus is available as a NuGet package so you can easily add it to your project.
The project site of EPPlus contains various samples. Creating an Excel sheet can be as simple as :
FileInfo newFile = new FileInfo(outputDir.FullName + #"\sample6.xlsx");
ExcelPackage pck = new ExcelPackage(newFile);
//Add the Content sheet
var ws = pck.Workbook.Worksheets.Add("Content");
ws.Cells["B1"].Value = "Name";
ws.Cells["C1"].Value = "Size";
ws.Cells["D1"].Value = "Created";
ws.Cells["E1"].Value = "Last modified";
EPPlus also allows you to query Excel Sheets using LINQ and even directly convert IEnumerable collections to Excel Tables :
ws.Cells["A1"].LoadFromCollection(myList, true);
This will fill a table with the property values of the objects in myList.
EPPlus also has methods to read from DataTables and DataReaders. This allows you to read your DataSet as you already do, then add each Datatable to the sheet in the appropriate places, eg:
ws.Cells["A1"].LoadFromDataTable(ds1.Tables[0], true);
...
ws.Cells["A30"].LoadFromDataTable(ds1.Tables[0], true);
This way you can use EPPlus to generate quick&dirty Excel reports when you don't want to use a full reporting solution.
You Need to read about ASP.Net Report Viewer control
May this helping you
ASP.Net Report Viewer control Tutorial with example
or you can use
Export to EXCEL from Datatable in C#.Net
or You need Office Interop as a reference then lets do the coding
using Microsoft.Office.Interop.Excel;
public void DataSetsToExcel(List<DataSet> dataSets, string fileName)
{
Microsoft.Office.Interop.Excel.Application xlApp =
new Microsoft.Office.Interop.Excel.Application();
Workbook xlWorkbook = xlApp.Workbooks.Add(XlWBATemplate.xlWBATWorksheet);
Sheets xlSheets = null;
Worksheet xlWorksheet = null;
foreach (DataSet dataSet in dataSets)
{
System.Data.DataTable dataTable = dataSet.Tables[0];
int rowNo = dataTable.Rows.Count;
int columnNo = dataTable.Columns.Count;
int colIndex = 0;
//Create Excel Sheets
xlSheets = xlWorkbook.Sheets;
xlWorksheet = (Worksheet)xlSheets.Add(xlSheets[1],
Type.Missing, Type.Missing, Type.Missing);
xlWorksheet.Name = dataSet.DataSetName;
//Generate Field Names
foreach (DataColumn dataColumn in dataTable.Columns)
{
colIndex++;
xlApp.Cells[1, colIndex] = dataColumn.ColumnName;
}
object[,] objData = new object[rowNo, columnNo];
//Convert DataSet to Cell Data
for (int row = 0; row < rowNo; row++)
{
for (int col = 0; col < columnNo; col++)
{
objData[row, col] = dataTable.Rows[row][col];
}
}
//Add the Data
Range range = xlWorksheet.Range[xlApp.Cells[2, 1], xlApp.Cells[rowNo + 1, columnNo]];
range.Value2 = objData;
//Format Data Type of Columns
colIndex = 0;
foreach (DataColumn dataColumn in dataTable.Columns)
{
colIndex++;
string format = "#";
switch (dataColumn.DataType.Name)
{
case "Boolean":
break;
case "Byte":
break;
case "Char":
break;
case "DateTime":
format = "dd/mm/yyyy";
break;
case "Decimal":
format = "$* #,##0.00;[Red]-$* #,##0.00";
break;
case "Double":
break;
case "Int16":
format = "0";
break;
case "Int32":
format = "0";
break;
case "Int64":
format = "0";
break;
case "SByte":
break;
case "Single":
break;
case "TimeSpan":
break;
case "UInt16":
break;
case "UInt32":
break;
case "UInt64":
break;
default: //String
break;
}
//Format the Column accodring to Data Type
xlWorksheet.Range[xlApp.Cells[2, colIndex],
xlApp.Cells[rowNo + 1, colIndex]].NumberFormat = format;
}
}
//Remove the Default Worksheet
((Worksheet)xlApp.ActiveWorkbook.Sheets[xlApp.ActiveWorkbook.Sheets.Count]).Delete();
//Save
xlWorkbook.SaveAs(fileName,
System.Reflection.Missing.Value,
System.Reflection.Missing.Value,
System.Reflection.Missing.Value,
System.Reflection.Missing.Value,
System.Reflection.Missing.Value,
XlSaveAsAccessMode.xlNoChange,
System.Reflection.Missing.Value,
System.Reflection.Missing.Value,
System.Reflection.Missing.Value,
System.Reflection.Missing.Value,
System.Reflection.Missing.Value);
xlWorkbook.Close();
xlApp.Quit();
GC.Collect();
}
Take note you have to name your DataSet and that will be the name of the worksheet in Excel.
DataSet dataSet1 = new DataSet("My Data Set 1");
dataAdapter1.Fill(dataSet1);
DataSet dataSet2 = new DataSet("My Data Set 2");
dataAdapter1.Fill(dataSet2);
DataSet dataSet3 = new DataSet("My Data Set 3");
dataAdapter1.Fill(dataSet3);
List<DataSet> dataSets = new List<DataSet>();
dataSets.Add(dataSet1);
dataSets.Add(dataSet2);
dataSets.Add(dataSet3);
DataSetsToExcel(dataSets, "{Your File Name}")
to add Office Interop as a reference
following these steps:
On the Project menu, click Add Reference.
On the COM tab, locate Microsoft Excel Object Library, and then click Select. In Visual Studio 2010, locate Microsoft Excel --.- Object Library on the COM tab.
Click OK in the Add References dialog box to accept your selections. If you are prompted to generate wrappers for the libraries that you selected, click “Yes”.

Selecting a range in a worksheet C#

I currently have the following code opening and reading in an excel spreadsheet:
var connectionString = string.Format("Provider=Microsoft.ACE.OLEDB.12.0;Data Source={0};Extended Properties=Excel 12.0;", fileNameTextBox.Text);
var queryString = String.Format("SELECT * FROM [{0}]",DETAILS_SHEET_NAME);
var adapter = new OleDbDataAdapter(queryString, connectionString);
var ds = new DataSet();
adapter.Fill(ds, DETAILS_SHEET_NAME);
DataTable data = ds.Tables[DETAILS_SHEET_NAME];
dataGridView1.DataSource = data;
dataGridView1.AutoResizeColumns(DataGridViewAutoSizeColumnsMode.AllCellsExceptHeader);
This is all good and well except I'm not interested in the first row (Possibly first two rows as row 2 is headers) of the worksheet. How can I modify the select Query to select a range like I would in excel?
I'm interested in reading in say columns A-N in rows all rows from 2 onwards that contain data.
I also need to access a couple of specific cells on a different worksheet, I assume I have to build another adaptor with a different query string for each of those cells?
Modify Select statement including just the columns you need instead of wildcard "*" like in the following example:
("SELECT Column1, Column2 FROM DETAILS_SHEET_NAME");
You can apply additional logic in order to remove unnecessary rows, for example, a "paging solution" (i.e. selecting rows from N to M) like the following one:
Assuming the Database Table "TBL_ITEM" contains two columns (fields) of interest: “Item” column, representing the unique ID and “Rank”, which is used for sorting in ascending order, the general paging problem is stated as following: Select N-rows from the table ordered by Rank offsetting (i.e. skipping) (M-N) rows:
SELECT TOP N Item,
Rank FROM (SELECT TOP M Rank, Item FROM TBL_ITEM ORDER BY Rank)
AS [SUB_TAB] ORDER BY Rank DESC
This solution and its extensions/samples are thoroughly discussed in my article Pure SQL solution to Database Table Paging (link: http://www.codeproject.com/Tips/441079/Pure-SQL-solution-to-Database-Table-Paging)
Finally, you can use a code snippet shown below in Listing 2 to export a content of DataTable object in Excel file with plenty of customization features that could be added to a code;
Listing 2. Export DataTable to Excel File (2007/2010):
internal static bool Export2Excel(DataTable dataTable, bool Interactive)
{
object misValue = System.Reflection.Missing.Value;
// Note: don't include Microsoft.Office.Interop.Excel in reference (using),
// it will cause ambiguity w/System.Data: both have DataTable obj
Microsoft.Office.Interop.Excel.Application _appExcel = null;
Microsoft.Office.Interop.Excel.Workbook _excelWorkbook = null;
Microsoft.Office.Interop.Excel.Worksheet _excelWorksheet = null;
try
{
// excel app object
_appExcel = new Microsoft.Office.Interop.Excel.Application();
// make it visible to User if Interactive flag is set
_appExcel.Visible = Interactive;
// excel workbook object added to app
_excelWorkbook = _appExcel.Workbooks.Add(misValue);
_excelWorksheet = _appExcel.ActiveWorkbook.ActiveSheet
as Microsoft.Office.Interop.Excel.Worksheet;
// column names row (range obj)
Microsoft.Office.Interop.Excel.Range _columnsNameRange;
_columnsNameRange = _excelWorksheet.get_Range("A1", misValue);
_columnsNameRange = _columnsNameRange.get_Resize(1, dataTable.Columns.Count);
// data range obj
Microsoft.Office.Interop.Excel.Range _dataRange;
_dataRange = _excelWorksheet.get_Range("A2", misValue);
_dataRange = _dataRange.get_Resize(dataTable.Rows.Count, dataTable.Columns.Count);
// column names array to be assigned to columnNameRange
string[] _arrColumnNames = new string[dataTable.Columns.Count];
// 2d-array of data to be assigned to _dataRange
string[,] _arrData = new string[dataTable.Rows.Count, dataTable.Columns.Count];
// populate both arrays: _arrColumnNames and _arrData
// note: 2d-array structured as row[idx=0], col[idx=1]
for (int i = 0; i < dataTable.Columns.Count; i++)
{
for (int j = 0; j < dataTable.Rows.Count; j++)
{
_arrColumnNames[i] = dataTable.Columns[i].ColumnName;
_arrData[j, i] = dataTable.Rows[j][i].ToString();
}
}
//assign column names array to _columnsNameRange obj
_columnsNameRange.set_Value(misValue, _arrColumnNames);
//assign data array to _dataRange obj
_dataRange.set_Value(misValue, _arrData);
// save and close if Interactive flag not set
if (!Interactive)
{
// Excel 2010 - "14.0"
// Excel 2007 - "12.0"
string _ver = _appExcel.Version;
string _fileName ="TableExport_" +
DateTime.Today.ToString("yyyy_MM_dd") + "-" +
DateTime.Now.ToString("hh_mm_ss");
// check version and select file extension
if (_ver == "14.0" || _ver == "12.0") { _fileName += ".xlsx";}
else { _fileName += ".xls"; }
// save and close Excel workbook
_excelWorkbook.Close(true, "{DRIVE LETTER}:\\" + _fileName, misValue);
}
return true;
}
catch (Exception ex) { throw; }
finally
{
// quit excel app process
if (_appExcel != null)
{
_appExcel.UserControl = false;
_appExcel.Quit();
_appExcel = null;
misValue = null;
}
}
}
You can simply ask for no headers. Modify your connection string, add HDR=No.
For your second issue, I found this post, Maybe you'll find it helpful.

How to merge two Excel workbook into one workbook in C#?

Let us consider that I have two Excel files (Workbooks) in local. Each Excel workbook is having 3 worksheets.
Lets say WorkBook1 is having Sheet1, Sheet2, Sheet3
Workbook2 is having Sheet1, Sheet2, Sheet3.
So here I need to merge these two excel workbook into one and the new excel workbook that is let's say Workbook3 which will have total 6 worksheets (combination of workbook1 and workbook2).
I need the code that how to perform this operation in c# without using any third party tool. If the third party tool is free version then its fine.
An easier solution is to copy the worksheets themselves, and not their cells.
This method takes any number of excel file paths and copy them into a new file:
private static void MergeWorkbooks(string destinationFilePath, params string[] sourceFilePaths)
{
var app = new Application();
app.DisplayAlerts = false; // No prompt when overriding
// Create a new workbook (index=1) and open source workbooks (index=2,3,...)
Workbook destinationWb = app.Workbooks.Add();
foreach (var sourceFilePath in sourceFilePaths)
{
app.Workbooks.Add(sourceFilePath);
}
// Copy all worksheets
Worksheet after = destinationWb.Worksheets[1];
for (int wbIndex = app.Workbooks.Count; wbIndex >= 2; wbIndex--)
{
Workbook wb = app.Workbooks[wbIndex];
for (int wsIndex = wb.Worksheets.Count; wsIndex >= 1; wsIndex--)
{
Worksheet ws = wb.Worksheets[wsIndex];
ws.Copy(After: after);
}
}
// Close source documents before saving destination. Otherwise, save will fail
for (int wbIndex = 2; wbIndex <= app.Workbooks.Count; wbIndex++)
{
Workbook wb = app.Workbooks[wbIndex];
wb.Close();
}
// Delete default worksheet
after.Delete();
// Save new workbook
destinationWb.SaveAs(destinationFilePath);
destinationWb.Close();
app.Quit();
}
Edit: notice that you might want to Move method instead of Copy in case you have dependencies between the sheets, e.g. pivot table, charts, formulas, etc. Otherwise the data source will disconnect and any changes in one sheet won't effect the other.
Here's a working sample that joins two books into a new one, hope it will give you an idea:
using System;
using Excel = Microsoft.Office.Interop.Excel;
using System.Reflection;
namespace MergeWorkBooks
{
class Program
{
static void Main(string[] args)
{
Excel.Application app = new Excel.Application();
app.Visible = true;
app.Workbooks.Add("");
app.Workbooks.Add(#"c:\MyWork\WorkBook1.xls");
app.Workbooks.Add(#"c:\MyWork\WorkBook2.xls");
for (int i = 2; i <= app.Workbooks.Count; i++)
{
int count = app.Workbooks[i].Worksheets.Count;
app.Workbooks[i].Activate();
for (int j=1; j <= count; j++)
{
Excel._Worksheet ws = (Excel._Worksheet)app.Workbooks[i].Worksheets[j];
ws.Select(Type.Missing);
ws.Cells.Select();
Excel.Range sel = (Excel.Range)app.Selection;
sel.Copy(Type.Missing);
Excel._Worksheet sheet = (Excel._Worksheet)app.Workbooks[1].Worksheets.Add(
Type.Missing, Type.Missing, Type.Missing, Type.Missing
);
sheet.Paste(Type.Missing, Type.Missing);
}
}
}
}
}
You're looking for Office Autmation libraries in C#.
Here is a sample code to help you get started.
System.Data.Odbc.OdbcDataAdapter Odbcda;
//CSV File
strConnString = "Driver={Microsoft Text Driver (*.txt; *.csv)};Dbq=" + SourceLocation + ";Extensions=asc,csv,tab,txt;Persist Security Info=False";
sqlSelect = "select * from [" + filename + "]";
System.Data.Odbc.OdbcConnection conn = new System.Data.Odbc.OdbcConnection(strConnString.Trim());
conn.Open();
Odbcda = new System.Data.Odbc.OdbcDataAdapter(sqlSelect, conn);
Odbcda.Fill(ds, DataTable);
conn.Close();
This would read the contents of an excel file into a dataset.
Create multiple datasets like this and then do a merge.
Code taken directly from here.

Using Excel OleDb to get sheet names IN SHEET ORDER

I'm using OleDb to read from an excel workbook with many sheets.
I need to read the sheet names, but I need them in the order they are defined in the spreadsheet; so If I have a file that looks like this;
|_____|_____|____|____|____|____|____|____|____|
|_____|_____|____|____|____|____|____|____|____|
|_____|_____|____|____|____|____|____|____|____|
\__GERMANY__/\__UK__/\__IRELAND__/
Then I need to get the dictionary
1="GERMANY",
2="UK",
3="IRELAND"
I've tried using OleDbConnection.GetOleDbSchemaTable(), and that gives me the list of names, but it alphabetically sorts them. The alpha-sort means I don't know which sheet number a particular name corresponds to. So I get;
GERMANY, IRELAND, UK
which has changed the order of UK and IRELAND.
The reason I need it to be sorted is that I have to let the user choose a range of data by name or index; they can ask for 'all the data from GERMANY to IRELAND' or 'data from sheet 1 to sheet 3'.
Any ideas would be greatly appreciated.
if I could use the office interop classes, this would be straightforward. Unfortunately, I can't because the interop classes don't work reliably in non-interactive environments such as windows services and ASP.NET sites, so I needed to use OLEDB.
Can you not just loop through the sheets from 0 to Count of names -1? that way you should get them in the correct order.
Edit
I noticed through the comments that there are a lot of concerns about using the Interop classes to retrieve the sheet names. Therefore here is an example using OLEDB to retrieve them:
/// <summary>
/// This method retrieves the excel sheet names from
/// an excel workbook.
/// </summary>
/// <param name="excelFile">The excel file.</param>
/// <returns>String[]</returns>
private String[] GetExcelSheetNames(string excelFile)
{
OleDbConnection objConn = null;
System.Data.DataTable dt = null;
try
{
// Connection String. Change the excel file to the file you
// will search.
String connString = "Provider=Microsoft.Jet.OLEDB.4.0;" +
"Data Source=" + excelFile + ";Extended Properties=Excel 8.0;";
// Create connection object by using the preceding connection string.
objConn = new OleDbConnection(connString);
// Open connection with the database.
objConn.Open();
// Get the data table containg the schema guid.
dt = objConn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
if(dt == null)
{
return null;
}
String[] excelSheets = new String[dt.Rows.Count];
int i = 0;
// Add the sheet name to the string array.
foreach(DataRow row in dt.Rows)
{
excelSheets[i] = row["TABLE_NAME"].ToString();
i++;
}
// Loop through all of the sheets if you want too...
for(int j=0; j < excelSheets.Length; j++)
{
// Query each excel sheet.
}
return excelSheets;
}
catch(Exception ex)
{
return null;
}
finally
{
// Clean up.
if(objConn != null)
{
objConn.Close();
objConn.Dispose();
}
if(dt != null)
{
dt.Dispose();
}
}
}
Extracted from Article on the CodeProject.
Since above code do not cover procedures for extracting list of sheet name for Excel 2007,following code will be applicable for both Excel(97-2003) and Excel 2007 too:
public List<string> ListSheetInExcel(string filePath)
{
OleDbConnectionStringBuilder sbConnection = new OleDbConnectionStringBuilder();
String strExtendedProperties = String.Empty;
sbConnection.DataSource = filePath;
if (Path.GetExtension(filePath).Equals(".xls"))//for 97-03 Excel file
{
sbConnection.Provider = "Microsoft.Jet.OLEDB.4.0";
strExtendedProperties = "Excel 8.0;HDR=Yes;IMEX=1";//HDR=ColumnHeader,IMEX=InterMixed
}
else if (Path.GetExtension(filePath).Equals(".xlsx")) //for 2007 Excel file
{
sbConnection.Provider = "Microsoft.ACE.OLEDB.12.0";
strExtendedProperties = "Excel 12.0;HDR=Yes;IMEX=1";
}
sbConnection.Add("Extended Properties",strExtendedProperties);
List<string> listSheet = new List<string>();
using (OleDbConnection conn = new OleDbConnection(sbConnection.ToString()))
{
conn.Open();
DataTable dtSheet = conn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
foreach (DataRow drSheet in dtSheet.Rows)
{
if (drSheet["TABLE_NAME"].ToString().Contains("$"))//checks whether row contains '_xlnm#_FilterDatabase' or sheet name(i.e. sheet name always ends with $ sign)
{
listSheet.Add(drSheet["TABLE_NAME"].ToString());
}
}
}
return listSheet;
}
Above function returns list of sheet in particular excel file for both excel type(97,2003,2007).
Can't find this in actual MSDN documentation, but a moderator in the forums said
I am afraid that OLEDB does not preserve the sheet order as they were in Excel
Excel Sheet Names in Sheet Order
Seems like this would be a common enough requirement that there would be a decent workaround.
This is short, fast, safe, and usable...
public static List<string> ToExcelsSheetList(string excelFilePath)
{
List<string> sheets = new List<string>();
using (OleDbConnection connection =
new OleDbConnection((excelFilePath.TrimEnd().ToLower().EndsWith("x"))
? "Provider=Microsoft.ACE.OLEDB.12.0;Data Source='" + excelFilePath + "';" + "Extended Properties='Excel 12.0 Xml;HDR=YES;'"
: "provider=Microsoft.Jet.OLEDB.4.0;Data Source='" + excelFilePath + "';Extended Properties=Excel 8.0;"))
{
connection.Open();
DataTable dt = connection.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
foreach (DataRow drSheet in dt.Rows)
if (drSheet["TABLE_NAME"].ToString().Contains("$"))
{
string s = drSheet["TABLE_NAME"].ToString();
sheets.Add(s.StartsWith("'")?s.Substring(1, s.Length - 3): s.Substring(0, s.Length - 1));
}
connection.Close();
}
return sheets;
}
Another way:
a xls(x) file is just a collection of *.xml files stored in a *.zip container.
unzip the file "app.xml" in the folder docProps.
<?xml version="1.0" encoding="UTF-8" standalone="true"?>
-<Properties xmlns:vt="http://schemas.openxmlformats.org/officeDocument/2006/docPropsVTypes" xmlns="http://schemas.openxmlformats.org/officeDocument/2006/extended-properties">
<TotalTime>0</TotalTime>
<Application>Microsoft Excel</Application>
<DocSecurity>0</DocSecurity>
<ScaleCrop>false</ScaleCrop>
-<HeadingPairs>
-<vt:vector baseType="variant" size="2">
-<vt:variant>
<vt:lpstr>Arbeitsblätter</vt:lpstr>
</vt:variant>
-<vt:variant>
<vt:i4>4</vt:i4>
</vt:variant>
</vt:vector>
</HeadingPairs>
-<TitlesOfParts>
-<vt:vector baseType="lpstr" size="4">
<vt:lpstr>Tabelle3</vt:lpstr>
<vt:lpstr>Tabelle4</vt:lpstr>
<vt:lpstr>Tabelle1</vt:lpstr>
<vt:lpstr>Tabelle2</vt:lpstr>
</vt:vector>
</TitlesOfParts>
<Company/>
<LinksUpToDate>false</LinksUpToDate>
<SharedDoc>false</SharedDoc>
<HyperlinksChanged>false</HyperlinksChanged>
<AppVersion>14.0300</AppVersion>
</Properties>
The file is a german file (Arbeitsblätter = worksheets).
The table names (Tabelle3 etc) are in the correct order. You just need to read these tags;)
regards
I have created the below function using the information provided in the answer from #kraeppy (https://stackoverflow.com/a/19930386/2617732). This requires the .net framework v4.5 to be used and requires a reference to System.IO.Compression. This only works for xlsx files and not for the older xls files.
using System.IO.Compression;
using System.Xml;
using System.Xml.Linq;
static IEnumerable<string> GetWorksheetNamesOrdered(string fileName)
{
//open the excel file
using (FileStream data = new FileStream(fileName, FileMode.Open))
{
//unzip
ZipArchive archive = new ZipArchive(data);
//select the correct file from the archive
ZipArchiveEntry appxmlFile = archive.Entries.SingleOrDefault(e => e.FullName == "docProps/app.xml");
//read the xml
XDocument xdoc = XDocument.Load(appxmlFile.Open());
//find the titles element
XElement titlesElement = xdoc.Descendants().Where(e => e.Name.LocalName == "TitlesOfParts").Single();
//extract the worksheet names
return titlesElement
.Elements().Where(e => e.Name.LocalName == "vector").Single()
.Elements().Where(e => e.Name.LocalName == "lpstr")
.Select(e => e.Value);
}
}
I like the idea of #deathApril to name the sheets as 1_Germany, 2_UK, 3_IRELAND. I also got your issue to do this rename for hundreds of sheets. If you don't have a problem to rename the sheet name then you can use this macro to do it for you. It will take less than seconds to rename all sheet names. unfortunately ODBC, OLEDB return the sheet name order by asc. There is no replacement for that. You have to either use COM or rename your name to be in the order.
Sub Macro1()
'
' Macro1 Macro
'
'
Dim i As Integer
For i = 1 To Sheets.Count
Dim prefix As String
prefix = i
If Len(prefix) < 4 Then
prefix = "000"
ElseIf Len(prefix) < 3 Then
prefix = "00"
ElseIf Len(prefix) < 2 Then
prefix = "0"
End If
Dim sheetName As String
sheetName = Sheets(i).Name
Dim names
names = Split(sheetName, "-")
If (UBound(names) > 0) And IsNumeric(names(0)) Then
'do nothing
Else
Sheets(i).Name = prefix & i & "-" & Sheets(i).Name
End If
Next
End Sub
UPDATE:
After reading #SidHoland comment regarding BIFF an idea flashed. The following steps can be done through code. Don't know if you really want to do that to get the sheet names in the same order. Let me know if you need help to do this through code.
1. Consider XLSX as a zip file. Rename *.xlsx into *.zip
2. Unzip
3. Go to unzipped folder root and open /docprops/app.xml
4. This xml contains the sheet name in the same order of what you see.
5. Parse the xml and get the sheet names
UPDATE:
Another solution - NPOI might be helpful here
http://npoi.codeplex.com/
FileStream file = new FileStream(#"yourexcelfilename", FileMode.Open, FileAccess.Read);
HSSFWorkbook hssfworkbook = new HSSFWorkbook(file);
for (int i = 0; i < hssfworkbook.NumberOfSheets; i++)
{
Console.WriteLine(hssfworkbook.GetSheetName(i));
}
file.Close();
This solution works for xls. I didn't try xlsx.
Thanks,
Esen
This worked for me. Stolen from here: How do you get the name of the first page of an excel workbook?
object opt = System.Reflection.Missing.Value;
Excel.Application app = new Microsoft.Office.Interop.Excel.Application();
Excel.Workbook workbook = app.Workbooks.Open(WorkBookToOpen,
opt, opt, opt, opt, opt, opt, opt,
opt, opt, opt, opt, opt, opt, opt);
Excel.Worksheet worksheet = workbook.Worksheets[1] as Microsoft.Office.Interop.Excel.Worksheet;
string firstSheetName = worksheet.Name;
Try this. Here is the code to get the sheet names in order.
private Dictionary<int, string> GetExcelSheetNames(string fileName)
{
Excel.Application _excel = null;
Excel.Workbook _workBook = null;
Dictionary<int, string> excelSheets = new Dictionary<int, string>();
try
{
object missing = Type.Missing;
object readOnly = true;
Excel.XlFileFormat.xlWorkbookNormal
_excel = new Excel.ApplicationClass();
_excel.Visible = false;
_workBook = _excel.Workbooks.Open(fileName, 0, readOnly, 5, missing,
missing, true, Excel.XlPlatform.xlWindows, "\\t", false, false, 0, true, true, missing);
if (_workBook != null)
{
int index = 0;
foreach (Excel.Worksheet sheet in _workBook.Sheets)
{
// Can get sheet names in order they are in workbook
excelSheets.Add(++index, sheet.Name);
}
}
}
catch (Exception e)
{
return null;
}
finally
{
if (_excel != null)
{
if (_workBook != null)
_workBook.Close(false, Type.Missing, Type.Missing);
_excel.Application.Quit();
}
_excel = null;
_workBook = null;
}
return excelSheets;
}
As per MSDN, In a case of spreadsheets inside of Excel it might not work because Excel files are not real databases. So you will be not able to get the sheets name in order of their visualization in workbook.
Code to get sheets name as per their visual appearance using interop:
Add reference to Microsoft Excel 12.0 Object Library.
Following code will give the sheets name in the actual order stored in workbook, not the sorted name.
Sample Code:
using Microsoft.Office.Interop.Excel;
string filename = "C:\\romil.xlsx";
object missing = System.Reflection.Missing.Value;
Microsoft.Office.Interop.Excel.Application excel = new Microsoft.Office.Interop.Excel.Application();
Microsoft.Office.Interop.Excel.Workbook wb =excel.Workbooks.Open(filename, missing, missing, missing, missing,missing, missing, missing, missing, missing, missing, missing, missing, missing, missing);
ArrayList sheetname = new ArrayList();
foreach (Microsoft.Office.Interop.Excel.Worksheet sheet in wb.Sheets)
{
sheetname.Add(sheet.Name);
}
I don't see any documentation that says the order in app.xml is guaranteed to be the order of the sheets. It PROBABLY is, but not according to the OOXML specification.
The workbook.xml file, on the other hand, includes the sheetId attribute, which does determine the sequence - from 1 to the number of sheets. This is according to the OOXML specification. workbook.xml is described as the place where the sequence of the sheets is kept.
So reading workbook.xml after it is extracted form the XLSX would be my recommendation. NOT app.xml. Instead of docProps/app.xml, use xl/workbook.xml and look at the element, as shown here -
`
<workbook xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships">
<fileVersion appName="xl" lastEdited="5" lowestEdited="5" rupBuild="9303" />
<workbookPr defaultThemeVersion="124226" />
- <bookViews>
<workbookView xWindow="120" yWindow="135" windowWidth="19035" windowHeight="8445" />
</bookViews>
- <sheets>
<sheet name="By song" sheetId="1" r:id="rId1" />
<sheet name="By actors" sheetId="2" r:id="rId2" />
<sheet name="By pit" sheetId="3" r:id="rId3" />
</sheets>
- <definedNames>
<definedName name="_xlnm._FilterDatabase" localSheetId="0" hidden="1">'By song'!$A$1:$O$59</definedName>
</definedNames>
<calcPr calcId="145621" />
</workbook>
`

Categories