Using Excel OleDb to get sheet names IN SHEET ORDER - c#

I'm using OleDb to read from an excel workbook with many sheets.
I need to read the sheet names, but I need them in the order they are defined in the spreadsheet; so If I have a file that looks like this;
|_____|_____|____|____|____|____|____|____|____|
|_____|_____|____|____|____|____|____|____|____|
|_____|_____|____|____|____|____|____|____|____|
\__GERMANY__/\__UK__/\__IRELAND__/
Then I need to get the dictionary
1="GERMANY",
2="UK",
3="IRELAND"
I've tried using OleDbConnection.GetOleDbSchemaTable(), and that gives me the list of names, but it alphabetically sorts them. The alpha-sort means I don't know which sheet number a particular name corresponds to. So I get;
GERMANY, IRELAND, UK
which has changed the order of UK and IRELAND.
The reason I need it to be sorted is that I have to let the user choose a range of data by name or index; they can ask for 'all the data from GERMANY to IRELAND' or 'data from sheet 1 to sheet 3'.
Any ideas would be greatly appreciated.
if I could use the office interop classes, this would be straightforward. Unfortunately, I can't because the interop classes don't work reliably in non-interactive environments such as windows services and ASP.NET sites, so I needed to use OLEDB.

Can you not just loop through the sheets from 0 to Count of names -1? that way you should get them in the correct order.
Edit
I noticed through the comments that there are a lot of concerns about using the Interop classes to retrieve the sheet names. Therefore here is an example using OLEDB to retrieve them:
/// <summary>
/// This method retrieves the excel sheet names from
/// an excel workbook.
/// </summary>
/// <param name="excelFile">The excel file.</param>
/// <returns>String[]</returns>
private String[] GetExcelSheetNames(string excelFile)
{
OleDbConnection objConn = null;
System.Data.DataTable dt = null;
try
{
// Connection String. Change the excel file to the file you
// will search.
String connString = "Provider=Microsoft.Jet.OLEDB.4.0;" +
"Data Source=" + excelFile + ";Extended Properties=Excel 8.0;";
// Create connection object by using the preceding connection string.
objConn = new OleDbConnection(connString);
// Open connection with the database.
objConn.Open();
// Get the data table containg the schema guid.
dt = objConn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
if(dt == null)
{
return null;
}
String[] excelSheets = new String[dt.Rows.Count];
int i = 0;
// Add the sheet name to the string array.
foreach(DataRow row in dt.Rows)
{
excelSheets[i] = row["TABLE_NAME"].ToString();
i++;
}
// Loop through all of the sheets if you want too...
for(int j=0; j < excelSheets.Length; j++)
{
// Query each excel sheet.
}
return excelSheets;
}
catch(Exception ex)
{
return null;
}
finally
{
// Clean up.
if(objConn != null)
{
objConn.Close();
objConn.Dispose();
}
if(dt != null)
{
dt.Dispose();
}
}
}
Extracted from Article on the CodeProject.

Since above code do not cover procedures for extracting list of sheet name for Excel 2007,following code will be applicable for both Excel(97-2003) and Excel 2007 too:
public List<string> ListSheetInExcel(string filePath)
{
OleDbConnectionStringBuilder sbConnection = new OleDbConnectionStringBuilder();
String strExtendedProperties = String.Empty;
sbConnection.DataSource = filePath;
if (Path.GetExtension(filePath).Equals(".xls"))//for 97-03 Excel file
{
sbConnection.Provider = "Microsoft.Jet.OLEDB.4.0";
strExtendedProperties = "Excel 8.0;HDR=Yes;IMEX=1";//HDR=ColumnHeader,IMEX=InterMixed
}
else if (Path.GetExtension(filePath).Equals(".xlsx")) //for 2007 Excel file
{
sbConnection.Provider = "Microsoft.ACE.OLEDB.12.0";
strExtendedProperties = "Excel 12.0;HDR=Yes;IMEX=1";
}
sbConnection.Add("Extended Properties",strExtendedProperties);
List<string> listSheet = new List<string>();
using (OleDbConnection conn = new OleDbConnection(sbConnection.ToString()))
{
conn.Open();
DataTable dtSheet = conn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
foreach (DataRow drSheet in dtSheet.Rows)
{
if (drSheet["TABLE_NAME"].ToString().Contains("$"))//checks whether row contains '_xlnm#_FilterDatabase' or sheet name(i.e. sheet name always ends with $ sign)
{
listSheet.Add(drSheet["TABLE_NAME"].ToString());
}
}
}
return listSheet;
}
Above function returns list of sheet in particular excel file for both excel type(97,2003,2007).

Can't find this in actual MSDN documentation, but a moderator in the forums said
I am afraid that OLEDB does not preserve the sheet order as they were in Excel
Excel Sheet Names in Sheet Order
Seems like this would be a common enough requirement that there would be a decent workaround.

This is short, fast, safe, and usable...
public static List<string> ToExcelsSheetList(string excelFilePath)
{
List<string> sheets = new List<string>();
using (OleDbConnection connection =
new OleDbConnection((excelFilePath.TrimEnd().ToLower().EndsWith("x"))
? "Provider=Microsoft.ACE.OLEDB.12.0;Data Source='" + excelFilePath + "';" + "Extended Properties='Excel 12.0 Xml;HDR=YES;'"
: "provider=Microsoft.Jet.OLEDB.4.0;Data Source='" + excelFilePath + "';Extended Properties=Excel 8.0;"))
{
connection.Open();
DataTable dt = connection.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
foreach (DataRow drSheet in dt.Rows)
if (drSheet["TABLE_NAME"].ToString().Contains("$"))
{
string s = drSheet["TABLE_NAME"].ToString();
sheets.Add(s.StartsWith("'")?s.Substring(1, s.Length - 3): s.Substring(0, s.Length - 1));
}
connection.Close();
}
return sheets;
}

Another way:
a xls(x) file is just a collection of *.xml files stored in a *.zip container.
unzip the file "app.xml" in the folder docProps.
<?xml version="1.0" encoding="UTF-8" standalone="true"?>
-<Properties xmlns:vt="http://schemas.openxmlformats.org/officeDocument/2006/docPropsVTypes" xmlns="http://schemas.openxmlformats.org/officeDocument/2006/extended-properties">
<TotalTime>0</TotalTime>
<Application>Microsoft Excel</Application>
<DocSecurity>0</DocSecurity>
<ScaleCrop>false</ScaleCrop>
-<HeadingPairs>
-<vt:vector baseType="variant" size="2">
-<vt:variant>
<vt:lpstr>Arbeitsblätter</vt:lpstr>
</vt:variant>
-<vt:variant>
<vt:i4>4</vt:i4>
</vt:variant>
</vt:vector>
</HeadingPairs>
-<TitlesOfParts>
-<vt:vector baseType="lpstr" size="4">
<vt:lpstr>Tabelle3</vt:lpstr>
<vt:lpstr>Tabelle4</vt:lpstr>
<vt:lpstr>Tabelle1</vt:lpstr>
<vt:lpstr>Tabelle2</vt:lpstr>
</vt:vector>
</TitlesOfParts>
<Company/>
<LinksUpToDate>false</LinksUpToDate>
<SharedDoc>false</SharedDoc>
<HyperlinksChanged>false</HyperlinksChanged>
<AppVersion>14.0300</AppVersion>
</Properties>
The file is a german file (Arbeitsblätter = worksheets).
The table names (Tabelle3 etc) are in the correct order. You just need to read these tags;)
regards

I have created the below function using the information provided in the answer from #kraeppy (https://stackoverflow.com/a/19930386/2617732). This requires the .net framework v4.5 to be used and requires a reference to System.IO.Compression. This only works for xlsx files and not for the older xls files.
using System.IO.Compression;
using System.Xml;
using System.Xml.Linq;
static IEnumerable<string> GetWorksheetNamesOrdered(string fileName)
{
//open the excel file
using (FileStream data = new FileStream(fileName, FileMode.Open))
{
//unzip
ZipArchive archive = new ZipArchive(data);
//select the correct file from the archive
ZipArchiveEntry appxmlFile = archive.Entries.SingleOrDefault(e => e.FullName == "docProps/app.xml");
//read the xml
XDocument xdoc = XDocument.Load(appxmlFile.Open());
//find the titles element
XElement titlesElement = xdoc.Descendants().Where(e => e.Name.LocalName == "TitlesOfParts").Single();
//extract the worksheet names
return titlesElement
.Elements().Where(e => e.Name.LocalName == "vector").Single()
.Elements().Where(e => e.Name.LocalName == "lpstr")
.Select(e => e.Value);
}
}

I like the idea of #deathApril to name the sheets as 1_Germany, 2_UK, 3_IRELAND. I also got your issue to do this rename for hundreds of sheets. If you don't have a problem to rename the sheet name then you can use this macro to do it for you. It will take less than seconds to rename all sheet names. unfortunately ODBC, OLEDB return the sheet name order by asc. There is no replacement for that. You have to either use COM or rename your name to be in the order.
Sub Macro1()
'
' Macro1 Macro
'
'
Dim i As Integer
For i = 1 To Sheets.Count
Dim prefix As String
prefix = i
If Len(prefix) < 4 Then
prefix = "000"
ElseIf Len(prefix) < 3 Then
prefix = "00"
ElseIf Len(prefix) < 2 Then
prefix = "0"
End If
Dim sheetName As String
sheetName = Sheets(i).Name
Dim names
names = Split(sheetName, "-")
If (UBound(names) > 0) And IsNumeric(names(0)) Then
'do nothing
Else
Sheets(i).Name = prefix & i & "-" & Sheets(i).Name
End If
Next
End Sub
UPDATE:
After reading #SidHoland comment regarding BIFF an idea flashed. The following steps can be done through code. Don't know if you really want to do that to get the sheet names in the same order. Let me know if you need help to do this through code.
1. Consider XLSX as a zip file. Rename *.xlsx into *.zip
2. Unzip
3. Go to unzipped folder root and open /docprops/app.xml
4. This xml contains the sheet name in the same order of what you see.
5. Parse the xml and get the sheet names
UPDATE:
Another solution - NPOI might be helpful here
http://npoi.codeplex.com/
FileStream file = new FileStream(#"yourexcelfilename", FileMode.Open, FileAccess.Read);
HSSFWorkbook hssfworkbook = new HSSFWorkbook(file);
for (int i = 0; i < hssfworkbook.NumberOfSheets; i++)
{
Console.WriteLine(hssfworkbook.GetSheetName(i));
}
file.Close();
This solution works for xls. I didn't try xlsx.
Thanks,
Esen

This worked for me. Stolen from here: How do you get the name of the first page of an excel workbook?
object opt = System.Reflection.Missing.Value;
Excel.Application app = new Microsoft.Office.Interop.Excel.Application();
Excel.Workbook workbook = app.Workbooks.Open(WorkBookToOpen,
opt, opt, opt, opt, opt, opt, opt,
opt, opt, opt, opt, opt, opt, opt);
Excel.Worksheet worksheet = workbook.Worksheets[1] as Microsoft.Office.Interop.Excel.Worksheet;
string firstSheetName = worksheet.Name;

Try this. Here is the code to get the sheet names in order.
private Dictionary<int, string> GetExcelSheetNames(string fileName)
{
Excel.Application _excel = null;
Excel.Workbook _workBook = null;
Dictionary<int, string> excelSheets = new Dictionary<int, string>();
try
{
object missing = Type.Missing;
object readOnly = true;
Excel.XlFileFormat.xlWorkbookNormal
_excel = new Excel.ApplicationClass();
_excel.Visible = false;
_workBook = _excel.Workbooks.Open(fileName, 0, readOnly, 5, missing,
missing, true, Excel.XlPlatform.xlWindows, "\\t", false, false, 0, true, true, missing);
if (_workBook != null)
{
int index = 0;
foreach (Excel.Worksheet sheet in _workBook.Sheets)
{
// Can get sheet names in order they are in workbook
excelSheets.Add(++index, sheet.Name);
}
}
}
catch (Exception e)
{
return null;
}
finally
{
if (_excel != null)
{
if (_workBook != null)
_workBook.Close(false, Type.Missing, Type.Missing);
_excel.Application.Quit();
}
_excel = null;
_workBook = null;
}
return excelSheets;
}

As per MSDN, In a case of spreadsheets inside of Excel it might not work because Excel files are not real databases. So you will be not able to get the sheets name in order of their visualization in workbook.
Code to get sheets name as per their visual appearance using interop:
Add reference to Microsoft Excel 12.0 Object Library.
Following code will give the sheets name in the actual order stored in workbook, not the sorted name.
Sample Code:
using Microsoft.Office.Interop.Excel;
string filename = "C:\\romil.xlsx";
object missing = System.Reflection.Missing.Value;
Microsoft.Office.Interop.Excel.Application excel = new Microsoft.Office.Interop.Excel.Application();
Microsoft.Office.Interop.Excel.Workbook wb =excel.Workbooks.Open(filename, missing, missing, missing, missing,missing, missing, missing, missing, missing, missing, missing, missing, missing, missing);
ArrayList sheetname = new ArrayList();
foreach (Microsoft.Office.Interop.Excel.Worksheet sheet in wb.Sheets)
{
sheetname.Add(sheet.Name);
}

I don't see any documentation that says the order in app.xml is guaranteed to be the order of the sheets. It PROBABLY is, but not according to the OOXML specification.
The workbook.xml file, on the other hand, includes the sheetId attribute, which does determine the sequence - from 1 to the number of sheets. This is according to the OOXML specification. workbook.xml is described as the place where the sequence of the sheets is kept.
So reading workbook.xml after it is extracted form the XLSX would be my recommendation. NOT app.xml. Instead of docProps/app.xml, use xl/workbook.xml and look at the element, as shown here -
`
<workbook xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships">
<fileVersion appName="xl" lastEdited="5" lowestEdited="5" rupBuild="9303" />
<workbookPr defaultThemeVersion="124226" />
- <bookViews>
<workbookView xWindow="120" yWindow="135" windowWidth="19035" windowHeight="8445" />
</bookViews>
- <sheets>
<sheet name="By song" sheetId="1" r:id="rId1" />
<sheet name="By actors" sheetId="2" r:id="rId2" />
<sheet name="By pit" sheetId="3" r:id="rId3" />
</sheets>
- <definedNames>
<definedName name="_xlnm._FilterDatabase" localSheetId="0" hidden="1">'By song'!$A$1:$O$59</definedName>
</definedNames>
<calcPr calcId="145621" />
</workbook>
`

Related

The process cannot access the file because it is being used by another process using spreadsheet document

Please help me to solve the problem, when i save the excel file throw this error i convert the excel file to zip file.
zip.Save(#"" + root + "/" + "" + userid + ".zip");
the process cannot access the file because it is being used by another process
my code is below
using (SpreadsheetDocument document = SpreadsheetDocument.Create(excelFilename, SpreadsheetDocumentType.Workbook))
{
WriteExcelFile(ds, document);
}
private static void WriteExcelFile(DataSet ds, SpreadsheetDocument spreadsheet)
{
spreadsheet.AddWorkbookPart();
spreadsheet.WorkbookPart.Workbook = new DocumentFormat.OpenXml.Spreadsheet.Workbook();
spreadsheet.WorkbookPart.Workbook.Append(new BookViews(new WorkbookView()));
// If we don't add a "WorkbookStylesPart", OLEDB will refuse to connect to this .xlsx file !
WorkbookStylesPart workbookStylesPart = spreadsheet.WorkbookPart.AddNewPart<WorkbookStylesPart>("rIdStyles");
Stylesheet stylesheet = new Stylesheet();
workbookStylesPart.Stylesheet = stylesheet;
// Loop through each of the DataTables in our DataSet, and create a new Excel Worksheet for each.
uint worksheetNumber = 1;
foreach (DataTable dt in ds.Tables)
{
// For each worksheet you want to create
string workSheetID = "rId" + worksheetNumber.ToString();
string worksheetName = dt.TableName;
WorksheetPart newWorksheetPart = spreadsheet.WorkbookPart.AddNewPart<WorksheetPart>();
newWorksheetPart.Worksheet = new DocumentFormat.OpenXml.Spreadsheet.Worksheet();
// create sheet data
newWorksheetPart.Worksheet.AppendChild(new DocumentFormat.OpenXml.Spreadsheet.SheetData());
// save worksheet
WriteDataTableToExcelWorksheet(dt, newWorksheetPart);
newWorksheetPart.Worksheet.Save();
// create the worksheet to workbook relation
if (worksheetNumber == 1)
spreadsheet.WorkbookPart.Workbook.AppendChild(new DocumentFormat.OpenXml.Spreadsheet.Sheets());
spreadsheet.WorkbookPart.Workbook.GetFirstChild<DocumentFormat.OpenXml.Spreadsheet.Sheets>().AppendChild(new DocumentFormat.OpenXml.Spreadsheet.Sheet()
{
Id = spreadsheet.WorkbookPart.GetIdOfPart(newWorksheetPart),
SheetId = (uint)worksheetNumber,
Name = dt.TableName
});
worksheetNumber++;
}
spreadsheet.WorkbookPart.Workbook.Save();
}
I see that you have saved your document but you have not closed it. So, it will still be open with the spreadsheetdocument object.
Add the following line after you save the document.
spreadsheet.WorkbookPart.Workbook.Close();
If it doesn't help, kindly provide the full code so that we can look into it further. Good luck.
I think you are not closing all your Workbooks before you try to put them in the Zip-File.
Put this at the end of the Method:
private static void WriteExcelFile(DataSet ds, SpreadsheetDocument spreadsheet)
{
...
spreadsheet.WorkbookPart.Workbook.Close();
}

Search entire workbook and fetch out values

In my excel workbook
The first sheet tab contains
Number Name Code Subject
100 Mark ABC Mathematics
101 John XYZ Physics
The second sheet tab contains
Number Name Code Subject
103 Mark DEF Chemistry
104 John GHI Biology
I want to pass the code(which is going to be unique) as a parameter and search the entire excel workbook
and fetch name and subject..
ie..select name,subject from myexcelworkbook where code='ABC'
I am able to get sheet names, column count etc. but not able to search thro' an entire excel workbook and get the required values
const string fileName="C:\\FileName.xls";
OleDbConnectionStringBuilder connectionStringBuilder = new OleDbConnectionStringBuilder();
connectionStringBuilder.Provider = "Microsoft.ACE.OLEDB.12.0";
connectionStringBuilder.DataSource = fileName;
connectionStringBuilder.Add("Mode", "Read");
const string extendedProperties = "Excel 12.0;IMEX=1;HDR=YES";
connectionStringBuilder.Add("Extended Properties", extendedProperties);
using (OleDbConnection objConn = new OleDbConnection(connectionStringBuilder.ConnectionString))
{
objConn.Open();
Microsoft.Office.Interop.Excel.Application xlsApp = new Microsoft.Office.Interop.Excel.Application();
Workbook wb = xlsApp.Workbooks.Open(fileName, 0, true, 5, "", "", true, XlPlatform.xlWindows, "\t", false, false, 0, true);
Sheets sheets = wb.Worksheets;
for (int i =1 ; i <= wb.Worksheets.Count; i++)
{
MessageBox.Show(wb.Sheets[i].Name.ToString()); - gives sheet names inside the workbook
}
Worksheet ws = (Worksheet)sheets.get_Item(1); - gives the elements of specified sheet tab
}
//To get elements inside a specific sheet of an excel workbook/get column names
OleDbCommand objCmdSelect = new OleDbCommand("SELECT * FROM [" + "Sheet1" + "$]", objConn);
OleDbDataAdapter objAdapter1 = new OleDbDataAdapter();
objAdapter1.SelectCommand = objCmdSelect;
DataSet objDataset1 = new DataSet();
objAdapter1.Fill(objDataset1);
string columnNames = string.Empty;
// For each DataTable, print the ColumnName. use dataset.Rows to iterate row data...
foreach (System.Data.DataTable table in objDataset1.Tables)
{
foreach (DataColumn column in table.Columns)
{
if (columnNames.Length > 0)
{
columnNames = columnNames + Environment.NewLine + column.ColumnName;
}
else
{
columnNames = column.ColumnName;
}
}
}
Can someone share some ideas, so that i can find out the unique data inside the excel workbook and fetch out the needed values based on that? thanks in advance.
If the workbooks are as structured as you indicate and you will be querying multiple students then, as #Andy G suggests, you might find it easiest to just get the data into some form of record set then you run your queries on it through Linq or SQL or whatever you prefer. As you aren't wishing to modify the Excel workbook at all this is probably a better approach.
Alternatively, you can use the Excel API, like you were also trying to use. You can enumerate through the worksheets, running a Find on each. A la:
internal void GetYourData()
{
//... code to get the relevant Workbook and relevant/new Excel Application
Tuple<string, string> pupil;
string searchTerm = "ABC";
//Get the cell of the match
Range match = FindFirstOccurrenceInWorkbook(workbook, searchTerm);
if (match != null)
{
//Do whatever - per your data structure, it is probably easiest to just use .Offset(row, column) property
pupil = new Tuple<string, string>((string)match.Offset(0, -1).Value, (string)match.Offset(0, 1).Value);
}
//... code to do whatever with your results
}
internal static Range FindFirstOccurrenceInWorkbook(Workbook workbook, string searchTerm)
{
if (workbook == null) throw new ArgumentNullException("workbook");
if (searchTerm == null) throw new ArgumentNullException("searchTerm");
Sheets wss = workbook.Worksheets;
Range match = null;
foreach (Worksheet ws in wss)
{
Range cells = ws.Cells;
//Add more args as needed - this is just an example
match = cells.Find(
what: searchTerm,
after: Type.Missing,
lookIn: XlFindLookIn.xlFormulas,
lookAt: XlLookAt.xlPart);
System.Runtime.InteropServices.Marshal.ReleaseComObject(cells);
System.Runtime.InteropServices.Marshal.ReleaseComObject(ws);
if (match != null)
{
break;
}
}
System.Runtime.InteropServices.Marshal.ReleaseComObject(wss);
GC.Collect();
GC.WaitForPendingFinalizers();
GC.Collect();
return match;
}
Also, you can't use COM objects like you use without generating orphaned references -> this will prevent the Excel application from closing. You need to explicitly assign all object properties you want to use like Workbook.Worksheets to a variable then call System.Runtime.InteropServices.Marshal.ReleaseCOMObject(object) when the variable is about to go out of scope.

Copy specific worksheets to new Excel file

I have an Excel file.
I need to open it, select specific sheets from it, and convert those sheets to a PDF format. I am able to convert the whole excel file, I just don't know how to convert only the specific sheets.
My idea is to copy specific sheets from an existing file to a new temporary file, and convert that whole new temporary file to PDF.
Maybe there's an easier way?
My code so far is =>
using Word = Microsoft.Office.Interop.Word;
using Excel = Microsoft.Office.Interop.Excel;
public static void ExportExcel(string infile, string outfile, int[] worksheets)
{
Excel.Application excelApp = null;
Excel.Application newExcelApp = null;
try
{
excelApp = new Excel.Application();
excelApp.Workbooks.Open(infile);
//((Microsoft.Office.Interop.Excel._Worksheet)excelApp.ActiveSheet).PageSetup.Orientation = Microsoft.Office.Interop.Excel.XlPageOrientation.xlLandscape;
excelApp.ActiveWorkbook.ExportAsFixedFormat(Excel.XlFixedFormatType.xlTypePDF, outfile);
}
finally
{
if (excelApp != null)
{
excelApp.DisplayAlerts = false;
excelApp.SaveWorkspace();
excelApp.Quit();
}
}
}
Maybe the ExportAsFixedFormat method can be set to consider only specific pages (sheets) while converting?
If not, how do I copy the sheets from one file to another?
Thanks!
You might be able to just print the sheets you want from the original file. I fired up the Macro Recorder, selected a couple of sheets, and Saved As to PDF. Here's the code:
Sheets(Array("Sheet1", "Sheet2")).Select
ActiveSheet.ExportAsFixedFormat Type:=xlTypePDF, Filename:= _
"C:\Users\doug\Documents\Book1.pdf", Quality:=xlQualityStandard, _
IncludeDocProperties:=True, IgnorePrintAreas:=False, OpenAfterPublish:=True
When I changed the selected sheets and ran again it worked as expected.
What's strange is that in the actual Save As dialog, you can go to Options and check "Selected Sheets." That's not available as a parameter to ExportAsFixedFormat, but it was automatically selected in the dialog, and maybe is the default also when called.
You could simply copy the file to the new destination, open the destination, remove the unwanted sheets and export. This is an example (tested) of my idea.
// infile is the excel file, outfile is the pdf to build, sheetToExport is the name of the sheet
public static void ExportExcel(string infile, string outfile, string sheetToExport)
{
Microsoft.Office.Interop.Excel.Application excelApp = new
Microsoft.Office.Interop.Excel.Application();
try
{
string tempFile = Path.ChangeExtension(outfile, "XLS");
File.Copy(infile, tempFile, true);
Microsoft.Office.Interop.Excel._Workbook excelWorkbook =
excelApp.Workbooks.Open(tempFile);
for(int x = excelApp.Sheets.Count; x > 0; x--)
{
_Worksheet sheet = (_Worksheet)excelApp.Sheets[x];
if(sheet != null && sheet.Name != sheetToExport)
sheet.Delete();
}
excelApp.ActiveWorkbook.ExportAsFixedFormat(XlFixedFormatType.xlTypePDF, outfile);
}
finally
{
if (excelApp != null)
{
excelApp.DisplayAlerts = false;
excelApp.SaveWorkspace();
excelApp.Quit();
}
}
}

How to merge two Excel workbook into one workbook in C#?

Let us consider that I have two Excel files (Workbooks) in local. Each Excel workbook is having 3 worksheets.
Lets say WorkBook1 is having Sheet1, Sheet2, Sheet3
Workbook2 is having Sheet1, Sheet2, Sheet3.
So here I need to merge these two excel workbook into one and the new excel workbook that is let's say Workbook3 which will have total 6 worksheets (combination of workbook1 and workbook2).
I need the code that how to perform this operation in c# without using any third party tool. If the third party tool is free version then its fine.
An easier solution is to copy the worksheets themselves, and not their cells.
This method takes any number of excel file paths and copy them into a new file:
private static void MergeWorkbooks(string destinationFilePath, params string[] sourceFilePaths)
{
var app = new Application();
app.DisplayAlerts = false; // No prompt when overriding
// Create a new workbook (index=1) and open source workbooks (index=2,3,...)
Workbook destinationWb = app.Workbooks.Add();
foreach (var sourceFilePath in sourceFilePaths)
{
app.Workbooks.Add(sourceFilePath);
}
// Copy all worksheets
Worksheet after = destinationWb.Worksheets[1];
for (int wbIndex = app.Workbooks.Count; wbIndex >= 2; wbIndex--)
{
Workbook wb = app.Workbooks[wbIndex];
for (int wsIndex = wb.Worksheets.Count; wsIndex >= 1; wsIndex--)
{
Worksheet ws = wb.Worksheets[wsIndex];
ws.Copy(After: after);
}
}
// Close source documents before saving destination. Otherwise, save will fail
for (int wbIndex = 2; wbIndex <= app.Workbooks.Count; wbIndex++)
{
Workbook wb = app.Workbooks[wbIndex];
wb.Close();
}
// Delete default worksheet
after.Delete();
// Save new workbook
destinationWb.SaveAs(destinationFilePath);
destinationWb.Close();
app.Quit();
}
Edit: notice that you might want to Move method instead of Copy in case you have dependencies between the sheets, e.g. pivot table, charts, formulas, etc. Otherwise the data source will disconnect and any changes in one sheet won't effect the other.
Here's a working sample that joins two books into a new one, hope it will give you an idea:
using System;
using Excel = Microsoft.Office.Interop.Excel;
using System.Reflection;
namespace MergeWorkBooks
{
class Program
{
static void Main(string[] args)
{
Excel.Application app = new Excel.Application();
app.Visible = true;
app.Workbooks.Add("");
app.Workbooks.Add(#"c:\MyWork\WorkBook1.xls");
app.Workbooks.Add(#"c:\MyWork\WorkBook2.xls");
for (int i = 2; i <= app.Workbooks.Count; i++)
{
int count = app.Workbooks[i].Worksheets.Count;
app.Workbooks[i].Activate();
for (int j=1; j <= count; j++)
{
Excel._Worksheet ws = (Excel._Worksheet)app.Workbooks[i].Worksheets[j];
ws.Select(Type.Missing);
ws.Cells.Select();
Excel.Range sel = (Excel.Range)app.Selection;
sel.Copy(Type.Missing);
Excel._Worksheet sheet = (Excel._Worksheet)app.Workbooks[1].Worksheets.Add(
Type.Missing, Type.Missing, Type.Missing, Type.Missing
);
sheet.Paste(Type.Missing, Type.Missing);
}
}
}
}
}
You're looking for Office Autmation libraries in C#.
Here is a sample code to help you get started.
System.Data.Odbc.OdbcDataAdapter Odbcda;
//CSV File
strConnString = "Driver={Microsoft Text Driver (*.txt; *.csv)};Dbq=" + SourceLocation + ";Extensions=asc,csv,tab,txt;Persist Security Info=False";
sqlSelect = "select * from [" + filename + "]";
System.Data.Odbc.OdbcConnection conn = new System.Data.Odbc.OdbcConnection(strConnString.Trim());
conn.Open();
Odbcda = new System.Data.Odbc.OdbcDataAdapter(sqlSelect, conn);
Odbcda.Fill(ds, DataTable);
conn.Close();
This would read the contents of an excel file into a dataset.
Create multiple datasets like this and then do a merge.
Code taken directly from here.

How to convert xml to excel file programmatically

I have a xml document holding a small data for my project where I want to convert my xml to an excel file (microsoft office excel 2003 and over)
How can I do this programmatically?
It can be achieved by using Microsoft.Office.Interop.Excel as shown below:
First of all declare these necessary references.
using System;
using System.IO;
using System.Reflection;
using System.Runtime.InteropServices;
using Microsoft.Office.Tools.Excel;
using Microsoft.VisualStudio.Tools.Applications.Runtime;
using Excel = Microsoft.Office.Interop.Excel;
using Office = Microsoft.Office.Core;
using System.Diagnostics;
using Microsoft.Win32;
Than create excel app as shown:
Excel.Application excel2; // Create Excel app
Excel.Workbook DataSource; // Create Workbook
Excel.Worksheet DataSheet; // Create Worksheet
excel2 = new Excel.Application(); // Start an Excel app
DataSource = (Excel.Workbook)excel2.Workbooks.Add(1); // Add a Workbook inside
string tempFolder = System.IO.Path.GetTempPath(); // Get folder
string FileName = openFileDialog1.FileName.ToString(); // Get xml file name
After that use below code in a loop to ensure all items in xml file are copied
// Open that xml file with excel
DataSource = excel2.Workbooks.Open(FileName, Missing.Value, Missing.Value, Missing.Value, Missing.Value, Missing.Value, Missing.Value, Missing.Value, Missing.Value, Missing.Value, Missing.Value, Missing.Value, Missing.Value, Missing.Value, Missing.Value);
// Get items from xml file
DataSheet = DataSource.Worksheets.get_Item(1);
// Create another Excel app as object
Object xl_app;
xl_app = GetObj(1, "Excel.Application");
Excel.Application xlApp = (Excel.Application)xl_app;
// Set previous Excel app (Xml) as ReportPage
Excel.Application ReportPage = (Excel.Application)System.Runtime.InteropServices.Marshal.GetActiveObject("Excel.Application");
// Copy items from ReportPage(Xml) to current Excel object
Excel.Workbook Copy_To_Excel = ReportPage.ActiveWorkbook;
Now we have an Excel object named as Copy_To_Excel that contains all items in Xml..
We can save and close the Excel object with the name we want..
Copy_To_Excel.Save("thePath\theFileName");
Copy_To_Excel.Close();
If you have control over the XML generated, just generate an XML Spreadsheet file (the XML file standard for Excel 2002 and 2003).
These open natively in Excel, without having to change the extension. (To open by default in Excel, the file extension XML should be set to open with "XML Editor", which is an Office app that routes the XML file to Excel, Word, PowerPoint, InfoPath, or your external XML editor as needed. This is the default mapping when Office is installed, but it may be out of whack for some users, particularly devs who edit XML files in a text editor.)
Or, use the NPOI library to generate a native (97/2000 BIFF/XLS) Excel file rather than XML.
you can even read the XML file as string and use regular expressions to read the content between the tags and create a CSV file or use xpath expressions to read the XML file data and export to CSV file.
I know of no easy way to do code-based conversion from xml-based spreadsheet to xls/xlsx. But you might look into Microsoft Open Xml SDK which lets you work with xlsx. You might build xlsx spreadsheet and just feed it with your data. On the level of open-xml SDK it's like building xml file.
1.Fill the xml file into dataset,
2.convert dataset to excel by below method in asp.net
These are very simple methods.
public static void Convert(System.Data.DataSet ds, System.Web.HttpResponse response)
{
//first let's clean up the response.object
response.Clear();
response.Charset = "";
//set the response mime type for excel
response.ContentType = "application/vnd.ms-excel";
//create a string writer
System.IO.StringWriter stringWrite = new System.IO.StringWriter();
//create an htmltextwriter which uses the stringwriter
System.Web.UI.HtmlTextWriter htmlWrite = new System.Web.UI.HtmlTextWriter(stringWrite);
//instantiate a datagrid
System.Web.UI.WebControls.DataGrid dg = new System.Web.UI.WebControls.DataGrid();
//set the datagrid datasource to the dataset passed in
dg.DataSource = ds.Tables[0];
//bind the datagrid
dg.DataBind();
//tell the datagrid to render itself to our htmltextwriter
dg.RenderControl(htmlWrite);
//all that's left is to output the html
response.Write(stringWrite.ToString());
response.End();
}
For array elements separate with ',' comma and reuse the same column name
1) XML Functions
public static class XMLFunctions
{
public static List<Tuple<string, string>> GetXMlTagsAndValues(string xml)
{
var xmlList = new List<Tuple<string, string>>();
var doc = XDocument.Parse(xml);
foreach (var element in doc.Descendants())
{
// we don't care about the parent tags
if (element.Descendants().Count() > 0)
{
continue;
}
var path = element.AncestorsAndSelf().Select(e => e.Name.LocalName).Reverse();
var xPath = string.Join("/", path);
xmlList.Add(Tuple.Create(xPath, element.Value));
}
return xmlList;
}
public static System.Data.DataTable CreateDataTableFromXmlFile(string xmlFilePath)
{
System.Data.DataTable Dt = new System.Data.DataTable();
string input = File.ReadAllText(xmlFilePath);
var xmlTagsAndValues = GetXMlTagsAndValues(input);
var columnList = new List<string>();
foreach(var xml in xmlTagsAndValues)
{
if(!columnList.Contains(xml.Item1))
{
columnList.Add(xml.Item1);
Dt.Columns.Add(xml.Item1, typeof(string));
}
}
DataRow dtrow = Dt.NewRow();
var columnList2 = new Dictionary<string, string>();
foreach (var xml in xmlTagsAndValues)
{
if (!columnList2.Keys.Contains(xml.Item1))
{
dtrow[xml.Item1] = xml.Item2;
columnList2.Add(xml.Item1, xml.Item2);
}
else
{ // Here we are using the same column but appending the next value
dtrow[xml.Item1] = columnList2[xml.Item1] + "," + xml.Item2;
columnList2[xml.Item1] = columnList2[xml.Item1] + "," + xml.Item2;
}
}
Dt.Rows.Add(dtrow);
return Dt;
}
}
2) Full Excel class
using Microsoft.Office.Interop.Excel;
using _Excel = Microsoft.Office.Interop.Excel;
public class Excel
{
string path = "";
Application excel = new _Excel.Application();
Workbook wb;
Worksheet ws;
public Range xlRange;
static bool saveChanges = false;
static int excelRow = 0;
static List<string> columnHeaders = new List<string>();
public Excel(string path, int Sheet = 1)
{
this.path = path;
wb = excel.Workbooks.Open(path);
ws = wb.Worksheets[Sheet];
xlRange = ws.UsedRange;
excelRow = 0;
columnHeaders = new List<string>();
}
public void SaveFile(bool save = true)
{
saveChanges = save;
}
public void Close()
{
wb.Close(saveChanges);
System.Runtime.InteropServices.Marshal.ReleaseComObject(wb);
excel.Quit();
}
public void XMLToExcel(string xmlFilePath)
{
var dt = XMLFunctions.CreateDataTableFromXmlFile(xmlFilePath);
AddDataTableToExcel(dt);
}
public void AddDataTableToExcel(System.Data.DataTable table)
{
// Creating Header Column In Excel
for (int i = 0; i < table.Columns.Count; i++)
{
if (!columnHeaders.Contains(table.Columns[i].ColumnName))
{
ws.Cells[1, columnHeaders.Count() + 1] = table.Columns[i].ColumnName;
columnHeaders.Add(table.Columns[i].ColumnName);
}
}
// Get the rows
for (int k = 0; k < table.Columns.Count; k++)
{
var columnNumber = columnHeaders.FindIndex(x => x.Equals(table.Columns[k].ColumnName));
ws.Cells[excelRow + 2, columnNumber + 1] = table.Rows[0].ItemArray[k].ToString();
}
excelRow++;
SaveFile(true);
}
}
3) Call it
var excel = new Excel(excelFilename);
foreach (var filePath in files)
{
excel.XMLToExcel(filePath);
}
excel.Close();
For array elements having appended incremented column names (ex. column_2)
Create Data Table From XmlFile Redone
public static System.Data.DataTable CreateDataTableFromXmlFile(string xmlFilePath)
{
System.Data.DataTable Dt = new System.Data.DataTable();
string input = File.ReadAllText(xmlFilePath);
var xmlTagsAndValues = GetXMlTagsAndValues(input);
var columnList = new List<string>();
foreach (var xml in xmlTagsAndValues)
{
if (!columnList.Contains(xml.Item1))
{
columnList.Add(xml.Item1);
Dt.Columns.Add(xml.Item1, typeof(string));
}
else
{
var columnName = xml.Item1;
do
{
columnName = columnName.Increment();
} while (columnList.Contains(columnName));
columnList.Add(columnName);
Dt.Columns.Add(columnName, typeof(string));
}
}
DataRow dtrow = Dt.NewRow();
var columnList2 = new Dictionary<string, string>();
foreach (var xml in xmlTagsAndValues)
{
if (!columnList2.Keys.Contains(xml.Item1))
{
dtrow[xml.Item1] = xml.Item2;
columnList2.Add(xml.Item1, xml.Item2);
}
else
{
var columnName = xml.Item1;
do
{
columnName = columnName.Increment();
} while (columnList2.Keys.Contains(columnName));
dtrow[columnName] = xml.Item2;
columnList2[columnName] = xml.Item2;
}
}
Dt.Rows.Add(dtrow);
return Dt;
}
String Extensions
public static class StringExtensions
{
public static string Increment(this string str)
{
if (!str.Contains("_"))
{
str += "_2";
return str;
}
var number = int.Parse(str.Substring(str.LastIndexOf('_') + 1));
var stringBefore = StringFunctions.GetUntilOrEmpty(str, "_");
return $"{stringBefore}_{++number}";
}
}
Using GetUntilOrEmpty
How to open an XML file in Excel 2003.
In short, you can simply open it by using File › Open. Then, when you select an XML file, you are prompted to select one of the following methods to import the XML data:
As an XML list
As a read-only workbook
Use the XML Source task pane
Can't you simply open it in Excel? I thought Excel recognized the suffix XML?

Categories