I have a xml document holding a small data for my project where I want to convert my xml to an excel file (microsoft office excel 2003 and over)
How can I do this programmatically?
It can be achieved by using Microsoft.Office.Interop.Excel as shown below:
First of all declare these necessary references.
using System;
using System.IO;
using System.Reflection;
using System.Runtime.InteropServices;
using Microsoft.Office.Tools.Excel;
using Microsoft.VisualStudio.Tools.Applications.Runtime;
using Excel = Microsoft.Office.Interop.Excel;
using Office = Microsoft.Office.Core;
using System.Diagnostics;
using Microsoft.Win32;
Than create excel app as shown:
Excel.Application excel2; // Create Excel app
Excel.Workbook DataSource; // Create Workbook
Excel.Worksheet DataSheet; // Create Worksheet
excel2 = new Excel.Application(); // Start an Excel app
DataSource = (Excel.Workbook)excel2.Workbooks.Add(1); // Add a Workbook inside
string tempFolder = System.IO.Path.GetTempPath(); // Get folder
string FileName = openFileDialog1.FileName.ToString(); // Get xml file name
After that use below code in a loop to ensure all items in xml file are copied
// Open that xml file with excel
DataSource = excel2.Workbooks.Open(FileName, Missing.Value, Missing.Value, Missing.Value, Missing.Value, Missing.Value, Missing.Value, Missing.Value, Missing.Value, Missing.Value, Missing.Value, Missing.Value, Missing.Value, Missing.Value, Missing.Value);
// Get items from xml file
DataSheet = DataSource.Worksheets.get_Item(1);
// Create another Excel app as object
Object xl_app;
xl_app = GetObj(1, "Excel.Application");
Excel.Application xlApp = (Excel.Application)xl_app;
// Set previous Excel app (Xml) as ReportPage
Excel.Application ReportPage = (Excel.Application)System.Runtime.InteropServices.Marshal.GetActiveObject("Excel.Application");
// Copy items from ReportPage(Xml) to current Excel object
Excel.Workbook Copy_To_Excel = ReportPage.ActiveWorkbook;
Now we have an Excel object named as Copy_To_Excel that contains all items in Xml..
We can save and close the Excel object with the name we want..
Copy_To_Excel.Save("thePath\theFileName");
Copy_To_Excel.Close();
If you have control over the XML generated, just generate an XML Spreadsheet file (the XML file standard for Excel 2002 and 2003).
These open natively in Excel, without having to change the extension. (To open by default in Excel, the file extension XML should be set to open with "XML Editor", which is an Office app that routes the XML file to Excel, Word, PowerPoint, InfoPath, or your external XML editor as needed. This is the default mapping when Office is installed, but it may be out of whack for some users, particularly devs who edit XML files in a text editor.)
Or, use the NPOI library to generate a native (97/2000 BIFF/XLS) Excel file rather than XML.
you can even read the XML file as string and use regular expressions to read the content between the tags and create a CSV file or use xpath expressions to read the XML file data and export to CSV file.
I know of no easy way to do code-based conversion from xml-based spreadsheet to xls/xlsx. But you might look into Microsoft Open Xml SDK which lets you work with xlsx. You might build xlsx spreadsheet and just feed it with your data. On the level of open-xml SDK it's like building xml file.
1.Fill the xml file into dataset,
2.convert dataset to excel by below method in asp.net
These are very simple methods.
public static void Convert(System.Data.DataSet ds, System.Web.HttpResponse response)
{
//first let's clean up the response.object
response.Clear();
response.Charset = "";
//set the response mime type for excel
response.ContentType = "application/vnd.ms-excel";
//create a string writer
System.IO.StringWriter stringWrite = new System.IO.StringWriter();
//create an htmltextwriter which uses the stringwriter
System.Web.UI.HtmlTextWriter htmlWrite = new System.Web.UI.HtmlTextWriter(stringWrite);
//instantiate a datagrid
System.Web.UI.WebControls.DataGrid dg = new System.Web.UI.WebControls.DataGrid();
//set the datagrid datasource to the dataset passed in
dg.DataSource = ds.Tables[0];
//bind the datagrid
dg.DataBind();
//tell the datagrid to render itself to our htmltextwriter
dg.RenderControl(htmlWrite);
//all that's left is to output the html
response.Write(stringWrite.ToString());
response.End();
}
For array elements separate with ',' comma and reuse the same column name
1) XML Functions
public static class XMLFunctions
{
public static List<Tuple<string, string>> GetXMlTagsAndValues(string xml)
{
var xmlList = new List<Tuple<string, string>>();
var doc = XDocument.Parse(xml);
foreach (var element in doc.Descendants())
{
// we don't care about the parent tags
if (element.Descendants().Count() > 0)
{
continue;
}
var path = element.AncestorsAndSelf().Select(e => e.Name.LocalName).Reverse();
var xPath = string.Join("/", path);
xmlList.Add(Tuple.Create(xPath, element.Value));
}
return xmlList;
}
public static System.Data.DataTable CreateDataTableFromXmlFile(string xmlFilePath)
{
System.Data.DataTable Dt = new System.Data.DataTable();
string input = File.ReadAllText(xmlFilePath);
var xmlTagsAndValues = GetXMlTagsAndValues(input);
var columnList = new List<string>();
foreach(var xml in xmlTagsAndValues)
{
if(!columnList.Contains(xml.Item1))
{
columnList.Add(xml.Item1);
Dt.Columns.Add(xml.Item1, typeof(string));
}
}
DataRow dtrow = Dt.NewRow();
var columnList2 = new Dictionary<string, string>();
foreach (var xml in xmlTagsAndValues)
{
if (!columnList2.Keys.Contains(xml.Item1))
{
dtrow[xml.Item1] = xml.Item2;
columnList2.Add(xml.Item1, xml.Item2);
}
else
{ // Here we are using the same column but appending the next value
dtrow[xml.Item1] = columnList2[xml.Item1] + "," + xml.Item2;
columnList2[xml.Item1] = columnList2[xml.Item1] + "," + xml.Item2;
}
}
Dt.Rows.Add(dtrow);
return Dt;
}
}
2) Full Excel class
using Microsoft.Office.Interop.Excel;
using _Excel = Microsoft.Office.Interop.Excel;
public class Excel
{
string path = "";
Application excel = new _Excel.Application();
Workbook wb;
Worksheet ws;
public Range xlRange;
static bool saveChanges = false;
static int excelRow = 0;
static List<string> columnHeaders = new List<string>();
public Excel(string path, int Sheet = 1)
{
this.path = path;
wb = excel.Workbooks.Open(path);
ws = wb.Worksheets[Sheet];
xlRange = ws.UsedRange;
excelRow = 0;
columnHeaders = new List<string>();
}
public void SaveFile(bool save = true)
{
saveChanges = save;
}
public void Close()
{
wb.Close(saveChanges);
System.Runtime.InteropServices.Marshal.ReleaseComObject(wb);
excel.Quit();
}
public void XMLToExcel(string xmlFilePath)
{
var dt = XMLFunctions.CreateDataTableFromXmlFile(xmlFilePath);
AddDataTableToExcel(dt);
}
public void AddDataTableToExcel(System.Data.DataTable table)
{
// Creating Header Column In Excel
for (int i = 0; i < table.Columns.Count; i++)
{
if (!columnHeaders.Contains(table.Columns[i].ColumnName))
{
ws.Cells[1, columnHeaders.Count() + 1] = table.Columns[i].ColumnName;
columnHeaders.Add(table.Columns[i].ColumnName);
}
}
// Get the rows
for (int k = 0; k < table.Columns.Count; k++)
{
var columnNumber = columnHeaders.FindIndex(x => x.Equals(table.Columns[k].ColumnName));
ws.Cells[excelRow + 2, columnNumber + 1] = table.Rows[0].ItemArray[k].ToString();
}
excelRow++;
SaveFile(true);
}
}
3) Call it
var excel = new Excel(excelFilename);
foreach (var filePath in files)
{
excel.XMLToExcel(filePath);
}
excel.Close();
For array elements having appended incremented column names (ex. column_2)
Create Data Table From XmlFile Redone
public static System.Data.DataTable CreateDataTableFromXmlFile(string xmlFilePath)
{
System.Data.DataTable Dt = new System.Data.DataTable();
string input = File.ReadAllText(xmlFilePath);
var xmlTagsAndValues = GetXMlTagsAndValues(input);
var columnList = new List<string>();
foreach (var xml in xmlTagsAndValues)
{
if (!columnList.Contains(xml.Item1))
{
columnList.Add(xml.Item1);
Dt.Columns.Add(xml.Item1, typeof(string));
}
else
{
var columnName = xml.Item1;
do
{
columnName = columnName.Increment();
} while (columnList.Contains(columnName));
columnList.Add(columnName);
Dt.Columns.Add(columnName, typeof(string));
}
}
DataRow dtrow = Dt.NewRow();
var columnList2 = new Dictionary<string, string>();
foreach (var xml in xmlTagsAndValues)
{
if (!columnList2.Keys.Contains(xml.Item1))
{
dtrow[xml.Item1] = xml.Item2;
columnList2.Add(xml.Item1, xml.Item2);
}
else
{
var columnName = xml.Item1;
do
{
columnName = columnName.Increment();
} while (columnList2.Keys.Contains(columnName));
dtrow[columnName] = xml.Item2;
columnList2[columnName] = xml.Item2;
}
}
Dt.Rows.Add(dtrow);
return Dt;
}
String Extensions
public static class StringExtensions
{
public static string Increment(this string str)
{
if (!str.Contains("_"))
{
str += "_2";
return str;
}
var number = int.Parse(str.Substring(str.LastIndexOf('_') + 1));
var stringBefore = StringFunctions.GetUntilOrEmpty(str, "_");
return $"{stringBefore}_{++number}";
}
}
Using GetUntilOrEmpty
How to open an XML file in Excel 2003.
In short, you can simply open it by using File › Open. Then, when you select an XML file, you are prompted to select one of the following methods to import the XML data:
As an XML list
As a read-only workbook
Use the XML Source task pane
Can't you simply open it in Excel? I thought Excel recognized the suffix XML?
Related
How to read an Excel file using C#? I open an Excel file for reading and copy it to clipboard to search email format, but I don't know how to do it.
FileInfo finfo;
Excel.ApplicationClass ExcelObj = new Excel.ApplicationClass();
ExcelObj.Visible = false;
Excel.Workbook theWorkbook;
Excel.Worksheet worksheet;
if (listView1.Items.Count > 0)
{
foreach (ListViewItem s in listView1.Items)
{
finfo = new FileInfo(s.Text);
if (finfo.Extension == ".xls" || finfo.Extension == ".xlsx" || finfo.Extension == ".xlt" || finfo.Extension == ".xlsm" || finfo.Extension == ".csv")
{
theWorkbook = ExcelObj.Workbooks.Open(s.Text, 0, true, 5, "", "", true, Excel.XlPlatform.xlWindows, "\t", false, false, 0, true, false, false);
for (int count = 1; count <= theWorkbook.Sheets.Count; count++)
{
worksheet = (Excel.Worksheet)theWorkbook.Worksheets.get_Item(count);
worksheet.Activate();
worksheet.Visible = false;
worksheet.UsedRange.Cells.Select();
}
}
}
}
OK,
One of the more difficult concepts to grasp about Excel VSTO programming is that you don't refer to cells like an array, Worksheet[0][0] won't give you cell A1, it will error out on you. Even when you type into A1 when Excel is open, you are actually entering data into Range A1. Therefore you refer to cells as Named Ranges. Here's an example:
Excel.Worksheet sheet = workbook.Sheets["Sheet1"] as Excel.Worksheet;
Excel.Range range = sheet.get_Range("A1", Missing.Value)
You can now literally type:
range.Text // this will give you the text the user sees
range.Value2 // this will give you the actual value stored by Excel (without rounding)
If you want to do something like this:
Excel.Range range = sheet.get_Range("A1:A5", Missing.Value)
if (range1 != null)
foreach (Excel.Range r in range1)
{
string user = r.Text
string value = r.Value2
}
There might be a better way, but this has worked for me.
The reason you need to use Value2 and not Value is because the Value property is a parametrized and C# doesn't support them yet.
As for the cleanup code, i will post that when i get to work tomorrow, i don't have the code with me, but it's very boilerplate. You just close and release the objects in the reverse order you created them. You can't use a Using() block because the Excel.Application or Excel.Workbook doesn't implement IDisposable, and if you don't clean-up, you will be left with a hanging Excel objects in memory.
Note:
If you don't set the Visibility property Excel doesn't display, which can be disconcerting to your users, but if you want to just rip the data out, that is probably good enough
You could OleDb, that will work too.
I hope that gets you started, let me know if you need further clarification. I'll post a complete
here is a complete sample:
using System;
using System.IO;
using System.Reflection;
using NUnit.Framework;
using ExcelTools = Ms.Office;
using Excel = Microsoft.Office.Interop.Excel;
namespace Tests
{
[TestFixture]
public class ExcelSingle
{
[Test]
public void ProcessWorkbook()
{
string file = #"C:\Users\Chris\Desktop\TestSheet.xls";
Console.WriteLine(file);
Excel.Application excel = null;
Excel.Workbook wkb = null;
try
{
excel = new Excel.Application();
wkb = ExcelTools.OfficeUtil.OpenBook(excel, file);
Excel.Worksheet sheet = wkb.Sheets["Data"] as Excel.Worksheet;
Excel.Range range = null;
if (sheet != null)
range = sheet.get_Range("A1", Missing.Value);
string A1 = String.Empty;
if( range != null )
A1 = range.Text.ToString();
Console.WriteLine("A1 value: {0}", A1);
}
catch(Exception ex)
{
//if you need to handle stuff
Console.WriteLine(ex.Message);
}
finally
{
if (wkb != null)
ExcelTools.OfficeUtil.ReleaseRCM(wkb);
if (excel != null)
ExcelTools.OfficeUtil.ReleaseRCM(excel);
}
}
}
}
I'll post the functions from ExcelTools tomorrow, I don't have that code with me either.
Edit:
As promised, here are the Functions from ExcelTools you might need.
public static Excel.Workbook OpenBook(Excel.Application excelInstance, string fileName, bool readOnly, bool editable,
bool updateLinks) {
Excel.Workbook book = excelInstance.Workbooks.Open(
fileName, updateLinks, readOnly,
Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing,
Type.Missing, editable, Type.Missing, Type.Missing, Type.Missing,
Type.Missing, Type.Missing);
return book;
}
public static void ReleaseRCM(object o) {
try {
System.Runtime.InteropServices.Marshal.ReleaseComObject(o);
} catch {
} finally {
o = null;
}
}
To be frank, this stuff is much easier if you use VB.NET. It's in C# because I didn't write it. VB.NET does option parameters well, C# does not, hence the Type.Missing. Once you typed Type.Missing twice in a row, you run screaming from the room!
As for you question, you can try to following:
http://msdn.microsoft.com/en-us/library/microsoft.office.interop.excel.range.find(VS.80).aspx
I will post an example when I get back from my meeting... cheers
Edit: Here is an example
range = sheet.Cells.Find("Value to Find",
Type.Missing,
Type.Missing,
Type.Missing,
Type.Missing,
Excel.XlSearchDirection.xlNext,
Type.Missing,
Type.Missing, Type.Missing);
range.Text; //give you the value found
Here is another example inspired by this site:
range = sheet.Cells.Find("Value to find", Type.Missing, Type.Missing,Excel.XlLookAt.xlWhole,Excel.XlSearchOrder.xlByColumns,Excel.XlSearchDirection.xlNext,false, false, Type.Missing);
It helps to understand the parameters.
P.S. I'm one of those weird people who enjoys learning COM automation. All this code steamed from a tool I wrote for work which required me to process over 1000+ spreadsheets from the lab each Monday.
You can use Microsoft.Office.Interop.Excel assembly to process excel files.
Right click on your project and go to Add reference. Add the
Microsoft.Office.Interop.Excel assembly.
Include using
Microsoft.Office.Interop.Excel; to make use of assembly.
Here is the sample code:
using Microsoft.Office.Interop.Excel;
//create the Application object we can use in the member functions.
Microsoft.Office.Interop.Excel.Application _excelApp = new Microsoft.Office.Interop.Excel.Application();
_excelApp.Visible = true;
string fileName = "C:\\sampleExcelFile.xlsx";
//open the workbook
Workbook workbook = _excelApp.Workbooks.Open(fileName,
Type.Missing, Type.Missing, Type.Missing, Type.Missing,
Type.Missing, Type.Missing, Type.Missing, Type.Missing,
Type.Missing, Type.Missing, Type.Missing, Type.Missing,
Type.Missing, Type.Missing);
//select the first sheet
Worksheet worksheet = (Worksheet)workbook.Worksheets[1];
//find the used range in worksheet
Range excelRange = worksheet.UsedRange;
//get an object array of all of the cells in the worksheet (their values)
object[,] valueArray = (object[,])excelRange.get_Value(
XlRangeValueDataType.xlRangeValueDefault);
//access the cells
for (int row = 1; row <= worksheet.UsedRange.Rows.Count; ++row)
{
for (int col = 1; col <= worksheet.UsedRange.Columns.Count; ++col)
{
//access each cell
Debug.Print(valueArray[row, col].ToString());
}
}
//clean up stuffs
workbook.Close(false, Type.Missing, Type.Missing);
Marshal.ReleaseComObject(workbook);
_excelApp.Quit();
Marshal.FinalReleaseComObject(_excelApp);
Why don't you create OleDbConnection? There are a lot of available resources in the Internet. Here is an example
OleDbConnection con = new OleDbConnection("Provider=Microsoft.Jet.OLEDB.4.0;Data Source="+filename+";Extended Properties=Excel 8.0");
con.Open();
try
{
//Create Dataset and fill with imformation from the Excel Spreadsheet for easier reference
DataSet myDataSet = new DataSet();
OleDbDataAdapter myCommand = new OleDbDataAdapter(" SELECT * FROM ["+listname+"$]" , con);
myCommand.Fill(myDataSet);
con.Close();
richTextBox1.AppendText("\nDataSet Filled");
//Travers through each row in the dataset
foreach (DataRow myDataRow in myDataSet.Tables[0].Rows)
{
//Stores info in Datarow into an array
Object[] cells = myDataRow.ItemArray;
//Traverse through each array and put into object cellContent as type Object
//Using Object as for some reason the Dataset reads some blank value which
//causes a hissy fit when trying to read. By using object I can convert to
//String at a later point.
foreach (object cellContent in cells)
{
//Convert object cellContect into String to read whilst replacing Line Breaks with a defined character
string cellText = cellContent.ToString();
cellText = cellText.Replace("\n", "|");
//Read the string and put into Array of characters chars
richTextBox1.AppendText("\n"+cellText);
}
}
//Thread.Sleep(15000);
}
catch (Exception ex)
{
MessageBox.Show(ex.ToString());
//Thread.Sleep(15000);
}
finally
{
con.Close();
}
try
{
DataTable sheet1 = new DataTable("Excel Sheet");
OleDbConnectionStringBuilder csbuilder = new OleDbConnectionStringBuilder();
csbuilder.Provider = "Microsoft.ACE.OLEDB.12.0";
csbuilder.DataSource = fileLocation;
csbuilder.Add("Extended Properties", "Excel 12.0 Xml;HDR=YES");
string selectSql = #"SELECT * FROM [Sheet1$]";
using (OleDbConnection connection = new OleDbConnection(csbuilder.ConnectionString))
using (OleDbDataAdapter adapter = new OleDbDataAdapter(selectSql, connection))
{
connection.Open();
adapter.Fill(sheet1);
}
}
catch (Exception e)
{
Console.WriteLine(e.Message);
}
This worked for me. Please try it and let me know for queries.
First of all, it's important to know what you mean by "open an Excel file for reading and copy it to clipboard..."
This is very important because there are many ways you could do that depending just on what you intend to do. Let me explain:
If you want to read a set of data and copy that in the clipboard and you know the data format (e.g. column names), I suggest you use an OleDbConnection to open the file, this way you can treat the xls file content as a Database Table, so you can read data with SQL instruction and treat the data as you want.
If you want to do operations on the data with the Excel object model then open it in the way you began.
Some time it's possible to treat an xls file as a kind of csv file, there are tools like File Helpers which permit you to treat and open an xls file in a simple way by mapping a structure on an arbitrary object.
Another important point is in which Excel version the file is.
I have, unfortunately I say, a strong experience working with Office automation in all ways, even if bounded in concepts like Application Automation, Data Management and Plugins, and generally I suggest only as the last resort, to using Excel automation or Office automation to read data; just if there aren't better ways to accomplish that task.
Working with automation could be heavy in performance, in terms of resource cost, could involve in other issues related for example to security and more, and last but not at least, working with COM interop it's not so "free"..
So my suggestion is think and analyze the situation within your needs and then take the better way.
Here's a 2020 answer - if you don't need to support the older .xls format (so pre 2003) you could use either:
LightweightExcelReader to access specfic cells, or cursor through all the data in a spreadsheet.
or
ExcelToEnumerable if you want to map spreadsheet data to a list of objects.
Pros :
Performance - at the time of writing (the the fastest way to read an .xlsx file)[https://github.com/ChrisHodges/ExcelToEnumerable#performance].
Simplicity - less verbose than OLE DB or OpenXml
Cons:
Neither LightweightExcelReader nor ExcelToEnumerable support .xls files.
Disclaimer: I am the author of LightweightExcelReader and ExcelToEnumerable
Use Open XML.
Here is some code to process a spreadsheet with a specific tab or sheet name and dump it to something like CSV. (I chose a pipe instead of comma).
I wish it was easier to get the value from a cell, but I think this is what we are stuck with. You can see that I reference the MSDN documents where I got most of this code. That is what Microsoft recommends.
/// <summary>
/// Got code from: https://msdn.microsoft.com/en-us/library/office/gg575571.aspx
/// </summary>
[Test]
public void WriteOutExcelFile()
{
var fileName = "ExcelFiles\\File_With_Many_Tabs.xlsx";
var sheetName = "Submission Form"; // Existing tab name.
using (var document = SpreadsheetDocument.Open(fileName, isEditable: false))
{
var workbookPart = document.WorkbookPart;
var sheet = workbookPart.Workbook.Descendants<Sheet>().FirstOrDefault(s => s.Name == sheetName);
var worksheetPart = (WorksheetPart)(workbookPart.GetPartById(sheet.Id));
var sheetData = worksheetPart.Worksheet.Elements<SheetData>().First();
foreach (var row in sheetData.Elements<Row>())
{
foreach (var cell in row.Elements<Cell>())
{
Console.Write("|" + GetCellValue(cell, workbookPart));
}
Console.Write("\n");
}
}
}
/// <summary>
/// Got code from: https://msdn.microsoft.com/en-us/library/office/hh298534.aspx
/// </summary>
/// <param name="cell"></param>
/// <param name="workbookPart"></param>
/// <returns></returns>
private string GetCellValue(Cell cell, WorkbookPart workbookPart)
{
if (cell == null)
{
return null;
}
var value = cell.CellFormula != null
? cell.CellValue.InnerText
: cell.InnerText.Trim();
// If the cell represents an integer number, you are done.
// For dates, this code returns the serialized value that
// represents the date. The code handles strings and
// Booleans individually. For shared strings, the code
// looks up the corresponding value in the shared string
// table. For Booleans, the code converts the value into
// the words TRUE or FALSE.
if (cell.DataType == null)
{
return value;
}
switch (cell.DataType.Value)
{
case CellValues.SharedString:
// For shared strings, look up the value in the
// shared strings table.
var stringTable =
workbookPart.GetPartsOfType<SharedStringTablePart>()
.FirstOrDefault();
// If the shared string table is missing, something
// is wrong. Return the index that is in
// the cell. Otherwise, look up the correct text in
// the table.
if (stringTable != null)
{
value =
stringTable.SharedStringTable
.ElementAt(int.Parse(value)).InnerText;
}
break;
case CellValues.Boolean:
switch (value)
{
case "0":
value = "FALSE";
break;
default:
value = "TRUE";
break;
}
break;
}
return value;
}
Use OLEDB Connection to communicate with excel files. it gives better result
using System.Data.OleDb;
string physicalPath = "Your Excel file physical path";
OleDbCommand cmd = new OleDbCommand();
OleDbDataAdapter da = new OleDbDataAdapter();
DataSet ds = new DataSet();
String strNewPath = physicalPath;
String connString = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + strNewPath + ";Extended Properties=\"Excel 12.0;HDR=Yes;IMEX=2\"";
String query = "SELECT * FROM [Sheet1$]"; // You can use any different queries to get the data from the excel sheet
OleDbConnection conn = new OleDbConnection(connString);
if (conn.State == ConnectionState.Closed) conn.Open();
try
{
cmd = new OleDbCommand(query, conn);
da = new OleDbDataAdapter(cmd);
da.Fill(ds);
}
catch
{
// Exception Msg
}
finally
{
da.Dispose();
conn.Close();
}
The Output data will be stored in dataset, using the dataset object you can easily access the datas.
Hope this may helpful
Using OlebDB, we can read excel file in C#, easily, here is the code while working with Web-Form, where FileUpload1 is file uploading tool
string path = Server.MapPath("~/Uploads/");
if (!Directory.Exists(path))
{
Directory.CreateDirectory(path);
}
//get file path
filePath = path + Path.GetFileName(FileUpload1.FileName);
//get file extenstion
string extension = Path.GetExtension(FileUpload1.FileName);
//save file on "Uploads" folder of project
FileUpload1.SaveAs(filePath);
string conString = string.Empty;
//check file extension
switch (extension)
{
case ".xls": //Excel 97-03.
conString = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=Excel03ConString;Extended Properties='Excel 8.0;HDR=YES'";
break;
case ".xlsx": //Excel 07 and above.
conString = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=Excel07ConString;Extended Properties='Excel 8.0;HDR=YES'";
break;
}
//create datatable object
DataTable dt = new DataTable();
conString = string.Format(conString, filePath);
//Use OldDb to read excel
using (OleDbConnection connExcel = new OleDbConnection(conString))
{
using (OleDbCommand cmdExcel = new OleDbCommand())
{
using (OleDbDataAdapter odaExcel = new OleDbDataAdapter())
{
cmdExcel.Connection = connExcel;
//Get the name of First Sheet.
connExcel.Open();
DataTable dtExcelSchema;
dtExcelSchema = connExcel.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
string sheetName = dtExcelSchema.Rows[0]["TABLE_NAME"].ToString();
connExcel.Close();
//Read Data from First Sheet.
connExcel.Open();
cmdExcel.CommandText = "SELECT * From [" + sheetName + "]";
odaExcel.SelectCommand = cmdExcel;
odaExcel.Fill(dt);
connExcel.Close();
}
}
}
//bind datatable with GridView
GridView1.DataSource = dt;
GridView1.DataBind();
Source : https://qawithexperts.com/article/asp-net/read-excel-file-and-import-data-into-gridview-using-datatabl/209
Console application similar code example
https://qawithexperts.com/article/c-sharp/read-excel-file-in-c-console-application-example-using-oledb/168
If you need don't want to use OleDB, you can try https://github.com/ExcelDataReader/ExcelDataReader
which seems to have the ability to handle both formats (.xls and .xslx)
Excel File Reader & Writer Without Excel On u'r System
Download and add the dll for
NPOI u'r project.
Using this code to read a excel file.
using (FileStream file = new FileStream(filePath, FileMode.Open, FileAccess.Read, FileShare.ReadWrite))
{
XSSFWorkbook XSSFWorkbook = new XSSFWorkbook(file);
}
ISheet objxlWorkSheet = XSSFWorkbook.GetSheetAt(0);
int intRowCount = 1;
int intColumnCount = 0;
for (; ; )
{
IRow Row = objxlWorkSheet.GetRow(intRowCount);
if (Row != null)
{
ICell Cell = Row.GetCell(0);
ICell objCell = objxlWorkSheet.GetRow(intRowCount).GetCell(intColumnCount); }}
You can use ExcelDataReader see GitHub
You need to install nugets :
-ExcelDataReader
-ExcelDataReader.DataSet
using System;
using System.Collections.Generic;
using System.Data;
using System.Linq;
using System.IO;
using ExcelDataReader;
using System.Text;
/// <summary>
/// Excel parsing in this class is performed by using a common shareware Lib found on:
/// https://github.com/ExcelDataReader/ExcelDataReader
/// </summary>
public static class ExcelParser
{
/// <summary>
/// Load, read and get values from Excel sheet
/// </summary>
public static List<FileRow> GetExcelRows(string path, string sheetName, bool skipFirstLine)
{
if (File.Exists(path))
{
return GetValues(path, sheetName, skipFirstLine);
}
else
throw new Exception("The process cannot access the file");
}
/// <summary>
/// Parse sheet names from given Excel file.
/// </summary>
public static List<string> GetSheetNames(string path)
{
using (var stream = new FileStream(path, FileMode.Open, FileAccess.Read, FileShare.Read))
{
Encoding.RegisterProvider(CodePagesEncodingProvider.Instance);
using (var excelReader = GetExcelDataReader(path, stream))
{
var dataset = excelReader.AsDataSet(new ExcelDataSetConfiguration()
{
ConfigureDataTable = (_) => new ExcelDataTableConfiguration()
{
UseHeaderRow = true
}
});
var names = from DataTable table in dataset.Tables
select table.TableName;
return names.ToList();
}
}
}
/// <summary>
/// Parse values from Excel sheet and add to Rows collection.
/// </summary>
public static List<FileRow> GetValues(string path, string sheetName, bool skipFirstLine)
{
var rowItems = new List<FileRow>();
using (var stream = new FileStream(path, FileMode.Open, FileAccess.Read, FileShare.Read))
{
using (var excelReader = GetExcelDataReader(path, stream))
{
var dataset = excelReader.AsDataSet(new ExcelDataSetConfiguration()
{
ConfigureDataTable = (_) => new ExcelDataTableConfiguration()
{
UseHeaderRow = skipFirstLine
}
});
foreach (DataRow row in dataset.Tables[sheetName].Rows)
{
var rowItem = new FileRow();
foreach (var value in row.ItemArray)
rowItem.Values.Add(value);
rowItems.Add(rowItem);
}
}
}
return rowItems;
}
private static IExcelDataReader GetExcelDataReader(string path, Stream stream)
{
var extension = GetExtension(path);
switch (extension)
{
case "xls":
return ExcelReaderFactory.CreateBinaryReader(stream);
case "xlsx":
return ExcelReaderFactory.CreateOpenXmlReader(stream);
default:
throw new Exception(string.Format("'{0}' is not a valid Excel extension", extension));
}
}
private static string GetExtension(string path)
{
var extension = Path.GetExtension(path);
return extension == null ? null : extension.ToLower().Substring(1);
}
}
With this entity :
public class FileRow
{
public List<object> Values { get; set; }
public FileRow()
{
Values = new List<object>();
}
}
Use like that :
var txtPath = #"D:\Path\excelfile.xlsx";
var sheetNames = ExcelParser.GetSheetNames(txtPath);
var datas = ExcelParser.GetExcelRows(txtPath, sheetNames[0], true);
The recommended way to read Excel files on server side app is Open XML.
Sharing few links -
https://msdn.microsoft.com/en-us/library/office/hh298534.aspx
https://msdn.microsoft.com/en-us/library/office/ff478410.aspx
https://msdn.microsoft.com/en-us/library/office/cc823095.aspx
public void excelRead(string sheetName)
{
Excel.Application appExl = new Excel.Application();
Excel.Workbook workbook = null;
try
{
string methodName = "";
Excel.Worksheet NwSheet;
Excel.Range ShtRange;
//Opening Excel file(myData.xlsx)
appExl = new Excel.Application();
workbook = appExl.Workbooks.Open(sheetName, Missing.Value, ReadOnly: false);
NwSheet = (Excel.Worksheet)workbook.Sheets.get_Item(1);
ShtRange = NwSheet.UsedRange; //gives the used cells in sheet
int rCnt1 = 0;
int cCnt1 = 0;
for (rCnt1 = 1; rCnt1 <= ShtRange.Rows.Count; rCnt1++)
{
for (cCnt1 = 1; cCnt1 <= ShtRange.Columns.Count; cCnt1++)
{
if (Convert.ToString(NwSheet.Cells[rCnt1, cCnt1].Value2) == "Y")
{
methodName = NwSheet.Cells[rCnt1, cCnt1 - 2].Value2;
Type metdType = this.GetType();
MethodInfo mthInfo = metdType.GetMethod(methodName);
if (Convert.ToString(NwSheet.Cells[rCnt1, cCnt1 - 2].Value2) == "fn_AddNum" || Convert.ToString(NwSheet.Cells[rCnt1, cCnt1 - 2].Value2) == "fn_SubNum")
{
StaticVariable.intParam1 = Convert.ToInt32(NwSheet.Cells[rCnt1, cCnt1 + 3].Value2);
StaticVariable.intParam2 = Convert.ToInt32(NwSheet.Cells[rCnt1, cCnt1 + 4].Value2);
object[] mParam1 = new object[] { StaticVariable.intParam1, StaticVariable.intParam2 };
object result = mthInfo.Invoke(this, mParam1);
StaticVariable.intOutParam1 = Convert.ToInt32(result);
NwSheet.Cells[rCnt1, cCnt1 + 5].Value2 = Convert.ToString(StaticVariable.intOutParam1) != "" ? Convert.ToString(StaticVariable.intOutParam1) : String.Empty;
}
else
{
object[] mParam = new object[] { };
mthInfo.Invoke(this, mParam);
NwSheet.Cells[rCnt1, cCnt1 + 5].Value2 = StaticVariable.outParam1 != "" ? StaticVariable.outParam1 : String.Empty;
NwSheet.Cells[rCnt1, cCnt1 + 6].Value2 = StaticVariable.outParam2 != "" ? StaticVariable.outParam2 : String.Empty;
}
NwSheet.Cells[rCnt1, cCnt1 + 1].Value2 = StaticVariable.resultOut;
NwSheet.Cells[rCnt1, cCnt1 + 2].Value2 = StaticVariable.resultDescription;
}
else if (Convert.ToString(NwSheet.Cells[rCnt1, cCnt1].Value2) == "N")
{
MessageBox.Show("Result is No");
}
else if (Convert.ToString(NwSheet.Cells[rCnt1, cCnt1].Value2) == "EOF")
{
MessageBox.Show("End of File");
}
}
}
workbook.Save();
workbook.Close(true, Missing.Value, Missing.Value);
appExl.Quit();
System.Runtime.InteropServices.Marshal.FinalReleaseComObject(ShtRange);
System.Runtime.InteropServices.Marshal.FinalReleaseComObject(NwSheet);
System.Runtime.InteropServices.Marshal.FinalReleaseComObject(workbook);
System.Runtime.InteropServices.Marshal.FinalReleaseComObject(appExl);
}
catch (Exception)
{
workbook.Close(true, Missing.Value, Missing.Value);
}
finally
{
GC.Collect();
GC.WaitForPendingFinalizers();
System.Runtime.InteropServices.Marshal.CleanupUnusedObjectsInCurrentContext();
}
}
//code for reading excel data in datatable
public void testExcel(string sheetName)
{
try
{
MessageBox.Show(sheetName);
foreach(Process p in Process.GetProcessesByName("EXCEL"))
{
p.Kill();
}
//string fileName = "E:\\inputSheet";
Excel.Application oXL;
Workbook oWB;
Worksheet oSheet;
Range oRng;
// creat a Application object
oXL = new Excel.Application();
// get WorkBook object
oWB = oXL.Workbooks.Open(sheetName);
// get WorkSheet object
oSheet = (Microsoft.Office.Interop.Excel.Worksheet)oWB.Sheets[1];
System.Data.DataTable dt = new System.Data.DataTable();
//DataSet ds = new DataSet();
//ds.Tables.Add(dt);
DataRow dr;
StringBuilder sb = new StringBuilder();
int jValue = oSheet.UsedRange.Cells.Columns.Count;
int iValue = oSheet.UsedRange.Cells.Rows.Count;
// get data columns
for (int j = 1; j <= jValue; j++)
{
oRng = (Microsoft.Office.Interop.Excel.Range)oSheet.Cells[1, j];
string strValue = oRng.Text.ToString();
dt.Columns.Add(strValue, System.Type.GetType("System.String"));
}
//string colString = sb.ToString().Trim();
//string[] colArray = colString.Split(':');
// get data in cell
for (int i = 2; i <= iValue; i++)
{
dr = dt.NewRow();
for (int j = 1; j <= jValue; j++)
{
oRng = (Microsoft.Office.Interop.Excel.Range)oSheet.Cells[i, j];
string strValue = oRng.Text.ToString();
dr[j - 1] = strValue;
}
dt.Rows.Add(dr);
}
if(StaticVariable.dtExcel != null)
{
StaticVariable.dtExcel.Clear();
StaticVariable.dtExcel = dt.Copy();
}
else
StaticVariable.dtExcel = dt.Copy();
oWB.Close(true, Missing.Value, Missing.Value);
oXL.Quit();
MessageBox.Show(sheetName);
}
catch (Exception ex)
{
MessageBox.Show(ex.Message);
}
finally
{
}
}
//code for class initialize
public static void startTesting(TestContext context)
{
Playback.Initialize();
ReadExcel myClassObj = new ReadExcel();
string sheetName="";
StreamReader sr = new StreamReader(#"E:\SaveSheetName.txt");
sheetName = sr.ReadLine();
sr.Close();
myClassObj.excelRead(sheetName);
myClassObj.testExcel(sheetName);
}
//code for test initalize
public void runValidatonTest()
{
DataTable dtFinal = StaticVariable.dtExcel.Copy();
for (int i = 0; i < dtFinal.Rows.Count; i++)
{
if (TestContext.TestName == dtFinal.Rows[i][2].ToString() && dtFinal.Rows[i][3].ToString() == "Y" && dtFinal.Rows[i][4].ToString() == "TRUE")
{
MessageBox.Show(TestContext.TestName);
MessageBox.Show(dtFinal.Rows[i][2].ToString());
StaticVariable.runValidateResult = "true";
break;
}
}
//StaticVariable.dtExcel = dtFinal.Copy();
}
I'd recommend you to use Bytescout Spreadsheet.
https://bytescout.com/products/developer/spreadsheetsdk/bytescoutspreadsheetsdk.html
I tried it with Monodevelop in Unity3D and it is pretty straight forward. Check this sample code to see how the library works:
https://bytescout.com/products/developer/spreadsheetsdk/read-write-excel.html
All,
I have been struggling to simply reference an Excel file so that I can extract its data into a new worksheet automatically.
I know this means to create a new Excel file and its items:
Excel.Application oXL;
Excel._Workbook oWB;
Excel._Worksheet oSheet;
Excel.Range oRng;
//Start Excel and get Application object.
oXL = new Excel.Application();
oXL.Visible = true;
//Get a new workbook.
oWB = (Excel._Workbook)(oXL.Workbooks.Add(Missing.Value));
oSheet = (Excel._Worksheet)oWB.ActiveSheet;
But this is where I am lost...I want the user to select another Excel file through a file dialog and then I want to copy data from said file into the new workbook above.
Ex: New file, user selects "MyExcel.csv". How would I reference this so that I can, say, copy Column A into the new worksheet? Whatever works with C#.
Maybe this helps:
Make sure your manage your Nugets and add the reference "Microsoft.Office.Interop.Excel" --> (References (Right click) --> Manage Nuget Packages --> Browse --> "Microsoft.Office.Interop.Excel" --> Install)
Note: This is is a copy paste into the same worksheet. To copy the data into another worksheet from another file just pass the "worksheet" as parameter to the method and paste it there.
public class Excel1
{
private readonly string excelSufix = ".xlsx";
private readonly string excelFilePrefix = "..."; //Where you excel file is located
private string getFilePath(string fileName)
{
return excelFilePrefix + fileName + excelSufix; //returns the file path by the given file name
}
public void CopyPaste(string fileName, int worksheet, string toCopyRange, string whereInsertRange)
{
//Range should look like = "A:C" or "D:F"
var excelApp = new Microsoft.Office.Interop.Excel.Application();
excelApp.Visible = true;
var workBook = excelApp.Workbooks.Open(getFilePath(fileName));
var workSheet = (Microsoft.Office.Interop.Excel.Worksheet)workBook.Sheets[worksheet];
var toCopy = workSheet.Range[toCopyRange];
var whereInsert = workSheet.Range[whereInsertRange];
whereInsert.Insert(Microsoft.Office.Interop.Excel.XlInsertShiftDirection.xlShiftToRight, toCopy.Cut());
}
}
Edit: You could add a constructor to the class where you pass the file Path and some class variables which reference the file, the worksheet etc etc... In this case, you will have independent objects for each file
Based on your description, you want to copy one column from the csv file and paste in the new excel file.
First, I convert the csv file to datatable.
Second, I select the specific column and convert one column to datatable.
Third, I convert the datatable to the new excel file.
You can try the following code.
using System;
using System.Data;
using Excel = Microsoft.Office.Interop.Excel;
class Program
{
static void Main(string[] args)
{
string path = "D:\\t.csv";
DataTable table = ReadCsv(path);
table= new DataView(table).ToTable(false, "Tests");
ExportToExcel(table, "D:\\t.xlsx");
}
public static DataTable ReadCsv(string path)
{
Microsoft.Office.Interop.Excel.Application objXL = null;
Microsoft.Office.Interop.Excel.Workbook objWB = null;
objXL = new Microsoft.Office.Interop.Excel.Application();
objWB = objXL.Workbooks.Open(path);
Microsoft.Office.Interop.Excel.Worksheet objSHT = objWB.Worksheets[1];
int rows = objSHT.UsedRange.Rows.Count;
int cols = objSHT.UsedRange.Columns.Count;
DataTable dt = new DataTable();
int noofrow = 1;
for (int c = 1; c <= cols; c++)
{
string colname = objSHT.Cells[1, c].Text;
dt.Columns.Add(colname);
noofrow = 2;
}
for (int r = noofrow; r <= rows; r++)
{
DataRow dr = dt.NewRow();
for (int c = 1; c <= cols; c++)
{
dr[c - 1] = objSHT.Cells[r, c].Text;
}
dt.Rows.Add(dr);
}
objWB.Close();
objXL.Quit();
return dt;
}
public static void ExportToExcel( DataTable tbl, string excelFilePath = null)
{
try
{
if (tbl == null || tbl.Columns.Count == 0)
throw new Exception("ExportToExcel: Null or empty input table!\n");
// load excel, and create a new workbook
var excelApp = new Excel.Application();
excelApp.Workbooks.Add();
// single worksheet
Excel._Worksheet workSheet = excelApp.ActiveSheet;
// column headings
for (var i = 0; i < tbl.Columns.Count; i++)
{
workSheet.Cells[1, i + 1] = tbl.Columns[i].ColumnName;
}
// rows
for (var i = 0; i < tbl.Rows.Count; i++)
{
// to do: format datetime values before printing
for (var j = 0; j < tbl.Columns.Count; j++)
{
workSheet.Cells[i + 2, j + 1] = tbl.Rows[i][j];
}
}
// check file path
if (!string.IsNullOrEmpty(excelFilePath))
{
try
{
workSheet.SaveAs(excelFilePath);
excelApp.Quit();
Console.WriteLine("Excel file saved!");
}
catch (Exception ex)
{
throw new Exception("ExportToExcel: Excel file could not be saved! Check filepath.\n"
+ ex.Message);
}
}
else
{ // no file path is given
excelApp.Visible = true;
}
}
catch (Exception ex)
{
throw new Exception("ExportToExcel: \n" + ex.Message);
}
}
}
Initial csv file.
The excel file:
I am using the following code to get data from excel sheet:-
private static DataTable GetDataFromCSVFile(string csv_file_path)
{
DataTable csvData = new DataTable();
using (TextFieldParser csvReader = new TextFieldParser(csv_file_path))
{
csvReader.SetDelimiters(new string[] { "," });
csvReader.HasFieldsEnclosedInQuotes = true;
string[] colFields = csvReader.ReadFields();
foreach (string column in colFields)
{
DataColumn datecolumn = new DataColumn(column);
datecolumn.AllowDBNull = true;
csvData.Columns.Add(datecolumn);
}
while (!csvReader.EndOfData)
{
string[] fieldData = csvReader.ReadFields();
//Making empty value as null
for (int i = 0; i < fieldData.Length; i++)
{
if (fieldData[i] == "")
{
fieldData[i] = null;
}
}
csvData.Rows.Add(fieldData);
}
return csvData;
}
}
but I need to retrieve same from sheets within same workbook. Any idea how can I do this using datatable in c#.
Thanks
"As users input data in Excel sheets" doesn't mean input into xls or xlsx file. It may mean that users use MS Excel to edit CSV. You can create "true excel" file with data from 4 source csv files and read a new single file, but you need another API, not TextFieldParser. To read new xlsx files i use DocumentFormat.OpenXml assembly. Quick sample is
using (var doc = SpreadsheetDocument.Open(_docPath, false))
{
var sheet = doc.WorkbookPart.Workbook.Sheets.Elements<Sheet>().First()
//Or iterate over all sheets, get sheet name, read it's data etc.
//i'm sure you can find a lot of samples
var worksheetPart = (WorksheetPart)doc.WorkbookPart.GetPartById(sheet.Id);
var sheetData = worksheetPart.Worksheet.GetFirstChild<SheetData>();
foreach (var row in sheetData.Elements<Row>())
{
//some code here
}
}
I know that there are different ways to read an Excel file:
Iterop
Oledb
Open Xml SDK
Compatibility is not a question because the program will be executed in a controlled environment.
My Requirement :
Read a file to a DataTable / CUstom Entities (I don't know how to make dynamic properties/fields to an object[column names will be variating in an Excel file])
Use DataTable/Custom Entities to perform some operations using its data.
Update DataTable with the results of the operations
Write it back to excel file.
Which would be simpler.
Also if possible advice me on custom Entities (adding properties/fields to an object dynamically)
Take a look at Linq-to-Excel. It's pretty neat.
var book = new LinqToExcel.ExcelQueryFactory(#"File.xlsx");
var query =
from row in book.Worksheet("Stock Entry")
let item = new
{
Code = row["Code"].Cast<string>(),
Supplier = row["Supplier"].Cast<string>(),
Ref = row["Ref"].Cast<string>(),
}
where item.Supplier == "Walmart"
select item;
It also allows for strongly-typed row access too.
I realize this question was asked nearly 7 years ago but it's still a top Google search result for certain keywords regarding importing excel data with C#, so I wanted to provide an alternative based on some recent tech developments.
Importing Excel data has become such a common task to my everyday duties, that I've streamlined the process and documented the method on my blog: best way to read excel file in c#.
I use NPOI because it can read/write Excel files without Microsoft Office installed and it doesn't use COM+ or any interops. That means it can work in the cloud!
But the real magic comes from pairing up with NPOI Mapper from Donny Tian because it allows me to map the Excel columns to properties in my C# classes without writing any code. It's beautiful.
Here is the basic idea:
I create a .net class that matches/maps the Excel columns I'm interested in:
class CustomExcelFormat
{
[Column("District")]
public int District { get; set; }
[Column("DM")]
public string FullName { get; set; }
[Column("Email Address")]
public string EmailAddress { get; set; }
[Column("Username")]
public string Username { get; set; }
public string FirstName
{
get
{
return Username.Split('.')[0];
}
}
public string LastName
{
get
{
return Username.Split('.')[1];
}
}
}
Notice, it allows me to map based on column name if I want to!
Then when I process the excel file all I need to do is something like this:
public void Execute(string localPath, int sheetIndex)
{
IWorkbook workbook;
using (FileStream file = new FileStream(localPath, FileMode.Open, FileAccess.Read))
{
workbook = WorkbookFactory.Create(file);
}
var importer = new Mapper(workbook);
var items = importer.Take<CustomExcelFormat>(sheetIndex);
foreach(var item in items)
{
var row = item.Value;
if (string.IsNullOrEmpty(row.EmailAddress))
continue;
UpdateUser(row);
}
DataContext.SaveChanges();
}
Now, admittedly, my code does not modify the Excel file itself. I am instead saving the data to a database using Entity Framework (that's why you see "UpdateUser" and "SaveChanges" in my example). But there is already a good discussion on SO about how to save/modify a file using NPOI.
Using OLE Query, it's quite simple (e.g. sheetName is Sheet1):
DataTable LoadWorksheetInDataTable(string fileName, string sheetName)
{
DataTable sheetData = new DataTable();
using (OleDbConnection conn = this.returnConnection(fileName))
{
conn.Open();
// retrieve the data using data adapter
OleDbDataAdapter sheetAdapter = new OleDbDataAdapter("select * from [" + sheetName + "$]", conn);
sheetAdapter.Fill(sheetData);
conn.Close();
}
return sheetData;
}
private OleDbConnection returnConnection(string fileName)
{
return new OleDbConnection("Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + fileName + "; Jet OLEDB:Engine Type=5;Extended Properties=\"Excel 8.0;\"");
}
For newer Excel versions:
return new OleDbConnection("Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + fileName + ";Extended Properties=Excel 12.0;");
You can also use Excel Data Reader an open source project on CodePlex. Its works really well to export data from Excel sheets.
The sample code given on the link specified:
FileStream stream = File.Open(filePath, FileMode.Open, FileAccess.Read);
//1. Reading from a binary Excel file ('97-2003 format; *.xls)
IExcelDataReader excelReader = ExcelReaderFactory.CreateBinaryReader(stream);
//...
//2. Reading from a OpenXml Excel file (2007 format; *.xlsx)
IExcelDataReader excelReader = ExcelReaderFactory.CreateOpenXmlReader(stream);
//...
//3. DataSet - The result of each spreadsheet will be created in the result.Tables
DataSet result = excelReader.AsDataSet();
//...
//4. DataSet - Create column names from first row
excelReader.IsFirstRowAsColumnNames = true;
DataSet result = excelReader.AsDataSet();
//5. Data Reader methods
while (excelReader.Read())
{
//excelReader.GetInt32(0);
}
//6. Free resources (IExcelDataReader is IDisposable)
excelReader.Close();
Reference: How do I import from Excel to a DataSet using Microsoft.Office.Interop.Excel?
Try to use this free way to this, https://freenetexcel.codeplex.com
Workbook workbook = new Workbook();
workbook.LoadFromFile(#"..\..\parts.xls",ExcelVersion.Version97to2003);
//Initialize worksheet
Worksheet sheet = workbook.Worksheets[0];
DataTable dataTable = sheet.ExportDataTable();
If you can restrict it to just (Open Office XML format) *.xlsx files, then probably the most popular library would be EPPLus.
Bonus is, there are no other dependencies. Just install using nuget:
Install-Package EPPlus
Try to use Aspose.cells library (not free, but trial is enough to read), it is quite good
Install-package Aspose.cells
There is sample code:
using Aspose.Cells;
using System;
namespace ExcelReader
{
class Program
{
static void Main(string[] args)
{
// Replace path for your file
readXLS(#"C:\MyExcelFile.xls"); // or "*.xlsx"
Console.ReadKey();
}
public static void readXLS(string PathToMyExcel)
{
//Open your template file.
Workbook wb = new Workbook(PathToMyExcel);
//Get the first worksheet.
Worksheet worksheet = wb.Worksheets[0];
//Get cells
Cells cells = worksheet.Cells;
// Get row and column count
int rowCount = cells.MaxDataRow;
int columnCount = cells.MaxDataColumn;
// Current cell value
string strCell = "";
Console.WriteLine(String.Format("rowCount={0}, columnCount={1}", rowCount, columnCount));
for (int row = 0; row <= rowCount; row++) // Numeration starts from 0 to MaxDataRow
{
for (int column = 0; column <= columnCount; column++) // Numeration starts from 0 to MaxDataColumn
{
strCell = "";
strCell = Convert.ToString(cells[row, column].Value);
if (String.IsNullOrEmpty(strCell))
{
continue;
}
else
{
// Do your staff here
Console.WriteLine(strCell);
}
}
}
}
}
}
Read from excel, modify and write back
/// <summary>
/// /Reads an excel file and converts it into dataset with each sheet as each table of the dataset
/// </summary>
/// <param name="filename"></param>
/// <param name="headers">If set to true the first row will be considered as headers</param>
/// <returns></returns>
public DataSet Import(string filename, bool headers = true)
{
var _xl = new Excel.Application();
var wb = _xl.Workbooks.Open(filename);
var sheets = wb.Sheets;
DataSet dataSet = null;
if (sheets != null && sheets.Count != 0)
{
dataSet = new DataSet();
foreach (var item in sheets)
{
var sheet = (Excel.Worksheet)item;
DataTable dt = null;
if (sheet != null)
{
dt = new DataTable();
var ColumnCount = ((Excel.Range)sheet.UsedRange.Rows[1, Type.Missing]).Columns.Count;
var rowCount = ((Excel.Range)sheet.UsedRange.Columns[1, Type.Missing]).Rows.Count;
for (int j = 0; j < ColumnCount; j++)
{
var cell = (Excel.Range)sheet.Cells[1, j + 1];
var column = new DataColumn(headers ? cell.Value : string.Empty);
dt.Columns.Add(column);
}
for (int i = 0; i < rowCount; i++)
{
var r = dt.NewRow();
for (int j = 0; j < ColumnCount; j++)
{
var cell = (Excel.Range)sheet.Cells[i + 1 + (headers ? 1 : 0), j + 1];
r[j] = cell.Value;
}
dt.Rows.Add(r);
}
}
dataSet.Tables.Add(dt);
}
}
_xl.Quit();
return dataSet;
}
public string Export(DataTable dt, bool headers = false)
{
var wb = _xl.Workbooks.Add();
var sheet = (Excel.Worksheet)wb.ActiveSheet;
//process columns
for (int i = 0; i < dt.Columns.Count; i++)
{
var col = dt.Columns[i];
//added columns to the top of sheet
var currentCell = (Excel.Range)sheet.Cells[1, i + 1];
currentCell.Value = col.ToString();
currentCell.Font.Bold = true;
//process rows
for (int j = 0; j < dt.Rows.Count; j++)
{
var row = dt.Rows[j];
//added rows to sheet
var cell = (Excel.Range)sheet.Cells[j + 1 + 1, i + 1];
cell.Value = row[i];
}
currentCell.EntireColumn.AutoFit();
}
var fileName="{somepath/somefile.xlsx}";
wb.SaveCopyAs(fileName);
_xl.Quit();
return fileName;
}
I used Office's NuGet Package: DocumentFormat.OpenXml and pieced together the code from that component's doc site.
With the below helper code, was similar in complexity to my other CSV file format parsing in that project...
public static async Task ImportXLSX(Stream stream, string sheetName) {
{
// This was necessary for my Blazor project, which used a BrowserFileStream object
MemoryStream ms = new MemoryStream();
await stream.CopyToAsync(ms);
using (var document = SpreadsheetDocument.Open(ms, false))
{
// Retrieve a reference to the workbook part.
WorkbookPart wbPart = document.WorkbookPart;
// Find the sheet with the supplied name, and then use that
// Sheet object to retrieve a reference to the first worksheet.
Sheet theSheet = wbPart?.Workbook.Descendants<Sheet>().Where(s => s?.Name == sheetName).FirstOrDefault();
// Throw an exception if there is no sheet.
if (theSheet == null)
{
throw new ArgumentException("sheetName");
}
WorksheetPart wsPart = (WorksheetPart)(wbPart.GetPartById(theSheet.Id));
// For shared strings, look up the value in the
// shared strings table.
var stringTable =
wbPart.GetPartsOfType<SharedStringTablePart>()
.FirstOrDefault();
// I needed to grab 4 cells from each row
// Starting at row 11, until the cell in column A is blank
int row = 11;
while (true) {
var accountNameCell = GetCell(wsPart, "A" + row.ToString());
var accountName = GetValue(accountNameCell, stringTable);
if (string.IsNullOrEmpty(accountName)) {
break;
}
var investmentNameCell = GetCell(wsPart, "B" + row.ToString());
var investmentName = GetValue(investmentNameCell, stringTable);
var symbolCell = GetCell(wsPart, "D" + row.ToString());
var symbol = GetValue(symbolCell, stringTable);
var marketValue = GetCell(wsPart, "J" + row.ToString()).InnerText;
// DO STUFF with data
row++;
}
}
}
private static string? GetValue(Cell cell, SharedStringTablePart stringTable) {
try {
return stringTable.SharedStringTable.ElementAt(int.Parse(cell.InnerText)).InnerText;
} catch (Exception) {
return null;
}
}
private static Cell GetCell(WorksheetPart wsPart, string cellReference) {
return wsPart.Worksheet.Descendants<Cell>().Where(c => c.CellReference.Value == cellReference)?.FirstOrDefault();
}
I'm using OleDb to read from an excel workbook with many sheets.
I need to read the sheet names, but I need them in the order they are defined in the spreadsheet; so If I have a file that looks like this;
|_____|_____|____|____|____|____|____|____|____|
|_____|_____|____|____|____|____|____|____|____|
|_____|_____|____|____|____|____|____|____|____|
\__GERMANY__/\__UK__/\__IRELAND__/
Then I need to get the dictionary
1="GERMANY",
2="UK",
3="IRELAND"
I've tried using OleDbConnection.GetOleDbSchemaTable(), and that gives me the list of names, but it alphabetically sorts them. The alpha-sort means I don't know which sheet number a particular name corresponds to. So I get;
GERMANY, IRELAND, UK
which has changed the order of UK and IRELAND.
The reason I need it to be sorted is that I have to let the user choose a range of data by name or index; they can ask for 'all the data from GERMANY to IRELAND' or 'data from sheet 1 to sheet 3'.
Any ideas would be greatly appreciated.
if I could use the office interop classes, this would be straightforward. Unfortunately, I can't because the interop classes don't work reliably in non-interactive environments such as windows services and ASP.NET sites, so I needed to use OLEDB.
Can you not just loop through the sheets from 0 to Count of names -1? that way you should get them in the correct order.
Edit
I noticed through the comments that there are a lot of concerns about using the Interop classes to retrieve the sheet names. Therefore here is an example using OLEDB to retrieve them:
/// <summary>
/// This method retrieves the excel sheet names from
/// an excel workbook.
/// </summary>
/// <param name="excelFile">The excel file.</param>
/// <returns>String[]</returns>
private String[] GetExcelSheetNames(string excelFile)
{
OleDbConnection objConn = null;
System.Data.DataTable dt = null;
try
{
// Connection String. Change the excel file to the file you
// will search.
String connString = "Provider=Microsoft.Jet.OLEDB.4.0;" +
"Data Source=" + excelFile + ";Extended Properties=Excel 8.0;";
// Create connection object by using the preceding connection string.
objConn = new OleDbConnection(connString);
// Open connection with the database.
objConn.Open();
// Get the data table containg the schema guid.
dt = objConn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
if(dt == null)
{
return null;
}
String[] excelSheets = new String[dt.Rows.Count];
int i = 0;
// Add the sheet name to the string array.
foreach(DataRow row in dt.Rows)
{
excelSheets[i] = row["TABLE_NAME"].ToString();
i++;
}
// Loop through all of the sheets if you want too...
for(int j=0; j < excelSheets.Length; j++)
{
// Query each excel sheet.
}
return excelSheets;
}
catch(Exception ex)
{
return null;
}
finally
{
// Clean up.
if(objConn != null)
{
objConn.Close();
objConn.Dispose();
}
if(dt != null)
{
dt.Dispose();
}
}
}
Extracted from Article on the CodeProject.
Since above code do not cover procedures for extracting list of sheet name for Excel 2007,following code will be applicable for both Excel(97-2003) and Excel 2007 too:
public List<string> ListSheetInExcel(string filePath)
{
OleDbConnectionStringBuilder sbConnection = new OleDbConnectionStringBuilder();
String strExtendedProperties = String.Empty;
sbConnection.DataSource = filePath;
if (Path.GetExtension(filePath).Equals(".xls"))//for 97-03 Excel file
{
sbConnection.Provider = "Microsoft.Jet.OLEDB.4.0";
strExtendedProperties = "Excel 8.0;HDR=Yes;IMEX=1";//HDR=ColumnHeader,IMEX=InterMixed
}
else if (Path.GetExtension(filePath).Equals(".xlsx")) //for 2007 Excel file
{
sbConnection.Provider = "Microsoft.ACE.OLEDB.12.0";
strExtendedProperties = "Excel 12.0;HDR=Yes;IMEX=1";
}
sbConnection.Add("Extended Properties",strExtendedProperties);
List<string> listSheet = new List<string>();
using (OleDbConnection conn = new OleDbConnection(sbConnection.ToString()))
{
conn.Open();
DataTable dtSheet = conn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
foreach (DataRow drSheet in dtSheet.Rows)
{
if (drSheet["TABLE_NAME"].ToString().Contains("$"))//checks whether row contains '_xlnm#_FilterDatabase' or sheet name(i.e. sheet name always ends with $ sign)
{
listSheet.Add(drSheet["TABLE_NAME"].ToString());
}
}
}
return listSheet;
}
Above function returns list of sheet in particular excel file for both excel type(97,2003,2007).
Can't find this in actual MSDN documentation, but a moderator in the forums said
I am afraid that OLEDB does not preserve the sheet order as they were in Excel
Excel Sheet Names in Sheet Order
Seems like this would be a common enough requirement that there would be a decent workaround.
This is short, fast, safe, and usable...
public static List<string> ToExcelsSheetList(string excelFilePath)
{
List<string> sheets = new List<string>();
using (OleDbConnection connection =
new OleDbConnection((excelFilePath.TrimEnd().ToLower().EndsWith("x"))
? "Provider=Microsoft.ACE.OLEDB.12.0;Data Source='" + excelFilePath + "';" + "Extended Properties='Excel 12.0 Xml;HDR=YES;'"
: "provider=Microsoft.Jet.OLEDB.4.0;Data Source='" + excelFilePath + "';Extended Properties=Excel 8.0;"))
{
connection.Open();
DataTable dt = connection.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
foreach (DataRow drSheet in dt.Rows)
if (drSheet["TABLE_NAME"].ToString().Contains("$"))
{
string s = drSheet["TABLE_NAME"].ToString();
sheets.Add(s.StartsWith("'")?s.Substring(1, s.Length - 3): s.Substring(0, s.Length - 1));
}
connection.Close();
}
return sheets;
}
Another way:
a xls(x) file is just a collection of *.xml files stored in a *.zip container.
unzip the file "app.xml" in the folder docProps.
<?xml version="1.0" encoding="UTF-8" standalone="true"?>
-<Properties xmlns:vt="http://schemas.openxmlformats.org/officeDocument/2006/docPropsVTypes" xmlns="http://schemas.openxmlformats.org/officeDocument/2006/extended-properties">
<TotalTime>0</TotalTime>
<Application>Microsoft Excel</Application>
<DocSecurity>0</DocSecurity>
<ScaleCrop>false</ScaleCrop>
-<HeadingPairs>
-<vt:vector baseType="variant" size="2">
-<vt:variant>
<vt:lpstr>Arbeitsblätter</vt:lpstr>
</vt:variant>
-<vt:variant>
<vt:i4>4</vt:i4>
</vt:variant>
</vt:vector>
</HeadingPairs>
-<TitlesOfParts>
-<vt:vector baseType="lpstr" size="4">
<vt:lpstr>Tabelle3</vt:lpstr>
<vt:lpstr>Tabelle4</vt:lpstr>
<vt:lpstr>Tabelle1</vt:lpstr>
<vt:lpstr>Tabelle2</vt:lpstr>
</vt:vector>
</TitlesOfParts>
<Company/>
<LinksUpToDate>false</LinksUpToDate>
<SharedDoc>false</SharedDoc>
<HyperlinksChanged>false</HyperlinksChanged>
<AppVersion>14.0300</AppVersion>
</Properties>
The file is a german file (Arbeitsblätter = worksheets).
The table names (Tabelle3 etc) are in the correct order. You just need to read these tags;)
regards
I have created the below function using the information provided in the answer from #kraeppy (https://stackoverflow.com/a/19930386/2617732). This requires the .net framework v4.5 to be used and requires a reference to System.IO.Compression. This only works for xlsx files and not for the older xls files.
using System.IO.Compression;
using System.Xml;
using System.Xml.Linq;
static IEnumerable<string> GetWorksheetNamesOrdered(string fileName)
{
//open the excel file
using (FileStream data = new FileStream(fileName, FileMode.Open))
{
//unzip
ZipArchive archive = new ZipArchive(data);
//select the correct file from the archive
ZipArchiveEntry appxmlFile = archive.Entries.SingleOrDefault(e => e.FullName == "docProps/app.xml");
//read the xml
XDocument xdoc = XDocument.Load(appxmlFile.Open());
//find the titles element
XElement titlesElement = xdoc.Descendants().Where(e => e.Name.LocalName == "TitlesOfParts").Single();
//extract the worksheet names
return titlesElement
.Elements().Where(e => e.Name.LocalName == "vector").Single()
.Elements().Where(e => e.Name.LocalName == "lpstr")
.Select(e => e.Value);
}
}
I like the idea of #deathApril to name the sheets as 1_Germany, 2_UK, 3_IRELAND. I also got your issue to do this rename for hundreds of sheets. If you don't have a problem to rename the sheet name then you can use this macro to do it for you. It will take less than seconds to rename all sheet names. unfortunately ODBC, OLEDB return the sheet name order by asc. There is no replacement for that. You have to either use COM or rename your name to be in the order.
Sub Macro1()
'
' Macro1 Macro
'
'
Dim i As Integer
For i = 1 To Sheets.Count
Dim prefix As String
prefix = i
If Len(prefix) < 4 Then
prefix = "000"
ElseIf Len(prefix) < 3 Then
prefix = "00"
ElseIf Len(prefix) < 2 Then
prefix = "0"
End If
Dim sheetName As String
sheetName = Sheets(i).Name
Dim names
names = Split(sheetName, "-")
If (UBound(names) > 0) And IsNumeric(names(0)) Then
'do nothing
Else
Sheets(i).Name = prefix & i & "-" & Sheets(i).Name
End If
Next
End Sub
UPDATE:
After reading #SidHoland comment regarding BIFF an idea flashed. The following steps can be done through code. Don't know if you really want to do that to get the sheet names in the same order. Let me know if you need help to do this through code.
1. Consider XLSX as a zip file. Rename *.xlsx into *.zip
2. Unzip
3. Go to unzipped folder root and open /docprops/app.xml
4. This xml contains the sheet name in the same order of what you see.
5. Parse the xml and get the sheet names
UPDATE:
Another solution - NPOI might be helpful here
http://npoi.codeplex.com/
FileStream file = new FileStream(#"yourexcelfilename", FileMode.Open, FileAccess.Read);
HSSFWorkbook hssfworkbook = new HSSFWorkbook(file);
for (int i = 0; i < hssfworkbook.NumberOfSheets; i++)
{
Console.WriteLine(hssfworkbook.GetSheetName(i));
}
file.Close();
This solution works for xls. I didn't try xlsx.
Thanks,
Esen
This worked for me. Stolen from here: How do you get the name of the first page of an excel workbook?
object opt = System.Reflection.Missing.Value;
Excel.Application app = new Microsoft.Office.Interop.Excel.Application();
Excel.Workbook workbook = app.Workbooks.Open(WorkBookToOpen,
opt, opt, opt, opt, opt, opt, opt,
opt, opt, opt, opt, opt, opt, opt);
Excel.Worksheet worksheet = workbook.Worksheets[1] as Microsoft.Office.Interop.Excel.Worksheet;
string firstSheetName = worksheet.Name;
Try this. Here is the code to get the sheet names in order.
private Dictionary<int, string> GetExcelSheetNames(string fileName)
{
Excel.Application _excel = null;
Excel.Workbook _workBook = null;
Dictionary<int, string> excelSheets = new Dictionary<int, string>();
try
{
object missing = Type.Missing;
object readOnly = true;
Excel.XlFileFormat.xlWorkbookNormal
_excel = new Excel.ApplicationClass();
_excel.Visible = false;
_workBook = _excel.Workbooks.Open(fileName, 0, readOnly, 5, missing,
missing, true, Excel.XlPlatform.xlWindows, "\\t", false, false, 0, true, true, missing);
if (_workBook != null)
{
int index = 0;
foreach (Excel.Worksheet sheet in _workBook.Sheets)
{
// Can get sheet names in order they are in workbook
excelSheets.Add(++index, sheet.Name);
}
}
}
catch (Exception e)
{
return null;
}
finally
{
if (_excel != null)
{
if (_workBook != null)
_workBook.Close(false, Type.Missing, Type.Missing);
_excel.Application.Quit();
}
_excel = null;
_workBook = null;
}
return excelSheets;
}
As per MSDN, In a case of spreadsheets inside of Excel it might not work because Excel files are not real databases. So you will be not able to get the sheets name in order of their visualization in workbook.
Code to get sheets name as per their visual appearance using interop:
Add reference to Microsoft Excel 12.0 Object Library.
Following code will give the sheets name in the actual order stored in workbook, not the sorted name.
Sample Code:
using Microsoft.Office.Interop.Excel;
string filename = "C:\\romil.xlsx";
object missing = System.Reflection.Missing.Value;
Microsoft.Office.Interop.Excel.Application excel = new Microsoft.Office.Interop.Excel.Application();
Microsoft.Office.Interop.Excel.Workbook wb =excel.Workbooks.Open(filename, missing, missing, missing, missing,missing, missing, missing, missing, missing, missing, missing, missing, missing, missing);
ArrayList sheetname = new ArrayList();
foreach (Microsoft.Office.Interop.Excel.Worksheet sheet in wb.Sheets)
{
sheetname.Add(sheet.Name);
}
I don't see any documentation that says the order in app.xml is guaranteed to be the order of the sheets. It PROBABLY is, but not according to the OOXML specification.
The workbook.xml file, on the other hand, includes the sheetId attribute, which does determine the sequence - from 1 to the number of sheets. This is according to the OOXML specification. workbook.xml is described as the place where the sequence of the sheets is kept.
So reading workbook.xml after it is extracted form the XLSX would be my recommendation. NOT app.xml. Instead of docProps/app.xml, use xl/workbook.xml and look at the element, as shown here -
`
<workbook xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships">
<fileVersion appName="xl" lastEdited="5" lowestEdited="5" rupBuild="9303" />
<workbookPr defaultThemeVersion="124226" />
- <bookViews>
<workbookView xWindow="120" yWindow="135" windowWidth="19035" windowHeight="8445" />
</bookViews>
- <sheets>
<sheet name="By song" sheetId="1" r:id="rId1" />
<sheet name="By actors" sheetId="2" r:id="rId2" />
<sheet name="By pit" sheetId="3" r:id="rId3" />
</sheets>
- <definedNames>
<definedName name="_xlnm._FilterDatabase" localSheetId="0" hidden="1">'By song'!$A$1:$O$59</definedName>
</definedNames>
<calcPr calcId="145621" />
</workbook>
`