Write an Excel file with C# [duplicate] - c#

Is there any easy to implement library that can be used to read excel files and may be create them later on?
is this my best bet?
http://support.microsoft.com/kb/302084

Try this: http://epplus.codeplex.com
EPPlus is a .net library that reads and writes Excel 2007/2010 files
using the Open Office Xml format (xlsx).

If you are willing to commit yourself to a later version of Excel (2007+) you can also take a look at the OpenXML SDK. It's free, doesn't tie you to having MS Office installed on the machine it will be running on and there are quite a few resources available on how to use it online (including blogs from the OpenXML team).

There is excel package plus:
http://epplus.codeplex.com/
Only works on xlsx though, but Office 2003 is cycling out anyway.

You can use ExcelLibrary ,Although it works for .xls only which is 2003 format
The aim of this project is provide a native .NET solution to create, read and modify Excel files without using COM interop or OLEDB connection.
I had a chance of using EPPLUS ,it was wonderful :) ,It works for new excel format .xlsx which is used in 2007/2010
EPPlus is a .net library , you can read and write to excel files ,create charts ,pictures ,shapes... and Much more
Also take a look at this SO post

I've used oledb, interop and just started using Epplus. So far epplus is proving to be simplest.
http://epplus.codeplex.com/
However, I just posted a problem I have with epplus, but I posted some code you could use as reference.
c# epplus error Removed Part: Drawing shape

I like to use ExcelDataReader for reading and the aforementioned EPPlus for writing. Here's an example.
Here's an example of reading with it:
FileStream stream = File.Open(filePath, FileMode.Open, FileAccess.Read);
// Reading from a binary Excel file ('97-2003 format; *.xls)
// IExcelDataReader excelReader = ExcelReaderFactory.CreateBinaryReader(stream);
// Reading from a OpenXml Excel file (2007 format; *.xlsx)
IExcelDataReader excelReader = ExcelReaderFactory.CreateOpenXmlReader(stream);
// DataSet - The result of each spreadsheet will be created in the result.Tables
DataSet result = excelReader.AsDataSet();
// Free resources (IExcelDataReader is IDisposable)
excelReader.Close();
var cdm = new ValueSetRepository();
for (int i = 0; i < result.Tables.Count; i++)
{
// CHECK if tableNames filtering is specified
if (tableNames != null)
{
// CHECK if a table matches the specified tablenames
var tablename = result.Tables[i].TableName;
if (!tableNames.Contains(tablename))
{
continue;
}
}
var lookup = new ValueSetLookup();
lookup.CmsId = result.Tables[i].Rows[2][0].ToString();
lookup.NqfNumber = result.Tables[i].Rows[2][1].ToString();
lookup.Data = new List<ValueSetAttribute>();
int row_no = 2;
while (row_no < result.Tables[i].Rows.Count) // i is the index of table
// (sheet name) which you want to convert to csv
{
var currRow = result.Tables[i].Rows[row_no];
var valueSetAttribute = new ValueSetAttribute()
{
Id = currRow[0].ToString(),
Number = currRow[1].ToString(),
tName = currRow[2].ToString(),
Code = currRow[7].ToString(),
Description = currRow[8].ToString(),
};
lookup.Data.Add(valueSetAttribute);
row_no++;
}
cdm.AddRecord(lookup);

A company I used to work for did a lot of research on this and decided a product by SoftArtisans was their best bet:
OfficeWriter
I always found it strange how weak the support for Excel reading and writing was. I'm pretty sure that if you use Microsoft's libraries you have to have Excel installed anyway which is an extra expense just like OfficeWriter.

You could either go for VBA or use the free library from FileHelpers. If you are planning to buy some commerical solutions, I would recommend ASPOSE

According to this website you need to include a reference to the Microsoft Excel 12.0 Object library. From there, you need to do a few things to open up the file. There's a code sample on the website.
PS - Sorry it's not too detailed but I couldn't find the Microsoft Office developer reference with more details.

I used ExcelLibrary with very great results! (until now it support Excel 2003 or lower versions).
http://code.google.com/p/excellibrary/

Yes, multiple open-source libraries exist to help read and/or write Excel spreadsheets using C#.
Here is a shortlist of C# libraries:
Microsoft.Office.Interop.Excel
ExcelDataReader
NPOI
ExcelMapper - NPOI extension
EPPlus
An up-to-date curated list is maintained here.
Example: Reading Excel File using ExcelMapper
a. Install using NuGet, by running below command in NuGet Packet Manager:
Install-Package ExcelMapper
b. Sample C# Code for ExcelMapper
public void ReadExcelUsingExcelMapperExtension()
{
string filePath = #"C:\Temp\ListOfPeople.xlsx";
var people = new ExcelMapper(filePath).Fetch<Person>().ToList();
}
public class Person
{
public string FirstName { get; set; }
public string LastName { get; set; }
public int Age { get; set; }
}
Disclaimer: I like the conciseness of ExcelMapper, therefore included sample code for this package. To do the same using other libraries, requires a lot more code.

Related

Open workbook in Interop.Excel from byte array

I want to open Excel from byte[] because my file is encrypted and I want to open after decrypt but without write in a file.
The office has "restricted access" and I want to open my file with this protection but without saving the decrypted content in a file.
myApp.Workbooks.Open only supports a path.
Is it possible?
As an alternative to OpenXml there's also ExcelDataReader which from my experience is a lot faster in processing data compared to Interop.Excel(around 3 times+).
It can also open encrypted Excel files directly(stackoverflow)
The github page for ExcelDataReader has some great examples on how to use it. The only thing you'd have to do is:
This:
using (var stream = File.Open(filePath, FileMode.Open, FileAccess.Read))
Becomes this:
using (var stream = new MemoryStream(yourByte[])
And if you just want to open the password protected excel file you'd do this:
var conf = new ExcelReaderConfiguration { Password = "yourPassword" }; //Add this
excelReader = ExcelReaderFactory.CreateReader(stream, conf); //change the excel Reader to this
Make sure to check the Github page for more info!
It is not possible because the interop is actually an interface for programs to run and operate existing excel on the computer.
I think you need to use openxml created by Microsoft to work with excel word and PowerPoint.
DocumentFormat.OpenXml
Then you can use:
ExcelPackage excelPackage = new ExcelPackage(stream)
or
var pck = new OfficeOpenXml.ExcelPackage();
pck.Load(File.OpenRead(path));
pck.Load(Stream) can use any stream as input not only from a file.
It depends on your needs.

Convert a Datatable to .xls,.xlsx,.csv by the given delimeter in input

I want a method to write the datatable data to .xls,.xlsx or.csv based on the input provided along with the delimiter as input
public class DataTableExtensions
{
/*Input Params : Datatable input
fileFormat(.xls,.csv,.xlsx)
delimeter('\t' (tabSpace) or ,(comma) or | (pipe Symbol)
filepath - Any local folder*/
public void WriteToCsvFile(DataTable dataTable,string fileFormat,string delimeter, string filePath)
{
//Code to convert file based on the input
//Code to create file
System.IO.File.WriteAllText(filePath, fileContent.ToString());
}
}
You said it is only 1000 rows every 2 hours in the comments. That is a acceptable amount of data for a C# programm. I would say the big question left is wich output format you use.
.CSV is the simplest one. This format can be done with a File.WriteLine() and some string concaction. There is no build in CSV parser or writer code I am aware off in C#, but there is plenty of 3rd party code.
.XLS requires the (t)rusty Office COM Interop. That requires office to be installed and does not work from a non-interactive session (like a Windows Service). On top of all the normal issues for using COM interop.
There is the odd "export to XLS" function on existing classses, but those are rare, far inbetween and about everything you get. Unfortunately as we always had COM Interop as fallback, we never quite developed a standalone library for working with .XLS. Ironically working with this old format is harder from C#/.NET then it would be from Java.
.XLSX however is easier. It can be written using the OpenXML SDK. Or the XML writer and ZipArchive class: At their core all the ???x formats are a bunch of .XML files in a renamed .ZIP container. There should even be 3rd party code out there to make using the SDK easier.
.CSV is the lowest common denominator and propably the easiest to create. However if a user is supposed to open this document, the lack for formating might become an issue.
.XSLX would be my choice if you need a user to open it.
.XSL I would avoid like a swarm of angry bees.
I have written this Program to convert Xls,XLSx using console application with
Datatable as input and for text file I have written a simple stream writer logic.This works good. Initially I have installed package manage console and below code
using expertXLs package.I am not sure wheather I can share the key of that
or not.Please search the key and give in config before running it
Package Manage Console - Install-Package ExpertXls.ExcelLibrary -Version 5.0.0
Code :
--------
private static void GenerateTxtFileFromDataTable(DataTable sampleDataTable,string delimiter)
{
var _expertxlsLK = ConfigurationManager.AppSettings["ExpertxlsLK"];
//GetKey Value from config
// Create the workbook in which the data from the DataTable will be loaded 0 for 2003 Excel(xls),1 for 2007 Excel(xlsx)
ExcelWorkbookFormat workbookFormat = ExcelWorkbookFormat.0;
// create the workbook in the desired format with a single worksheet
ExcelWorkbook workbook = new ExcelWorkbook(workbookFormat);
workbook.EnableFormulaCalculations();
workbook.LicenseKey = _expertxlsLK;
// get the first worksheet in the workbook
ExcelWorksheet worksheet = workbook.Worksheets[0];
// set the default worksheet name
worksheet.Name = "ClaimInformation";
// load data from DataTable into the worksheet
worksheet.LoadDataTable(sampleDataTable, 1, 1, true);
worksheet.Workbook.EnableFormulaCalculations();
workbook.Save(#"M:\Rupesh\test.xlsx");
workbook.Close();
}

Save Excel as PDF

I open an Excel file in c#, make some changes and I want to save it as pdf file.
I have searched about it and find this:
Microsoft.Office.Interop.Excel._Workbook oWB;
oWB.ExportAsFixedFormat(XlFixedFormatType.xlTypePDF, "D:\\xxxxx.pdf");
but this code sometimes opens a form and a printer must be selected! I don't know why?!
Is there any other way for exporting PDF from Excel?
I saw that Workbook.saveas() has a Fileformat object. How can we use it?
Check out Spire.Xls, below is the code for converting Excel to PDF.
Workbook workbook = new Workbook();
workbook.LoadFromFile("Sample.xlsx");
//If you want to make the excel content fit to pdf page
//workbook.ConverterSetting.SheetFitToPage = true;
workbook.SaveToFile("result.pdf", Spire.Xls.FileFormat.PDF);
It has a free version which you can use without charge:
https://www.e-iceblue.com/Introduce/free-xls-component.html (Also available on NuGet)
Use iTextSharp. It is native .NET code. Doesn't require any Excel interop -
https://www.nuget.org/packages/itextsharp/
If you're looking for an alternative approach, then check out GemBox.Spreadsheet:
https://www.nuget.org/packages/GemBox.Spreadsheet/
Here is how you can save Excel to PDF with it:
ExcelFile excel = ExcelFile.Load("D:\\xxxxx.xlsx");
excel.Save("D:\\xxxxx.pdf");
Or you can write it as following:
XlsxLoadOptions loadOptions = new XlsxLoadOptions();
ExcelFile excel = ExcelFile.Load("D:\\xxxxx.xlsx", loadOptions);
PdfSaveOptions saveOptions = new PdfSaveOptions();
excel.Save("D:\\xxxxx.pdf", saveOptions);
You can do this using this API. Please see documentation for further detail.
http://cdn.bytescout.com/help/BytescoutPDFExtractorSDK/html/55590148-5bef-4338-ac16-1de4056a952b.htm

Delete rows from Excel

Following are the approaches I tried:
A) I tried to delete rows from an excel sheet using Microsoft.Office.Interop.Excel.
I'm doing this in a script task within a SSIS package.
I added the library to the GAC, since it was raising an error : Could not load Library.
Now it's raises this error saying : Retrieving the COM class factory for component with CLSID {00024500-0000-0000-C000-000000000046} failed due to the following error: 80040154.
Googling this tells me I need MS Office installed for it to work, which I don't want coz the server I deploy this solution on is definitely not going to have MS Office installed on it. I'm no expert, but I would like to know why such operations are not possible, by simply adding reference to a dll? Why is it mandatory to install MS Office.
B) I also tried Oledb jet provider, but this one doesn't allow deleting of rows.
The only operations it supports is Insert, Update and Select.
Things I have come across on the web:
A) A SO Questions' answer suggests to use Npoi, but I can't totally rely on that, because what's free library today can become paid in future.
B) Also I have come across EPP Plus library. I have used it and understand that it's based on a GNU public license, but I'm apprehensive on using it because it may become a paid tool in future.
C) I have also come across people using Open XML SDK by Microsoft. Before I get my hands dirty in this, I would love if someone up front tells me whether I should be using this. Not that I'm lazy to try it out myself but what what would be helpful to me before I start is, does this SDK need any external programs installed on the machine. Coz it requires me to install an msi to be able to us it.
Is there a work around to do this using Microsoft COM components? I'm not asking a subjective question here. I want to know technical obstacles, if any when I use the above three researched tools.
Thanks in advance
The point is with Interop that you indeed must have office installed. So bluntly said, you cannot use Interop. If you only need to support xlsx files, you can do it in xml.
See this and this link for more details about unpacking xlsx files, editing and repacking. The only thing you need than is something to unzip it and your own xml handling code.
If the requirement is to also support xls files you have a bit of a problem. I tried this in the past without any additional installations but did not succeed, so I decided to only support xlsx. I either needed some .msi files or office installed on the server.
You're saying that you are using a script task in SSIS; then why not import the excel file you want to delete the values from it (preferably into a database or keep it cached into a datatable) and then generate a new xls file with just the data you want to keep.
OR don't use the script task at all and use, inside a data flow, a configured excel source combined with a script component (which is basically the same thing as a script task just that you can use this one only in a data flow) and do all your work there. If you have a dynamic connection to the excel file, you can always use variables (parameters if you're on DataTools) to configure such a connection.
Good luck!
If you want to use Microsoft.Office.Interop.Excel then, yes, you do need Excel on the server. Therefore, so long as you only want to deal with xlsx based workbooks / 2007+ then I would suggest that OpenXML is the way to go. It's a bit of a learning curve and you get to realise how much work Excel does for you in the background but is not too bad once you get used to it.
A very quick sample knocked up in LINQPad:
void Main()
{
string fileName = #"c:\temp\delete-row-openxml.xlsx";
using (SpreadsheetDocument doc = SpreadsheetDocument.Open(fileName, true))
{
// Get the necessary bits of the doc
WorkbookPart workbookPart = doc.WorkbookPart;
SharedStringTablePart sstpart = workbookPart.GetPartsOfType<SharedStringTablePart>().First();
SharedStringTable sst = sstpart.SharedStringTable;
// Get the first worksheet
WorksheetPart worksheetPart = workbookPart.WorksheetParts.First();
Worksheet sheet = worksheetPart.Worksheet;
var rows = sheet.Descendants<Row>();
foreach (Row row in rows.Where(r => ShouldDeleteRow(r, sst)))
{
row.Remove();
}
}
}
private bool ShouldDeleteRow(Row row, SharedStringTable sst)
{
// Whatever logic to apply to decide whether to remove a row or not
string txt = GetCellText(row.Elements<Cell>().FirstOrDefault(), sst);
return (txt == "Row 3");
}
// Basic way to get the text of a cell - need to use the SharedStringTable
private string GetCellText(Cell cell, SharedStringTable sst)
{
if (cell == null)
return "";
if ((cell.DataType != null) && (cell.DataType == CellValues.SharedString))
{
int ssid = int.Parse(cell.CellValue.Text);
string str = sst.ChildElements[ssid].InnerText;
return str;
}
else if (cell.CellValue != null)
{
return cell.CellValue.Text;
}
return "";
}
Note that this will clear the row not shuffle up all the other rows. To do that you'd need to provide some logic to adjust row indexes of the remaining rows.
To answer a little more of the OP question - the OpenXML msi is all that is needed apart from the standard .Net framework. The sample needs a reference to WindowsBase.dll for the packaging API and using statements for DocumentFormat.OpenXml.Packaging and DocumentFormat.OpenXml.Spreadsheet. The OpenXML API package can be referenced in VS via Nuget too so you don't even need to install the msi if you don't want. But it makes sense to do so IMHO.
One other item that you will find VERY useful is the OpenXML tools msi. This lets you open a Word or Excel doc and see the XML layout inside - most helpful.
This is how I managed to remove rows in excel and move up the data
using DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.Spreadsheet;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Text.RegularExpressions;
using (SpreadsheetDocument document = SpreadsheetDocument.Open(pathToFile, true))
{
WorkbookPart wbPart = document.WorkbookPart;
var worksheet = wbPart.WorksheetParts.First().Worksheet;
var rows = worksheet.GetFirstChild<SheetData>().Elements<Row>();
// Skip headers
foreach (var row in rows.Skip(1))
{
if (/* some condition on which rows to delete*/)
{
row.Remove();
}
}
// Fix all row indexes
string cr;
for (int i = 2; i < rows.Count(); i++)
{
var newCurrentRowIndex = rows.ElementAt(i - 1).RowIndex.Value + 1;
var currentRow = rows.ElementAt(i);
currentRow.RowIndex.Value = updatedRowIndex;
IEnumerable<Cell> cells = currentRow.Elements<Cell>().ToList();
if (cells != null)
{
foreach (Cell cell in cells)
{
cr = cell.CellReference.Value;
cr = Regex.Replace(cell.CellReference.Value, #"[\d-]", "");
cell.CellReference.Value = $"{cr}{updatedRowIndex}";
}
}
}
worksheet.Save();
}

.NET Excel File Parser

So the company I'm working for is looking for a means to verify that a given .xls/.xlsx file is valid. Which means checking columns and rows and other data. He's having me evaluate GrapeCity Spread and SpreadsheetGear, but I'm wondering if anyone else has any other suggestions of external tools to check out.
We don't need a means to export .xls files or anything like that, just the ability to import them and verify they are valid based on a set of criteria I create.
Thanks.
If you need just to compare cell values you can use ADO.NET driver, for anything else will be required Excel or third party component. I am using SpreadsheetGear. When I was evaluating this component 3 years ago I have found an issue with conditional formatting for cell with absolute reference, but issue was quickly resolved. They have same day support response.
To my mind, the easiest way to handle this is to use an ODBC Excel data provider. I find it more straightforward to work with than the PIAs.
// Connection string for Excel 2007 (.xlsx)
string dbConnStr = #"Driver={Microsoft Excel Driver (*.xls, *.xlsx, *.xlsm, *.xlsb)};Dsn=Excel Files;dbq=C:\temp\mySpreadsheet.xlsx";
// Connection string for Excel 98-2003 (.xls)
//string dbConnStr = #"Driver={Microsoft Excel Driver (*.xls)};driverid=790;dbq=C:\temp\mySpreadsheet.xls;defaultdir=c:\temp";
OdbcCommand cmd = new OdbcCommand("Select * from [SheetName$]", new OdbcConnection(dbConnStr));
cmd.Connection.Open();
OdbcDataReader dr = cmd.ExecuteReader();
foreach (System.Data.IDataRecord item in dr)
{
// Check specific column values, etc
string id = item["Column Name"].ToString();
}
You can use the Microsoft.Office.Interop.Excel library to access any workbook the same way you do in Excel VBA.
Code looks like this:
using Excel = Microsoft.Office.Interop.Excel;
Excel.Application excel = new Excel.Application();
Excel.Workbook workbook = excel.Workbooks.Open("datasheet.xls");
Excel.Worksheet worksheet = workbook["Sheet1"] as Excel.Worksheet;
string someData = (worksheet.Range["A2"] as Excel.Range).Value.ToString();
worksheet = null;
workbook.Close();
excel.Quit();
Depending on your budget, the Aspose libraries are great. Not cheap but work very, very well.
you can use the oleDb from Microsoft to access the excel data as any other database system. You can get the right connection string from connectionstrings
Maybe the NPOI project can be useful (I have never used it though).
Best
Check out Excel Data Reader GitHub (formerly on CodePlex). I've used this a few times and it works well.
Be warned however that there are bugs reading .xlsx files where cells are skipped. Apply this patch (link is to Codeplex and out of date) I submitted for v2.0.1.0 to fix the problem. (The project maintainers don't seem active and I've had problems contacting them.)

Categories