I have an xls file sitting in a byte[] as a result of a file upload on my asp.net web application. Is there a library that can read in and process the xls file as a byte[]? I do not want to save the file to disk.
All I need to do is be able to read the cell contents (I would prefer to accept csv file if I had the choice).
I discovered SpreadsheetGear which claims to do this, but I would rather not pay $1000 for software that does way more than I need it to.
Note that I am referring to XLS file and not XLSX file, but I would appreciate advice on both.
You may checkout excellibrary. And if you are dealing with OpenXML (.xlsx) you may checkout the Open XML SDK.
EPPlus is also a solid library for working with Excel files. It has some samples that will show how to interact with a file from a MemoryStream.
http://epplus.codeplex.com/
NOPI has a really good library and it picks up where EPPlus leaves off. http://npoi.codeplex.com/
Your reference to XLS suggests the older Excel 97 format, which in that case you can use the ExcelWorkbook / ExcelWorksheet reader code provided as part of the Tarantino project at the Tarantino Bitbucket Repository
You can pass your XLS in memory as a stream and the helper methods will return a DataSet with workbook data and Tables representing Sheets. You do not need the entire Tarantino project code and can simply grab:
ExcelWorkbookReader.cs
ExcelWorksheetReader.cs
IExcelWorkbookReader.cs
IExcelWorksheetReader.cs
and add these files to your solution.
Using the interface is simple:
[HttpPost]
public ActionResult Uploadfile(HttpPostedFileBase file)
{
var reader = new ExcelWorkbookReader();
var data = reader.GetWorkbookData(file.InputStream);
// Do something with the data here
return RedirectToAction("List");
}
You can read a .xls content without Excel library using ADO.NET and OLEDB driver. But the worksheet must be in "table" format. If this is true, its works fine.
The connection string should be something like this:
Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C:\MyExcel.xls;Extended Properties="Excel 8.0;HDR=Yes;IMEX=1";
Regards.
Related
I was developing an application which read data from an excel file, but when I try to open it, an exception was thrown if the source file is saved with the xls format (File contains corrupted data error when opening Excel sheet with OpenXML). indeed when I save this file with the xlsx format it works fine. please help me to solve this problem.
Use Free Spire.XLS dll available via NuGet.
Sample:
Workbook workbook = new Workbook();
workbook.LoadFromFile("Input.xls");
workbook.SaveToFile("Output.xlsx", ExcelVersion.Version2013);
For reliably reading XLS files you could use ExcelDataReader which is a lightweight and fast library written in C# for reading Microsoft Excel files. It supports the import of Excel files all the way back to version 2.0 of Excel (released in 1987!)
Alternatively you could use a file conversion API like Zamzar. This service has been around for 10+ years, and provides a simple REST API for file conversion - it supports XLS to XLSX conversion. You can use it in C# and it has extra features like allowing you to import and export files to and from Amazon S3, FTP servers etc.
Full disclosure: I'm the lead developer for the Zamzar API.
You cannot read xls files with OpenXML.
The solution from Microsoft is to read the xls file with Office Interop (but Interop is not recommended to be used on the server), transfer data from Interop step by step to OpenXML.
Another solution is to use an Excel library like EasyXLS and convert between these two Excel file formats:
ExcelDocument workbook = new ExcelDocument();
workbook.easy_LoadXLSFile("Excel.xls");
workbook.easy_WriteXLSXFile("Excel.xlsx");
Find more information about converting xls to xlsx.
I am not quite sure why you need to convert the file and why you don't just read the xls file, using a different technology then OpenXML, for sure.
XLS is the older Excel file format. XSLX is the newer format stored as OpenXML. XSLX is actually a zip file with the various components stored as files within it. You cannot simply rename the file to get it into the new format. To save the file in XSLX you'll have to save the file into the Excel 2010+ format.
If you're using Excel interop then it is an option on the SaveAs method.
for more info check the function: _Workbook.SaveAs Method
and the property: FileFormat:
Optional Object.
The file format to use when you save the file. For a list of valid choices,
see the FileFormat property. For an existing file, the default format is the
last file format specified; for a new file, the default is the format of the
version of Excel being used.
msdn info here:
https://msdn.microsoft.com/en-us/library/microsoft.office.interop.excel._workbook.saveas(v=office.11).aspx
I need some guidance here.
I am reading a excel file using a StreamReader , then get the file to a string using the StreamReader.ReadToEnd(); method. Then I write the string to a different location on the file system using a StreamWriter.Write() method.
Then I re-read file from the location I wrote it earlier. However it seems I am reading some garbage values and I can't open the excel file from the new location...
Am I doing something wrong here to file to get corrupted ? Am I missing something to do with encoding here ?
Excel files are binary. StreamReader is a kind of TextReader, and StreamWriter is a kind of TextWriter.
Binary and Text - not the same thing.
Depending on what Excel format you are using, you will find it very painful to read/write directly. Libraries such as NPOI make this much easier.
The version that reads/writes xlsx files is still in beta, but stable in my use. If you need xlsx format files, download from their site, instead of via NuGet. NPOI on GitHub
I would appreciate any pointers to documentation or API calls I can use.
Basically, I'm hoping there's some way to invoke Excel to make the conversion, although I haven't yet found any solutions that work for Excel 2010.
I am using the .NET framework.
Excel handles CSV files well and is the default editor for them on systems where the Excel install hasn't been customised. I use csv files in almost all cases where I need an Excel file and I work with some very non tech-savvy users!
Converting a TSV to a CSV is trivial in comparison to converting to xlsx - one of the best libraries I have used for working with flat files is Generic Parser which can read and write files delimited by any character (amongst many other things)
I have used LINQ to CSV library in several projects to load and manipulate CSV,TSV,etc files.
LINQ to CSV library
As for creating Office documents this is something that if you want easy conversion you will have to pay for. It is only really used in commercial applications so library writers know that there is a market for this.
That said there are some free libraries out there and I have heard good things about this one for editing Excel files:
EPPlus
I have done a lot of stuff like this using COM interop for Office. My recommendation is to check out the following link:
http://msdn.microsoft.com/en-us/library/dd264733.aspx
It should get you up and running with it. Let me know if you have any specific questions.
You can try the open source library EPPlus to generate excel files. It's easier to deploy than the full Excel application.
You can give it a try on GroupDocs.Conversion REST API for TSV to Excel or Excel to TSV conversion. You can use it via any REST Client or GroupDocs.Conversion Cloud SDK for .NET. Please note it is a paid API but its free plan offers free 150 API calls per month.
P.S: I am developer evangelist at GroupDocs.
// Get Client Id and Client Key from https://dashboard.groupdocs.cloud/
var configuration = new GroupDocs.Conversion.Cloud.Sdk.Client.Configuration(ClientId, ClientKey);
var fileApi = new GroupDocs.Conversion.Cloud.Sdk.Api.FileApi(configuration);
var convertApi = new ConvertApi(configuration);
// Convert TSV to XLSX
var format = "xlsx";
var testFile = "C:/Temp/sample.tsv";
var request = new ConvertDocumentDirectRequest(format, File.OpenRead(testFile));
var result = convertApi.ConvertDocumentDirect(request);
// Save output to local drive
var fileStream = System.IO.File.Create("C:/Temp/sample.xlsx");
result.CopyTo(fileStream);
I'm generating a CSV file from the following code
public ActionResult Index()
{
var csv = "मानक हिन्दी;some other value";
var data = Encoding.UTF8.GetBytes(csv);
data = Encoding.UTF8.GetPreamble().Concat(data).ToArray();
var cd = new ContentDisposition
{
Inline = false,
FileName = "newExcelSheet.csv"
};
Response.AddHeader("Content-Disposition", cd.ToString());
return File(data, "text/csv");
}
Now I wish to insert Image in the top row of the excel, Please assist me in the following problem
Thanks :)
CSV is not a format capable of including binary data such as images. The only thing you can include in a CSV file is text.
If you need to add an image to an excel document you would have to use a proper excel file (i.e. a .xls or .xlsx file). There are various APIs that you can use to write to such files, including the Excel Object Model exposed through COM when you have Office installed.
See this question for details on how to insert images through COM.
You can't do it without using the interop assembly. You either go that route or download epplus, a free Excel .Net library that supports what you need.
Code examples on the website:
http://epplus.codeplex.com/
CSV doesn't support what you ask for AND Interop is officially NOT supported by MS in server-scenarios (like ASP.NET...).
You will need to create "real" Excel files (XLS or XLSX) - some options to create Excel files:
MS provides the free OpenXML SDK V 2.0 - see http://msdn.microsoft.com/en-us/library/bb448854%28office.14%29.aspx
This can read+write MS Office files (including Excel XLSX but not XLS!).
Another option see http://www.codeproject.com/KB/office/OpenXML.aspx
IF you need more like rendering, formulas etc. then there are different free and commercial libraries like ClosedXML, EPPlus, Aspose.Cells, SpreadsheetGear, LibXL and Flexcel.
How Can I open and read all data in excel file to perform some operations on say write them to a database ...
You can automate Excel using com automation http://support.microsoft.com/kb/302096 , but if you are on a web server you will need to use a third party library like http://sourceforge.net/projects/koogra/
You can use the default library that comes with the .NET framework in order to use the Excel.Application Object and therefore the Workbook and Worksheets objects, so you can access Excel files, read or manipulate them
You can add it to your project by using the Add Reference option, the library is called
Microsoft.Office.Interop.Excel
Hope this helps
Assuming that the Excel files are in a table format, I'd suggest that the best way would be using OleDB. This page has a very basic sample that should show you how to get started.
If you're unable to use OleDB for some reason, then you could use Excel Automation. This is not recommended if it's on a server though and will in general be slower and less stable than OleDB, but you will be able to do pretty much anything you need.
There are several ways:
If *.xslx (the new XML based format) is used, you can open that file and read the XML File
You can read it with Excel COM Interop (not recomended on a Server!)
You can use a ODBC Data Source
Starting with Office 2007, you can use OpenXML to query/manipulate office documents. The .xlsx files (all Office .???x files) are zipped up XML files.
ExcelToEnumerable is a great solution if you want to map Excel data to a list of classes, e.g:
var filePath = "/Path/To/ExcelFile.xlsx";
IEnumerable<MyClass> myClasses = filePath.ExcelToEnumerable<MyClass>();
Disclaimer. I am the author of ExcelToEnumerable.