I want to read a huge excel file workbooks and I also want to modify some of the lines in the workbook.
Readying and saving through structures is complicated process, is there any other alternative efficient method?
I have no idea what your idea of »huge« looks like, but previously I used NativeExcel to access and create Excel files. It's a lot faster than the normal Excel API.
In my opinion best to go with OLEDB or http://openexcel.codeplex.com/.
You can choose either of them. More details can be found at.
http://www.codeproject.com/Articles/8500/Reading-and-Writing-Excel-using-OLEDB
http://openexcel.codeplex.com/
OpenExcel is open source and cool library.
Related
This Question might be repeated, But I couldn't get solution regarding my problem so far. I'm new to Interop. I'm using excel file (as a database).
Here is data presentation in excel file
in my data If Card ID repeated then I need to increment '1' in Counter in the same row, similarly I need to fetch IP address of same row..
I'm using Interop Excel approach to insert data in excel file..
Kindly tell me how can I perform that update operation to that excel file through C# (WPF)
Sorry for bad English..
Thanks
I recommend using Closed XML
You write to the file directly and don't need Excel. It will need to be the latest version of an Excel file to work (The open xml standard).
Epplus.dll or npoi.dll will also read/write to excel files w/o excel.
Save the data in an XML or JSON file, then when you want to visualize them you create the excel file from these data, so you will have a very light file and easy to read and update if you wish.
I haven't done this specifically through wpf, but you can access powershell cmdlets through .net and powershell has commands for retrieving and writing Excel data.
That said, my experience has been it's very tedious and inconsistent with bugs. I would tell your client that using an Excel file as a database is impossible and certainly prone to failure in practice.
For one thing you will run into read/write restrictions if it is used by anything else.
If you don't mind to use comercial libraries, you can try to use Aspose.Cells. It has rich cells API and able to work without Excel interop API.
I've been asked to strip an Excel file of macros, leaving only the data. I've been asked to do this by converting the Excel file to XML and then reading that file back into Excel using C#. This seems a bit inefficient to me and I was thinking that it would be easier to simply load the source Excel file into C# and then create a new target Excel file and add the sheets from the source back into the target.
I don't know where macros live inside an Excel file, so I'm not sure if this would accomplish the task or not. So, will this work? Will simply copying the sheets from one file to another strip it of it's macros or are they actually stored at the worksheet level?
As always, any and all suggestions are welcome, including alternate suggestions or even "why are you even doing this???". :)
To do this programmatically, you can use the ZipFile class from the System.IO.Compression library in .NET from C#. (.NET Framework 4.5)
Rename the file to add a ".zip" extension, and then open the file as a ZIP archive. Look for an element in the resultant "xl" folder called "vbproject.bin", and delete it. Remove the .zip extension. Macros gone.
Your best bet is to save the workbook as an xlsx, close it, open it, then save as a format of your choice.
This will strip the macros and is robust. It will also work if the VBA is locked for viewing.
Closing and reopening the workbook is necessary otherwise the macros are retained.
If you're needing to use C# to do this, I agree that it would be easier to load the source Excel file into C# and create a new target file only copying over the cells and sheets you need. Especially if you're doing this for a large amount of excel files I would recommend just creating a small console app that, when given an excel sheet, will automatically generate a new excel sheet with just the data for you.
One tool that I've found extremely useful and easy to use for such tasks is EPPlus.
I have an XML file that I am trying to load to into an existing workbook in Excel. I realize that I can simply open the file and it will load into Excel easily. I am trying to get it to load to a specific sheet within my already open workbook. What would be the best practice for this? I have the path of the XML file which is in a string, but I am lost on where to go from there.
I would probably look into the Excel COM automation API. This allows you to take data that you have in memory, or in an XML file in your case, and programmatically place it into whatever cells you want in the workbook. It's a lot more work, but it gives you a lot more control.
From C#, you would want to look at the Microsoft.Office.Interop.Excel dlls, if you choose to go this route. Hopefully someone else will come along and give you an easier answer, but that's the best that I can think of right now.
I've tried the OleDb driver, LinqToExcel, and Excel Data Reader to read .xls files, but all of them seem to have very annoying limitations. LinqToExcel and the OleDb driver both throw "Too Many Fields Defined" error messages if the excel files have phantom columns. The Excel Data Reader threw undefined exceptions, which I was never able to get to the bottom of.
Is there any excel driver that "just works", and can handle slightly mis-formatted excel files?
A commercial software package would be fine. My current requirements only specify reading dates and text from cells, though more sophisticated functionality would be a plus.
[Edit]
Needs to support both XLS and XLSX file formats.
I can recommend Aspose.Cells and Flexcel... didn't try SpreadsheetGear but hear+read lots of good things about it...
A free option (though for the newer xlsx format only!) is OpenXML 2 from MS.
Try Epplus Open Source library for excel
Even if i did not yet try, this project seems interesting. It is minded expecially for writing but it just work even for read. Unfortunately it accepts the new xslx formats.
I've used SpreadSheetGear previously. You have to pay but it worked very well for my needs and handled the different file formats well.
I tested a few other libs but SSG worked best in terms of maintaining fidelity of the file when I saved copies of it but then my files had lots of data validations and controls in place etc. For simpler files there's a range of other options.
If all else fails you could write an Excel macro and, if needed, call the macro from within C#. I am not sure if you need that as you do not give any reasons why you are doing this in C#
Another way into the data is by using an interop assembly which has become much easier since the arrival of the dynamic keyword in C# 4
I need to create a script that extracts some data from a complex Excel 2003 file (with multiple sheets and different tables inside a single sheet) and produces different XML files that need to be validated against a given XSD file.
My preferred language is Python;
to create and validate XML files i would go with lxml.
What do you suggest for parsing XLS files?
Is xlrd the right tool to use for complex Excel files?
Or do i need to convert all the sheets in CSV manually, and read files line by line, splitting and getting data?
I accept C#, VB6, VBA suggestions too.
[disclaimer: I'm the author of xlrd]
xlrd is quite suited for this kind of job. Get the latest version from PyPI. Get the flavour from the tutorial found here. XLSX support is in alpha test; e-mail me if you need it. The awkwardness and lossiness of the save-as-CSV approach was one of the things that prompted me to write xlrd.
Xlrd is OK. We use it extensively to import XLS files full of references and formulas with multiple sheets and data presented in custom (not Latin-1) encoding.
I am convinced the most simple solution for this task is using Excel VBA together with MSXML parser. Look here for some links how to use the MSXML parser in VBA for reading XML files; you can adopt this easily for writing XML files, I think.
I cant answer whether xlrd/python is the right tool for the job - as I don't know python well enough.
But there are many ways to access the excel data...in the main you have VBA built directly in to Excel.
Then you have Ado.net See David Hayden's article here which allows you to access the data via any DotNet language...even IronPython