Programmatically finding an Excel file's Excel version - c#

I'm using an OleDbConnection to connect to a spreadsheet from a C# program. One of the parameters in the connection string is the Excel version.
"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C:\Book1.xls;Extended Properties="Excel 8.0;HDR=YES"
Given the path of an Excel file how can I find out which Excel format version it uses?
Thanks in advance,
T.

I addition to what was said and apart from using Excel automation to open the file you can try reading file version from your code:
xls files: those are saved as structured storage. you can use the technique from the article here: How To Determine Which Version of Excel Wrote a Workbook
xlsx files: you can open them as zip files. Version is in app.xml file AppVersion field.

Download OLE File Property Reader. Register dsofile.dll with regsvr32 and add it as a reference in your application. The following code will output the type of Excel file. It will not work on xlsx/docx since those are not OLE compunds object files, but should work on all older Office formats (2007/2003/XP).
var doc = new OleDocumentPropertiesClass();
doc.Open(#"c:\spreadhseet.xls", false, dsoFileOpenOptions.dsoOptionDefault);
Console.WriteLine(doc.OleDocumentType);

Related

Save ole file as office document

I have some ole file with ole2 format in legacy system.
These are office word or excel & with embed object (e.g. picture) I think.
If I rename the file with docx or xlsx externsion, it will say file is corrupted.
Could I extract the ole file with some existing C# library? And save it as word or excel document?
OLEFileStructure PNG
NOTE:
The OlePres\d\d\d stream are embed ole object I think.
The Ole stream says it's a embed file not link.
The compObj stream indicate it's file type. e.g. Microsoft Word Document
For package type ole file, I have follow below blog to extract the file from ole10native stream successfully -- https://eigenein.wordpress.com/2011/08/03/how-to-extract-ole-attachment-body-from-ole10native-stream/
Updates: (Possible solution)
For old style, e.g. xls, doc, it could just rename the ole file to those extension and it works. But some of the file cannot be opened via MS Office, but it open successfully via Libre Office.
For new style, e.g. xlsx, docx. It could extract the Package stream and save as xlsx or docx. file.
For old style, e.g. xls, doc, it could just rename the ole file to those extension and it works.
But some of the file cannot be opened via MS Office, but it open successfully via Libre Office. So I use the Libre office command line tool to convert it with same format, e.g. soffice --convert-to docx *.docx --outdir ../Converted
Then it could be opened via MS Office.
For new style, e.g. xlsx, docx. It could extract the Package stream and save as xlsx or docx. file.

How to read excel application version from xls (Excel97) files?

When using NPOI WorkbookFactory with a "modern" Excel file (*.xlsx) it produces an XSSFWorkbook, which contains the Excel-version:
xssfWorkbook.GetProperties().ExtendedProperties.AppVersion
This returns the Excel version number, e.g. "16.0300".
Is there a way to get this information for a HSSFWorkbook?
BTW: The name of the application is available in both classes:
hssfWorkbook.SummaryInformation.ApplicationName
xssfWorkbook.GetProperties().ExtendedProperties.Application
For MS Excel, this is "Microsoft Excel" in both cases.
But I couldn't find any kind of version information for HSSF.
Usecase: I get a lot of files from different sources and for support questions, it would be very helpful to know the Excel version a particular source uses. As you can also save a .xlsx as .xls, the BIFF version alone would not be of much help.

FlexCel created corrupted xlsx file

I use the FlexCel library to create an Excel report in the .xlsx format.
When creating a file in .xls format - everything works fine.
When I try to create a file in the .xlsx format, the file is created, but when I open it by Excel, I get an error that the file is corrupt and can not be opened. And the file has the size in half from .xls.
If someone has encountered a similar problem or knows a solution, I will be very grateful for the answer.
Edit:
My code
var templateFilePath = "D:/template.xlsx";
var newReportPath = "D:/report.xlsx";
using (var fr = new FlexCelReport(true))
{
fr.AddTable("SOReport", dataTable);
fr.Run(
templateFilePath,
newReportPath
);
}
I had the same problem with export in xlsx format with old versions of FlexCel.
Currently I've tested it on FlexCel v5.5.1.0 and there is interesting behaviour:
If I use Excel 2016 or above to create xlsx templates - then I get "file is corrupted" error while trying to open exported xlsx file.
But if I use Excel 2013 or below to create xlsx templates - then I can open it perfectly without errors.
Also mention that if you open xlsx template created by Excel 2013 or below in Excel 2016 or above and save it - you can't restore it to working state in the future. Template will lost for you.
Too late but I hope this can help you.
P.S. Probably, FlexCel team (TMS Software) fixed it starting with v6.7.16.0 version of FlexCel. I think so because documentation said something similar: https://www.tmssoftware.com/site/flexcelnet.asp?s=history
Points like Improved : Improved compatibility with invalid xls and xlsx files
But I can't argue that.

How to programmatically restore the XLS file that was modified by a thirdparty tool

We are in the process of migrating the documents from AppXTender ( a EMC Documentum tool ) to another system.
I took a XLS file from AppXtender physical store ( the tool have renamed the .XLS file to .BIN ) and I knew its an excel file I tried renaming it to .XLS but the file is not opening as excel.
I learnt that the file is been modified by the AppXtender with some content like, "FFL 1.0 followed by the original file name.XLS".
When I open the excel in a notepad I could see this in the first line and there are couple of more lines with some text like "Embedded" and with some number.
If I manually remove those lines and save the file and still the file is not opening as excel.
What are these custom texts? how do i programmatically (C#) remove them from the excel file and restore as the proper excel file?
Thanks!
Karthik

Converting any file to .xlsx leads error in opening it

I am saving as any file to .xlsx format. If the original file is not with .xlsx extension, it is throwing exception while trying to open it. The exception message is :-
Excel cannot open the file 'abc.xlsx because the file format or file
extension is not valid. Verify that the file has not been corrupted
and that the file extenstion matches the format of the file.
Whereas if conversion is in .xls format, I can open the converted file with warning message.
The file you are trying to open, abc.xls, is in a different format
than specified by the file extension. Verify that the file is not
corrupted and is from a trusted source before opening the file. Do you
want to open the file now?
I need to convert file to .xlsx format by C# code regardless its extenstion and open it by Excel 2010.
Thats not how file-types work!
You can not simply rename a file and it gets converted to a different type.
Renaming *.xls to *.xlsx works, because both are Excel files which can be opened by MS Excel, but for all other types (except some other which Excel can handle, like e.g. *.csv) you need to read the file and "manually" convert them.
To write *.xlsx using C# you can use e.g. EPPlus (NuGet).
You need a library that convert the files for you.
I see that you need to open files from Office 2003, so you need to use something like NPOI
Unfortunately, even if EPPLUS is a great library for Office's files, it only support the OXML Documents like .xlsx or .docx but not the .xls.
NPOI is a free opensource library to work with Office 2003->2010/3 files.
Here is the link

Categories