I'm making a library that uses OpenXML in C# to read excel files. I can read a cell text and numbers just fine, but when it comes to dates there's a problem.
There's the type "date" for the cells, but apparently Excel 2007 doesn't save the dates in that type, so I can't tell if the value I'm reading is a date or not, instead it appears to use styles.
How could I detect if it is a date and return the string representation of it (ex: 29-12-2010)?
Excel stores dates as a float value... the integer part being the number of days since 1/1/1900 (or 1/1/1904 depending on which calendar is being used), the fractional part being the proportion of a day (ie the time part)... made slightly more awkward by the fact that 1900 is considered a leap year.
The only thing that differentiates a data from a number is the number format mask. If you can read the format mask, you can use that to identify the value as a date rather than a number... then calculate the date value/formatting from the base date.
Related
I need to convert a C# datetime object into the dreaded Excel date format:
https://datapub.cdlib.org/2014/04/10/abandon-all-hope-ye-who-enter-dates-in-excel/
i.e. number of days since 1 Jan 1900 expressed as a floating point number.
Is there any way to do it without resorting to DIY code?
I need it in order to create Excel-friendly CSV exports
Googling around I didn't find anything useful except that good blog post
Excel dates use the OLE Automation date format. You can retrieve it with DateTime.ToOADate
OA Dates are a double whose integer part is the date offset from 30 December 1899 (ie earlier dates are negative) and fractional part is the time divided by 24.
This type was used a lot in the COM/VB6 days. Nowadays it's needed for Excel and when you need to call COM APIs that expect dates or variants with a date content.
You can use following method to convert from Excel Date back to C# DateTime
return DateTime.FromOADate(SerialDate);
In an Excel cell, I put 12, if I format it as Date, then it is 1/12/1900.
In C#, I use DateTime.FromOADate(12), it returns 1/11/1900.
but if I put 411 in Excel and format it as date, it will be 2/14/1901.
In C#, DateTime.FromOADate(411) returns 2/14/1901, too.
I am confused about the discrepancy. How can I get the right Date in C# then?
This is an Excel quirk where it emulates a Lotus 1-2-3 bug for compatibility.
The year 1900 was not a leap year, but Excel treats it as a leap year to be compatible with the Lotus 1-2-3 bug. The OLE date/time processing correctly does not.
There is this amusing anecdote about the issue: http://www.joelonsoftware.com/items/2006/06/16.html
If you need to work around this and emulate the Excel behaviour before 1 March 1900, you can set a double instead of a date, and do a DateTime to double conversion that respects the Excel bug too. Internally Excel always represents the dates as doubles anyway.
I am working on some functionality, which needs to know the data types of columns of a user given Excel Spreadsheet. These spreadsheets could have various formats, there is no standard format besides the first row being the names for the columns. The problem I am having is being able to differentiate between integers and DateTime Columns. Currently, I am using the following function to determine if a cell is a DateTime cell or not:
private bool isOADate(double Val)
{
try
{
DateTime dt = DateTime.FromOADate(Val);
return true;
}
catch (ArgumentException)
{
return false;
}
}
However if a cell has 1, 2, 3 etc in it, this function returns true as it is able to convert these to 1899/12/31 12:00:00 AM, 1900/01/01 12:00:00 AM, 1900/01/02 12:00:00 AM respectively. Is there a better way to determine the DateTime data type of cell? Or can you suggest an improvement on my current function so as to differentiate between them?
VBA function Isdate() checks also the cell format: 27/01/2012gives True, the result of its integer equivalent 40935 is False.
Every cell in Excel contains either a formula, a text string or a double numeric value.
A date/time is just a number, so 11:02 on 02-Jan-2013 is 41276.45972, representing the number of days since 31-Dec-1899 (incorporating the old Lotus 123 error of believing 1900 was a leap year).
So there's no way to be certain that a cell contains a date value without knowing something about the specific context of the worksheet.
If you know that dates will fall within a certain range (in particular we can often define at least a lower bound) then the function can be enhanced to test for a minimum value.
Further, if you can work with a reference to the cell itself and you know that cells containing values that represent date/times will be formatted appropriately, then you can test the NumberFormat property for something date/time-related (but this can quickly get complicated where custom formats are in use).
Excel stores date times as integers. In absence of any other context, you cannot differentiate between an integer and a date time. Unless the user has specified a column as date time; in which case you can read the .numberformat and deduce the type of the column from that.
My C# Excel addon is transforming a time series into another time series.
Time series is defined as a two column range with the first column being date and second column being a floating point number.
Though my C# code is returning back DateTime for date, the Excel converts it into Excel date represented by integer and the displayed value is the integer, e.g. 39000, not a date format, e.g. 14/12/2012.
Thus, the user has to press CTRL + # to turn it into dates.
Is there a way somehow return the output of a C# excel addon already Excel formatted as Date?
a) Format the cells from the code which returns the values. E.g. if you have a handle to the range object, you can do something like range.NumberFormat="yyyy-mm-dd".
b) Pre-format the cells to a date format before running the code which imports the data. This would work if the same sheet is used each time, rather than a blank or arbitrary sheet.
c) This is an ugly hack, but try returning your date values as strings, formatted as "mm/dd/yyyy". This should force Excel to interpret the values as dates. AFAIK the machine locale settings will not affect the translation in this case.
To supplement Adam's c) answer:
If you have a Date/Time variable and return it to a cell from a Sub using Range.Value it will format the cell as a date (tested using VBA rather than C#: not sure if .Value is available using interop?)
If you use Range.Value2 it will NOT format the cell as a date.
If you try to do this from a UDF the UDF will NOT change the formatting of the cell.
Excel AddIn, C#, UDF,
MyUDF calls a web service to retrieve certain date. Sometimes the date returned is not in range of Jan-1-1900 to Dec-31-9999 (Excel Date Range). E.g. in one case returned date is Jan-2-0002 (valid in C# but not valid in Excel), then Excel crashes. I do not hard code "Jan-1-1900", so wonder if there is a way that I can get it programmatically. thanks
Inside Excel, convert the date corresponding to 1 into a string that you can parse:
=TEXT(1,"dd-mm-yyyy")
which should give you 01-01-1900 or 02-01-1904 depending on the date system chosen (hat tip to barrowc!). Note how the date systems don't only differ by four years but also by the fact that in one case the value 1 corresponds to January 1st (in 1900) and in the other it is 0 which corresponds to January 1st (in 1904). For a given Workbook, the Workbook.Date1904 Property can tell you which date system is being used.