How to parse abbreviated day-of-week name (e.g. 'Fri' or 'Sat') to NodaTime.IsoDayOfWeek? Preferrably looking for something that uses NodaTime built in pattern matching capabilities. If that's not possible, alternative approaches or workarounds are welcome. This does not have to be culture sensitive, okay to assume CultureInfo.InvariantCulture.
I tried using LocalDatePattern "ddd" with a LocalDate template but it will fail unless the day of week I'm parsing (e.g. 'Fri') matches the template day of week. I could iterate over 7 different templates and return the first one that succeeds to parse but surely there must be a better way?
Related
Here is a scenario.
You have a string that represents a date i.e. "Jan 25 2016 10:10 AM".
You want to know whether it represents a date in a specific culture.
You want to know what dateTime pattern satisfies this date string.
Example:
Date string is "Jan 25 2016 10:10 AM"
Culture is en-US
The POSSIBLE format for it could be "MMM dd yyyy HH:mm tt"
Implementation:
To get the list of all dateTime patterns you can get a CultureInfo.DateTimeFormat.GetAllDateTimePatterns()
Then try the overloaded version of DateTime.TryParseExact(dateString, pattern, culture, DateTimeStyles.None, out resultingDate) for each of the patterns above and see whether it can parse a date.
That should give you the needed dateTime pattern.
HOWEVER if we iterate all those patterns it will not find any matches!
This is even more weird if you try and use a DateTime.TryParse(dateString, culture, DateTimeStyles.None, out resultingDate) and it DOES parse the correct date!
So the question is how come the DateTime.TryParse knows the pattern of a date string when this info is not a part of CultureInfo and how to get to this info in a culture?
Thanks!
I agree with xanatos, there is no perfect solution for that and you can't assume that every format GetAllDateTimePatterns returns can be perfectly parsable with Parse or TryParse methods.
From DateTimeFormatInfo.GetAllDateTimePatterns;
You can use the custom format strings in the array returned by the
GetAllDateTimePatterns method in formatting operations. However, if
you do, the string representation of a date and time value returned in
that formatting operation cannot always be parsed successfully by the
Parse and TryParse methods. Therefore, you cannot assume that the
custom format strings returned by the GetAllDateTimePatterns method
can be used to round-trip date and time values.
If you see Remarks section on the page, there are only 42 formats that can be parsed by TryParse method in 96 formats that GetAllDateTimePatterns method returns for it-IT culture for example.7
Tarek Mahmoud Sayed responded as;
Parse/TryParse are implemented as finite state machine so it doesn’t
really use the date patterns in parsing. It just split the parsed
string into tokens and try to find if the token match specific part of
the date (like Month, day, day of week…etc.). in the other hand
ParseExact/TryParseExact will just parse the string according to the
passed format pattern.
In short, Parsing is really hard because there are a lot of things that can trip it up. And someone in some government could suddenly decide that country X should use D/M/Y instead of M/D/Y, or could have someone entering data used to the other format.
I talk a little about this on a blog post (toward the bottom-ish) https://web.archive.org/web/20190110065542/https://blogs.msdn.microsoft.com/shawnste/2005/04/05/culture-data-shouldnt-be-considered-stable-except-for-invariant/
DateTime.Parse attempts to guess what the input might be based on the pattern(s) and separators it sees in the specified culture. Unfortunately, some cultures are REALLY hard to guess at. For example, . has been used for time formats in some locales, so is 1.1.1 12.12.12 the 12th day of December 2012? Or the 1st day of January 2001?
ParseExact (as the other answers suggest) is more reliable as you can tell it exactly what you're looking for - even better, you can also tell the user exactly what to enter. (Hopefully this is human input). Unfortunately it requires the user to follow the template.
This is also why most date controls you encounter, especially on the web, have separate fields for month, day & year.
For machine readable formats its best to spit it out in some standard format and read it back in with that exact same format. We've had customers send data from one country to another using the CurrentCulture and wonder why their vendor can't read it ;-)
I have a large collection of text files that need to be parsed using Regexes. Some of the data that I'm collecting are dates that come in all sorts of formats, such as 12/1/15, Dec 1, 2015, 12-1-15, etc. They sometimes have a year listed and sometimes don't. My problem occurs when I have dates that span two years, i.e. 12/1 - 1/8, where the first date needs the year 2015 and the second date needs the year 2016. Currently I'm parsing them as strings and trying to convert them to DateTimes. This adds the current day's year, so if it's parsed in 2015, the second date is wrong and if it's parsed in 2016 the first date is wrong. Is there a way to determine when Convert.ToDateTime adds the year since the string was missing one? If I could determine this I have a way to determine which year needs to be added.
Convert.ToDateTime just uses DateTime.Parse. My understanding is that when interpreting MM/dd formats, it will always assume the current year.
In your scenario, it sounds like you will need to make some determination on how you want to handle this. For instance, you could test that if the latter date precedes the prior date, you add a year.
I was looking at a code in an application (Someone else wrote it),on some cases it worked fine and on some cases it gave exceptions,it was actually converting strings in datetime,here is the code
//5000 is the year,but what about "1" is it month or day ?,if its month
//then what about the day ?
DateTime time = DateTime.Parse("1.5000");//1.5000 doesn't looks a date to me ?
time.ToString();//returns "1/1/5000 12:00:00 AM"
//where as if I give this string to DateTime.Parse();
time = DateTime.Parse("2341.70");
//FormatException was unhandled
//String was not recognized as a valid DateTime.
A Confusing thought
How does this string "3.5000" (it matches the 1.5000 pattern) evaluates , does this means 3-3-5000 or 1-3-5000 ,the format is ambiguous its unclear and confusing !
My questions are,
What kind of formats can DateTime.Parse expects ?
Whats happening in the code above ?
Suggestions to improve the code ?
Many people have commented on the possible reasons for the parse that you have seen being successful but your question seems to have several separate parts...
1. What kind of formats can DateTime.Parse expects ?
DateTime.Parse has been written to be as inclusive as possible. Pretty much anything that it can find someway to make into a DateTime it will do its best to do so which means in addition to the usual familiar yyyy-MM-dd type formats more strange ones like M.yyyy or yyyy.M and so on.
2. Whats happening in the code above ?
That is very complicated because the DateTime.Parse method is itself very complicated. You can probably fidn the source code out there somewhere but the complexity made it very hard for me to follow. Without being able to give precise details I'm going to answer this the same as above. What is happening is that the framework is trying its best to give you a date back and not throw an exception. The date it gives is the best guess as to what you meant.
3. Suggestions to improve the code ?
It sounds like if you are getting parse exceptions that you are passing dates in formats that are unexpected. Without knowing what those inputs are its hard to say. Two things could improve your code though. Making sure a single consistent date format is used and then using DateTime.ParseExact to ensure that it conforms to the right format. You will remove all ambiguity this way but you will sacrifice flexibility.
The second option is to use DateTime.TryParse. This will attempt to parse your date and then return a boolean saying whether it succeeded or not. If successful the date parse will be returned in a ref parameter. This won't make your code any better at recognising unknown date formats but will let your code know when such an unparsable format crops up and you can deal with it (eg by providing user feedback reporting the wrong format and suggesting a correct one, or just by logging it or something else).
What the best method is depends mostly on where your input is coming from. If it is user input then I'd go with the second option. If it is automated input then you probably want to make sure your input is standardized and then use the first option. Of course circumstances always vary so this is not a hard and fast rule. :)
In regards to "2. Whats happening in the code above ?":
In some cultures, the date separator is a dot instead of a slash. So for example 13.12.2013 is a valid date (2013-12-13) in the format "dd.MM.yyyy". Now by whatever design choice, the day part in this example is not mandatory and if left out, is automatically filled with 1. So parsing 12.2013 would result in 2013-12-01. And therefore it's easy to see how 1.5000 would become 5000-01-01. 2341.70 can not be parsed, because 2341 is not a valid month. - So in this case 1.5000 is a "valid" date in the format M.yyyy.
I use asp.net 4 and c#.
I need to use a WebControl of type Validation namely RegularExpressionValidator to detect data inputed in a TextBox that IS NOT in format yyyy-MM-dd (String).
Any idea how to write the RegEx to apply ot this control?
Thanks
Here's one possible regex:
^\d{4}-((0\d)|(1[012]))-(([012]\d)|3[01])$
Note: this will prevent months >12 and days >31, but won't check specific months for length (ie it won't block 30th Feb or 31st Apr). You could write a regex to do that, but it would be quite lengthy, and 29th Feb is always going to give you problems in regex.
I'd say if you need that kind of fine-grained validation, you're better off parsing the date with a date library; regex isn't the tool for you. This regex should be sufficient for basic pre-validation though.
I've also gone lenient on the year; just checking that it's four digits. But if you want some sort of sanity check (ie within certain bounds), it shouldn't be too hard to add. Foe example, if you want to match only dates in the this century, you would replace the \d{4} at the beginning of the regex with 20\d{2}. Again, trying to validate a date with excessive accuracy in regex is going to be difficult and you should use a date parser, but you can get basic century-level matching quite easily to prevent the user entering anything really silly.
Finally, I've put ^ and $ to tie off the ends of the string so it can't match if the user enters a valid date and extra characters as well. (You may want to add a string length validator for this as well).
Hope that helps.
Spudley's answer above allows 00 for day and month.
I fixed it :
^\d{4}-((0[1-9])|(1[012]))-((0[1-9]|[12]\d)|3[01])$
Note: neither of these expressions check for days in a month that are invalid, e.g. 04/31, 06/31 or 02/29 on non-leap years.
Regular expression \d\d\d\d-\d\d-\d\d should do the trick.
I would like to add a little change in Spudley's answer:
^\d{4}$|^\d{4}-((0?\d)|(1[012]))-(((0?|[12])\d)|3[01])$
so you can use date like 2013-5-5 (month and date is not necessary the zero but can be used)
Hope it helps.
Another implementation for ISO 8601 structured dates:
^\d{4}-\d{1,2}-\d{1,2}\s\d{1,2}:\d{1,2}:\d{1,2}.?\d{0,}$
It's not quite as strict, and will accept incorrect dates, but it should validate that it follows the ISO 8601 structure even if the date is a non-existent one. It should also be fairly simple to understand for anyone with a basic Regex understanding.
If you really want to ensure the date is correct, and work with it, run DateTime.TryParse() on it.
(19|20)[0-9]{2}-(0[1-9]|1[012])-(0[1-9]|[12][0-9]|3[01])
mach result:
1999-09-12
((([0-9][0-9][0-9][1-9])|([1-9][0-9][0-9][0-9])|([0-9][1-9][0-9][0-9])|([0-9][0-9][1-9][0-9]))-((0[13578])|(1[02]))-((0[1-9])|([12][0-9])|(3[01])))|((([0-9][0-9][0-9][1-9])|([1-9][0-9][0-9][0-9])|([0-9][1-9][0-9][0-9])|([0-9][0-9][1-9][0-9]))-((0[469])|11)-((0[1-9])|([12][0-9])|(30)))|(((000[48])|([0-9]0-9)|([0-9][1-9][02468][048])|([1-9][0-9][02468][048]))-02-((0[1-9])|([12][0-9])))|((([0-9][0-9][0-9][1-9])|([1-9][0-9][0-9][0-9])|([0-9][1-9][0-9][0-9])|([0-9][0-9][1-9][0-9]))-02-((0[1-9])|([1][0-9])|([2][0-8])))
This is the regex for yyyy-MM-dd format.
You can replace - with \/ for yyyy/MM/dd...
Tested working perfect..
I understand different cultures specify dates differently. Some put day before month (27/10/2009 vs. 10/27/2009) and others use dots instead of slashes (10.27.2009 vs. 10/27/2009). However, is there anything special that needs to be done regarding the year? Do non-Christian cultures refer to the same numeric year (2009) as Christian cultures? I created a simple C# app and did a toString on the current date, changed the language/culture to Arabic and it displays the same thing. Maybe the year is a globally accepted standard???
Use the ToString() overload which takes an IFormatProvider and pass in CultureInfo.CurrentCulture and the date will format appropriately.