C# Regular Expression not matching - c#

I have a regular expression
string dateformattwo = #"^(?:(?:31(\/|-|\.)(?:0?[13578]|1[02]|(?:Jan|Mar|May|Jul|Aug|Oct|Dec)))\1|(?:(?:29|30)(\/|-|\.)(?:0?[1,3-9]|1[0-2]|(?:Jan|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec))\2))(?:(?:1[6-9]|[2-9]\d)?\d{2})$|^(?:29(\/|-|\.)(?:0?2|(?:Feb))\3(?:(?:(?:1[6-9]|[2-9]\d)?(?:0[48]|[2468][048]|[13579][26])|(?:(?:16|[2468][048]|[3579][26])00))))$|^(?:0?[1-9]|1\d|2[0-8])(\/|-|\.)(?:(?:0?[1-9]|(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep))|(?:1[0-2]|(?:Oct|Nov|Dec)))\4(?:(?:1[6-9]|[2-9]\d)?\d{2})";
and two strings
string value = "30.Jul.2019 This is the line that I want to match"
string value2 = "30.jul.2019"
The regex is correct however it does not match with value but it matches with value2. Why is that happening?

I couldn't get your regex to match your strings, so it's hard to say exactly what's expected here, but I can take a guess as to why it's not working: nowhere in your regex are you looking for july - looks to me like you're only matching for JUL.
Edit: each of your regexes end with $, which asserts its position at the end of the line. Your first line fails because there's characters after the date.
Updates regex here which, despite being a php-matching regex as pointed out in the comments, still matches your desired text.

Related

I want only matching string using regex

I have a string "myname 18-may 1234" and I want only "myname" from whole string using a regex.
I tried using the \b(^[a-zA-Z]*)\b regex and that gave me "myname" as a result.
But when the string changes to "1234 myname 18-may" the regex does not return "myname". Please suggest the correct way to select only "myname" whole word.
Is it also possible - given the string in
"1234 myname 18-may" format - to get myname only, not may?
UPDATE
Judging by your feedback to your other question you might need
(?<!\p{L})\p{L}+(?!\p{L})
ORIGINAL ANSWER
I have come up with a lighter regex that relies on the specific nature of your data (just a couple of words in the string, only one is whole word):
\b(?<!-)\p{L}+\b
See demo
Or even a more restrictive regex that finds a match only between (white)spaces and string start/end:
(?<=^|\s)\p{L}+(?=\s|$)
The following regex is context-dependent:
\p{L}+(?=\s+\d{1,2}-\p{L}{3}\b)
See demo
This will match only the word myname.
The regex means:
\p{L}+ - Match 1 or more Unicode letters...
(?=\s+\d{1,2}-\p{L}{3}\b) - until it finds 1 or more whitespaces (\s+) followed with 1 or 2 digits, followed with a hyphen and 3 Unicode letters (\p{L}{3}) which is a whole word (\b). This construction is a positive look-ahead that only checks if something can be found after the current position in the string, but it does not "consume" text.
Since the date may come before the string, you can add an alternation:
\p{L}+(?=[ ]+\d{1,2}-\p{L}{3}\b)|(?<=\d{1,2}-\p{L}{3}[ ]+)\p{L}+
See another demo
The (?<=\d{1,2}-\p{L}{3}\s+) is a look-behind that checks for the same thing (almost) as the look-ahead, but before the myname.
here is a solution without RegEx
string input = "myname 18-may 1234";
string result = input.Split(' ').Where(x => x.All(y => char.IsLetter(y))).FirstOrDefault();
Do a replace using this regex:
(\s*\d+\-.{3}\s*|\s*.{3}\-\d+\s*)|(\s*\d+\s*)
you will end up with just your name.
Demo

Not detecting the text between two characters?

I need to recognize the number between the tags [DN]4[-DN] so I wrote this regex:
Regex regexCount = new Regex(#"\[DN]([^)]*)\[-DN]");
Match matchCount = regexCount.Match("[DN]4[-DN]");
However when I try to convert the string match to a Int32, I get this error:
Input string was not in a correct format.
This is how I tried converting:
int count = Convert.ToInt32(matchCount.Value);
When I debugged, I saw that the matched value returns {[DN]2[-DN]} instead of 2. However the regex101 test gave away the correct result with the same regex: regex101
What am I doing wrong folks?
You're currently returning the entire match. You need to return the matched context from your capturing group. The Groups property gets the captured groups within the regular expression.
int Count = Convert.ToInt32(matchCount.Groups[1].Value);
Also, the negated character class seems incorrect, I would use the regex token \d instead.
#"\[DN](\d+)\[-DN]"

C# regex match behaviour

I've got this line in my code:
Match match = Regex.Match(actualValue, regexValue, RegexOptions.None);
I've got a simple question. why when checking for success meaning with the line:
if(match.Success)
then the match does succeed with the following values:
actualValue = "G:1"
regexValue = "A*"
the actual does not seem to fit at least for me so i probably miss something...
what i do want to achieve is just receiving an actual value and a regular expression and check if the actual value fits the regular expression.. i thought that's what i did there but apparently i didn't.
EDIT: another question. is there a way to treat the * as the "any char" wildcard? meaning is it possible that A* will be considered as A and after it any char is possible?
Your code itself is correct; your regular expression isn't.
Based on your comments on other answers, you're after a regular expression which matches any string which starts with A, and you're assuming that '*' means "any characters". '*' in fact means "match the preceding character zero or more times", so the regular expression you've given means "match the start of the string followed by zero or more 'A' characters", which will match absolutely anything.
If you're looking for a regular expression that matches the whole string but only if it starts with 'A', the regular expression you're after is ^A.*. The '.' character in a regular expression means "match any character". This regular expression thus means "match the start of the string, followed by an 'A', followed by zero or more other characters" and will thus match the entire string provided it starts with 'A'.
However, you already have the whole string, so this is a little unnecessary - all you really want to do is get an answer to the question "does the string start with an 'A'?". A regular expression that will achieve this is simply '^A'. If it matches, the string started with an 'A'.
Of course, it should be pointed out that you don't need a regular expression to confirm this anyway. If this is genuinely all you want to do (and it's possible you've just put together a simple example, and your real scenario is more complicated), why not just use the StartsWith method?:
bool match = actualValue.StartsWith("A");
The regex matches because A* means "look for 0 or more occurrences of 'A'". It will match any string.
If you meant to look for an arbitrary number of 'A', but at least one, try A+ instead.
Looking at the comments it looks like you're trying to match a lot of strings starting with A.
If they're separated by white space you could find all of them using the following:
bool matched = Regex.IsMatch(actualValue, #"\bA\w+");
This matches : "Atest flkjs Apple Ascii cAse".
If there is only one string you're matching and it starts with A and has no spaces:
bool matched = Regex.IsMatch(actualValue, #"^A\w+$");
This matches "Apple", but not "Apple and orange" as the second string has spaces.
As Chris noted * is not a wildcard in the way you meant with regex searches. You can find some information to get you started with regexes at regex-info.
Regex take the regular expression in the constructor.
Exampel in your case could be :
if(new Regex("A*").IsMatch(actualValue)
//Do something
If you are unsecure of the regexpattern, try it out here

Regex Replace in between

I have been trying real hard understanding regular expression,
Is there any way I can replace character(s) that is between two strings/
For example
I have
sometextREPLACEsomeothertext
I want to replace , REPLACE (which can be anything in real work) ONLY between sometext and someothertext with other string.
Can anyone please help me with this.
EDIT
Suppose, my input string is
sometext_REPLACE_someotherText_something_REPLACE_nothing
I want to replace REPLACE text in between sometext and someotherText
resulting following output
sometext_THISISREPLACED_someotherText_something_REPLACE_nothing
Thank you
If I understand your question correctly you might want to use lookahead and lookbehind for your regular expression
(?<=...) # matches a positive look behind
(?=...) # matches a positive look ahead
Thus
(?<=sometext)(\w+?)(?=someothertext)
would match any 'word' with at least 1 character following 'sometext' and followed by 'someothertext'
In C#:
result = Regex.Replace(subject, #"(?<=sometext)(\w+?)(?=someothertext)", "REPLACE");
This is the regex to test if the string is valid.
\^.REPLACE.\
C# replace
string s = "sdfsdfREPLACEdhfsdg";
string v = s.Replace("REPLACE", "SOMETEXT");

Regular Expression with Groups and Values in C#

I am trying to write a simple regex to convert some two digit years to four digit years in a pipe delimited file. I am using:
Regex dateFormat = new Regex(#"\|(\d\d)/(\d\d)/([\d\d)\|");
string convertedString = dateFormat.Replace(contents, #"|$1$220$3|'");
What I want is |10/31/09| to be replaced with |10312009|.
What I am getting is |10$22009|
I think the problem is .NET is evaluating $1 and $3 but is thinking there is a group in the middle with no value ($220 maybe?). How can I let .NET know that the 20 is a constant value instead of part of the group value?
Thanks in advance
Your intuition about the problem is correct: the second backreference is being interpreted as $220, not $2. To fix this, use curly braces:
dateFormat.Replace(contents,#"|$1${2}20$3|'");
More info about .NET regular expressions is available here.
Your regex text doesn't parse. Was the "[" supposed to be there? Wrap the number in {} to fix the replace issue:
Regex dateFormat = new Regex(#"\|(\d\d)/(\d\d)/(\d\d)\|");
string convertedString = dateFormat.Replace(contents, #"|${1}${2}20${3}|'");
You can modify your Regex to use named groups instead. The syntax for a named group is (?). Then, in your Replace function you can use the group names instead of the group number.
Regex dateFormat = new Regex(#"\|(?<month>\d\d)/(?<day>\d\d)/(?<year>[\d\d)\|");
string convertedString = dateFormat.Replace(contents, #"|${month}${day}20${year}|'");
I don't know how to do that but here is my workaround. To use named group.
Regex dateFormat = new Regex(#"\|(?<month>\d\d)/(?<date>\d\d)/(?<year>\d\d)\|");
string convertedString = dateFormat.Replace(contents, #"|${month}${date}20${year}|'");
See more infor at the bottom of this page.
Hope this help.
Try this:
string contents = "|10/31/09|";
Regex dateFormat = new Regex(#"\|(?<mm>\d\d)/(?<dd>\d\d)/(?<yy>\d\d)\|");
Console.WriteLine(dateFormat.Replace(contents, "|${mm}${dd}20${yy}|"));
More information:
Call RegexObj.Replace("subject", "replacement") to perform a search-and-replace using the regex on the subject string, replacing all matches with the replacement string. In the replacement string, you can use $& to insert the entire regex match into the replacement text. You can use $1, $2, $3, etc... to insert the text matched between capturing parentheses into the replacement text. Use $$ to insert a single dollar sign into the replacement text. To replace with the first backreference immediately followed by the digit 9, use ${1}9. If you type $19, and there are less than 19 backreferences, the $19 will be interpreted as literal text, and appear in the result string as such. To insert the text from a named capturing group, use ${name}. Improper use of the $ sign may produce an undesirable result string, but will never cause an exception to be raised.
From http://www.regular-expressions.info/dotnet.html
I see problems with your regular expression, namely the unmatched [ character. The following works fine:
\|(?<month>\d{2})/(?<day>\d{2})/(?<year>\d{2})\|
That will group the month, day, and year results. You can then replace with the following string:
|$1/$2/20$3|

Categories