Regex issue i am stuck with - c#

I have to write a regex for matching a pattern 1-6/2011.
In this case, digits before the / can not be greater than 12.
So I have to select digits between 1-12.
I have written a regex:
^[1-9][0-2]?\s*[-−—]\s*[1-9][0-2]?\s*/\s*2[01][0-9][0-9]$
However, here I am getting 20-6/2014 also as a match.
I tried with a negative look-behind:
^[1-9](?<![2-9])[0-2]?\s*[-−—]\s*[1-9](?<![2-9])[0-2]?\s*/\s*2[01][0-9][0-9]$
Here, single digits are not getting identified.

You can use the following update of your regex:
^(?:0?[1-9]|1[0-2])\s*[-−—]\s*(?:0?[1-9]|1[0-2])\s*/\s*\s*2[01][0-9]{2}$
See demo
It will not match 12-30/2014, 12-31/2014, 12-32/2014, 13-31/2014, 20-6/2014.
It will match 1-6/2011 and 02-12/2014.
C#:
var lines = "1-6/2011\r\n02-12/2014\r\n12-30/2014\r\n12-31/2014\r\n12-32/2014\r\n13-31/2014\r\n20-6/2014";
var finds = Regex.Matches(lines, #"^(?:0?[1-9]|1[0-2])\s*[-−—]\s*(?:0?[1-9]|1[0-2])\s*/\s*\s*2[01][0-9]{2}\r?$", RegexOptions.Multiline);
Mind that \r? is only necessary in case we test with Multiline mode on. You can remove it when checking separate values.

So i have to select digits between 1-12
For that you can use regex
(?:0?[1-9]|1[0-2])
See demo.
https://www.regex101.com/r/fJ6cR4/23

You can use this regex:
^(?:[1-9]|1[0-2])\s*-\s*(?:[1-9]|1[0-2])\s*/\s*2[01]\d{2}$
RegEx Demo

Simplest regex to match with 1-12 is (1[0-2]?)|[2-9].
It matches with 13 cause 1[0-2]? matches with 1, but it doesn't matter in full regex (1[0-2]?)|[2-9]\/\d\d\d\d.

Related

How to eliminate digits followed by specific string

I have quite a long regex pattern. Here is just a part of it:
string pattern = #"((?<!top=)(?<![A-Za-z])\d)+";
Given the string:
date(Account/AccountClose) gt 2019-03-25 and Brg eq '100'&$select=IdAccountCurrent&$skip=10&$top=10
It matches 2019, 03, 25, 100, 10 and 0.
I want to eliminate the last 0 from the matching result. In other words, all numbers that are followed by top= should not match.
My solution works only if I have one digit after top=.How can I achieve the desired result ?
regex101 example
UPDATE: Unfortunately, the suggested solutions are not suited for the whole pattern. I tried to make my example simple but it looks like it's imposible to do.
So my whole regex pattern is:
string pattern = #"((?<!top=)(?<![A-Za-z])\d|-|T\d+|:|\.|\+|(?<=\d)Z)+|\bfalse\b|\btrue\b|\bnull\b|'[^']+'|\(['\d][^\)]+\)";
I need to edit this pattern to eliminate all digits right after top=.
my whole example (please see the last row in this example, last 0 should not be matched)
Just add 0-9 in your regex, for forcing the digit not to be preceded by another digit:
((?<!top=)(?<![A-Za-z0-9])\d+)
See here for a demo.
But you can also just use word boundaries:
(?<!top=)\b(\d+)
See here for a demo.
You can change your regex to this where I've used \b to reject the partial matching of digits,
(?<!top=)(?<![A-Za-z])\b\d+
Demo
The way your wrote your regex ((?<!top=)(?<![A-Za-z])\d)+ will work by applying the condition on an individually and then counting one or more such characters which wouldn't have allowed using \b in your regex and hence I changed it to remove outer parenthesis and used \b\d+. Hopefully this should give you all your desired matches. Let me know if you face any issues.

Regex for checking numbers in a string

I am looking for help with a regex for checking a string that could contain 10 digits separated by other characters or alphabets. For example
call1234567890
1234567890call
12.34_567.890_call
I have tried \D*(\d\D*){10}$ as suggested in other posts , but this matches with any string that has numbers even if 1 and characters after 1. So
Silly_1_me is also being caught
You must need to include starting anchor ^ so that it would do an exact line match or otherwise, it would do a partial string match.
#"^\D*(\d\D*){10}$"
DEMO
For multiline input , its better to use the below regex.
#"^[^\n\d]*(\d[^\n\d]*){10}$"
^(?!(?:.*\d){11,})(?:.*\d){10}[^\d]*$
Try this.See demo.
http://regex101.com/r/hQ9xT1/21

Regular expression - secong group of digits

Hi I would like to pull second group of digits which are after (-) from below string:
D:\data\home\Logs_Audit\VO12_LAB_20140617-000301.txt
I used \d{8} to pull 20140617 but now I want to pull 000301
EDIT 1:
Now I would Like to pull VO12_LAB from above string. Could You please help me.
I am not good at regular expression and I didn't find good tutorial to understand it.
EDIT 2:
I found that something like
\w{2,3}\d{2,3}_\w{2,3}
works to me. Do you think it is accurate enough?
You can use lookahead/lookbehind to find the group based on "anchors", like this:
(?<=[-])\\d+(?=[.]txt)
The groups before and after the \\d+ are non-capturing zero-width "markers", in the sense that they do not consume any characters from the string, only describe character combinations that need to precede and/or follow the text that you would like to match.
You can use a Positive Lookahead for this.
\d+(?=\.)
Explanation: This matches digits (1 or more times) preceded by a dot .
\d+ digits (0-9) (1 or more time)
(?= look ahead to see if there is:
\. '.'
) end of look-ahead
Live Demo
Final Solution:
String s = #"D:\data\home\Logs\V_LAB_20140617-000301.txt";
Match m = Regex.Match(s, #"\d+(?=\.)");
if (m.Success) {
Console.WriteLine(m.Value); //=> "000301"
}
You can use this regex:
(?<=-)(\d+)
The first group will contain the digits.
Live Demo

Regex Substring or Left Equivalent

Greetings beloved comrades.
I cannot figure out how to accomplish the following via a regex.
I need to take this format number 201101234 and transform it to 11-0123401, where digits 3 and 4 become the digits to the left of the dash, and the remaining five digits are inserted to the right of the dash, followed by a hardcoded 01.
I've tried http://gskinner.com/RegExr, but the syntax just defeats me.
This answer, Equivalent of Substring as a RegularExpression, sounds promising, but I can't get it to parse correctly.
I can create a SQL function to accomplish this, but I'd rather not hammer my server in order to reformat some strings.
Thanks in advance.
You can try this:
var input = "201101234";
var output = Regex.Replace(input, #"^\d{2}(\d{2})(\d{5})$", "${1}-${2}01");
Console.WriteLine(output); // 11-0123401
This will match:
two digits, followed by
two digits captured as group 1, followed by
five digits captured as group 2
And return a string which replaces that matched text with
group 1, followed by
a literal hyphen, followed by
group 2, followed by
a literal 01.
The start and end anchors ( ^ / $ ) ensure that if the input string does not exactly match this pattern, it will simply return the original string.
If you can use custom C# scripts, you may want to use Substring instead:
string newStr = string.Format("{0}-{1}01", old.Substring(2,2), old.Substring(4));
I don't think you really need a regex here. Substring would be better. But still if you want regex only, you can use this:
string newString = Regex.Replace(input, #"^\d{2}(\d{2})(\d+)$", "$1-${2}01");
Explanation:
^\d{2} // Match first 2 digits. Will be ignored
(\d{2}) // Match next 2 digits. Capture it in group 1
(\d+)$ // Match rest of the digits. Capture it in group 2
Now, the required digits, are in group 1 and 2, which you use in the replacement string.
Do you even SQL? Pull some levers and stuff.

regex for capturing digits and digit ranges

i have the following string
Fat mass loss was 2121,323.222 greater for GPLC (2–2.4kg vs. 0.5kg)
i want to capture
212,323.222
2-2.24
0.5
i.e. i want the above three results from the string,
can any one help me with this regex
I noticed that your hyphen in 2–2.4kg is not really hyphen, its a unicode 0x2013 "DASH".
So, here is another regex in C#
#"[0-9]+([,.\u2013-][0-9]+)*"
Test
MatchCollection matches = Regex.Matches("Fat mass loss was 2121,323.222 greater for GPLC (2–2.4kg vs. 0.5kg)", #"[0-9]+([,.\u2013-][0-9]+)*");
foreach (Match m in matches) {
Console.WriteLine(m.Groups[0]);
}
Here is the results, my console does not support printing unicode char 2013, so its "?" but its properly matched.
2121,323.222
2?2.4
0.5
Okay I didn't notice the C# tag until now. I will leave the answer but I know that's not what you expected, see if you can do something with it. Perhaps the title should have mentioned the programming language?
Sure:
Fat mass loss was (.*) greater for GPLC \((.*) vs. (.*)kg\)
Find your substrings in \1, \2 and \3.
If for Emacs, swap all parentheses and escaped parentheses.
How about something like this:
^.*((?:\d+,)*\d+(?:\.\d+)?).*(\d+(?:\.\d+)?(?:-\d+(?:\.\d+))?).*(\d+(?:\.\d+)).*$
A little more general, I think. I'm a little concerned about .* being greedy.
Fat mass loss was 2121,323.222 greater
for GPLC (2–2.4kg vs. 0.5kg)
a generalized extractor:
/\D+?([\d\,\.\-]+)/g
explanation:
/ # start pattern
\D+ # 1 or more non-digits
( # capture group 1
[\d,.-]+ # character class, 1 or more of digits, comma, period, hyphen
) # end capture group 1
/g # trailing regex g modifier (make regex continue after last match)
sorry I don't know c# well enough for a full writeup, but the pattern should plug right in.
see: http://www.radsoftware.com.au/articles/regexsyntaxadvanced.aspx for some implementation examples.
I came out with something like this atrocity:
-?\d(?:,?\d)*(?:\.(?:\d(?:,?\d)*\d|\d))?(?:[–-]-?\d(?:,?\d)*(?:\.(?:\d(?:,?\d)*\d|\d))?)?
Out of witch -?\d(?:,?\d)*(?:\.(?:\d(?:,?\d)*\d|\d))? is repeated twice, with – in the middle (note that this is a long hyphen).
This should take care of dots and commas outside of numbers, eg: hello,23,45.2-7world - will capture 23,45.2-7.
It looks like you're trying to find all numbers in the string (possibly with commas inside the number), and all ranges of numbers such as "2-2.4". Here is a regex that should work:
\d+(?:[,.-]\d+)*
From C# 3, you can use it like this:
var input = "Fat mass loss was 2121,323.222 greater for GPLC (2-2.4kg vs. 0.5kg)";
var pattern = #"\d+(?:[,.-]\d+)*";
var matches = Regex.Matches(input, pattern);
foreach ( var match in matches )
Console.WriteLine(match.Value);
Hmm, this is a tricky question, especially because the input string contains unicode character – (EN DASH) instead of - (HYPHEN-MINUS). Therefore the correct regex to match the numbers in the original string would be:
\d+(?:[\u2013,.]\d+)*
If you want a more generic approach would be:
\d+(?:[\p{Pd}\p{Pc}\p{Po}]\d+)*
which matches dash punctuation, connecter punctuation and other punctuation. See here for more information about those.
An implementation in C# would look like this:
string input = "Fat mass loss was 2121,323.222 greater for GPLC (2–2.4kg vs. 0.5kg)";
try {
Regex rx = new Regex(#"\d+(?:[\p{Pd}\p{Pc}\p{Po}\p{C}]\d+)*", RegexOptions.IgnoreCase | RegexOptions.Multiline);
Match match = rx.Match(input);
while (match.Success) {
// matched text: match.Value
// match start: match.Index
// match length: match.Length
match = match.NextMatch();
}
} catch (ArgumentException ex) {
// Syntax error in the regular expression
}
Let's try this one :
(?=\d)([0-9,.-]+)(?<=\d)
It captures all expressions containing only :
"[0-9,.-]" characters,
must start with a digit "(?=\d)",
must finish with a digit "(?<=\d)"
It works with a single digit expression and does not include beginning or trailing [.,-].
Hope this helps.
I got the solution to my problem.
The following is the Regex that gave my desired result:
(([0-9]+)([–.,-]*))+

Categories