Need value in regex.match group - c#

I want to match regex such that the sign(+ or -) in one group and figure in other group. It may possible that figure comes without any sign(+ or -)
Example
[-] 87.90
[+] 87.78
(-) 87.90
(+) 87.78
89
-89.56
- 89.98
I have used below regular expression
^\W*(\-|\+|)\W*(\d+(\.\d+)?)
By this I am getting empty in group 1
If I use
^\W*(\-|\+)\W*(\d+(\.\d+)?)
then 3rd figure will not match. So in short I want to match figure with (+ or -) or without any sign.

Group 1 is empty because the \W* greedily matches all non-word characters, that is, all parentheses and signs.
You should specify the literal parentheses in the pattern and a character class will be a more natural construct to match either a + or a -:
(?:\(?([-+])\)?)?\p{Zs}*(\d+(\.\d+)?)
See regex demo (if you need a full string match, use ^ at the start and $ at the end of the pattern).
Regex matches:
(?:\(?([-+])\)?)? - an optional non-capturing group ((?:...)) that matches a ( optionally, followed by a plus or minus (Group 1), and then by an optional )
\p{Zs}* - zero or more whitespace symbols
(\d+(\.\d+)?) - (Group 2) one or more digits followed by an optional capturing group (Group 3) that matches a period followed by one or more digits.
Result:

Related

Find regex pattern match string have multiple condition?

I have some strings formatted as follows:
1=case1,case2,..caseN;2=case1,..,caseN;3=case1, ..,caseN
Note: comma ";" is used to separate cases and case1, case2 are anything like strings, number doesn't matter their type.
I want to find regex pattern to match string
1=home,house;2=abc;3=2019,2021
however, it will not match the following:
1=home,;2=abc;3=2019,2021 (Excess comma mark at case 1)
1=;2=abc,2012;3= (must 1=..; not 1=;)
1=home,age;2 (must 2=.. not 2)
2=home;;3=sea (must ;3 not ;;3)
4=flower;k3=sea (must 3= , not k3)
I tried with the pattern: (\d+={1}[^;]+;). However, it will match if the backstring is not.
Please show me the way.
Many thanks!
Maybe this pattern helps you out:
^\b(?:(?:^|;)\d+=[^,;]+(?:,[^,;]+)*)+$
See the Online Demo
^ - Start string ancor.
\b - Word-boundary.
(?: - Opening 1st non-capture group.
(?:- Opening 2nd non-capture group.
^|; - Alternation between start string ancor or semi-colon.
) - Closing 2nd non-capture group.
\d+= - One or more digits followed by a =.
[^,;]+ - Negated character class, any character other than comma or semicolon one or more times.
(?: - Opening 3rd non-capture group.
, - A comma.
[^,;]+ - Negated character class, any character other than comma or semicolon one or more times.
)* - Close 3rd non-capture group and make it match zero or more times.
)+ - Close 1st non-capture group and make sure it's matches one or more times.
$ - End string ancor.
Note: I went with a negated character class since you mentioned "case1, case2 are anything like strings, number doesn't matter their type", therefor I read there can be spaces, special characters or any kind other than comma and semicolon.
This works on regex101
^(?:\d=(?:\w{1,},)*(?:\w{1,});)*(?:\d=(?:\w{1,},)*\w{1,})$
^(?:\d+=[a-z\d]+(?:,[a-z\d]+)*(?:;|$))+$
Demo
^ : match beginning of string
(?: : begin nc group
\d+=[a-z\d]+ : match 1+ digits, then '=' then 1+ lc letters or digits
(?:,[a-z\d]+) : match ',' then 1+ lc letters or digits in nc group
* : execute nc group 0+ times
(?:;|$) : match ';' or end of string
)+ : end nc group and execute 1+ times
$ : match end of string
I don't know if c# supports recursive pattern, but, if it does, use:
^(\d+=\w+(?:,\w+)*)(?:;(?1))*$
if it doesn't:
^\d+=\w+(?:,\w+)*(?:;\d+=\w+(?:,\w+)*)*$
Demo & explanation

C# Regular Expression Help - grouping with optional data points

Given 2 different lines I'm parsing, I need to extract the data points into regex match groups.
Example Line 1:
Header values are as follows:
DATE{space}TYPE{space}DESCR{space}VOLUME{space}RATE{space}TOTAL
[11/30/15] [CF] [DISC 1] [28270.18] [0.00150] [-42.41]
Example Line 2:
DATE{space}TYPE{space}DESCR{space}VOLUME{space}RATE{space}TOTAL
[11/30/15] [CF] [OTHER VOLUME FEES] [28186.68] [0.00008] [-2.25]
I'm using the following regex to get matches:
(?<date>^\d{1,2}[-/.]\d{1,2}[-/.]\d{1,2}[\d+])\s+(?<type>[A-Za-z]{2})\s+(?<descr>\w+\s+.*?(1))\s+.*?(?<volume>(\d+(?:\.\d+?))\s+.*?(?<rate>([0]?(\d+(?:\.\d+)?)))\s+(?<total>[-+]?\d+[.,]\d+)?.*$")
I can match the first case,but never the second case. there will always be a total, but they may NOT always be volume or rate. In addition, volume can be whole, decimal or code (e.g. "1B").
What am I missing here?
The description field is an open field and may contain "1" in it. I can have several words in it, or just 1.
Your log lines contain 6 fields, but the 4th and 5th can go missing. A common way to match optional fields is using an optional non-capturing group, (?:...)?. These groups do not make a separate memory buffers for the text they match, that is why they are useful to keep matching cleaner and more efficient.
NOTE that in .NET, there is a way to make all non-named capturing groups non-capturing by use of RegexOptions.ExplicitCapture option.
Your fixed regex mau look like
^(?<date>\d{1,2}[-/.]\d{1,2}[-/.]\d{1,2})\s+(?:(?<type>[A-Z]{2})\s+)?(?:(?<descr>\w.*?)\s+)?(?:(?<volume>\d*\.?\d+)\s+)?(?:(?<rate>\d*\.?\d+)\s+)?(?<total>[-+]?\d*[.,]?\d+)\s*$
See the .NET regex demo.
Details
^ - start of a line (when RegexOptions.Multiline is used)
(?<date>\d{1,2}[-/.]\d{1,2}[-/.]\d{1,2}) - Group "date": 1-2 digits and then 2 repetitions of -///. followed with 1-2 digits (thus, this pattern can be written as (?<date>\d{1,2}(?:[-/.]\d{1,2}){2})).
\s+ - 1 or more whitespaces
(?:(?<type>[A-Z]{2})\s+)? - an optional group matching 2 uppercase ASCII letters, captured into Group "type", and then 1+ whitespaces
(?:(?<descr>\w.*?)\s+)? - an optional group matching a word char (letter, digit or _ and some other special chars (like diacritics) followed with any 0+ chars other than a newline char LF, as few as possible, all this captured into Group "descr", and then 1+ whitespaces
(?:(?<volume>\d*\.?\d+)\s+)? - an optional group matching 0+ digits, an optional . and then 1+ digits (that is, floats or integers) captured into Group "volume", then 1+ whitespace chars
(?:(?<rate>\d*\.?\d+)\s+)? - an optional group matching a float or integer values captured into Group "rate", and then 1+ whitespace chars
(?<total>[-+]?\d*[.,]?\d+) - Group "total": an optional - or + followed with 0+ digits, an optional . or , and then 1+ digits (so, positive or negative floats or integers are matched)
\s* - any 0+ trailing whitespaces
$ - end of the line.
(?<date>^\d{1,2}[-/.]\d{1,2}[-/.]\d{1,2}[\d+])\s+(?<type>[A-Z]{2})\s+(?<descr>\w+.*?\s+)(?<volume>\d+[.]?\d+)\s+(?<rate>\d+[.]?\d+)\s+(?<total>[-+]?\d+[.,]\d+?.*$)
Yes. This is a fairly complex regex. But if you have varying spaces inside your grouping, you can use .*?\s+ to end on the last space. This seems to work nicely for all the use cases I have.
Thanks for your comments!

Regex to insert and replace characters in a string C#

I have a string which looks like this :-
"$.ConfigSettings.DatabaseSettings.DatabaseConnections.SqlConnectionString.0.Id"
and I want the result to look like this :-
"$.ConfigSettings.DatabaseSettings.DatabaseConnections.SqlConnectionString[0].Id"
Basically wherever there is a single digit preceded and succeeded by a period I need to change it to [digit] followed by period ie [digit]. .I have seen tons of examples where people are only replacing the regex string.
How will I do this using Regex.Replace in C#
Regex.Replace(input, #"\.(\d)(?=\.)", "[$1]")
\. - capture a "."
(\d) - then a single digit in a capturing group ($1 in the replacement)
(?= - start a positive lookahead
\. - that matches a "."
) - end the lookahead
So, it means : (match a dot followed by a digit in a capturing group) only if it is followed by a dot
So we matched ".0" and captured "0". We replace the entire match with "[$1]", where $1 refers to the first captured group.
See "Grouping Constructs in Regular Expressions" : https://msdn.microsoft.com/en-us/library/bs2twtah(v=vs.110).aspx for information about the different grouping constructs that I use in this solution.

Cannot match parentheses in regex group

This is a regular expression, evaluated in .NET
I have the following input:
${guid->newguid()}
And I want to produce two matching groups, a character sequence after the ${ and before }, which are split by -> :
guid
newguid()
The pattern I am using is the following:
([^(?<=\${)(.*?)(?=})->]+)
But this doesn't match the parentheses, I am getting only the following matches:
guid
newguid
How can I modify the regex so I get the desired groups?
Your regex - ([^(?<=\${)(.*?)(?=})->]+) - match 1+ characters other than those defined in the negated character class (that is, 1 or more chars other than (, ?, <, etc).
I suggest using a matching regex like this:
\${([^}]*?)->([^}]*)}
See the regex demo
The results you need are in match.Groups[1] and match.Groups[2].
Pattern details:
\${ - match ${ literal character sequence
([^}]*?) - Group 1 capturing 0+ chars other than } as few as possible
-> - a literal char sequence ->
([^}]*) - Group 2 capturing 0+ chars other than } as many as possible
} - a literal }.
If you know that you only have word chars inside, you may simplify the regex to a mere
\${(\w+)->(\w+\(\))}
See the regex demo. However, it is much less generic.
Your input structure is always ${identifier->identifier()}? If this is the case, you can user ^\$\{([^-]+)->([^}]+)\}$.
Otherwise, you can modify your regexpr to ([^?<=\${.*??=}\->]+): using this rexexpr you should match input and get the desired groups: uid and newguid(). The key change is the quoting of - char, which is intendend as range operator without quoting and forces you to insert parenthesis in your pattern - but... [^......(....)....] excludes parenthesis from the match.
I hope than can help!
EDIT: testing with https://regex101.com helped me a lot... showing me that - was intended as range operator.

Regex to clean repetitions of characters

I have a pattern in the string like this:
T T and I want to T
And It can be any character from [a-z].
I have tried this Regex Example but not able to replace it.
EDIT
Like I have A Aa ar r then it should become Aar means replace any character 1st or 2nd no matter what it is.
You can use the backreferences for this.
/([a-z])\s*\1\s?/gi
Example
Some more explanation:
( begin matching group 1
[a-z] match any character from a to z
) end matching group 1
\s* match any amount of space characters
\1 match the result of matching group 1
exactly as it was again
this allows for the repition
\s? match none or one space character
this will allow to remove multiple
spaces when replacing

Categories