I have some strings formatted as follows:
1=case1,case2,..caseN;2=case1,..,caseN;3=case1, ..,caseN
Note: comma ";" is used to separate cases and case1, case2 are anything like strings, number doesn't matter their type.
I want to find regex pattern to match string
1=home,house;2=abc;3=2019,2021
however, it will not match the following:
1=home,;2=abc;3=2019,2021 (Excess comma mark at case 1)
1=;2=abc,2012;3= (must 1=..; not 1=;)
1=home,age;2 (must 2=.. not 2)
2=home;;3=sea (must ;3 not ;;3)
4=flower;k3=sea (must 3= , not k3)
I tried with the pattern: (\d+={1}[^;]+;). However, it will match if the backstring is not.
Please show me the way.
Many thanks!
Maybe this pattern helps you out:
^\b(?:(?:^|;)\d+=[^,;]+(?:,[^,;]+)*)+$
See the Online Demo
^ - Start string ancor.
\b - Word-boundary.
(?: - Opening 1st non-capture group.
(?:- Opening 2nd non-capture group.
^|; - Alternation between start string ancor or semi-colon.
) - Closing 2nd non-capture group.
\d+= - One or more digits followed by a =.
[^,;]+ - Negated character class, any character other than comma or semicolon one or more times.
(?: - Opening 3rd non-capture group.
, - A comma.
[^,;]+ - Negated character class, any character other than comma or semicolon one or more times.
)* - Close 3rd non-capture group and make it match zero or more times.
)+ - Close 1st non-capture group and make sure it's matches one or more times.
$ - End string ancor.
Note: I went with a negated character class since you mentioned "case1, case2 are anything like strings, number doesn't matter their type", therefor I read there can be spaces, special characters or any kind other than comma and semicolon.
This works on regex101
^(?:\d=(?:\w{1,},)*(?:\w{1,});)*(?:\d=(?:\w{1,},)*\w{1,})$
^(?:\d+=[a-z\d]+(?:,[a-z\d]+)*(?:;|$))+$
Demo
^ : match beginning of string
(?: : begin nc group
\d+=[a-z\d]+ : match 1+ digits, then '=' then 1+ lc letters or digits
(?:,[a-z\d]+) : match ',' then 1+ lc letters or digits in nc group
* : execute nc group 0+ times
(?:;|$) : match ';' or end of string
)+ : end nc group and execute 1+ times
$ : match end of string
I don't know if c# supports recursive pattern, but, if it does, use:
^(\d+=\w+(?:,\w+)*)(?:;(?1))*$
if it doesn't:
^\d+=\w+(?:,\w+)*(?:;\d+=\w+(?:,\w+)*)*$
Demo & explanation
Related
I have following regular expression
(\w)+(,(\w)+)*
which is comma separated characters and numbers only
test123,test3,test9
I want to also add symbols like #, #, $ that can be used within \w
when i try [(\w)$#] not worked.
I need to use it in DevExpress TextEdit Mask. it says syntax error
http://prntscr.com/pbyq7p
There is a reply at the bottom if this page which mentions that special characters cannot be used within [].
The available character are listed on Mask Type: Extended Regular Expressions
The advice is to use grouping with an alternation to separate the character class and the special character.
You might try
(\w+|[##$]+)+(,(\w+|[##$]+))+
In parts
( Group 1
\w+ Match 1+ word chars
| Or
[##$]+ Match 1+ times any of the lister
)+ Close group and repeat 1+ times
( Group 2
, Match literally
(\w+|[##$]+) Same pattern as group 1
)+ Close group and repeat the whole group starting with , 1+ times
Regex demo
If your data only consists of characters a-z and numbers only, you could also try
([a-z0-9##$]+)+(,([a-z0-9##$]+))+
Regex demo
I am looking to validate user input using regex below. User is allowed to enter positive integer values separated by comma(,) or space. The problem is when during negative testing, I enter a special character like ? or period(.),IsMatch hangs. Any help is appreciated.
new Regex("^\\s*[0-9]+\\s*(,*\\s*[0-9]+\\s*)*$")
The (,*\\s*[0-9]+\\s*)* pattern inside the regular expression contains multiple optional patterns while only [0-9]+ is obligatory, so it is a classic (a+)+ like pattern causing catastrophic backtracking with non-matching strings.
You should make sure there is at least 1 more obligatory pattern inside the quantified group, e.g.
#"^\s*[0-9]+(?:(?:\s*,\s*|\s+)[0-9]+)*\s*$"
Details
^ - start of string
\s* - optional 0+ leading whitespaces
[0-9]+ - 1+ digits
(?:(?:\s*,\s*|\s+)[0-9]+)* - 0+ repetitions of:
(?:\s*,\s*|\s+) - either , enclosed with 0+ whitespaces or just 1+ whitespaces
[0-9]+ - 1+ digits
\s* - optional 0+ trailing whitespaces
$ - end of string.
I have a string which looks like this :-
"$.ConfigSettings.DatabaseSettings.DatabaseConnections.SqlConnectionString.0.Id"
and I want the result to look like this :-
"$.ConfigSettings.DatabaseSettings.DatabaseConnections.SqlConnectionString[0].Id"
Basically wherever there is a single digit preceded and succeeded by a period I need to change it to [digit] followed by period ie [digit]. .I have seen tons of examples where people are only replacing the regex string.
How will I do this using Regex.Replace in C#
Regex.Replace(input, #"\.(\d)(?=\.)", "[$1]")
\. - capture a "."
(\d) - then a single digit in a capturing group ($1 in the replacement)
(?= - start a positive lookahead
\. - that matches a "."
) - end the lookahead
So, it means : (match a dot followed by a digit in a capturing group) only if it is followed by a dot
So we matched ".0" and captured "0". We replace the entire match with "[$1]", where $1 refers to the first captured group.
See "Grouping Constructs in Regular Expressions" : https://msdn.microsoft.com/en-us/library/bs2twtah(v=vs.110).aspx for information about the different grouping constructs that I use in this solution.
I have a pattern in the string like this:
T T and I want to T
And It can be any character from [a-z].
I have tried this Regex Example but not able to replace it.
EDIT
Like I have A Aa ar r then it should become Aar means replace any character 1st or 2nd no matter what it is.
You can use the backreferences for this.
/([a-z])\s*\1\s?/gi
Example
Some more explanation:
( begin matching group 1
[a-z] match any character from a to z
) end matching group 1
\s* match any amount of space characters
\1 match the result of matching group 1
exactly as it was again
this allows for the repition
\s? match none or one space character
this will allow to remove multiple
spaces when replacing
I want to match the first number/word/string in quotation marks/list in the input with Regex. For example, it should match those:
"hello world" gdfigjfoj sogjds
-14.5 fdhdfdfi dfjgdlf
test14 hfghdf hjgfjd
(a (c b 7)) (3 4) "hi"
Any ideas to a regex or how can I start?
Thank you.
If you want to match balanced parenthesis, regex is not the right tool for the job. Some regex implementations do facilitate recursive pattern matching (PHP and Perl, that I know of), but AFAIK, C# cannot do that (EDIT: see Steve's comment below: .NET can do this as well, after all).
You can match up to a certain depth using regex, but that very quickly explodes in your face. For example, this:
\(([^()]|\([^()]*\))*\)
meaning
\( # match the character '('
( # start capture group 1
[^()] # match any character from the set {'0x00'..''', '*'..'ÿ'}
| # OR
\( # match the character '('
[^()]* # match any character from the set {'0x00'..''', '*'..'ÿ'} and repeat it zero or more times
\) # match the character ')'
)* # end capture group 1 and repeat it zero or more times
\) # match the character ')'
will match single nested parenthesis like (a (c b 7)) and (a (x) b (y) c (z) d), but will fail to match (a(b(c))).
Any ideas to a regex or how can I start?
You can start with any tutorial on basic regex, such as this.
[Edit] I missed that you wanted to count parentheses. That cannot be done in regex - nothing that involves counting (except for non-standard lookaheads) can.
For first three cases, you could to use:
^("[^"]*"|[+-]?\d*(?:\.\d+)?|\w+)
For last one, I'm not sure if it's possible with regex to match that last closing parenthesis.
EDIT: using that suggested balanced matching for last one:
^\([^()]*(((?<Open>\()[^()]*)+((?<Close-Open>\))[^()]*)+)*(?(Open)(?!))\)