Match up to the comma - Regex - c#

I have created a Regex Pattern (?<=[TCC|TCC_BHPB]\s\d{3,4})[-_\s]\d{1,2}[,]
This Pattern match just:
TCC 6005_5,
What should I change to the end to match these both strings:
TCC 6005-5 ,
TCC 6005_5,

You can add a non-greedy wildcard to your expression (.*?):
(?<=(?:TCC|TCC_BHPB)\s\d{3,4})[-_\s]\d{1,2}.*?[,]
^^^
This will now also match any characters between the last digit and the comma.
As has been pointed out in the comments, [TCC|TCC_BHPB] is a character class rather than a literal match, so I've changed this to (?:TCC|TCC_BHPB) which is presumably what your intention was.
Try it online

This part of the pattern [TCC|TCC_BHPB] is a character class that matches one of the listed characters. It might also be written for example as [|_TCBHP]
To "match" both strings, you can match all parts instead of using a positive lookbehind.
\bTCC(?:_BHPB)?\s\d{3,4}[-_\s]\d{1,2}\s?,
See a regex demo
\bTCC A word boundary to prevent a partial match, then match TCC
(?:_BHPB)?\s\d{3,4} Optionally match _BHPB, match a whitespace char and 3-4 digits (Use [0-9] to match a digit 0-9)
[-_\s]\d{1,2} Match one of - _ or a whitespace char
\s?, Match an optional space and ,
Note that \s can also match a newline.
Using the lookbehind:
(?<=TCC(?:_BHPB)?\s\d{3,4})[-_\s]\d{1,2}\s?,
Regex demo
Or if you want to match 1 or more spaces except a newline
\bTCC(?:_BHPB)?[\p{Zs}\t][0-9]{3,4}[-_\p{Zs}\t][0-9]{1,2}[\p{Zs}\t]*,
Regex demo

Related

Regular expression to check format string string:string any occurrence

I am trying to build regex to match - Test get:all words:test
can start with a word then space and followed by any occurrence of word:word separated by space.
#"^[a-zA-Z]+/s(^[a-zA-Z]+:^[a-zA-Z]+/s)*"
You added extra start of string anchors, ^, inside the pattern, and you need to remove them for sure.
Besides, the whitespace patterns must be written as \s and the first \s must be moved inside the repeated group that should be converted into a non-capturing one ((?:...)) for better performance.
You can use
^[a-zA-Z]+(?:\s+[a-zA-Z]+:[a-zA-Z]+)*$
See the regex demo. Details:
^ - start of string
[a-zA-Z]+ - one or more ASCII letters
(?:\s+[a-zA-Z]+:[a-zA-Z]+)* - zero or more repetitions of
\s+ - one or more whitespaces
[a-zA-Z]+:[a-zA-Z]+ - one or more ASCII letters, :, one or more ASCII letters
$ - end of string (or use \z to match the very end of string).
If you meant to allow any word chars (letters, digits, connector punctuation) then replace each [a-zA-Z] with \w.
If you need to support just any Unicode letters, replace each [a-zA-Z] with \p{L}.

Regex to get digits from a string when there is no separator between digits

I have a string like Acc:123-456-789 and another string like -1234567, I need your help to write an expression to match digits in case there is no separator between the digits.
-*(?!\d*(?:\d*-)$)\d*$
Input strings:
Acc:123-456-789 -12323232 7894596
Desired result:
group 1 12323232
group 2 7894596
I think this ought to work:
(?<=^|\s|\s-)(\d+)(?=\s|$)
Breaking it down:
(?<=^|\s|\s-) - A positive lookbehind that matches the start of the string, whitespace, or whitespace followed by a -.
(\d+) - Matches and captures number sequences.
(?=\s|$) - A positive lookahead that matches whitespace or the end of the string.
** Note: If you need to capture negative number sequences, replace (\d+) with (\-?\d+).
Try it online
Regex reference
Remember for use in C# that you need to escape backslashes or use the # prefix to a string literal (#" ").

C# Regular Expression for x number of groups of A-Z separated by hyphen

I am trying to match the following pattern.
A minimum of 3 'groups' of alphanumeric characters separated by a hyphen.
Eg: ABC1-AB-B5-ABC1
Each group can be any number of characters long.
I have tried the following:
^(\w*(-)){3,}?$
This gives me what I want to an extent.
ABC1-AB-B5-0001 fails, and ABC1-AB-B5-0001- passes.
I don't want the trailing hyphen to be a requirement.
I can't figure out how to modify the expression.
Your ^(\w*(-)){3,}?$ pattern even allows a string like ----- because the only required pattern here is a hyphen: \w* may match 0 word chars. The - may be both leading and trailing because of that.
You may use
\A\w+(?:-\w+){2,}\z
Details:
\A - start of string
\w+ - 1+ word chars (that is, letters, digits or _ symbols)
(?:-\w+){2,} - 2 or more sequences of:
- - a single hyphen
\w+ - 1 or more word chars
\z - the very end of string.
See the regex demo.
Or, if you do not want to allow _:
\A[^\W_]+(?:-[^\W_]+){2,}\z
or to only allow ASCII letters and digits:
\A[A-Za-z0-9]+(?:-[A-Za-z0-9]+){2,}\z
It can be like this:
^\w+-\w+-\w+(-\w+)*$
^(\w+-){2,}(\w+)-?$
Matches 2+ groups separated by a hyphen, then a single group possibly terminated by a hyphen.
((?:-?\w+){3,})
Matches minimum 3 groups, optionally starting with a hyphen, thus ignoring the trailing hyphen.
Note that the \w word character also select the underscore char _ as well as 0-9 and a-z
link to demo

C# Regex to match a string that doesn't contain a certain string?

I want to match any string that does not contain the string "DontMatchThis".
What's the regex?
try this:
^(?!.*DontMatchThis).*$
The regex to match a string that does not contain a certain pattern is
(?s)^(?!.*DontMatchThis).*$
If you use the pattern without the (?s) (which is an inline version of the RegexOptions.Singleline flag that makes . match a newline LF symbol as well as all other characters), the DontMatchThis will only be searched for on the first line, and only a string without LF symbols will be matched with .*.
Pattern details:
(?s) - a DOTALL/Singleline modifier making . match any character
^ - start of string anchor
(?!.*DontMatchThis) - a negative lookahead checking if there are any 0 or more characters (matched with greedy .* subpattern - NOTE a lazy .*? version (matching as few characters as possible before the next subpattern match) might get the job done quicker if DontMatchThis is expected closer to the string start) followed with DontMatchThis
.* - any zero or more characters, as many as possible, up to
$ - the end of string (see Anchor Characters: Dollar ($)).

Regex for string with spaces and special characters - C#

I have been using Regex to match strings embedded in square brackets [*] as:
new Regex(#"\[(?<name>\S+)\]", RegexOptions.IgnoreCase);
I also need to match some codes that look like:
[TESTTABLE: A, B, C, D]
it has got spaces, comma, colon
Can you please guide me how can I modify my above Regex to include such codes.
P.S. other codes have no spaces/special charaters but are always enclosed in [...].
Regex myregex = new Regex(#"\[([^\]]*)]")
will match all characters that are not closing brackets and that are enclosed between brackets. Capture group \1 will match the content between brackets.
Explanation (courtesy of RegexBuddy):
Match the character “[” literally «\[»
Match the regular expression below and capture its match into backreference number 1 «([^\]]*)»
Match any character that is NOT a ] character «[^\]]*»
Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
Match the character “]” literally «]»
This will also work if you have more than one pair of matching brackets in the string you're looking at. It will not work if brackets can be nested, e. g. [Blah [Blah] Blah].
/\[([^\]:])*(?::([^\]]*))?\]/
Capture group 1 will contain the entire tag if it doesn't have a colon, or the part before the colon if it does.
Capture group 2 will contain the part after the colon. You can then split on ',' and trim each entry to get the individual parts.

Categories