C# regex to match string - c#

I'm currently having difficulty matching strings with my regex. The objective is to match:
One or two letters
One, two or three numbers
Zero or one asterisk
Such as U21, F305 and H12*. The regex that I'm using is:
\D{1,2}\d{1,3}\*?
However, it's been matching strings like:
3.0L
6HBW20
3/8"
Y1015
I'm not too bright with regex, but this is holding me from completing my project. Can anyone help me out?
Thank you.

Try using /^[a-zA-Z]{1,2}\d{1,3}\*?$/
The anchors ^ and $ are useful to make sure that you match exactly the pattern you intend. Read up on them :)

You need to anchor your match. ^ anchors the match to start of line; $ drops anchor at end of line.
Try this regular expression
#"^[\p{L}]{1,2}\d{1,3}[*]?$"

\D matches any non-digit, which is a much larger set than just letters (basically everything else, including periods, slashes, etc). Try using [a-zA-Z]{1,2} to match 1 or 2 letters.
[a-zA-Z]{1,2}\d{1,3}\*?

Related

Using RegEx, what's the best way to capture groups of digits, ignoring any whitespace in them

Given the following string...
ABC DEF GHI: 319 022 6543 QRS : 531 450
I'm trying to extract all ranges that start/end with a digit, and which may contain whitespace, but I want that whitespace itself removed.
For instance, the above should yield two results (since there are two 'ranges' that match what I aim looking for)...
3190226543
531450
My first thought was this, but this matches the spaces between the letters...
([\d\s])
Then I tried this, but it didn't seem to have any effect...
([\d+\s*])
This one comes close, but its grabbing the trailing spaces too. Also, this grabs the whitespace, but doesn't remove it.
(\d[\d\s]+)
If it's impossible to remove the spaces in a single statement, I can always post-process the groups if I can properly extract them. That most recent statement comes close, but how do I say it doesn't end with whitespace, but only a digit?
So what's the missing expression? Also, since sometimes people just post an answer, it would be helpful to explain out the RegEx too to help others figure out how to do this. I for one would love not just the solution, but an explanation. :)
Note: I know there can be some variations between RegEx on different platforms so that's fine if those differences are left up to the reader. I'm more interested in understanding the basic mechanics of the regex itself more so than the syntax. That said, if it helps, I'm using both Swift and C#.
You cannot get rid of whitespace from inside the match value within a single match operation. You will need to remove spaces as a post-processing step.
To match a string that starts with a digit and then optionally contains any amount of digits or whitespaces and then a digit you can use
\d(?:[\d\s]*\d)?
Details:
\d - a digit
(?:[\d\s]*\d)? - an optional non-capturing group matching
[\d\s]* - zero or more whitespaces / digits
\d - a digit.
See the regex demo.

Regex pattern in C# with empty space

I am having issue with a reg ex expression and can't find the answer to my question.
I am trying to build a reg ex pattern that will pull in any matches that have # around them. for example #match# or #mt# would both come back.
This works fine for that. #.*?#
However I don't want matches on ## to show up. Basically if there is nothing between the pound signs don't match.
Hope this makes sense.
Thanks.
Please use + to match 1 or more symbols:
#+.+#+
UPDATE:
If you want to only match substrings that are enclosed with single hash symbols, use:
(?<!#)#(?!#)[^#]+#(?!#)
See regex demo
Explanation:
(?<!#)#(?!#) - a # symbol that is not preceded with a # (due to the negative lookbehind (?<!#)) and not followed by a # (due to the negative lookahead (?!#))
[^#]+ - one or more symbols other than # (due to the negated character class [^#])
#(?!#) - a # symbol not followed with another # symbol.
Instead of using * to match between zero and unlimited characters, replace it with +, which will only match if there is at least one character between the #'s. The edited regex should look like this: #.+?#. Hope this helps!
Edit
Sorry for the incorrect regex, I had not expected multiple hash signs. This should work for your sentence: #+.+?#+
Edit 2
I am pretty sure I got it. Try this: (?<!#)#[^#].*?#. It might not work as expected with triple hashes though.
Try:
[^#]?#.+#[^#]?
The [^ character_group] construction matches any single character not included in the character group. Using the ? after it will let you match at the beginning/end of a string (since it matches the preceeding character zero or more times. Check out the documentation here

Why is this regex not allowing this text?

I have a username validator IsValidUsername, and I am testing "baconman" but it is failing, could someone please help me out with this regex?
if(!Regex.IsMatch(str, #"^[a-zA-Z]\\w+|[0-9][0-9_]*[a-zA-Z]+\\w*$")) {
isValid = false;
}
I want the restrictions to be: (It's very close)
Be between 5 & 17 characters long
contain at least one letter
no spaces
no special characters
You're escaping unnecessarily: if you write your regex as starting with # outside the string, you don't need both \ - just one is fine.
Either:
#"\w"
or
"\\w"
Edit: I didn't make this clear: right now due to the double escaping, you're looking for a \ in your regex and a w. So your match would need [some character]\w to match (example: "a\w" or "a\wwwwww" would match.
Your requirements are best taken care of in normal C#. They don't map well to a regular expression. Just code them up using LINQ which works on strings like it would on an IEnumerable<char>.
Also, understanding a query of a string is much easier than understanding a Regex with the requirements that you have.
It is possible to do everything as part of a Regex, however it is not pretty :-)
^(\w(?=\w*[a-zA-Z])|[a-zA-Z]|\w(?<=[a-zA-Z]\w*)){5,17}$
It does 3 checks that always results in 1 character being matched (so we can perform the length check in the end)
Either the character is any word character \w which is before [a-zA-Z]
Or it is [a-zA-Z]
Or it is any word character \w which is after [a-zA-Z]

Regular expressions to extract punctuations/characters

Hi guys I need a regex that only extracts punctuations/characters.
I have this so far :
[._^%$#!~#,-]+
but this works if there is at least 1 punctuations and still allows for any other char (digit or letter)
I need to to only allow punctuations/characters
Try this:
^[\p{S}\p{P}]+$
\p{S} matches any symbol character, and \p{P} matches any punctuation.
Note that your pattern will not match all the symbols and punctuations not present in the list.
Try anchoring the regex to the start and the end of the string (unless you're using Multiline matching) - i.e. ^ at the beginning and $ at the end:
^[._^%$#!~#,-]+$
Note - this does not endorse your actual pattern (I can't say whether this is matching all the 'special characters' you're talking about, but it will make it so that the entire string must be all 'special'.
[^a-zA-Z0-9]* u can try something like this. Should NOT accept those chars, cba to writte all beside chars beside one u typed.
\W
Matches any character that is not a word character (alphanumeric & underscore).

C# Regex Validation

Can someone please validate this for me (newbie of regex match cons).
Rather than asking the question, I am writing this:
Regex rgx = new Regex (#"^{3}[a-zA-Z0-9](\d{5})|{3}[a-zA-Z0-9](\d{9})$"
Can someone telll me if it's OK...
The accounts I am trying to match are either of:
1. BAA89345 (8 chars)
2. 12345678 (8 chars)
3. 123456789112 (12 chars)
Thanks in advance.
You can use a Regex tester. Plenty of free ones online. My Regex Tester is my current favorite.
Is the value with 3 characters then followed by digits always starting with three... can it start with less than or more than three. What are these mins and max chars prior to the digits if they can be.
You need to place your quantifiers after the characters they are supposed to quantify. Also, character classes need to be wrapped in square brackets. This should work:
#"^(?:[a-zA-Z0-9]{3}|\d{3}\d{4})\d{5}$"
There are several good, automated regex testers out there. You may want to check out regexpal.
Although that may be a perfectly valid match, I would suggest rewriting it as:
^([a-zA-Z]{3}\d{5}|\d{8}|\d{12})$
which requires the string to match one of:
[a-zA-Z]{3}\d{5} three alpha and five numbers
\d{8} 8 digits or
\d{12} twelve digits.
Makes it easier to read, too...
I'm not 100% on your objective, but there are a few problems I can see right off the bat.
When you list the acceptable characters to match, like with a-zA-Z0-9, you need to put it inside brackets, like [a-zA-Z0-9] Using a ^ at the beginning will negate the contained characters, e.g. `[^a-zA-Z0-9]
Word characters can be matched like \w, which is equivalent to [a-zA-Z0-9_].
Quantifiers need to appear at the end of the match expression. So, instead of {3}[a-zA-Z0-9], you would need to write [a-zA-Z0-9]{3} (assuming you want to match three instances of a character that matches [a-zA-Z0-9]

Categories