Regex lookbeaind only when contains colon

Regex lookbeaind only when contains colon - c#

Today I use c# Regex.IsMatch function to matching key:value format.
I have some code that checking if string format is: key:value (like: H:15).
The Regex pattern that I am using today is: [D,H,M,S]:[1-9]+\d?
I what to add the option for default key, when the input is 15, I would like to consider it like: H:15
So, I need to improve my Regex to support key:value or only value (without colon), H:15 is good and 15 is also good
I tried to use the or regex condition (|) something like : ([D,H,M,S]:[1-9]+\d?)|([1-9]+\d?)
But now it match more thinks like :1 and H:01 that are bad input for me.
I try to use also lookbehind regex without success
Any help would be greatly appreciated,
Nadav.

This should do the trick:
\b(?:[DHMS]:|(?<!:))[1-9][0-9]*\b
Demo
So, either match [DHMS]: or a word boundary not preceded by :.
Also, [1-9]+\d? looks very suspicious to me, so I replaced it with [1-9][0-9]*. Note that in .NET \d is not equivalent to [0-9] because it includes Unicode digits as well.

Looks like Avinash just beat me to it, but I added word boundaries with this expression, which works well in tests.
\b(?<=[DHMS]:)?[1-9]\d*\b

Seems like you wants something like this,
#"^(?:[DHMS]:)?[1-9]\d*$"
[DHMS] matches a single character from the given list. ? after the non-capturing group will turn the key part to an optional one. \d* matches zero or more digit characters.

Related

Extract string from a pattern preceded by any length

I'm looking for a regular expression to extract a string from a file name
eg if filename format is "anythingatallanylength_123_TESTNAME.docx", I'm interested in extracting "TESTNAME" ... probably fixed length of 8. (btw, 123 can be any three digit number)
I think I can use regex match ...
".*_[0-9][0-9][0-9]_[A-Z][A-Z][A-Z][A-Z][A-Z][A-Z][A-Z][A-Z].docx$"
However this matches the whole thing. How can I just get "TESTNAME"?
Thanks

Use parenthesis to match a specific piece of the whole regex.
You can also use the curly braces to specify counts of matching characters, and \d for [0-9].
In C#:
var myRegex = new Regex(#"*._\d{3}_([A-Za-z]{8})\.docx$");
Now "TESTNAME" or whatever your 8 letter piece is will be found in the captures collection of your regex after using it.
Also note, there will be a performance overhead for look-ahead and look-behind, as presented in some other solutions.

You can use a look-behind and a look-ahead to check parts without matching them:
(?<=_[0-9]{3}_)[A-Z]{8}(?=\.docx$)
Note that this is case-sensitive, you may want to use other character classes and/or quantifiers to fit your exact pattern.

In your file name format "anythingatallanylength_123_TESTNAME.docx", the pattern you are trying to match is a string before .docx and the underscore _. Keeping the thing in mind that any _ before doesn't get matched I came up with following solution.
Regex: (?<=_)[A-Za-z]*(?=\.docx$)
Flags used:
g global search
m multi-line search.
Explanation:
(?<=_) checks if there is an underscore before the file name.
(?=\.docx$) checks for extension at the end.
[A-Za-z]* checks the required match.
Regex101 Demo

Thanks to #Lucero #noob #JamesFaix I came up with ...
#"(?<=.*[0-9]{3})[A-Z]{8}(?=.docx$)"
So a look behind (in brackets, starting with ?<=) for anything (ie zero or more any char (denoted by "." ) followed by an underscore, followed by thee numerics, followed by underscore. Thats the end of the look behind. Now to match what I need (eight letters). Finally, the look ahead (in brackets, starting with ?=), which is the .docx
Nice work, fellas. Thunderbirds are go.

Character 'e' is not recognized by simple regular expression - why?

I wrote a very simple regular expression that need to match the next pattern:
word.otherWord
- Word must have at least 2 characters and must not start with digit.
I wrote the next expression:
[a-zA-Z][a-zA-Z](.[a-zA-Z0-9])+
I tested it using Regex tester and it seems to be working at most of the cases but when I try some inputs that ends with 'e' it's not working.
for example:
Hardware.Make does not work but Hardware.Makee is works fine, why? How can I fix it?

That's because your regex looks for inputs which length is even.
You have two characters matched by [a-zA-Z][a-zA-Z] and then another two characters matched by (.[a-zA-Z0-9]) as a group which is repeated one or more times (because of +).
You can see it here: http://regex101.com/r/fW2bC1
I think you need that:
[a-zA-Z]+(\.[a-zA-Z0-9]+)+

Actually, the dot is a regex metacharacter, which stands for "any character". You'll need to escape the dot.
For your situation, I'd do this:
[a-zA-Z]{2,}\.[a-zA-Z0-9]+
The {2,} means, at least 2 characters from the previous range.

In regex, the dot period is one of the most commonly used metacharacters and unfortunately also commonly misused metacharacter. The dot matches a single character without caring what that character is...
So u would also re-write it like
[a-zA-Z]+(\.[a-zA-Z0-9]+)+

Regex in a string

I need some help on a problem.
In fact I search to check for an image type by the hexadecimal code.
string JpgHex = "FF-D8-FF-E0-xx-xx-4A-46-49-46-00";
Then I have a condition on
string.StartsWith(pngHex).
The problem is that the "x" characters presents in my "JpgHex" string can be whatever I want.
I think I need a regex to check that but I don't know how!!
Thanks a lot!

I'm not quite clear what exactly you want to do, but the dot '.' character represents any character in Regex.
So the regex "^FF-D8-FF-E0-..-..-4A-46-49-46-00" will probably do the trick. '^' = Start of input.
If you want to allow only hex chars you can use "^FF-D8-FF-E0-[0-9A-F]{2}-[0-9A-F]{2}-4A-46-49-46-00".

Like I said, I'd need a better idea of what pattern you need to match.
Here are some examples:
Regex rgx =
new Regex(#"^FF-D8-FF-E0-[a-zA-Z0-9]{2}-[a-zA-Z0-9]{2}-4A-46-49-46-00$");
rgx.IsMatch(pngHex); // is match will return a bool.
I use [a-zA-Z0-9]{2} to denote two instances of a character, caps or small or a number. So the above regex would match :
FF-D8-FF-E0-aa-zZ-4A-46-49-46-00
FF-D8-FF-E0-11-22-4A-46-49-46-00
.. etc
Based on your need change the regex accordingly so for capitals and numbers only you change to [A-Z0-9]. The {2} denotes two occurrences.
The ^ denotes the string should start with FF and $ means the string should end with 00.
Lets say you wanted to only match two numbers, so you would use \d{2}, the whole thing would look like this:
Regex rgx = new Regex(#"^FF-D8-FF-E0-\d{2}-\d{2}-4A-46-49-46-00$");
rgx.IsMatch(pngHex);
How do I know of these magical characters? Simple, there are docs everywhere. See this MSDN page for some basic regex patterns. This page shows some quantifiers, those are things like match one or more or match only one.
Cheat-sheets also come in handy.

A regex would help you; you can use the following tool to help you test and learn: -
http://derekslager.com/blog/posts/2007/09/a-better-dotnet-regular-expression-tester.ashx
I recommend you have a play because then you'll learn!
To simply match any character in place of the x, the following should work: -
"^FF-D8-FF-E0-..-..-4A-46-49-46-00$"
In C#, it would be something like this: -
var test = "FF-D8-FF-E0-AB-CD-4A-46-49-46-00";
var foo = new Regex("^FF-D8-FF-E0-..-..-4A-46-49-46-00$");
if (foo.IsMatch(test))
{
// Do magic
}
You will need to read up on regular expressions to understand some of the characters that may not look familiar, i.e. ^ and $. See http://www.regular-expressions.info/

Why is this regex not allowing this text?

I have a username validator IsValidUsername, and I am testing "baconman" but it is failing, could someone please help me out with this regex?
if(!Regex.IsMatch(str, #"^[a-zA-Z]\\w+|[0-9][0-9_]*[a-zA-Z]+\\w*$")) {
isValid = false;
}
I want the restrictions to be: (It's very close)
Be between 5 & 17 characters long
contain at least one letter
no spaces
no special characters

You're escaping unnecessarily: if you write your regex as starting with # outside the string, you don't need both \ - just one is fine.
Either:
#"\w"
or
"\\w"
Edit: I didn't make this clear: right now due to the double escaping, you're looking for a \ in your regex and a w. So your match would need [some character]\w to match (example: "a\w" or "a\wwwwww" would match.

Your requirements are best taken care of in normal C#. They don't map well to a regular expression. Just code them up using LINQ which works on strings like it would on an IEnumerable<char>.
Also, understanding a query of a string is much easier than understanding a Regex with the requirements that you have.

It is possible to do everything as part of a Regex, however it is not pretty :-)
^(\w(?=\w*[a-zA-Z])|[a-zA-Z]|\w(?<=[a-zA-Z]\w*)){5,17}$
It does 3 checks that always results in 1 character being matched (so we can perform the length check in the end)
Either the character is any word character \w which is before [a-zA-Z]
Or it is [a-zA-Z]
Or it is any word character \w which is after [a-zA-Z]

C# Regex Validation

Can someone please validate this for me (newbie of regex match cons).
Rather than asking the question, I am writing this:
Regex rgx = new Regex (#"^{3}[a-zA-Z0-9](\d{5})|{3}[a-zA-Z0-9](\d{9})$"
Can someone telll me if it's OK...
The accounts I am trying to match are either of:
1. BAA89345 (8 chars)
2. 12345678 (8 chars)
3. 123456789112 (12 chars)
Thanks in advance.

You can use a Regex tester. Plenty of free ones online. My Regex Tester is my current favorite.

Is the value with 3 characters then followed by digits always starting with three... can it start with less than or more than three. What are these mins and max chars prior to the digits if they can be.

You need to place your quantifiers after the characters they are supposed to quantify. Also, character classes need to be wrapped in square brackets. This should work:
#"^(?:[a-zA-Z0-9]{3}|\d{3}\d{4})\d{5}$"
There are several good, automated regex testers out there. You may want to check out regexpal.

Although that may be a perfectly valid match, I would suggest rewriting it as:
^([a-zA-Z]{3}\d{5}|\d{8}|\d{12})$
which requires the string to match one of:
[a-zA-Z]{3}\d{5} three alpha and five numbers
\d{8} 8 digits or
\d{12} twelve digits.
Makes it easier to read, too...

I'm not 100% on your objective, but there are a few problems I can see right off the bat.
When you list the acceptable characters to match, like with a-zA-Z0-9, you need to put it inside brackets, like [a-zA-Z0-9] Using a ^ at the beginning will negate the contained characters, e.g. `[^a-zA-Z0-9]
Word characters can be matched like \w, which is equivalent to [a-zA-Z0-9_].
Quantifiers need to appear at the end of the match expression. So, instead of {3}[a-zA-Z0-9], you would need to write [a-zA-Z0-9]{3} (assuming you want to match three instances of a character that matches [a-zA-Z0-9]

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Regex lookbeaind only when contains colon - c#

This should do the trick: \b(?:[DHMS]:|(?<!:))[1-9][0-9]\b Demo So, either match [DHMS]: or a word boundary not preceded by :. Also, [1-9]+\d? looks very suspicious to me, so I replaced it with [1-9][0-9]. Note that in .NET \d is not equivalent to [0-9] because it includes Unicode digits as well.

Looks like Avinash just beat me to it, but I added word boundaries with this expression, which works well in tests. \b(?<=[DHMS]:)?[1-9]\d*\b

Seems like you wants something like this, #"^(?:[DHMS]:)?[1-9]\d$" [DHMS] matches a single character from the given list. ? after the non-capturing group will turn the key part to an optional one. \d matches zero or more digit characters.

Related

Extract string from a pattern preceded by any length

Character 'e' is not recognized by simple regular expression - why?

Regex in a string

Why is this regex not allowing this text?

C# Regex Validation

Categories

Resources

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Regex lookbeaind only when contains colon - c#

This should do the trick: \b(?:[DHMS]:|(?<!:))[1-9][0-9]*\b Demo So, either match [DHMS]: or a word boundary not preceded by :. Also, [1-9]+\d? looks very suspicious to me, so I replaced it with [1-9][0-9]*. Note that in .NET \d is not equivalent to [0-9] because it includes Unicode digits as well.

Looks like Avinash just beat me to it, but I added word boundaries with this expression, which works well in tests. \b(?<=[DHMS]:)?[1-9]\d*\b

Seems like you wants something like this, #"^(?:[DHMS]:)?[1-9]\d*$" [DHMS] matches a single character from the given list. ? after the non-capturing group will turn the key part to an optional one. \d* matches zero or more digit characters.

Related

Extract string from a pattern preceded by any length

Character 'e' is not recognized by simple regular expression - why?

Regex in a string

Why is this regex not allowing this text?

C# Regex Validation

Categories

Resources

This should do the trick: \b(?:[DHMS]:|(?<!:))[1-9][0-9]\b Demo So, either match [DHMS]: or a word boundary not preceded by :. Also, [1-9]+\d? looks very suspicious to me, so I replaced it with [1-9][0-9]. Note that in .NET \d is not equivalent to [0-9] because it includes Unicode digits as well.

Seems like you wants something like this, #"^(?:[DHMS]:)?[1-9]\d$" [DHMS] matches a single character from the given list. ? after the non-capturing group will turn the key part to an optional one. \d matches zero or more digit characters.