How to check if string has letter in second character with regex - c#

Say I wanted to create an ID number such as 1A45 or 4F01.
What would the regex be to make sure that the string had exactly one letter as the second character?
I am unsure how to check for specific combinations of characters.
What I have so far is:
if(!Regex.IsMatch(txtTrainID.Text, #"^[\w,\d,\w,\w]+$"))
Which is obviously completely wrong, I've had trouble trying to find a decent simple answer to this anywhere.

If that's the only requirement (and I am sure it's not), use anchors and a character class in the second position as in
^.[A-Za-z]
See a demo on regex101.com.
What you probably mean, comes down to:
^\d[a-zA-Z]\d{2}$
The latter means one digit, one of a-zA-Z, followed by two other digits and the end of the string. See another demo on the same site.

Related

ASP.net core RegularException attribute - multiple conditions

I have two regex that should be matched:
"^[a-z0-9\\!#\\$\\^&\\-\\+%\\=_\\(\\)\\{\\}\\<\\>'\";\\:/\\.,~`\\|\\\\]+$"
and
".*(g[o0]+gle).*"
The first one accept any alpha numeric character (with few more extras). Like helloworld123. The second one should reject any string that contain the word "google" (in diffrent forms - like: gooo0gle).
Allowed:
hello
helloworld
helloworld123
Disallowed:
hellogoogle
google
...
I want to use the RegularExpression to match this string. Thought about something like:
[RegularExpression("^[a-z0-9\\!#\\$\\^&\\-\\+%\\=_\\(\\)\\{\\}\\<\\>'\";\\:/\\.,~`\\|\\\\]+$|.*(g[o0]+gle).*"]
But it's not working since the second part (.*(g[o0]+gle).*) should be NOT.
How to do it right?
Thanks.
You can use your second regex by placing it in a negative look ahead and use the first regex as character set and combine both to get following regex that you can use,
^(?!.*g[o0]+gle)[-a-z0-9!#$^&+%=_(){}<>'";:\/.,~`|]+$
Here, this (?!.*g[o0]+gle) negative look ahead will reject any strings that contains google or any variation as supported by your regex, and this character set [-a-z0-9!#$^&+%=_(){}<>'";:\/.,~|]+` will match one or more characters allowed by it.
Also, you don't need to escape most special characters while they are in character set, hence I have unescaped most of them except / and also always place the hyphen - either as the very first character or very last character in the character set, else depending upon the regex dialects, you may see weird behavior.
Regex Demo

C# Using Regex to find words, any tips?

So i am studying programming in university and i have a task for which i need to use especially Regex.
So basically, i need to make program, which copies the text from first file until it meets first non-copied word from second file or it reaches the end of the file and when it finds that non-copied word (or reaches the end of the file), then it copies text from second file until it meets first non-copied word from first file or it reaches the end of the file, reapeat till both files end. Lower and Upper characters don't matter.
For example:
File1: You are very beautiful, can you give me your number?
File2: Beautiful is Beyonce, not me.
Result: You are very Beautiful is Beyonce, not me. beautiful, can you give me your number?
Yes, i know, it is a confused result, but i need to make, so do you have any ideas or tips, how i could make this program ?
First off, get yourself a Regex designer that follows .Net Regex rules.
I use: https://rad-software-regular-expression-designer.software.informer.com/
This is how you find words:
\b\w+\b
"A word boundary followed by at least 1 word char followed by a closing boundary", ensure you use the case insensitive RegexOption
Then, loop through both string using a case insensitive comparison to find matching words.
Once done, store the first word in a variable.
Now, create another Regex, it's match string is a "positive lookahead zero width assertion) around the first matching word. Don't forget the case insensitive switch.
If you match the word, the word gets replaced, instead we use "lookaround", which just return a zero width location and you get two "beautiful"s in the result
Use Regex's Instance method "Replace(left, right)"
Now, I've also thrown the code together but you're supposed to be doing this yourself.
// So I've hidden the code in this pastebin:
https://pastebin.com/fBAu1zBY
// Only go there is you've not managed to figure this out for yourself!

Parse directories from a string

Firstly i have spent Three hours trying to solve this. Also please don't suggest not using regex. I appreciate other comments and can easily use other methods but i am practicing regex as much as possible.
I am using VB.Net
Example string:
"Hello world this is a string C:\Example\Test E:\AnotherExample"
Pattern:
"[A-Z]{1}:.+?[^ ]*"
Works fine. How ever what if the directory name contains a white space? I have tried to match all strings that start with 1 uppercase letter followed by a colon then any thing else. This needs to be matched up until a whitespace, 1 upper letter and a colon. But then match the same sequence again.
Hope i have made sense.
How about "[A-Z]{1}:((?![A-Z]{1}:).)*", which should stop before the next drive letter and colon?
That "?!" is a "negative lookaround" or "zero-width negative lookahead" which, according to Regular expression to match a line that doesn't contain a word? is the way to get around the lack of inverse matching in regexes.
Not to be too picky, but most filesystems disallow a small number of characters (like <>/\:?"), so a correct pattern for a file path would be more like [A-Z]:\\((?![A-Z]{1}:)[^<>/:?"])*.
The other important point that has been raised is how you expect to parse input like "hello path is c:\folder\file.extension this is not part of the path:P"? This is a problem you commonly run into when you start trying to parse without specifying the allowed range of inputs, or the grammar that a parser accepts. This particular problem seems pretty ad hoc and so I don't really expect you to come up with a grammar or to define how particular messages are encoded. But the next time you approach a parsing problem, see if you can first define what messages are allowed and what they mean (syntax and semantics). I think you'll find that once you've defined the structure of allowed messages, parsing can be almost trivial.

Regular Expression for UK postcodes

I have a list of post codes which should be excluded from my shipping methods.
Suppose I have to exclude Scilly Isles, Isle of Man and few others.
For the above 2 areas valid post codes are IM1-IM9, IM86, IM87, IM89. And if it is IM25 or IM85 it is invalid.
I have writtent following expression. But it is returning even it is IM25 or IM 85.
var regex = new Regex("(PO3[0-9]|PO4[0-1]|GY[1-9]|JE[1-5]|IM[1-9]|TR[1-9])");
If I am passing IM85, to my expression it should return false. for IM1-IM9,, IM86, IM87, IM89 it should return true.
Same with TR post codes also. TR1-TR27 is a valid post code. If I give TR28, it should return false.
I am using '|' to seperate multiple patterns. Is that the right way of including multiple patterns in 1 expression.
What do you expect? What should be matched and what not? And please give an example of the string you want to test.
If you match your pattern against "IM25" it will match because you do allow IM[1-9] in your pattern, so you get a valid partial match. If you want to avoid that (I am not sure what you want to achieve) and want to allow really only a single digit after the first letters, use a "word boundary" \b and specify exactly what you want to allow, something like this:
(PO3[0-9]|PO4[0-1]|GY[1-9]|JE[1-5]|IM([1-9]|8[6-9])|TR([1-9]|2[0-7]))\b
See it here on Regexr
this would allow for the "IM" part also 6-9 as a second digit when there is a 8 before.
Update
It is still not clear what the context of your task is. I assume you have a list of valid Postcodes, probably it would be better, you extract the post code or only the first part of it (for that you can eventually use a regex) and check if it is in the list or not.
The actual validation is on the wikipedia site... Google has the answers ;) http://en.wikipedia.org/wiki/Postcodes_in_the_United_Kingdom#Validation
(GIR 0AA)|(((A[BL]|B[ABDFHLNRSTX]?|C[ABFHMORTVW]|D[ADEGHLNTY]|E[HNX]?|F[KY]|G[LUY]?|H[ADGPRSUX]|I[GMPV]|JE|K[ATWY]|L[ADELNSU]?|M[EKL]?|N[EGNPRW]?|O[LX]|P[AEHLOR]|R[GHM]|S[AEGKLMNOPRSTY]?|T[ADFNQRSW]|UB|W[ADFNRSV]|YO|ZE)[1-9]?[0-9]|((E|N|NW|SE|SW|W)1|EC[1-4]|WC[12])[A-HJKMNPR-Y]|(SW|W)([2-9]|[1-9][0-9])|EC[1-9][0-9]) [0-9][ABD-HJLNP-UW-Z]{2})
I still think you need more clarification. As a huge Regex guy, I would like to point out that multi-digit ranges should try to be put into the code side, not the Regex side, just for your sanity. But I personally like to play with Regex in this way. Regex reads one character at a time, so it only recognizes zero through nine. Not ten, not twenty eight. If you want to allow the following:
28 through 347
Then it becomes pretty complicated.
To put it into words, you want to allow:
If Two Digits, allow 2-9 for the first digit, and:
If the first digit is a Two, then allow 8/9 for the second digit,
ElseIf the first Digit is 3-9, then allow 0-9 for the second digit
Elseif Three Digits, allow 1-3 for the first Digit, and:
If the first digit is a Three, then allow 0-4 for the second digit, and:
If the second digit is a Four, then allow 0-7 for the third digit,
ElseIf the second digit is 0-3, then allow 0-9 for the third digit.
ElseIf the first digit is 1/2, then allow 0-9 for both the Second and Third digits.
Then with that, you can write a proper Regex like so, which searches for a word boundary or non-Digit surrounding a 2-pair or 3-pair. With this type of Problem-Solving, you should be able to figure out your Regex issue. Otherwise, let us know more about EXACTLY What you want to Match and NOT Match:
(\b|\D)((2[89]|[3-9][0-9])(\b|\D)|(3(4[0-7]|[0-3][0-9])|[12][0-9][0-9])(\b|\D))
I have changed my approach.
Instead of going for a regular expression which is becoming more complex, I am saving all the excluded outward codes of UK post codes.
And if any post code contains the particular outward code, excluding the post code from the list.
Outward codes are in this format
XX-YYY
XXX-YYY
XXXX-YYY
In all above formats, X represents outward code of an UK postcode.

Regular expression need to identify where sentences don't have a space between them

I need a regular expression to identify all instances where a sentence begins without a space following the previous period.
For example, this is a bad sentence:
I'm sentence one.This is sentence two.
this needs to be fixed as follows:
I'm sentence one. This is sentence two.
It's not simply a case of doing a string replace of '.' with '. ' because there are a also a lot of isntances where the rest of the sentences in the paragraph the correct spacing, and this would give those an extra space.
\.(?!\s) will match dots not followed by a space. You probably want exclamation marks and question marks as well though: [\.\!\?](?!\s)
Edit:
If C# supports it, try this: [\.\!\?](?!\s|$). It won't match the punctuation at the end of the string.
You could search for \w\s{1}\.[A-Z] to find a word character, followed by a single space character, followed by a period, followed by a Capital letter, to identify these. For a find/replace: find: (\w\s{1}\.)(A-Z]) and replace with $1 $2.
I doubt that you can create a regular expression that will work in the general case.
Any regex solution you come up with is going to have some interesting edge cases that you'll have to look at carefully. For example, the abbreviation "i.e." would become "i. e." (i.e., it will have an extra space and, if this parenthetical comment were run through the regex, it would become "i. e. ,").
Also, the proper way to quote text is to include the punctuation inside the quotes, as in "He said it was okay." If you had ["He said it was okay."This is a new sentence.], your regex solution might put a space before the final quote, or might ignore the error altogether.
Those are just two cases that come to mind immediately. There are plenty of others.
Whereas a regular expression will work in a limited set of simple sentences, real written language will quickly show that regular expressions are insufficient to provide a general solution to this problem.
if a sentence ends with e.g. ... you probably don't want to change this to . . .
I think the previous answers don't consider this case.
try to insert space where you find a word followed a new word starting with uppercase
find (\w+[\.!?])([A-Z]'?\w+) replace $1 $2
Best website ever: http://www.regular-expressions.info/reference.html

Categories