Regex to match all alphanumeric and math operators - c#

I have the simple regex #"[a-zA-Z]" to match all characters a-z in a string but I also need math operators (*, /, +, -). I was reading over the documentation on msdn but I got lost relatively fast due to the math operators being used as other tokens in the regex
This solution works:
#"[A-Za-z\*\+\-\/]"
Thanks for the help and resources everyone.

The correct answer is
#"[A-Za-z*+/-]"
Or #"[A-Za-z-*+/]", or #"[-A-Za-z*+/]", or #"[A-Za-z*\-+/]".
Or, shorten it with a case-insensitive modifier: #"(?i)[A-Z*+/-]" (or use a corresponding RegexOptions.IgnoreCase with #"[A-Z*+/-]" since it seems you are using C#).
Inside a character class, the unescaped hyphen should either be at the start or final position to be treated as a literal, or right after a range or shorthand class. Otherwise, it must be escaped. Also, ] must be escaped if not at the beginning of the character class. Other characters do not have to be escpaed inside a character class.
To test, use an appropriate online regex tester. You need one for .NET, see Regex demo at RegexStorm.

Related

C# Regex to match attributes [duplicate]

To review regular expresions I read this tutorial. Anyways that tutorial mentions that \b matches a word boundary (between \w and \W characters). That tutorial also gives a link where you can install expresso (program that helps when creating regular expressions).
So I have created my regular expressions in expresso and I do inded get a match. Now when I copy the same regex to visual studio I do not get a match. Take a look:
Why am I not getting a match? in the immediate window I am showing the content of variable output. In expresso I do get a match and in visual studio I don't. why?
The C# language and .NET Regular Expressions both have their own distinct set of backslash-escape sequences, but the C# compiler is intercepting the "\b" in your string and converting it into an ASCII backspace character so the RegEx class never sees it. You need to make your string verbatim (prefix with an at-symbol) or double-escape the 'b' so the backslash is passed to RegEx like so:
#"\bCOMPILATION UNIT";
Or
"\\bCOMPILATION UNIT"
I'll say the .NET RegEx documentation does not make this clear. It took me a while to figure this out at first too.
Fun-fact: The \r and \n characters (carriage-return and line-break respectively) and some others are recognized by both RegEx and the C# language, so the end-result is the same, even if the compiled string is different.
You should use #"\bCOMPILATION UNIT". This is a verbatim literal. When you do "\b" instead, it parses \b into a special character. You can also do "\\b", whose double backslash is parsed into a real backslash, but it's generally easier to just use verbatims when dealing with regex.

Extract string from a pattern preceded by any length

I'm looking for a regular expression to extract a string from a file name
eg if filename format is "anythingatallanylength_123_TESTNAME.docx", I'm interested in extracting "TESTNAME" ... probably fixed length of 8. (btw, 123 can be any three digit number)
I think I can use regex match ...
".*_[0-9][0-9][0-9]_[A-Z][A-Z][A-Z][A-Z][A-Z][A-Z][A-Z][A-Z].docx$"
However this matches the whole thing. How can I just get "TESTNAME"?
Thanks
Use parenthesis to match a specific piece of the whole regex.
You can also use the curly braces to specify counts of matching characters, and \d for [0-9].
In C#:
var myRegex = new Regex(#"*._\d{3}_([A-Za-z]{8})\.docx$");
Now "TESTNAME" or whatever your 8 letter piece is will be found in the captures collection of your regex after using it.
Also note, there will be a performance overhead for look-ahead and look-behind, as presented in some other solutions.
You can use a look-behind and a look-ahead to check parts without matching them:
(?<=_[0-9]{3}_)[A-Z]{8}(?=\.docx$)
Note that this is case-sensitive, you may want to use other character classes and/or quantifiers to fit your exact pattern.
In your file name format "anythingatallanylength_123_TESTNAME.docx", the pattern you are trying to match is a string before .docx and the underscore _. Keeping the thing in mind that any _ before doesn't get matched I came up with following solution.
Regex: (?<=_)[A-Za-z]*(?=\.docx$)
Flags used:
g global search
m multi-line search.
Explanation:
(?<=_) checks if there is an underscore before the file name.
(?=\.docx$) checks for extension at the end.
[A-Za-z]* checks the required match.
Regex101 Demo
Thanks to #Lucero #noob #JamesFaix I came up with ...
#"(?<=.*[0-9]{3})[A-Z]{8}(?=.docx$)"
So a look behind (in brackets, starting with ?<=) for anything (ie zero or more any char (denoted by "." ) followed by an underscore, followed by thee numerics, followed by underscore. Thats the end of the look behind. Now to match what I need (eight letters). Finally, the look ahead (in brackets, starting with ?=), which is the .docx
Nice work, fellas. Thunderbirds are go.

Why is this regex not allowing this text?

I have a username validator IsValidUsername, and I am testing "baconman" but it is failing, could someone please help me out with this regex?
if(!Regex.IsMatch(str, #"^[a-zA-Z]\\w+|[0-9][0-9_]*[a-zA-Z]+\\w*$")) {
isValid = false;
}
I want the restrictions to be: (It's very close)
Be between 5 & 17 characters long
contain at least one letter
no spaces
no special characters
You're escaping unnecessarily: if you write your regex as starting with # outside the string, you don't need both \ - just one is fine.
Either:
#"\w"
or
"\\w"
Edit: I didn't make this clear: right now due to the double escaping, you're looking for a \ in your regex and a w. So your match would need [some character]\w to match (example: "a\w" or "a\wwwwww" would match.
Your requirements are best taken care of in normal C#. They don't map well to a regular expression. Just code them up using LINQ which works on strings like it would on an IEnumerable<char>.
Also, understanding a query of a string is much easier than understanding a Regex with the requirements that you have.
It is possible to do everything as part of a Regex, however it is not pretty :-)
^(\w(?=\w*[a-zA-Z])|[a-zA-Z]|\w(?<=[a-zA-Z]\w*)){5,17}$
It does 3 checks that always results in 1 character being matched (so we can perform the length check in the end)
Either the character is any word character \w which is before [a-zA-Z]
Or it is [a-zA-Z]
Or it is any word character \w which is after [a-zA-Z]

Simple Regex Question

I am new to regex (15 minutes of experience) so I can't figure this one out. I just want something that will match an alphanumeric string with no spaces in it. For example:
"ThisIsMyName" should match, but
"This Is My Name" should not match.
^[a-zA-Z0-9]+$ will match any letters and any numbers with no spaces (or any punctuation) in the string. It will also require at least one alphanumeric character. This uses a character class for the matching. Breakdown:
^ #Match the beginning of the string
[ #Start of a character class
a-z #The range of lowercase letters
A-Z #The range of uppercase letters
0-9 #The digits 0-9
] #End of the character class
+ #Repeat the previous one or more times
$ #End of string
Further, if you want to "capture" the match so that it can be referenced later, you can surround the regex in parens (a capture group), like so:
^([a-zA-Z0-9]+)$
Even further: since you tagged this with C#, MSDN has a little howto for using regular expressions in .NET. It can be found here. You can also note the fact that if you run the regex with the RegexOptions.IgnoreCase flag then you can simplify it to:
^([a-z0-9])+$
this will match any sequence of non-space characters:
\S+
Take a look at this link for a good basic Regex information source: http://regexlib.com/CheatSheet.aspx
They also have a handy testing tool that I use quite a bit: http://regexlib.com/RETester.aspx
That said, #eldarerathis' or #Nicolas Bottarini's answers should work for you.
I have just written a blog entry about regex, maybe it's something you may find useful:)
http://blogs.appframe.com/erikv/2010-09-23-Regular-Expression
Try using this regex to see if it works: (\w+)

regular expression for special chars and numerics

I need a regular expression for accepting alpha numerics and special charecters also
like : abc & def12
thanks in advance
Nagesh
This might be the syntax you are looking for /^[a-zA-Z0-9&:\/ ]+$/, insert any other characters you want to match between the square brackets.
I would recommend you to read up on regular expressions if you intend to use them in the future, check out this tutorial http://perldoc.perl.org/perlretut.html
^[\w]+$ this regex will match all alphanumerics, if you want to match some other chars as well just specify them in the [] brackets, i.e. if you wan't to also match ampersand you will have ^[\w&]+$ regex, if you wan't to match white characters as well (tabs, spaces, line feeds, carriage returns) you add \d and end up with ^[\w&\s]+$ and so on until you have all your special characters handled.
not got a specific answer, but regexlib.com has come in handy for me.

Categories