Tried using the following regex code but the - key cant be accepted into my input textbox. Please assist!
My code is as followed:
if (Regex.IsMatch(textBox_address.Text, #"^[a-zA-Z0-9#- ]+$"))
Escape the - by replacing it by \-:
^[a-zA-Z0-9#\- ]+$
As you may see in this expression the [.-.] if used to define a set of characters. To explain the regex parser, that your character has not this meaning use \ to escape it.
It would be the same thing if you want to a regex that matches only numbers and [.
To do it : ^[0-9\[]+$ otherwise the regex can't be parsed.
Related
I am using this regex to parse URL from a semicolon separated string.
\b(?:https?:|http?:|www\.)\S+\b
It is working fine if my input text is in these formats:
"Google;\"https://google.com\""
//output - https://google.com
"Yahoo;\"www.yahoo.com\""
//output - www.yahoo.com
but in this case it gives incorrect string
"https://google.com;\"https://google.com\""
//output - https://google.com;\"https://google.com
how can I stop the parsing when I encounter the ';' ?
Looking at your examples, I would just match any URL between quotation marks. Something like this:
(?<=")(?:https?:|www\.)[^"]*
You can try it out here
Or as others have said, split the input string by the semicolon character using string.Split, and check each string sequentially for your desired match.
For your example data you might use a positive lookahead (?=) and a positive lookbehind (?<=)
(?<=")(?:https?:|www\.).+?(?=;?\\")
That would match
(?<=") Positive lookbehind to assert that what is on the left side is a double quote
(?:https?:|www\.) Match either http with an optional s or www.
.+? Match any character one or more times non greedy
(?=;?\\") Positive lookahead which asserts that what follows is an optional ; followed by\"
I would personally just modify the regex to look specifically for URLs and add some conditionals to the https:// protocols and www quantifier. Using \S+ can be kind of iffy because it will grab every non whitespace character, in which in a URL, it's limited on the characters you can use.
Something like this should work great for your particular needs.
(https?:\/{2})?([w]{3}.)?\w+\.[a-zA-Z]+
This sets up a conditional on the http (s also optional) protocol which would then be immediately be followed by the ://. Then, it will grab all letters, numbers, and underscores as many as possible until the ., followed by the last set of characters to end it. You can exchange the [a-zA-Z] character set for a explicit set of domains if you'd prefer.
I have a regex for validating a string but it doesn't accept semicolons? Is it because I have to use some escape sequences? I tested my regex here and it passes i.e allows semi-colon but doesn't allow in my c# app.
EDITED I have following regex
^[A-Za-z0-9]{1}[A-Za-z.&0-9\s\\-]{0,21}$
And tried validating sar232 trading inc;
The & entity hints at the fact you have this regular expression inside some XML attribute, and that this & gets parsed as a single & symbol when the pattern is sent to the regex engine.
That means, your pattern lacks the semi-colon inside the second character class, and that is why your regex does not match the string you provided.
The solution is simple: add the semi-colon to the 2nd character class:
someattr="^[A-Za-z0-9][;A-Za-z.&0-9\s\\-]{0,21}$"
^
See the regex demo
Please also note that the {1} limiting quantifier is redundant since a [A-Za-z0-9] already matches only 1 symbol from the indicated ranges.
I know this stuff has been talked about a lot, but I'm having a problem trying to match the following...
Example input: "test test 310-315"
I need a regex expression that recognizes a number followed by a dash, and returns 310. How do I include the dash in the regex expression though. So the final match result would be: "310".
Thanks a lot - kcross
EDIT: Also, how would I do the same thing but with the dash preceding, but also take into account that the number following the dash could be a negative number... didnt think of this one when I wrote the question immediately. for example: "test test 310--315" returns -315 and "test 310-315" returns 315.
Regex regex = new Regex(#"\d+(?=\-)");
\d+ - Looks for one or more digits
(?=\-) - Makes sure it is followed by a dash
The # just eliminates the need to escape the backslashes to keep the compiler happy.
Also, you may want this instead:
\d+(?=\-\d+)
This will check for a one or more numbers, followed by a dash, followed by one or more numbers, but only match the first set.
In response to your comment, here's a regex that will check for a number following a -, while accounting for potential negative (-) numbers:
Regex regex = new Regex(#"(?<=\-)\-?\d+");
(?<=\-) - Negative lookbehind which will check and make sure there is a preceding -
\-? - Checks for either zero or one dashes
\d+ - One or more digits
(?'number'\d+)- will work ( no need to escape ). In this example the group containing the single number is the named group 'number'.
if you want to match both groups with optional sign try:
#"(?'first'-?\d+)-(?'second'-?\d+)"
See it working here.
Just to describe, nothing complicated, just using -? to match an optional - and \d+ to match one or more digit. a literal - match itself.
here's some documentation that I use:
http://www.mikesdotnetting.com/Article/46/CSharp-Regular-Expressions-Cheat-Sheet
in the comments section of that page, it suggests escaping the dash with '\-'
make sure you escape your escape character \
You would escape the special meaning of - in regex language (means range) using a backslash (\). Since backslash has a special meaning in C# literals to escape quotes or be part of some characters, you need to escape that with another backslash(\). So essentially it would be \d+\\-.
\b\d*(?=\-) you will want to look ahead for the dash
\b = is start at a word boundry
\d = match any decimal digit
* = match the previous as many times as needed
(?=\-) = look ahead for the dash
Edited for Formatting issue with the slash not showing after posting
I am trying to create a regex validation attribute in asp.net mvc to validate that an entered email has the .edu TLD.
I have tried the following but the expression never validates to true...
[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*#(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+edu
and
\w.\w#{1,1}\w[.\w]?.edu
Can anyone provide some insight?
This should work for you:
^[a-zA-Z0-9._%+-]+#[a-zA-Z0-9.+-]+\.edu$
Breakdown since you said you were weak at RegEx:
^ Beginning of string
[a-zA-Z0-9._%+-]+ one or more letters, numbers, dots, underscores, percent-signs, plus-signs or dashes
# #
[a-zA-Z0-9.+-]+ one or more letters, numbers, dots, plus-signs or dashes
\.edu .edu
$ End of string
if you're using asp.net mvc validation attributes, your regular expression actually has to be coded with javascript regex syntax, and not c# regex syntax. Some symbols are the same, but you have to be weary about that.
You want your attribute to look like the following:
[RegularExpression(#"([0-9]|[a-z]|[A-Z])+#([0-9]|[a-z]|[A-Z])+\.edu$", ErrorMessage = "text to display to user")]
the reason you include the # before the string is to make a literal string, because I believe c# will apply its own escape sequences before it passes it to the regex
(a|b|c) matches either an 'a' or 'b' or 'c'. [a-z] matches all characters between a and z, and the similar for capital letters and numerals so, ([0-9]|[a-z]|[A-Z]) matches any alphanumeric character
([0-9]|[a-z]|[A-Z])+ matches 1 or more alphanumeric characters. + in a regular expression means 1 or more of the previous
# is for the '#' symbol in an email address. If it doesn't work, you might have to escape it, but i don't know of any special meaning for # in a javascript regex
Let's simplify it more
[RegularExpression(#"\w+#\w+\.edu$", ErrorMessage = "text to display to user")]
\w stands for any alphanumeric character including underscore
read some regex documentation at https://developer.mozilla.org/en/JavaScript/Guide/Regular_Expressions for more information
You may have different combinations and may be this very simple one :
\S+#\S+\.\S+\.edu
try this:
Regex regex = new Regex(#"^[A-Z0-9._%+-]+#[A-Z0-9.-]+\.(edu)$", RegexOptions.IgnoreCase);
ANSWER UPDATED...
I have a regex I need to match against a path like so: "C:\Documents and Settings\User\My Documents\ScanSnap\382893.pd~". I need a regex that matches all paths except those ending in '~' or '.dat'. The problem I am having is that I don't understand how to match and negate the exact string '.dat' and only at the end of the path. i.e. I don't want to match {d,a,t} elsewhere in the path.
I have built the regex, but need to not match .dat
[\w\s:\.\\]*[^~]$[^\.dat]
[\w\s:\.\\]* This matches all words, whitespace, the colon, periods, and backspaces.
[^~]$[^\.dat]$ This causes matches ending in '~' to fail. It seems that I should be able to follow up with a negated match for '.dat', but the match fails in my regex tester.
I think my answer lies in grouping judging from what I've read, would someone point me in the right direction? I should add, I am using a file watching program that allows regex matching, I have only one line to specify the regex.
This entry seems similar: Regex to match multiple strings
You want to use a negative look-ahead:
^((?!\.dat$)[\w\s:\.\\])*$
By the way, your character group ([\w\s:\.\\]) doesn't allow a tilde (~) in it. Did you intend to allow a tilde in the filename if it wasn't at the end? If so:
^((?!~$|\.dat$)[\w\s:\.\\~])*$
The following regex:
^.*(?<!\.dat|~)$
matches any string that does NOT end with a '~' or with '.dat'.
^ # the start of the string
.* # gobble up the entire string (without line terminators!)
(?<!\.dat|~) # looking back, there should not be '.dat' or '~'
$ # the end of the string
In plain English: match a string only when looking behind from the end of the string, there is no sub-string '.dat' or '~'.
Edit: the reason why your attempt failed is because a negated character class, [^...] will just negate a single character. A character class always matches a single character. So when you do [^.dat], you're not negating the string ".dat" but you're matching a single character other than '.', 'd', 'a' or 't'.
^((?!\.dat$)[\w\s:\.\\])*$
This is just a comment on an earlier answer suggestion:
. within a character class, [], is a literal . and does not need escaping.
^((?!\.dat$)[\w\s:.\\])*$
I'm sorry to post this as a new solution, but I apparently don't have enough credibility to simply comment on an answer yet.
I believe you are looking for this:
[\w\s:\.\\]*([^~]|[^\.dat])$
which finds, like before, all word chars, white space, periods (.), back slashes. Then matches for either tilde (~) or '.dat' at the end of the string. You may also want to add a caret (^) at the very beginning if you know that the string should be at the beginning of a new line.
^[\w\s:\.\\]*([^~]|[^\.dat])$