What's wrong with my Regex expression - c#

I have a written a Regex to check whether the string consists of 9 digits or not as follows but this always returning me false even if I have my string as 123456789.
if (Regex.IsMatch(strSSN, " ^\\d{9}$"))
Can any one tell what's wrong and also if any better one provide me .. What i am achieving is the string should have only numeric data and the length should be =9

Spaces are significant in regular expressions (and of course, you can't match a space character before the start of the string). Remove it and everything should be fine:
if (Regex.IsMatch(strSSN, "^\\d{9}$"))
Also, you generally want to use verbatim strings for regexes in C#, so you don't have to double your backslashes, but this is just a convenience, not the reason for your problem:
if (Regex.IsMatch(strSSN, #"^\d{9}$"))

this should work
^\d{9}$
hope that helps

You have a whitespace between " and ^. It should be:
if (Regex.IsMatch(strSSN, "^\\d{9}$"))

could it be the space at the beginning of your pattern? That seems wierd, a space then a number anchored to the start of the line.
This should give you what you want:
if (Regex.IsMatch(strSSN, "^\\d{9}$"))

Related

Text files - how to programmatically mimic opening in Wordpad and overwriting as plain text [duplicate]

How can I replace lone instances of \n with \r\n (LF alone with CRLF) using a regular expression in C#?
I know to do it using plan String.Replace, like:
myStr.Replace("\n", "\r\n");
myStr.Replace("\r\r\n", "\r\n");
However, this is inelegant, and would destroy any "\r+\r\n" already in the text (although they are not likely to exist).
It might be faster if you use this.
(?<!\r)\n
It basically looks for any \n that is not preceded by a \r. This would most likely be faster, because in the other case, almost every letter matches [^\r], so it would capture that, and then look for the \n after that. In the example I gave, it would only stop when it found a \n, and them look before that to see if it found \r
Will this do?
[^\r]\n
Basically it matches a '\n' that is preceded with a character that is not '\r'.
If you want it to detect lines that start with just a single '\n' as well, then try
([^\r]|$)\n
Which says that it should match a '\n' but only those that is the first character of a line or those that are not preceded with '\r'
There might be special cases to check since you're messing with the definition of lines itself the '$' might not work too well. But I think you should get the idea.
EDIT: credit #Kibbee Using look-ahead s is clearly better since it won't capture the matched preceding character and should help with any edge cases as well. So here's a better regex + the code becomes:
myStr = Regex.Replace(myStr, "(?<!\r)\n", "\r\n");
I was trying to do the code below to a string and it was not working.
myStr.Replace("(?<!\r)\n", "\r\n")
I used Regex.Replace and it worked
Regex.Replace( oldValue, "(?<!\r)\n", "\r\n")
I guess that "myStr" is an object of type String, in that case, this is not regex.
\r and \n are the equivalents for CR and LF.
My best guess is that if you know that you have an \n for EACH line, no matter what, then you first should strip out every \r. Then replace all \n with \r\n.
The answer chakrit gives would also go, but then you need to use regex, but since you don't say what "myStr" is...
Edit:looking at the other examples tells me one thing.. why do the difficult things, when you can do it easy?, Because there is regex, is not the same as "must use" :D
Edit2: A tool is very valuable when fiddling with regex, xpath, and whatnot that gives you strange results, may I point you to: http://www.regexbuddy.com/
myStr.Replace("([^\r])\n", "$1\r\n");
$ may need to be a \
Try this: Replace(Char.ConvertFromUtf32(13), Char.ConvertFromUtf32(10) + Char.ConvertFromUtf32(13))
If I know the line endings must be one of CRLF or LF, something that works for me is
myStr.Replace("\r?\n", "\r\n");
This essentially does the same neslekkiM's answer except it performs only one replace operation on the string rather than two. This is also compatible with Regex engines that don't support negative lookbehinds or backreferences.

regular expression for c# verbatim like strings (processing ""-like escapes)

I'm trying to extract information out of rc-files. In these files, "-chars in strings are escaped by doubling them ("") analog to c# verbatim strings. is ther a way to extract the string?
For example, if I have the following string "this is a ""test""" I would like to obtain this is a ""test"". It also must be non-greedy (very important).
I've tried to use the following regular expression;
"(?<text>[^""]*(""(.|""|[^"])*)*)"
However the performance was awful.
I'v based it on the explanation here: http://ad.hominem.org/log/2005/05/quoted_strings.php
Has anybody any idea to cope with this using a regular expression?
You've got some nested repetition quantifiers there. That can be catastrophic for the performance.
Try something like this:
(?<=")(?:[^"]|"")*(?=")
That can now only consume either two quotes at once... or non-quote characters. The lookbehind and lookahead assert, that the actual match is preceded and followed by a quote.
This also gets you around having to capture anything. Your desired result will simply be the full string you want (without the outer quotes).
I do not assert that the outer quotes are not doubled. Because if they were, there would be no way to distinguish them from an empty string anyway.
This turns out to be a lot simpler than you'd expect. A string literal with escaped quotes looks exactly like a bunch of simple string literals run together:
"Some ""escaped"" quotes"
"Some " + "escaped" + " quotes"
So this is all you need to match it:
(?:"[^"]*")+
You'll have to strip off the leading and trailing quotes in a separate step, but that's not a big deal. You would need a separate step anyway, to unescape the escaped quotes (\" or "").
Don't if this is better or worse than m.buettner's (guessing not - he seems to know his stuff) but I thought I'd throw it out there for critique.
"(([^"]+(""[^"]+"")*)*)"
Try this (?<=^")(.*?"{2}.*?"{2})(?="$)
it will be maybe more faster, than two previous
and without any bugs.
Match a " beginning the string
Multiple times match a non-" or two "
Match a " ending the string
"([^"]|(""))*?"

Need Regex for a text box so that it accepts first letter a character and remaining as digits

Hi all i need a regex that accepts first letter as character and the remaining should be digits.
Even spacing is not allowed too..
Possible cases : a123, abc123, xyz123 and so on ...
Unacceptable : 123abc,1abc12, a 123 and so on..
i tried some think like this but i am not sure this works so can any one help me..
"[A-Z][a-z]\d{0,9}"
^[A-Za-z]+[0-9]+$
matches one or more ASCII letters, followed by one or more ASCII digits. If digits are optional, use [0-9]* instead.
If you want to also allow other letters/digits than just ASCII, use
^\p{L}+\p{D}+$
you probably need this:
"[a-zA-Z]+\d+"
How about this expression [A-Za-z]\w*
[A-Z]|[a-z]{1,}\d{1,}
But as you mentioned, possible cases should be: a123, b321, z4213213, but not abc123. Right?
So regExp shoud be [A-Z]|[a-z]\d{1,} .
I would highly recommend you do not limit yourself to ASCII as most of the other answers for this question do.
Using character classes, which I recommend, you should use:
^[\p{Ll}\p{Lu}\p{Lt}\p{Lo}\p{Pc}]\d+$
See the reference for ^ and $.

Help writing a regular expression

I asked a very similar question to this one almost a month ago here.
I am trying very hard to understand regular expressions, but not a bit of it makes any sense. SLak's solution in that question worked well, but when I try to use the Regex Helper at http://gskinner.com/RegExr/ it only matches the first comma of -2.2,1.1-6.9,2.3-12.8,2.3 when given the regex ,|(?<!^|,)(?=-)
In other words I can't find a single regex tool that will even help me understand it. Well, enough whining. I'm now trying to re-write this regex so that I can do a Regex.Split() to split up the string 2.2 1.1-6.9,2.3-12.8 2.3 into -2.2, 1.1, -6.9, 2.3, -12.8, and 2.3.
The difference the aforementioned question is that there can now be leading and/or trailing whitespace, and that whitespace can act as a delimiter as can a comma.
I tried using \s|,|(?<!^|,)(?=-) but this doesn't work. I tried using this to split 293.46701,72.238185, but C# just tells me "the input string was not in a correct format". Please note that there is leading and trailing whitespace that SO does not display correctly.
EDIT: Here is the code which is executed, and the variables and values after execution of the code.
If it doesn't have to be Regex, and if it doesn't have to be slow :-) this should do it for you:
var components = "2.2 1.1-6.9,2.3-12.8 2.3".Replace("-", ",-").
Split(new[]{' ', ','},StringSplitOptions.RemoveEmptyEntries);
Components would then contain:[2.2 1.1 -6.9 2.3 -12.8 2.3]
Does it need to be split? You could do Regex.Matches(text, #"\-?[\d]+(\.[\d]+)?").
If you need split, Regex.Split(text, #"[^\d.-]+|(?=-)") should work also.
P.S. I used Regex Hero to test on the fly http://regexhero.net
Unless I'm missing the point entirely (it's Sunday night and I'm tired ;) ) I think you need to concentrate more on matching the things you do want and not the things you don't want.
Regex argsep = new Regex(#"\-?[0-9]+\.?[0-9]*");
string text_to_split = "-2.2 1.1-6.9,2.3-12.8 2.3 293.46701,72.238185";
var tmp3 = argsep.Matches(text_to_split);
This gives you a MatchCollection of each of the values you wanted.
To break that down and try and give you an understanding of what it's saying, split it up into parts:
\-? Matches a literal minus sign (\ denotes literal characters) zero or one time (?)
[0-9]+ Matches any character from 0 to 9, one or more times (+)
\.? Matches a literal full stop, zero or one time (?)
[0-9]* Matches any character from 0 to 9 again, but this time it's zero or more times (*)
You don't need to worry about things like \s (spaces) for this regex, as the things you're actually trying to match are the positive/negative numbers.
Consider using the string split function. String operations are way faster than regular expressions and much simpler to use/understand.
If the "Matches" approach doesnt work you could perhaps hack something in two steps?
Regex RE = new Regex(#"(-?[\d.]+)|,|\s+");
RE.Split(" -2.2,1.1-6.9,2.3-12.8,2.3 ")
.Where(s=>!string.IsNullOrEmpty(s))
Outputs:
-2.2
1.1
-6.9
2.3
-12.8
2.3

Regular Expression to reject special characters other than commas

I am working in asp.net. I am using Regular Expression Validator
Could you please help me in creating a regular expression for not allowing special characters other than comma. Comma has to be allowed.
I checked in regexlib, however I could not find a match. I treid with ^(a-z|A-Z|0-9)*[^#$%^&*()']*$ . When I add other characters as invalid, it does not work.
Also could you please suggest me a place where I can find a good resource of regular expressions? regexlib seems to be big; but any other place which lists very limited but most used examples?
Also, can I create expressions using C# code? Any articles for that?
[\w\s,]+
works fine, as you can see bellow.
RegExr is a great place to test your regular expressions with real time results, it also comes with a very complete list of common expressions.
[] character class \w Matches any word character (alphanumeric & underscore). \s
Matches any whitespace character (spaces, tabs, line breaks). , include comma + is greedy match; which will match the previous 1 or more times.
[\d\w\s,]*
Just a guess
To answer on any articles, I got started here, find it to be an excellent resource:
http://www.regular-expressions.info/
For your current problem, try something like this:
[\w\s,]*
Here's a breakdown:
Match a single character present in the list below «[\w\s,]*»
Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
A word character (letters, digits, etc.) «\w»
A whitespace character (spaces, tabs, line breaks, etc.) «\s»
The character “,” «,»
For a single character that is not a comma, [^,] should work perfectly fine.
You can try [\w\s,] regular expression. This regex will match only alpha-numeric characters and comma. If any other character appears within text, then this wont match.
For your second question regarding regular expression resource, you can goto
http://www.regular-expressions.info/
This website has lot of tutorials on regex, plus it has lot of usefult information.
Also, can I create expressions using
C# code? Any articles for that?
By this, do you mean to say you want to know which class and methods for regular expression execution? Or you want tool that will create regular expression for you?
You can create expressions with C#, something like this usually does the trick:
Regex regex = new Regex(#"^[a-z | 0-9 | /,]*$", RegexOptions.IgnoreCase);
System.Console.Write("Enter Text");
String s = System.Console.ReadLine();
Match match = regex.Match(s);
if (match.Success == true)
{
System.Console.WriteLine("True");
}
else
{
System.Console.WriteLine("False");
}
System.Console.ReadLine();
You need to import the System.Text.RegularExpressions;
The regular expression above, accepts only numbers, letters (both upper and lower case) and the comma.
For a small introduction to Regular Expressions, I think that the book for MCTS 70-536 can be of a big help, I am pretty sure that you can either download it from somewhere or obtain a copy.
I am assuming that you never messed around with regular expressions in C#, hence I provided the code above.
Hope this helps.
Thank you, all..
[\w\s,]* works
Let me go through regular-expressions.info and come back if I need further support.
Let me try the C# code approach and come back if I need further support.
[This forum is awesome. Quality replies so qucik..]
Thanks again
(…) is denoting a grouping and not a character set that’s denoted with […]. So try this:
^[a-zA-Z0-9,]*$
This will only allow alphanumeric characters and the comma.

Categories