When the textbox changes I want add a whitespace between numeric and alphanumeric characters.
For example
34 YT 567 *Allowed*
22 KL 2345 *Allowed*
22KL 2345 *Not Allowed*
22KL2345 *Not Allowed*
22 KL2345 *Not Allowed*
This will fix an incorrect value by inserting spaces where necessary:
var correctedValue = Regex.Replace(
incorrectValue,
"(?<=[0-9])(?=[A-Za-z])|(?<=[A-Za-z])(?=[0-9])",
" ");
You can use the same pattern to detect an incorrect value using Regex.IsMatch if you want to warn the user rather than fix it automatically.
Edit:
Regex.IsMatch(MyTextBox.Text,
"(?<=[0-9])(?=[A-Za-z])|(?<=[A-Za-z])(?=[0-9])|[^a-zA-Z0-9 ]")
will return true if the user inputs a number next to a letter, or inputs any non-alphanumeric (and non-space) character.
If you want to remove non-alphanumeric characters and insert spaces you'll need to do it in two steps; first Regex.Replace with pattern [^a-zA-Z0-9 ], then the Regex.Replace call above.
You can easily find bad input using RegEx.
Regex rgx = new RegEx("([0-9]+[a-z]|[A-Z]+)||([a-z]|[A-Z]+[0-9]+)");
if (rgx.IsMatch(MyTextBox.Text)
{
//bad input
}
else
//input was good.
The regular expression is matching one or more numbers followed directly by one or more letters or the other way around (letters then numbers).
Related
I have some input that is an integer stored as a string that may have 1 or 2 digits. I would like to know if it is possible to come up with a regex pattern and substitution string that allows me to add a 0 at the front of any input that has only one digit.
ie. I'd like to find pattern and subst such that:
Regex.Replace("1",pattern,subst); // returns "01"
Regex.Replace("31",pattern,subst); // returns "31"
Edit: the question is specific to C# regex. Please do not answer to provide alternative methods
Using regex you can use word boundaries around a single digit:
string num = "5";
Regex.Replace(num, #"\b\d\b", "0$&");
//=> 05
num = "31";
Regex.Replace(num, #"\b\d\b", "0$&");
//=> 31
Code Demo
Regex \b\d\b will match a single digit with word boundaries on either side to ensure we're only matching a single digit.
More Infor about Word boundary
In case digit can appear in the middle of the word then you can use lookarounds regex like this:
num = Console.WriteLine(Regex.Replace(num, #"(?<!\d)\d(?!\d)", "0$&"));
I am looking for something for Social Security Number which is in the form "###-##-####". I need a way that the first character can also be allowed to type "#"
How do I add that? I need it for a masked text box mask.
Try this regex:
^(#|\d\d\d-\d\d-\d\d\d\d)$
(Note: this is with the US format: ###-##-####)
The ^ and $ mean the "start" and "end" of the string, so that you can't match items in the middle of your text.
The | says "one or the other". So it will match a #, or the digits.
The following will match
123-45-6789
#
but this won't match
234-3333-14234
#123-45-6789
You can take a demo here.
Make sure when you type this into c# you use the correct character escaping:
string pattern = #"^(#|\d\d\d-\d\d-\d\d\d\d)$";
all you need is this (\d is any digit and \d is \d escaped for c# and #? means it will accept 0 or 1 #)
#?\\d\\d\\d-\\d\\d-\\d\\d\\d\\d
orginal question removed
I am looking for a Regular Expression which will format a string containing of special characters, characters and numbers into a string containing only numbers.
There are special cases in which it’s not enough to only replace all non-numeric characters with “” (empty).
1.) Zero in brackets.
If there are only zeros in a bracket (0) these should be removed if it is the first bracket pair. (The second bracket pair containing only zeros should not be removed)
2.) Leading zero.
All leading zero should be removed (ignoring brackets)
Examples for better understanding:
123 (0) 123 would be 123123 (zero removed)
(0) 123 -123 would be 123123(zero and all other non-numeric characters removed)
2(0) 123 (0) would be 21230 (first zero in brackets removed)
20(0)123023(0) would be 201230230 (first zero in brackets removed)
00(0)1 would be 1(leading zeros removed)
001(1)(0) would be 110 (leading zeros removed)
0(0)02(0) would be 20 (leading zeros removed)
123(1)3 would be 12313 (characters removed)
You could use a lookbehind to match (0) only if it's not at the beginning of the string, and replace with empty string as you're doing.
(original solution removed)
Updated again to reflect new requirements
Matches leading zeroes, matches (0) only if it's the first parenthesized item, and matches any non-digit characters:
^[0\D]+|(?<=^[^(]*)\(0\)|\D
Note that most regex engines do not support variable-length lookbehinds (i.e., the use of quantifiers like *), so this will only work in a few regex engines -- .NET's being one of them.
^[0\D]+ # zeroes and non-digits at start of string
| # or
(?<=^[^(]*) # preceded by start of string and only non-"(" chars
\(0\) # "(0)"
| # or
\D # non-digit, equivalent to "[^\d]"
(tested at regexhero.net)
You've changed and added requirements several times now. For multiple rules like this, you're probably better off coding for them individually. It could become complicated and difficult to debug if one condition matches and causes another condition not to match when it should. For example, in separate steps:
Remove parenthesized items as necessary.
Remove non-digit characters.
Remove leading zeroes.
But if you absolutely need these three conditions all matched in a single regular expression (not recommended), here it is.
Regexes get much, much simpler if you can use multiple passes. I think you could do a first pass to drop your (0) if it's not the first thing in a string, then follow it with stripping out the non-digits:
var noMidStrParenZero = Regex.Replace(text, "^([^(]+)\(0\)", "$1");
var finalStr = Regex.Replace(noMidStrParenZero, "[^0-9]", "");
Avoids a lot of regex craziness, and it's also self-documenting to an extent.
EDIT: this version should work with your new examples too.
This regex should be pretty near the one you're searching for.
(^[^\d])|([^\d](0[^\d])?)+
(You can replace everything that is caught by an empty string)
EDIT :
Your request evolved, and is now to complex to be treatd with a single pass. Assuming you always got a space before a bracket group, you can use those passes (keep this order) :
string[] entries = new string[7] {
"800 (0) 123 - 1",
"800 (1) 123",
"(0)321 123",
"1 (0) 1",
"1 (12) (0) 1",
"1 (0) (0) 1",
"(9)156 (1) (0)"
};
foreach (string entry in entries)
{
var output = Regex.Replace(entry , #"\(0\)\s*\(0\)", "0");
output = Regex.Replace(output, #"\s\(0\)", "");
output = Regex.Replace(output, #"[^\d]", "");
System.Console.WriteLine("---");
System.Console.WriteLine(entry);
System.Console.WriteLine(output);
}
(?: # start grouping
^ # start of string
| # OR
^\( # start of string followed by paren
| # OR
\d # a digit
) # end grouping
(0+) # capture any number of zeros
| # OR
([1-9]) # capture any non-zero digit
This works for all of your example strings, but the entire expression does match the ( followed by the zero. You can use Regex.Matches to get the match collection using a global match and then join all of the matched groups into a string to get numbers only (or just remove any non-numbers).
My Regex is removing all numeric (0-9) in my string.
I don't get why all numbers are replaced by _
EDIT: I understand that my "_" regex pattern changes the characters into underscores. But not why numbers!
Can anyone help me out? I only need to remove like all special characters.
See regex here:
string symbolPattern = "[!##$%^&*()-=+`~{}'|]";
Regex.Replace("input here 12341234" , symbolPattern, "_");
Output: "input here ________"
The problem is your pattern uses a dash in the middle, which acts as a range of the ascii characters from ) to =. Here's a breakdown:
): 41
1: 49
=: 61
As you can see, numbers start at 49, and falls between the range of 41-61, so they're matched and replaced.
You need to place the - at either the beginning or end of the character class for it to be matched literally rather than act as a range:
"[-!##$%^&*()=+`~{}'|]"
you must escape - because sequence [)-=] contains digits
string symbolPattern = "[!##$%^&*()\-=+`~{}'|]";
Move the - to the end of the list so it is seen as a literal:
"[!##$%^&*()=+`~{}'|-]"
Or, to the front:
"[-!##$%^&*()=+`~{}'|]"
As it stands, it will match all characters in the range )-=, which includes all numerals.
You need to escape your special characters in your regex. For instance, * is a wildcard match. Look at what some of those special characters mean for your match.
I've not used C#, but typically the "*" character is also a control character that would need escaping.
The following matches a whole line of any characters, although the "^" and "$" are some what redundant:
^.*$
This matches any number of "A" characters that appear in a string:
A*
The "Owl" book from oreilly is what you really need to research this:
http://shop.oreilly.com/product/9780596528126.do?green=B5B9A1A7-B828-5E41-9D38-70AF661901B8&intcmp=af-mybuy-9780596528126.IP
Guys I hate Regex and I suck at writing.
I have a string that is space separated and contains several codes that I need to pull out. Each code is marked by beginning with a capital letter and ending with a number. The code is only two digits.
I'm trying to create an array of strings from the initial string and I can't get the regular expression right.
Here is what I have
String[] test = Regex.Split(originalText, "([a-zA-Z0-9]{2})");
I also tried:
String[] test = Regex.Split(originalText, "([A-Z]{1}[0-9]{1})");
I don't have any experience with Regex as I try to avoid writing them whenever possible.
Anyone have any suggestions?
Example input:
AA2410 F7 A4 Y7 B7 A 0715 0836 E0.M80
I need to pull out F7, A4, B7. E0 should be ignored.
You want to collect the results, not split on them, right?
Regex regexObj = new Regex(#"\b[A-Z][0-9]\b");
allMatchResults = regexObj.Matches(subjectString);
should do this. The \bs are word boundaries, making sure that only entire strings (like A1) are extracted, not substrings (like the A1 in TWA101).
If you also need to exclude "words" with non-word characters in them (like E0.M80 in your comment), you need to define your own word boundary, for example:
Regex regexObj = new Regex(#"(?<=^|\s)[A-Z][0-9](?=\s|$)");
Now A1 only matches when surrounded by whitespace (or start/end-of-string positions).
Explanation:
(?<= # Assert that we can match the following before the current position:
^ # Start of string
| # or
\s # whitespace.
)
[A-Z] # Match an uppercase ASCII letter
[0-9] # Match an ASCII digit
(?= # Assert that we can match the following after the current position:
\s # Whitespace
| # or
$ # end of string.
)
If you also need to find non-ASCII letters/digits, you can use
\p{Lu}\p{N}
instead of [A-Z][0-9]. This finds all uppercase Unicode letters and Unicode digits (like Ä٣), but I guess that's not really what you're after, is it?
Do you mean that each code looks like "A00"?
Then this is the regex:
"[A-Z][0-9][0-9]"
Very simple... By the way, there's no point writing {1} in a regex. [0-9]{1} means "match exactly one digit, which is exactly like writing [0-9].
Don't give up, simple regexes make perfect sense.
This should be ok:
String[] all_codes = Regex.Split(originalText, #"\b[A-Z]\d\b");
It gives you an array with all code starting with a capital letter followed by a digit, separated by an kind of word boundary (site space etc.)