Allow double-byte space in Regex

Allow double-byte space in Regex - c#

I have a regex on my C# code to check if the name that end user entered is valid, my regex deny double-byte characters like double-byte space.
The double-byte space like the space between quotation “　“ .
My regex: #"^[\p{L}\p{M}\p{N}' \.\-]+$".
I'm already tried to edit this regex to accept double-byte space, but I did not reach meaningful result.
So please if any one can edit this regex to accept double-byte space, I will be thankful for him.

You need to replace a literal space with a pattern that matches any horizontal Unicode whitespace and in .NET regex, it can be achieved with \p{Zs}.
#"^[\p{L}\p{M}\p{N}\p{Zs}'.-]+$"
See the regex demo.
Note this pattern does not match a TAB char. If you need to match a TAB, too, you just need to add it,
#"^[\p{L}\p{M}\p{N}\p{Zs}\t'.-]+$"
Note you do not need to escape . and - in this regex. . inside square brackets is not any special regex metacharacter and - is not special when it is placed at the end of the character class.

Related

Update regular expression to only allow single spaces [duplicate]

Right now I have a regex that prevents the user from typing any special characters. The only allowed characters are A through Z, 0 through 9 or spaces.
I want to improve this regex to prevent the following:
No leading/training spaces - If the user types one or more spaces before or after the entry, do not allow.
No double-spaces - If the user types the space key more than once, do not allow.
The Regex I have right now to prevent special characters is as follows and appears to work just fine, which is:
^[a-zA-Z0-9 ]+$
Following some other ideas, I tried all these options but they did not work:
^\A\s+[a-zA-Z0-9 ]+$\A\s+
/s*^[a-zA-Z0-9 ]+$/s*
Could I get a helping hand with this code? Again, I just want letters A-Z, numbers 0-9, and no leading or trailing spaces.
Thanks.

You can use the following regex:
^[a-zA-Z0-9]+(?: [a-zA-Z0-9]+)*$
See regex demo.
The regex will match alphanumerics at the start (1 or more) and then zero or more chunks of a single space followed with one or more alphanumerics.
As an alternative, here is a regex based on lookaheads (but is thus less efficient):
^(?!.* {2})(?=\S)(?=.*\S$)[a-zA-Z0-9 ]+$
See the regex demo
The (?!.* {2}) disallows consecutive spaces and (?=.*\S$) requires a non-whitespace to be at the end of the string and (?=\S) requires it at the start.

Regex for alpha number string in c# accepting underscore and white spaces

I already gone through many post on SO. I didn't find what I needed for my specific scenario.
I need a regex for alpha numeric string.
where following conditions should be matched
Valid string:
ameya123 (alphabets and numbers)
ameya (only alphabets)
AMeya12(Capital and normal alphabets and numbers)
Ameya_123 (alphabets and underscore and numbers)
Ameya_ 123 (alphabets underscore and white speces)
Invalid string:
123 (only numbers)
_ (only underscore)
(only space) (only white spaces)
any special charecter other than underscore
what i tried till now:
(?=.*[a-zA-Z])(?=.*[0-9]*[\s]*[_]*)
the above regex is working in Regex online editor however not working in data annotation in c#
please suggest.

Based on your requirements and not your attempt, what you are in need of is this:
^(?!(?:\d+|_+| +)$)[\w ]+$
The negative lookahead looks for undesired matches to fail the whole process. Those are strings containing digits only, underscores only or spaces only. If they never happen we want to have a match for ^[\w ]+$ which is nearly the same as ^[a-zA-Z0-9_ ]+$.
See live demo here
Explanation:
^ Start of line / string
(?! Start of negative lookahead
(?: Start of non-capturing group
\d+ Match digits
| Or
_+ Match underscores
| Or
[ ]+ Match spaces
)$ End of non-capturing group immediately followed by end of line / string (none of previous matches should be found)
) End of negative lookahead
[\w ]+$ Match a character inside the character set up to end of input string
Note: \w is a shorthand for [a-zA-Z0-9_] unless u modifier is set.

One problem with your regex is that in annotations, the regex must match and consume the entire string input, while your pattern only contains lookarounds that do not consume any text.
You may use
^(?!\d+$)(?![_\s]+$)[A-Za-z0-9\s_]+$
See the regex demo. Note that \w (when used for a server-side validation, and thus parsed with the .NET regex engine) will also allow any Unicode letters, digits and some more stuff when validating on the server side, so I'd rather stick to [A-Za-z0-9_] to be consistent with both server- and client-side validation.
Details
^ - start of string (not necessary here, but good to have when debugging)
(?!\d+$) - a negative lookahead that fails the match if the whole string consists of digits
(?![_\s]+$) - a negative lookahead that fails the match if the whole string consists of underscores and/or whitespaces. NOTE: if you plan to only disallow ____ or " " like inputs, you need to split this lookahead into (?!_+$) and (?!\s+$))
[A-Za-z0-9\s_]+ - 1+ ASCII letters, digits, _ and whitespace chars
$ - end of string (not necessary here, but still good to have).

If I understand your requirements correctly, you need to match one or more letters (uppercase or lowercase), and possibly zero or more of digits, whitespace, or underscore. This implies the following pattern:
^[A-Za-z0-9\s_]*[A-Za-z][A-Za-z0-9\s_]*$
Demo
In the demo, I have replaced \s with \t \r, because \s was matching across all lines.
Unlike the answers given by #revo and #wiktor, I don't have a fancy looking explanation to the regex. I am beautiful even without my makeup on. Honestly, if you don't understand the pattern I gave, you might want to review a good regex tutorial.

This simple RegEx should do it:
[a-zA-Z]+[0-9_ ]*
One or more Alphabet, followed by zero or more numbers, underscore and Space.

This one should be good:
[\w\s_]*[a-zA-Z]+[\w\s_]*

Regex to capture an exact word in a sentence

I'm having some trouble to capture a specific string inside of a sentence.
The Regex I'm using is \b[0-9]{9,12}\b to capture numbers which have between 9 and 12 digits. The boundary I was using it to specify the exact number, but the problem is, when I have a number which matches with this regex followed by a dot, for example, the regex still matching and giving me much trouble.
As I searched, the problem is that \b uses some special characters as a separator too, right? Then is there a way to consider, for example 123456789. a whole string and the regex will not match with that example?
Thanks !

The word boundary \b requires a non-word character before and after a digit (as a digit is a word character). As dots and commas are non-word characters, they are allowed. To make sure the digit sequence between dots is not matched, you need to use lookarounds.
You can use
\b(?<!\.)[0-9]{9,12}(?!\.)\b
See the regex demo
The additional subpatterns are the lookbehind (?<!\.) and a lookahead (?!\.) that make sure there are no . before and after the digit sequence.
If you have . and , as decimal separators, you may want to adjust the pattern to
\b(?<![.,])[0-9]{9,12}(?![.,])\b

Regex to match comma separated string with no comma at the end of the line

I am trying to write a regex that will allow input of all characters on the keyboard(even space) but will restrict the input of comma at the end of the line. I have tried do this,that includes all the possible characters,but it still does not give me the correct output:
[RegularExpression("^([a-zA-Z0-9\t\n ./<>?;:\"'!##$%^&*()[]{}_+=|\\-]+,)*[a-zA-Z0-9\t\n ./<>?;:\"'!##$%^&*()[]{}_+=|\\-]+$", ErrorMessage = "Comma is not allowed at the end of {0} ")]

^.*[^,]$
.* means all char,don't need so long

^([a-zA-Z0-9\t\n ./<>?;:\"'!##$%^&*()[]{}_+=|\\-]+,)*[a-zA-Z0-9\t\n ./<>?;:\"'!##$%^&*()[]{}_+=|\\-]+(?<!,)$
^^
Just add lookbehind at the end.

a regex that will allow input of all characters on the keyboard(even space) but will restrict the input of comma at the end of the line.
Mind that you can type much more than what you typed using a keyboard. Basically, you want to allow any character but a comma at the end of the line.
So,
(?!,).(?=\r\n|\z)
This regex is checking each line (because of the (?=\r\n|$) look-ahead), and the (?!,) look-ahead makes sure the last character (that we match using .) is not a comma. \z is an unambiguous string end anchor.
See regex demo
This will work even on a client side.
To also get the full line match, you can just add .* at the beginning of the pattern (as we are not using singleline flag, . does not match newline symbols):
.*(?!,).(?=\r\n|\z)
Or (making it faster with an atomic group or an inline multiline option with ^ start of line anchor, but will not work on the client side)
(?>.*)(?!,).(?=\r\n|\z)
(?m)^.*?(?!,).(?=\r\n|\z) // The fastest of the last three
See demo

Regular expressions to extract punctuations/characters

Hi guys I need a regex that only extracts punctuations/characters.
I have this so far :
[._^%$#!~#,-]+
but this works if there is at least 1 punctuations and still allows for any other char (digit or letter)
I need to to only allow punctuations/characters

Try this:
^[\p{S}\p{P}]+$
\p{S} matches any symbol character, and \p{P} matches any punctuation.
Note that your pattern will not match all the symbols and punctuations not present in the list.

Try anchoring the regex to the start and the end of the string (unless you're using Multiline matching) - i.e. ^ at the beginning and $ at the end:
^[._^%$#!~#,-]+$
Note - this does not endorse your actual pattern (I can't say whether this is matching all the 'special characters' you're talking about, but it will make it so that the entire string must be all 'special'.

[^a-zA-Z0-9]* u can try something like this. Should NOT accept those chars, cba to writte all beside chars beside one u typed.

\W
Matches any character that is not a word character (alphanumeric & underscore).

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Allow double-byte space in Regex - c#

Related

Update regular expression to only allow single spaces [duplicate]

Regex for alpha number string in c# accepting underscore and white spaces

Regex to capture an exact word in a sentence

Regex to match comma separated string with no comma at the end of the line

Regular expressions to extract punctuations/characters

Categories

Resources