Regex only checks first character in string C# [duplicate] - c#

This question already has answers here:
What special characters must be escaped in regular expressions?
(13 answers)
Closed 4 years ago.
Why does the following method only check the first character in the supplied string?
public static bool IsUnicodeSms(string message)
{
var strMap = new Regex(#"^[#£$¥èéùìòÇØøÅå_ÆæßÉ!""#%&'()*+,-./0123456789:;<=>? ¡ABCDEFGHIJKLMNOPQRSTUVWXYZÄÖÑܧ¿abcdefghijklmnopqrstuvwxyzäöñüà^{}\[~]|€]+$");
return !strMap.IsMatch(message);
}
So for example the following string returns false: "abcლ" but "ლabc" returns true.

You have to escape ] with \] and also put the - at the end:
Change this:
var strMap = new Regex(#"^[#£$¥èéùìòÇØøÅå_ÆæßÉ!""#%&'()*+,-./0123456789:;<=>? ¡ABCDEFGHIJKLMNOPQRSTUVWXYZÄÖÑܧ¿abcdefghijklmnopqrstuvwxyzäöñüà^{}\[~]|€]+$");
To this:
var strMap = new Regex(#"^[#£$¥èéùìòÇØøÅå_ÆæßÉ!""#%&'()*+,./0123456789:;<=>? ¡ABCDEFGHIJKLMNOPQRSTUVWXYZÄÖÑܧ¿abcdefghijklmnopqrstuvwxyzäöñüà^{}\[~\]|€-]+$");
Btw, you can improve your regex and use:
var strMap = new Regex(#"^[#£$¥èéùìòÇØøÅå_ÆæßÉ!"#%&'()*+,./\w:;<=>? ¡ÄÖÑܧ¿äöñüà^{}\[~\]|€-]+$");
And not sure if using the ignore case flag might help you to shorten it a little more like this:
var strMap = new Regex(#"(?i)^[#£$¥èéùìòÇøå_Ææß!"#%&'()*+,./\w:;<=>? ¡§¿äöñüà^{}\[~\]|€-]+$");

You copied the code from here.
It's very flawed. It needs more escaping. From Regexp Tutorial - Character Classes or Character Sets:
the only special characters or metacharacters inside a character class are the closing bracket (]), the backslash (\), the caret (^), and the hyphen (-)
So, it needs to be:
new Regex(#"^[#£$¥èéùìòÇØøÅå_ÆæßÉ!""#%&'()*+,\-./0123456789:;<=>? ¡ABCDEFGHIJKLMNOPQRSTUVWXYZÄÖÑܧ¿abcdefghijklmnopqrstuvwxyzäöñüà^{}\[~\]|€]+$");
You can of course improve the regex even further like #Fede demonstrates.

Related

Regex not able to parse last group [duplicate]

This question already has answers here:
What is the difference between .*? and .* regular expressions?
(3 answers)
Closed 3 years ago.
This is the test. I expect the last group to be ".png", but this pattern returns "" instead.
var inputStr = #"C:\path\to\dir\[yyyy-MM-dd_HH-mm].png";
var pattern = #"(.*?)\[(.*?)\](.*?)";
var regex = new Regex(pattern);
var match = regex.Match(inputStr);
var thirdGroupValue = match.Groups[3].Value;
// ✓ EXPECTED: ".png"
// ✗ CURRENT: ""
The 1st and 2nd groups work fine.
This is because you made the * in Group 3 lazy:
(.*?)\[(.*?)\](.*?)
^
here
This means it will match as little as possible. What's the least .* can match? An empty string!
You can learn more about lazy vs greedy here.
You can fix this either by removing ?, making it greedy, or put a $ at the end, telling it to match until the end of the string:
(.*?)\[(.*?)\](.*)
or
(.*?)\[(.*?)\](.*?)$

Get value between parentheses [duplicate]

This question already has answers here:
How do I extract text that lies between parentheses (round brackets)?
(19 answers)
Closed 4 years ago.
I need to get the all strings that sit between open and closed parentheses. An example string is as follows
[CDATA[[(MyTag),xi(Tag2) ]OT(OurTag3).
The output needs to be an array with MyTag, Tag2, OurTag3 i.e. The strings need to have the parentheses removed.
The code below works but retains the parentheses. How do I adjust the regex pattern to remove the parentheses from the output?
string pattern = #"\(([^)]*)\)";
string MyString = "[CDATA[[(MyTag),xi(Tag2) ]OT(OurTag3)";
Regex re = new Regex(pattern);
foreach (Match match in re.Matches(MyString))
{
Console.WriteLine(match.Groups[1]); // print the captured group 1
}
You should be able to use the following:
(?<=\().+?(?=\))
(?<=() - positive lookbehind for (
.*? - non greedy match for the content
(?=)) - positive lookahead for )

C# Regular Expression For Specific Characters [duplicate]

This question already has answers here:
regex 'literal' meaning in regex
(1 answer)
How to make a regex match case insensitive?
(1 answer)
Closed 4 years ago.
I need to build regex dynamically, so I pass to my method a string of valid characters. Then I use that string to build regex in my method
string valid = "^m><"; // Note 1st char is ^ (special char)
string input = ...; //some string I want to check
Check(valid);
public void Check(string valid)
{
Regex reg = new Regex("[^" + valid + "]");
if (reg.Match(input).ToString().Length > 0)
{
throw new Exception(...);
}
}
I want above Match to throw exception if it finds any other character than characters provided by valid string above. But in my case, even if I dont have any other character tahn these 3, Check method still throws new exception.
What is wrong with this regex?
this resolved it, thanks to everyone for help
Regex reg = new Regex("[^" + valid + "]", RegexOptions.IgnoreCase);

Replace text place holders with Regular Expression [duplicate]

This question already has answers here:
Extract string between braces using RegEx, ie {{content}}
(3 answers)
Closed 6 years ago.
I have a text template that has text variables wrapped with {{ and }}.
I need a regular expression to gives me all the matches that "Include {{ and }}".
For example if I have {{FirstName}} in my text I want to get {{FirstName}} back as a match to be able to replace it with the actual variable.
I already found a regular expression that probably gives me what is INSIDE { and } but I don't know how can I modify it to return what I want.
/\{([^)]+)\}/
This pattern should do the trick:
string str = "{{FirstName}} {{LastName}}";
Regex rgx = new Regex("{{.*?}}");
foreach (var match in rgx.Matches(str))
{
// {{FirstName}}
// {{LastName}}
}
Maybe:
alert(/^\{{2}[\w|\s]+\}{2}$/.test('{{FirstName}}'))
^: In the beginning.
$: In the end.
\{{2}: Character { 2 times.
[\w|\s]+: Alphabet characters or whitespace 1 or more times.
\}{2}: Character } 2 times.
UPDATE:
alert(/(^\{{2})?[\w|\s]+(\}{2})?$/.test('FirstName'))

Escaping characters in RegEx pattern string [duplicate]

This question already has answers here:
Escape Special Character in Regex
(3 answers)
Closed 7 years ago.
I'm trying to extract MTQ0ODQ3NjcyNDoxNDQ4NDc2NzI0OjE6LTM4OTc1OTc2MjM4MDc1OTM2NjY6MTQ0ODQ3NjAwMzowOjA6NTQw from the string below.
I am having issues with the \\ (backslash) characters. How do I escape these in C#. Is there any documentation that shows characters that need escaping in regex patterns, and how to escape them?
first_cursor\\":\\"MTQ0ODQ3NjcyNDoxNDQ4NDc2NzI0OjE6LTM4OTc1OTc2MjM4MDc1OTM2NjY6MTQ0ODQ3NjAwMzowOjA6NTQw\\"
I've tried the following to no avail. I tried to avoid having to escape the backslashes altogether:
MatchCollection matches = Regex.Matches(content, "first_cursor*.quot;([-0-9A-Za-z]+)");
Any help would be much appreciated.
In C# each backslash in a string can be written as \\\\.
You can use the following:
MatchCollection matches = Regex.Matches(content, "first_cursor\\\\{2}":\\\\{2}&quot([-0-9A-Za-z]+)");
I prefer to use verbatim string literals when writing RegEx strings in C#:
string pattern = #"first_cursor\\\\":\\\\"([-0-9A-Za-z]+)\\\\"";
This prevents you from having to escape the slashes twice; once for C# and again for the RegEx engine.
As an aside, this syntax is also useful when storing paths in strings:
string logFile = #"C:\Temp\mylog.txt";
And even supports multi-line for SQL commands and such:
string query = #"
SELECT *
FROM tblStudents
WHERE FirstName = 'Bobby'
AND LastName = 'Tables'
";
You can use lookahead to elimate some of the contenders:
var example = #"first_cursor\\":\\"MTQ0ODQ3NjcyNDoxNDQ4NDc2NzI0OjE6LTM4OTc1OTc2MjM4MDc1OTM2NjY6MTQ0ODQ3NjAwMzowOjA6NTQw\\"";
var regex = new Regex("(?<!&[-0-9A-Za-z]*)(?<!_[-0-9A-Za-z]*)[-0-9A-Za-z]+");
var matches = regex.Matches(example);
foreach(var match in matches)
{
if (match.ToString() != "first")
{
Console.WriteLine(match);
}
}
This would give you two matches. One for first and one for the string you are looking for. Then you can iterate over the matches and see if it's not "first" then it should be what you are looking for.

Categories