what's wrong with this regular expression - c#

I'm doing some experiments with regular expressions and I don't know why the regex don't match.
string line is one line from a file. A line which should match is this
["boxusers:settings/user[boxuser11]/name"] = "username",
The number of the boxuser and the value could be different, so I tried to find a regular expression
My code is this:
string user;
string patternUser = "[\"boxusers:settings/user[boxuser\\d{2,}]/name\"] = \"";
if (Regex.Match(line,patternUser).Success)
user = Regex.Replace(Regex.Replace(line, patternUser, String.Empty), ",*", String.Empty);
So I think \d{2,0} should be a number with two digits and the rest is just the same. But the regex just don't match.
What's going wrong?

Square brackets have a special significance in regular expressions. You need to escape them with a backslash.
var line = #"[""boxusers:settings/user[boxuser11]/name""] = ""username"", ";
string patternUser = #"\[""boxusers:settings/user\[boxuser\d{2,}\]/name""\] = """;
Console.WriteLine(Regex.Match(line, patternUser).Success);
If you don't want to use verbatim strings, you'll need to use two backslashes to escape each regex metacharacter (the first to escape the second).

Related

Replace specific repeating characters from a string

I have a string like "aaa\\\\\\\\test.txt".
How do I replace all the repeating \\ characters by a single \\?
I have tried
pPath = new Regex("\\{2,}").Replace(pPath, Path.DirectorySeparatorChar.ToString());
which matches on http://regexstorm.net/tester but doesn't seem to do the trick in my program.
I'm running this on Windows so the Path.DirectorySeparatorChar is a \\.
Use new Regex(#"\\{2,}") and the rest the same.
You need to actually leave the backslash escaped in your regular expression, so you need to produce a string with two backslashes in it. The two equivalent techniques to produce the correct C# string literal are #"\\{2,}" or "\\\\{2,}"
Both of those string literals are the string \\{2,}, which is the correct regular expression. Your regular expression calls for one backslash character occurring two times, and you have to escape the backslash character. At the risk of being pedantic, if you wanted to replace two a characters, you would use the regular expression a{2,} and if you want to replace to \ characters, you would use the regular expression \\{2,} because \\ is the regular expression that matches a single \. Clear as mud? :)
Not being a demi-god at regex, I would use StringBuilder and do something like this:
string txt = "";
int count = 0;
StringBuilder bldr = new StringBuilder();
foreach(char c in txt)
{
if (c == '\')
{
count++;
if (count < 3)
{
bldr.Append(c);
}
}
else
{
count = 0;
bldr.Append(c);
}
}
string result = bldr.ToString();

Removing special characters using Regex in C#

I have one problem in this code. I want to remove all special characters but the square brackets are not getting removed.
string regExp = "[\\\"]";
string tmp = Regex.Replace(str, regExp," ");
string[] strArray = tmp.Split(',');
obj.amcid = db.Execute("select MAX(amcid)+1 from sca_amcmaster");
foreach (string i in strArray)
{
// int myInts = int.Parse(i);
db.Execute(";EXEC insertitems1 #0,#1", i, obj.invoiceno);
}
Square Brackets are metacharacters in Regular Expressions, which allow us to define list of things. So if you want to match then using Regex you need to change your expression to:
string regExp = "\[\\\"\]";
Therefore, you simply need to include the backslashes before the square brackets to match then too.
If none of them are required in the expression, you can group then using brackets, and the character ? (zero or more matches):
string regExp = "(\[)?(\\)?(\")?(\])?";

C# Regex, match but not include the first character before matched string

How can I make this C# Regex to not include the first character before the URL in the matching results:
((?!\").)https?:\/\/twitter\.com\/(?:#!\/)?(\w+)\/status(?:es)?\/(\d+)
This will match:
Xhttps://twitter.com/oppomobileindia/status/798397636780953600
Notice the first X letter.
I want it to match the URLs that start without double quotes. Also not include the first character before the https for those URLs that do not start with double quotes.
An actual example that I use in my code:
var str = "<div id=\"content\">
<p>https://twitter.com/oppomobileindia/status/798397636780953600</p>
<p>\"https://twitter.com/oppomobileindia/status/11111111111111111111</p></div>";
var pattern = #"(?<!""')https?://twitter\.com/(?:#!/)?(\w+)/status(?:es)?/(\d+)";//
var rgx = new Regex(pattern);
var results = rgx.Replace(str, "XXX");
In the above example, only the first URL should be replaces, because the second one has double quotation before the URL. It also should be replaced at the exact match, without the first letter before the matches string.
Use a (?<!") negative lookbehind:
var re = #"(?<!"")https?://twitter\.com/(?:#!/)?(\w+)/status(?:es)?/(\d+)";
The (?<!") means that there cannot be a " immediately before the current location.
In C#, you do not need to escape / inside the pattern since regex delimiters are not used when defining the regex.
Note on the C# syntax: if you want to define a " inside a verbatim string literal, double it. In a regular string literal, escape the " and \:
var re = "(?<!\")https?://twitter\\.com/(?:#!/)?(\\w+)/status(?:es)?/(\\d+)";

problem in regular expression

I am having a regular expression
Regex r = new Regex(#"(\s*)([A|B|C|E|G|H|J|K|L|M|N|P|R|S|T|V|Y|X]\d(?!.*[DFIOQU])(?:[A-Z](\s?)\d[A-Z]\d))(\s*)",RegexOptions.IgnoreCase);
and having a string
string test="LJHLJHL HJGJKDGKJ JGJK C1C 1C1 LKJLKJ";
I have to fetch C1C 1C1.This running fine.
But if a modify test string as
string test="LJHLJHL HJGJKDGKJ JGJK C1C 1C1 ON";
then it is unable to find the pattern i.e C1C 1C1.
any idea why this expression is failing?
You have a negative look ahead:
(?!.*[DFIOQU])
That matches the "O" in "ON" and since it is a negative look ahead, the whole pattern fails. And, as an aside, I think you want to replace this:
[A|B|C|E|G|H|J|K|L|M|N|P|R|S|T|V|Y|X]
With this:
[A-CEGHJ-NPR-TVYX]
A pipe (|) is a literal character inside a character class, not an alternation, and you can use ranges to help hilight the characters that you're leaving out.
A single regex might not be the best way to parse that string. Or perhaps you just need a looser regex.
You are searching for a not a following DFIOQU with your negative look ahead (?!.*[DFIOQU])
In your second string there is a O at the end in ON, so it must be failing to match.
If you remove the .* in your negative look ahead it will only check the directly following character and not the complete string to the end (Is it this what you want?).
\s*([ABCEGHJKLMNPRSTVYX]\d(?![DFIOQU])(?:[A-Z]\s?\d[A-Z]\d))\s*
then it works, see it here on Regexr. It is now checking if there is not one of the characters in the class directly after the digit, I don't know if this is intended.
Btw. I removed the | from your first character class, its not needed and also some brackets around your whitespaces, also not needed.
As I understood you need to find the C1C 1C1 text in your string
I've used this regex for do this
string strRegex = #"^.*(?<c1c>C1C)\s*(?<c1c2>1C1).*$";
after that you can extract text from named groups
string strRegex = #"^.*(?<c1c>C1C)\s*(?<c1c2>1C1).*$";
RegexOptions myRegexOptions = RegexOptions.Multiline;
Regex myRegex = new Regex(strRegex, myRegexOptions);
string strTargetString = #"LJHLJHL HJGJKDGKJ JGJK C1C 1C1 LKJLKJ";
string secondStr = "LJHLJHL HJGJKDGKJ JGJK C1C 1C1 ON";
Match match = myRegex.Match(strTargetString);
string c1c = match.Groups["c1c"].Value;
string c1c2 = match.Groups["c1c2"].Value;
Console.WriteLine(c1c + " " +c1c2);

Regex to match alphanumeric and spaces

What am I doing wrong here?
string q = "john s!";
string clean = Regex.Replace(q, #"([^a-zA-Z0-9]|^\s)", string.Empty);
// clean == "johns". I want "john s";
just a FYI
string clean = Regex.Replace(q, #"[^a-zA-Z0-9\s]", string.Empty);
would actually be better like
string clean = Regex.Replace(q, #"[^\w\s]", string.Empty);
This:
string clean = Regex.Replace(dirty, "[^a-zA-Z0-9\x20]", String.Empty);
\x20 is ascii hex for 'space' character
you can add more individual characters that you want to be allowed.
If you want for example "?" to be ok in the return string add \x3f.
I got it:
string clean = Regex.Replace(q, #"[^a-zA-Z0-9\s]", string.Empty);
Didn't know you could put \s in the brackets
The following regex is for space inclusion in textbox.
Regex r = new Regex("^[a-zA-Z\\s]+");
r.IsMatch(textbox1.text);
This works fine for me.
I suspect ^ doesn't work the way you think it does outside of a character class.
What you're telling it to do is replace everything that isn't an alphanumeric with an empty string, OR any leading space. I think what you mean to say is that spaces are ok to not replace - try moving the \s into the [] class.
There appear to be two problems.
You're using the ^ outside a [] which matches the start of the line
You're not using a * or + which means you will only match a single character.
I think you want the following regex #"([^a-zA-Z0-9\s])+"
bottom regex with space, supports all keyboard letters from different culture
string input = "78-selim güzel667.,?";
Regex regex = new Regex(#"[^\w\x20]|[\d]");
var result= regex.Replace(input,"");
//selim güzel
The circumflex inside the square brackets means all characters except the subsequent range. You want a circumflex outside of square brackets.
This regex will help you to filter if there is at least one alphanumeric character and zero or more special characters i.e. _ (underscore), \s whitespace, -(hyphen)
string comparer = "string you want to compare";
Regex r = new Regex(#"^([a-zA-Z0-9]+[_\s-]*)+$");
if (!r.IsMatch(comparer))
{
return false;
}
return true;
Create a set using [a-zA-Z0-9]+ for alphanumeric characters, "+" sign (a quantifier) at the end of the set will make sure that there will be at least one alphanumeric character within the comparer.
Create another set [_\s-]* for special characters, "*" quantifier is to validate that there can be special characters within comparer string.
Pack these sets into a capture group ([a-zA-Z0-9]+[_\s-]*)+ to say that the comparer string should occupy these features.
[RegularExpression(#"^[A-Z]+[a-zA-Z""'\s-]*$")]
Above syntax also accepts space

Categories