Extracting digits from string - c#

I'm trying to extract some digits from a string: foo=bar&hash=00000690821388874159\";\n
I tried making a group for the digit, but it always return an empty string.
string matchString = Regex.Match(textBox1.Text, #"hash=(\d+)\\").Groups[1].Value;
I never use regex, so please tell me what I'm missing here.

There is no \\ in your string, the \ is in fact used to escape a quote so that's why the regex doesn't match. This works:
string matchString = Regex.Match(textBox1.Text, #"hash=(\d+)""").Groups[1].Value;
http://dotnetfiddle.net/2U0lkI

Related

C# regex not matching my string

I have a regex string:
string regex =
"\"\\d*\",\"(?<url>\\w|\\d|[().,-–_'])\".*";
And a string I want to match it against:
string line =
"\"4\",\"1800_in_sports\",\"24987709\",\"\",\"1906\",\"20171028152258\"";
When I try to get the url category, or even check for a match, there is no match:
var result = Regex.Match(line, regex);
string output = result.Groups["url"].Value;
If i try Regex.IsMatch(..) it also returns false.
I used http://regexstorm.net/tester to test this and it works there, but, not when I run the code.
In RegexStorm I used the pattern:
"\d{1,3}","(?<url>\w|\d|\n|[().,-–_'])+?"
Replace \\d with just \d and \\w with just \w.
As Dour High Arch mentioned, verbatim string should be used. Adding double quotes in front of double quotes allows for verbatim strings.
Changing string regex to:
string regex =
#"""\d{1,3}"",""(?<url>\w|\d|\n|[().,-–_''])+?""";
Now returns a match.

what's wrong with this regular expression

I'm doing some experiments with regular expressions and I don't know why the regex don't match.
string line is one line from a file. A line which should match is this
["boxusers:settings/user[boxuser11]/name"] = "username",
The number of the boxuser and the value could be different, so I tried to find a regular expression
My code is this:
string user;
string patternUser = "[\"boxusers:settings/user[boxuser\\d{2,}]/name\"] = \"";
if (Regex.Match(line,patternUser).Success)
user = Regex.Replace(Regex.Replace(line, patternUser, String.Empty), ",*", String.Empty);
So I think \d{2,0} should be a number with two digits and the rest is just the same. But the regex just don't match.
What's going wrong?
Square brackets have a special significance in regular expressions. You need to escape them with a backslash.
var line = #"[""boxusers:settings/user[boxuser11]/name""] = ""username"", ";
string patternUser = #"\[""boxusers:settings/user\[boxuser\d{2,}\]/name""\] = """;
Console.WriteLine(Regex.Match(line, patternUser).Success);
If you don't want to use verbatim strings, you'll need to use two backslashes to escape each regex metacharacter (the first to escape the second).

Regex.IsMatch is not working when text including "$"

Regex.IsMatch method returns the wrong result while checking the following condition,
string text = "$0.00";
Regex compareValue = new Regex(text);
bool result = compareValue.IsMatch(text);
The above code returns as "False". Please let me know if i missed anything.
The Regex class has a special method for escaping characters in a pattern: Regex.Escape()
Change your code like this:
string text = "$0.00";
Regex compareValue = new Regex(Regex.Escape(text)); // Escape characters in text
bool result = compareValue.IsMatch(text);
"$" is a special character in C# regex. Escape it first.
Regex compareValue = new Regex(#"\$0\.00");
bool result = compareValue.IsMatch("$0.00");
Regex expressions: https://msdn.microsoft.com/en-us/library/az24scfc(v=vs.110).aspx
Both '.' and '$' are special characters and thus you need to escape them if you want to match the character itself. '.' matches any character and '$' matches the end of a string
see: https://regex101.com/r/pK2uY6/1
You have to escape $ since it is a special (reserved) character which means "end of string". In case . means just dot (say, decimal separator) you have to escape it as well (when not escaped, . means "any symbol"):
string pattern = #"\$0\.00";
bool result = RegEx.IsMatch(text, pattern);
As for your original pattern, it has no chance to match any string, since $0.00 means
$ end of string, followed by
0 zero
. any character
0 zero
0 zero
but end of string can't be followed by...

C# Regex for retrieving capital string in quotation mark

Given a string, I want to retrieve a string that is in between the quotation marks, and that is fully capitalized.
For example, if a string of
oqr"awr"q q"ASRQ" asd "qIKQWIR"
has been entered, the regex would only evaluate "ASRQ" as matching string.
What is the best way to approach this?
Edit: Forgot to mention the string takes a numeric input as well I.E: "IO8917AS" is a valid input
EDIT: If you actually want "one or more characters, and none of the characters is a lower-case letter" then you probably want:
Regex regex = new Regex("\"\\P{Ll}+\"");
That will then allow digits as well... and punctuation. If you want to allow digits and upper case letters but nothing else, you can use:
Regex regex = new Regex("\"[\\p{Lu}\\d]+\"");
Or in verbatim string literal form (makes the quotes more confusing, but the backslashes less so):
Regex regex = new Regex(#"""[\p{Lu}\d]+""");
Original answer (before digits were required)
Sounds like you just want (within the pattern)
"[A-Z]*"
So something like:
Regex regex = new Regex("\"[A-Z]*\"");
Or for full Unicode support, use the Lu Unicode character category:
Regex regex = new Regex("\"\\p{Lu}*\"");
EDIT: As noted, if you don't want to match an empty string in quotes (which is still "a string where everything is upper case") then use + instead of *, e.g.
Regex regex = new Regex("\"\\p{Lu}+\");
Short but complete example of finding and displaying the first match:
using System;
using System.Text.RegularExpressions;
class Program
{
public static void Main()
{
Regex regex = new Regex("\"\\p{Lu}+\"");
string text = "oqr\"awr\"q q\"ASRQ\" asd \"qIKQWIR\"";
Match match = regex.Match(text);
Console.WriteLine(match.Success); // True
Console.WriteLine(match.Value); // "ASRQ"
}
}
Like this:
"\"[A-Z]+\""
The outermost quotes are not part of the regex, they delimit a C# string.
This requires at least one uppercase character between quotes and works for the English language.
Please try the following:
[\w]*"([A-Z0-9]+)"

problem in regular expression

I am having a regular expression
Regex r = new Regex(#"(\s*)([A|B|C|E|G|H|J|K|L|M|N|P|R|S|T|V|Y|X]\d(?!.*[DFIOQU])(?:[A-Z](\s?)\d[A-Z]\d))(\s*)",RegexOptions.IgnoreCase);
and having a string
string test="LJHLJHL HJGJKDGKJ JGJK C1C 1C1 LKJLKJ";
I have to fetch C1C 1C1.This running fine.
But if a modify test string as
string test="LJHLJHL HJGJKDGKJ JGJK C1C 1C1 ON";
then it is unable to find the pattern i.e C1C 1C1.
any idea why this expression is failing?
You have a negative look ahead:
(?!.*[DFIOQU])
That matches the "O" in "ON" and since it is a negative look ahead, the whole pattern fails. And, as an aside, I think you want to replace this:
[A|B|C|E|G|H|J|K|L|M|N|P|R|S|T|V|Y|X]
With this:
[A-CEGHJ-NPR-TVYX]
A pipe (|) is a literal character inside a character class, not an alternation, and you can use ranges to help hilight the characters that you're leaving out.
A single regex might not be the best way to parse that string. Or perhaps you just need a looser regex.
You are searching for a not a following DFIOQU with your negative look ahead (?!.*[DFIOQU])
In your second string there is a O at the end in ON, so it must be failing to match.
If you remove the .* in your negative look ahead it will only check the directly following character and not the complete string to the end (Is it this what you want?).
\s*([ABCEGHJKLMNPRSTVYX]\d(?![DFIOQU])(?:[A-Z]\s?\d[A-Z]\d))\s*
then it works, see it here on Regexr. It is now checking if there is not one of the characters in the class directly after the digit, I don't know if this is intended.
Btw. I removed the | from your first character class, its not needed and also some brackets around your whitespaces, also not needed.
As I understood you need to find the C1C 1C1 text in your string
I've used this regex for do this
string strRegex = #"^.*(?<c1c>C1C)\s*(?<c1c2>1C1).*$";
after that you can extract text from named groups
string strRegex = #"^.*(?<c1c>C1C)\s*(?<c1c2>1C1).*$";
RegexOptions myRegexOptions = RegexOptions.Multiline;
Regex myRegex = new Regex(strRegex, myRegexOptions);
string strTargetString = #"LJHLJHL HJGJKDGKJ JGJK C1C 1C1 LKJLKJ";
string secondStr = "LJHLJHL HJGJKDGKJ JGJK C1C 1C1 ON";
Match match = myRegex.Match(strTargetString);
string c1c = match.Groups["c1c"].Value;
string c1c2 = match.Groups["c1c2"].Value;
Console.WriteLine(c1c + " " +c1c2);

Categories