Regex - Get digits after a colon - c#

I have a regex:
var topPayMatch = Regex.Match(result, #"(?<=Top Pay)(\D*)(\d+(?:\.\d+)?)", RegexOptions.IgnoreCase);
And I have to convert this to int which I did
topPayMatch = Convert.ToInt32(topPayMatchString.Groups[2].Value);
So now...
Top Pay: 1,000,000 then it currently grabs the first digit, which is 1. I want all 1000000.
If Top Pay: 888,888 then I want all 888888.
What should I add to my regex?

You can use something as simple like #"(?<=Top Pay: )([0-9,]+)". Note that, decimals will be ignored with this regex.
This will match all numbers with their commas after Top Pay:, which after you can parse it to an integer.
Example:
Regex rgx = new Regex(#"(?<=Top Pay: )([0-9,]+)");
string str = "Top Pay: 1,000,000";
Match match = rgx.Match(str);
if (match.Success)
{
string val = match.Value;
int num = int.Parse(val, System.Globalization.NumberStyles.AllowThousands);
Console.WriteLine(num);
}
Console.WriteLine("Ended");
Source:
Convert int from string with commas

If you use the lookbehind, you don't need the capture groups and you can move the \D* into the lookbehind.
To get the values, you can match 1+ digits followed by optional repetitions of , and 1+ digits.
Note that your example data contains comma's and no dots, and using ? as a quantifier means 0 or 1 time.
(?<=Top Pay\D*)\d+(?:,\d+)*
The pattern matches:
(?<=Top Pay\D*) Positive lookbehind, assert what is to the left is Top Pay and optional non digits
\d+ Match 1+ digits
(?:,\d+)* Optionally repeat a , and 1+ digits
See a .NET regex demo and a C# demo
string pattern = #"(?<=Top Pay\D*)\d+(?:,\d+)*";
string input = #"Top Pay: 1,000,000
Top Pay: 888,888";
RegexOptions options = RegexOptions.IgnoreCase;
foreach (Match m in Regex.Matches(input, pattern, options))
{
var topPayMatch = int.Parse(m.Value, System.Globalization.NumberStyles.AllowThousands);
Console.WriteLine(topPayMatch);
}
Output
1000000
888888

Related

Regex how can I merge execution

I have the following buffer returning from a .textContent
Latitude
32,6549581304256
Longitude
-16,9288643331225
I fixed the whitespaces with
dwText = Regex.Replace( dwText, #"\s{2,}", "\n"); resulting in
Latitude
32,6549581304256
Longitude
-16,9288643331225
I then transformed this new output to my needs by
dwText = Regex.Replace( dwText, #"(Latitude|Longitude)(.*)\n", "$1: "); resulting in
Latitude: 32,6549581304256
Longitude: -16,9288643331225
My question is can i do these 2 lines in one go?
dwText = Regex.Replace( dwText, #"\s{2,}", "\n");
dwText = Regex.Replace( dwText, #"(Latitude|Longitude)(.*)\n", "$1: ");
I would appreciate some help on how this can be achieved more efficiently, thank you.
Try the following (with i flag),
[\S\s]*?([a-z]+)[\S\s]*?([-\d,]+)[\S\s]*?
Replacement: $1: $2\n
C# Regex Demo
Explanation
[\S\s]*? - matches anything lazily.
[a-z]+ (first capture group) - matches alphabetical words, case insensitive.
[-\d,]+ (second capturing group) - matches digits, - (hyphen) and , (comma)
You can match the whitespace chars around the Latitude and Longitude and capture the values in 2 groups and use those 2 groups in the replacement.
\s*\b(Latitude|Longitude)\s*(-?[0-9]+(?:,[0-9]+)?)\b
Explanation
\s* Match 0+ whitespace chars
\b(Latitude|Longitude) A word boundary, capture either latitude or Longitude in group 1
\s* Match 0+ whitespace chars
(-?[0-9]+(?:,[0-9]+)?) Capture group 2, match optional -, 1+ digits with an optional decimal part
\b A word boundary
Replace with:
$1: $2\n
.Net regex demo
Why not parse the values out and then extract them to do what is needed with them?
By using match group named captures (?<{NameHere}> ) one can organize and then extract the data.
Example white space shortened, but it works across lines and with the original example:
var data = " Latitude 32,6549 Longitude -16,9288 ";
var pattern = #"[^\d]+(?<Lat>[\d,]+)[^\d]+(?<Long>[\d,]+)";
var mtch = Regex.Match(data, pattern);
Console.WriteLine($"Latitude: {mtch.Groups["Lat"].Value} Longitude: {mtch.Groups["Long"].Value}");
// Latitude: 32,6549 Longitude: 16,9288

match.regex syntax with digit character and a #

i have a string with this format :
111111#1
the number of digit character is 5 or 6 and after that i set a '#' and also set a digit charcter.
i use Regex.IsMatch like this :
if (Regex.IsMatch(string, #"^d{6}#\d{1}"))
{...}
but it cant handle my string
what is my mistake?
You're missing the backslash on the first d so it's not matching against digits:
Regex.IsMatch("111111#1", #"^\d{6}#\d{1}")
This single line Regex will capture two groups: the leading five to six digits and the '#' followed by a single digit:
(\d{5,6})(#\d{1})
Example:
string pattern = #"(\d{5,6})(#\d{1})";
string input = "111111#1";
MatchCollection matches = Regex.Matches(input, pattern);
foreach (Match match in matches)
{
var firstGroupValue = match.Groups[1]; // "111111"
var secondGroupValue = match.Groups[2]; // "#1"
}

Match words in string using negative lookbehind

I try to get words which don't start with "un" using pattern with negative lookbehind. This is the code:
using Regexp = System.Text.RegularExpressions.Regex;
using RegexpOptions = System.Text.RegularExpressions.RegexOptions;
string quote = "Underground; round; unstable; unique; queue";
Regexp negativeViewBackward = new Regexp(#"(?<!un)\w+\b", RegexpOptions.IgnoreCase);
MatchCollection finds = negativeViewBackward.Matches(quote);
Console.WriteLine(String.Join(", ", finds));
It always returns full set of words, but should return only round, queue.
The (?<!un)\w+\b first matches a location that is not preceded with un (with the negative lookbehind), then matches 1 or more word chars followed with a word boundary position.
You need to use a negative lookahead after a leading word boundary:
\b(?!un)\w+\b
See the regex demo.
Details
\b - leading word boundary
(?!un) - a negative lookahead that fails the match if the next two word chars are un
\w+ - 1+ word chars
\b - a trailing word boundary.
C# demo:
string quote = "Underground; round; unstable; unique; queue";
Regex negativeViewBackward = new Regex(#"\b(?!un)\w+\b", RegexOptions.IgnoreCase);
List<string> result = negativeViewBackward.Matches(quote).Cast<Match>().Select(x => x.Value).ToList();
foreach (string s in result)
Console.WriteLine(s);
Output:
round
queue

How to match a specific sentence with Regex

I'm new to Regex and I couldn't cope with matching this sort of sentence: Band Name #Venue 30 450, where the digits at the end represent price and quantity.
string input = "Band Name #City 25 3500";
Match m = Regex.Match(input, #"^[A-Za-z]+\s+[A-Za-z]+\s+[\d+]+\s+[\d+]$");
if (m.Success)
{
Console.WriteLine("Success!");
}
You can use Regex and leverage usage of named groups. This will make easier to extract data later if you need them. Example is:
string pattern = #"(Band) (?<Band>[A-Za-z ]+) (?<City>#[A-Za-z ]+) (?<Price>\d+) (?<Quantity>\d+)";
string input = "Band Name #City 25 3500";
Match match = Regex.Match(input, pattern);
Console.WriteLine(match.Groups["Band"].Value);
Console.WriteLine(match.Groups["City"].Value.TrimStart('#'));
Console.WriteLine(match.Groups["Price"].Value);
Console.WriteLine(match.Groups["Quantity"].Value);
If you looked at the pattern there are few regex groups which are named ?<GroupName>. It is just a basic example which can be tweaked as well to fulfill you actual needs.
This one should work:
[A-Za-z ]+ [A-Za-z ]+ #[A-Za-z ]+ \d+ \d+
Can test it here.
With your code it'd be:
string input = "Band Name #City 25 3500";
Match m = Regex.Match(input, "[A-Za-z ]+ [A-Za-z ]+ #[A-Za-z ]+ \d+ \d+");
if (m.Success)
{
Console.WriteLine("Success!");
}
Here is a very old and elaborated way : 1st way
string re1=".*?"; // Here the part before #
string re2="(#)"; // Any Single Character 1
string re3="((?:[a-z][a-z]+))"; // Word 1, here city
string re4="(\\s+)"; // White Space 1
string re5="(\\d+)"; // Integer Number 1, here 25
string re6="(\\s+)"; // White Space 2
string re7="(\\d+)"; // Integer Number 2, here 3500
Regex r = new Regex(re1+re2+re3+re4+re5+re6+re7,RegexOptions.IgnoreCase|RegexOptions.Singleline);
Match m = r.Match(txt);
if (m.Success)
{
String c1=m.Groups[1].ToString();
String word1=m.Groups[2].ToString();
String ws1=m.Groups[3].ToString();
String int1=m.Groups[4].ToString();
String ws2=m.Groups[5].ToString();
String int2=m.Groups[6].ToString();
Console.Write("("+c1.ToString()+")"+"("+word1.ToString()+")"+"("+ws1.ToString()+")"+"("+int1.ToString()+")"+"("+ws2.ToString()+")"+"("+int2.ToString()+")"+"\n");
}
In the above way you can store the specific values at a time. Like in your group[6] there is 3500 or what value in this format.
you can create your own regex here : Regex
And in a short, others given answers are right. 2nd way
just create the regex with
"([A-Za-z ]+) ([A-Za-z ]+) #([A-Za-z ]+) (\d+) (\d+)"
And match with any string format. you can create you won regex and test here: Regex Tester
That is the answer to what I was trying to do:
string input = "Band Name #Location 25 3500";
Match m = Regex.Match(input, #"([A-Za-z ]+) (#[A-Za-z ]+) (\d+) (\d+)");
if (m.Success)
{
Console.WriteLine("Success!");
}

Matching a pattern in a string

I have a string
string str = "I am fine. How are you? You need exactly 4 pieces of sandwiches. Your ADAST Count is 5. Okay thank you ";
What I want is, get the ADAST count value. For the above example, it is 5.
The problem here is, the is after the ADAST Count. It can be is or =. But there will the two words ADAST Count.
What I have tried is
var resultString = Regex.Match(str, #"ADAST\s+count\s+is\s+\d+", RegexOptions.IgnoreCase).Value;
var number = Regex.Match(resultString, #"\d+").Value;
How can I write the pattern which will search is or = ?
You may use
ADAST\s+count\s+(?:is|=)\s+(\d+)
See the regex demo
Note that (?:is|=) is a non-capturing group (i.e. it is used to only group alternations without pushing these submatches on to the capture stack for further retrieval) and | is an alternation operator.
Details:
ADAST - a literal string
\s+ - 1 or more whitespaces
count - a literal string
\s+ - 1 or more whitespaces
(?:is|=) - either is or =
\s+ - 1 or more whitespaces
(\d+) - Group 1 capturing one or more digits
C#:
var m = Regex.Match(s, #"ADAST\s+count\s+(?:is|=)\s+(\d+)", RegexOptions.IgnoreCase);
if (m.Success) {
Console.Write(m.Groups[1].Value);
}

Categories