How to get two numerical values from a string in C# - c#

I have a string like this :
X LIMITED COMPANY (52100000/58447000)
I want to extract X LIMITED COMPANY, 52100000 and 58447000 seperately.
I'm extracting X LIMITED COMPANY like this :
companyName = Regex.Match(mystring4, #"[a-zA-Z\s]+").Value.Trim();
But I'm stuck with extracting numbers, they can be 1, 2 or large numbers in the example. Can you show me how to extract those numbers? Thanks.

Try regular expressions with alternative | (or):
Either word symbols (but not digits) [\w-[\d]][\w\s-[\d]]+)
Digits only ([0-9]+)
E.g.
string mystring4 = #"AKASYA CAM SANAYİ VE TİCARET LİMİTED ŞİRKETİ(52100000 / 58447000)";
string[] values = Regex
.Matches(mystring4, #"([\w-[\d]][\w\s-[\d]]+)|([0-9]+)")
.OfType<Match>()
.Select(match => match.Value.Trim())
.ToArray();
Test
// X LIMITED COMPANY
// 52100000
// 58447000
Console.Write(string.Join(Environment.NewLine, values));
I suggested changing the initial pattern [a-zA-Z\s]+ into [a-zA-Z][a-zA-Z\s]+ in order to skip matches which contain separators only (e.g. " ")

Try using named groups:
var s = "X LIMITED COMPANY (52100000 / 58447000)";
var regex = new Regex(#"(?<CompanyName>[^\(]+)\((?<Num1>\d+)\s*/\s*(?<Num2>\d+)\)");
var match = regex.Match(s);
var companyName = match.Groups["CompanyName"];

If the format is fixed, you could try this:
var regex = new Regex(#"^(?<name>[^\(]+)\((?<n1>\d+)/(?<n2>\d+)\)");
var match = regex.Match(input);
var companyName = match.Groups["name"].Value;
var number1 = Convert.ToInt64(match.Groups["n1"].Value);
var number2 = Convert.ToInt64(match.Groups["n2"].Value);
This matches everything up to the open parentheses and puts it into a named group "name". Then it matches two numbers within parentheses, separated by "/" and puts them into groups named "n1" and "n2" respectively.

Related

Get number between characters in Regex

Having difficulty creating a regex.
I have this text:
"L\":0.01690502,\"C\":0.01690502,\"V\":33.76590433"
I need only the number after C\": extracted, this is what I currently have.
var regex = new Regex(#"(?<=C\\"":)\d +.\d + (?=\s *,\\)");
var test = regex.Match(content).ToString();
decimal.TryParse(test, out decimal closingPrice);
To extract the number after C\":, you can capture (\d+.\d+) in a group:
C\\":(\d+.\d+)
You could also use a positive lookbehind:
(?<=C\\":)\d+.\d+
You can use this code to fetch all pairs of letter and number.
var regex = new Regex("(?<letter>[A-Z])[^:]+:(?<number>[^,\"]+)");
var input = "L\":0.01690502,\"C\":0.01690502,\"V\":33.76590433";
var matches = regex.Matches(input).Cast<Match>().ToArray();
foreach (var match in matches)
Console.WriteLine($"Letter: {match.Groups["letter"].Value}, number: {match.Groups["number"].Value}");
If you only need only number from "C" letter you can use this linq expression:
var cNumber = matches.FirstOrDefault(m => m.Groups["letter"].Value == "C")?.Groups["number"].Value ?? "";
Regex explanation:
(?<letter>[A-Z]) // capture single letter
[^:]+ // skip all chars until ':'
: // colon
(?<number>[^,"]+) // capture all until ',' or '"'
Working demo
Fixed it with this.
var regex = new Regex("(?<=C\\\":)\\d+.\\d+(?=\\s*,)");
var test = regex.Match(content).ToString();
String literal to use for C#:
#"C\\"":([.0-9]*),"
If you wish to filter for only a valid numbers:
#"C\\"":([0-9]+.[0-9]+),"

How to get the matched sub-groups in C# using Regex?

I have a string:
{Lower Left ( 460700.000, 2121200.000)}
and here is my code:
var pat = #"Lower Left\s*\(\s*[\d\.]+\,(\s)*[\d\.]+\)";
var r = new Regex(pat, RegexOptions.IgnoreCase);
var m = r.Match(s);
The m.Groups[0] now equals:
{Lower Left ( 460700.000, 2121200.000)}
But I want to get the coordinate strings in two variables, e.g. X and Y. how to do it?
You could do like this:
string s = "{Lower Left ( 460700.000, 2121200.000)}";
var pat = #"Lower Left\s*\(\s*(\d+\.\d+)\,\s*(\d+\.\d+)\)";
var r = new Regex(pat, RegexOptions.IgnoreCase);
var m = r.Match(s);
Console.WriteLine(m.Groups[1]); // first number
Console.WriteLine(m.Groups[2]); // second number
If your number may or may not contain ., you can use:
string s = "{Lower Left ( 460700.000, 2121200.000)}";
var pat = #"Lower Left\s*\(\s*(\d+(?:\.\d+)?)\,\s*(\d+(?:\.\d+)?)\)";
var r = new Regex(pat, RegexOptions.IgnoreCase);
var m = r.Match(s);
Console.WriteLine(m.Groups[1]);
Console.WriteLine(m.Groups[2]);
This will accept this number: 123456 (no dot), 123.456 (one dot inside), but not 123.456.7 (two dot) or 1234. (dot at the end).
The first group allways returns the entire match, whilst the indexed ones contain your actual values for the matching groups. So you need m.Groups[1] and m.Groups[1] accordingly.
You can also name your groups:
#"Lower Left\s*\(\s*(?<X>\d+\.\d+),(\s)*(?<Y>\d+\.\d+)\)";
Where (?<identifier>anyPattern) means build a matching-group which is named identifier and has the pattern given by anyPattern.
Allowing you to access them like this:
m.Groups["X"]
m.Groups["Y"]
The square-brackets ([]) are also not needed at all as this would mean "either a number od digits (\d+), or a dot", not "a number of digits followed by a dot followed by a number of digits".

Regexp find position of different characters in string

I have a string conforming to the following pattern:
(cc)-(nr).(nr)M(nr)(cc)whitespace(nr)
where cc is artbitrary number of letter characters, nr is arbitrary number of numerical characters, and M is is the actual letter M.
For example:
ASF-1.15M437979CA 100000
EU-12.15M121515PO 1145
I need to find the positions of -, . and M whithin the string. The problem is, the leading characters and the ending characters can contain the letter M as well, but I need only the one in the middle.
As an alternative, the subtraction of the first characters (until -) and the first two numbers (as in (nr).(nr)M...) would be enough.
If you need a regex-based solution, you just need to use 3 capturing groups around the required patterns, and then access the Groups[n].Index property:
var rxt = new Regex(#"\p{L}*(-)\d+(\.)\d+(M)\d+\p{L}*\s*\d+");
// Collect matches
var matches = rxt.Matches(#"ASF-1.15M437979CA 100000 or EU-12.15M121515PO 1145");
// Now, we can get the indices
var posOfHyphen = matches.Cast<Match>().Select(p => p.Groups[1].Index);
var posOfDot = matches.Cast<Match>().Select(p => p.Groups[2].Index);
var posOfM = matches.Cast<Match>().Select(p => p.Groups[3].Index);
Output:
posOfHyphen => [3, 32]
posOfDot => [5, 35]
posOfM => [8, 38]
Regex:
string pattern = #"[A-Z]+(-)\d+(\.)\d+(M)\d+[A-Z]+";
string value = "ASF-1.15M437979CA 100000 or EU-12.15M121515PO 1145";
var match = Regex.Match(value, pattern);
if (match.Success)
{
int sep1 = match.Groups[1].Index;
int sep2 = match.Groups[2].Index;
int sep3 = match.Groups[3].Index;
}

regex to strip number from var in string

I have a long string and I have a var inside it
var abc = '123456'
Now I wish to get the 123456 from it.
I have tried a regex but its not working properly
Regex regex = new Regex("(?<abc>+)=(?<var>+)");
Match m = regex.Match(body);
if (m.Success)
{
string key = m.Groups["var"].Value;
}
How can I get the number from the var abc?
Thanks for your help and time
var body = #" fsd fsda f var abc = '123456' fsda fasd f";
Regex regex = new Regex(#"var (?<name>\w*) = '(?<number>\d*)'");
Match m = regex.Match(body);
Console.WriteLine("name: " + m.Groups["name"]);
Console.WriteLine("number: " + m.Groups["number"]);
prints:
name: abc
number: 123456
Your regex is not correct:
(?<abc>+)=(?<var>+)
The + are quantifiers meaning that the previous characters are repeated at least once (and there are no characters since (?< ... > ... ) is named capture group and is not considered as a character per se.
You perhaps meant:
(?<abc>.+)=(?<var>.+)
And a better regex might be:
(?<abc>[^=]+)=\s*'(?<var>[^']+)'
[^=]+ will match any character except an equal sign.
\s* means any number of space characters (will also match tabs, newlines and form feeds though)
[^']+ will match any character except a single quote.
To specifically match the variable abc, you then put it like this:
(?<abc>abc)\s*=\s*'(?<var>[^']+)'
(I added some more allowances for spaces)
From the example you provided the number can be gotten such as
Console.WriteLine (
Regex.Match("var abc = '123456'", #"(?<var>\d+)").Groups["var"].Value); // 123456
\d+ means 1 or more numbers (digits).
But I surmise your data doesn't look like your example.
Try this:
var body = #"my word 1, my word 2, my word var abc = '123456' 3, my word x";
Regex regex = new Regex(#"(?<=var \w+ = ')\d+");
Match m = regex.Match(body);

C# Regex Split - How do I split string into 2 words

I have the following string:
String myNarrative = "ID: 4393433 This is the best narration";
I want to split this into 2 strings;
myId = "ID: 4393433";
myDesc = "This is the best narration";
How do I do this in Regex.Split()?
Thanks for your help.
If it is a fixed format as shown, use Regex.Match with Capturing Groups (see Matched Subexpressions). Split is useful for dividing up a repeating sequence with unbound multiplicity; the input does not represent such a sequence but rather a fixed set of fields/values.
var m = Regex.Match(inp, #"ID:\s+(\d+)\s+(.*)\s+");
if (m.Success) {
var number = m.Groups[1].Value;
var rest = m.Groups[2].Value;
} else {
// Failed to match.
}
Alternatively, one could use Named Groups and have a read through the Regular Expression Language quick-reference.

Categories