Regex to extract specific numbers in a String - c#

string temp = "12345&refere?X=Assess9677125?Folder_id=35478";
I need to extract the number 12345 alone and I don't need the numbers 9677125 and 35478.
What regex can I use?

Here is the regex for extracting 5 digit number in the beginning of the string:
^(\d{5})&
If length is arbitrary:
^(\d+)&
If termination pattern is not always &:
^(\d+)[^\d]
Based on the Sayse's comment you can simply rewrite as:
^(\d+)
and in case of the termination is some number(for instance 999) then:
^(\d+)999

You don't need regex if you only want to extract the first number:
string temp = "12345&refere?X=Assess9677125?Folder_id=35478";
int first = Int32.Parse(String.Join("", temp.TakeWhile(c => Char.IsDigit(c))));
Console.WriteLine(first); // 12345

If the number you want is always at the beginning of the string and terminated by an ampersand (&) you don't need a regex at all. Just split the string on the ampersand and get the first element of the resulting array:
String temp = "12345&refere?X=Assess9677125?Folder_id=35478";
var splitArray = String.Split('&', temp);
var number = splitArray[0]; // returns 12345
Alternatively, you can get the index of the ampersand and substring up to that point:
String temp = "12345&refere?X=Assess9677125?Folder_id=35478";
var ampersandIndex = temp.IndexOf("&");
var number = temp.SubString(0, ampersandIndex); // returns 12345

From what you haven given us this is fairly simple:
var regex = new Regex(#"^(?<number>\d+)&");
var match = regex.Match("12345&refere?X=Assess9677125?Folder_id=35478");
if (match.Success)
{
var number = int.Parse(match.Groups["number"].Value);
}
Edit: Of course you can replace the argument of new Regex with any of the combinations Giorgi has given.

Related

Get INT value between 2 matching chars

I have a string that I need to separate the product ID from, is this format
shop:?id:556:token:bmgwcGJxZEpnK2RqemhaKzdBYWZjbTVZN0xaOXh5L3pmdDBFZjQrWVVES1pmYVBXVVB6SlFhejBsNndnaHNsUA==
I need to get 556 out of there, and in the case of say 2658 etc also possible.
First index ":" I think
str.Substring(str.LastIndexOf(':') + 1);
But then I dont know how to just break after the match, regex better? any help apprecaited
EDIT
These do the exact same thing, seperating the first numbers out
LINQ:
var test = new string(str.Substring(str.IndexOfAny("0123456789".ToCharArray())).TakeWhile(char.IsDigit).ToArray());
Reggex:
var test = Regex.Match(str, #"\d+").Value;
So bears the question, which is better approach?
If the string format is fixed, use the Split function
string str = "shop:?id:556:token:bmgwcGJxZEpnK2RqemhaKzdBYWZjbTVZN0xaOXh5L3pmdDBFZjQrWVVES1pmYVBXVVB6SlFhejBsNndnaHNsUA==";
int id = Convert.ToInt32(str.Split(':')[2]);
Console.WriteLine(id);
I'd probably use Regex:
var id = Regex.Match(input, #"\?id:(?<x>\d+)").Groups["x"].Value
Decoded, that Regex means "literally match ?id: then start a capturing group called x and capture one or more digits into it"
The returned Match will have a Groups property that we index by x and retrieve the value
If you want it as an int you can int.Parse the result-you won't need a TryParse because the Regex will have only matched digits
If the format of the string is fixed then this would work:
input[9..input.IndexOf(':',10)];
And it would be more performant than Regex or Split
If you wanted a substring that works with a format change, perhaps:
var x = input.IndexOf("?id:") + 4;
var id = input[x..input.IndexOf(':',x+1)];
This will work even if the order of items changes.
string original = "shop:?id:556:token:bmgwcGJxZEpnK2RqemhaKzdBYWZjbTVZN0xaOXh5L3pmdDBFZjQrWVVES1pmYVBXVVB6SlFhejBsNndnaHNsUA==";
string startWithId = original.Substring(original.IndexOf("id:") + 3);
string onlyId = startWithId.Split(':')[0];
Console.WriteLine(onlyId);

Get number between characters in Regex

Having difficulty creating a regex.
I have this text:
"L\":0.01690502,\"C\":0.01690502,\"V\":33.76590433"
I need only the number after C\": extracted, this is what I currently have.
var regex = new Regex(#"(?<=C\\"":)\d +.\d + (?=\s *,\\)");
var test = regex.Match(content).ToString();
decimal.TryParse(test, out decimal closingPrice);
To extract the number after C\":, you can capture (\d+.\d+) in a group:
C\\":(\d+.\d+)
You could also use a positive lookbehind:
(?<=C\\":)\d+.\d+
You can use this code to fetch all pairs of letter and number.
var regex = new Regex("(?<letter>[A-Z])[^:]+:(?<number>[^,\"]+)");
var input = "L\":0.01690502,\"C\":0.01690502,\"V\":33.76590433";
var matches = regex.Matches(input).Cast<Match>().ToArray();
foreach (var match in matches)
Console.WriteLine($"Letter: {match.Groups["letter"].Value}, number: {match.Groups["number"].Value}");
If you only need only number from "C" letter you can use this linq expression:
var cNumber = matches.FirstOrDefault(m => m.Groups["letter"].Value == "C")?.Groups["number"].Value ?? "";
Regex explanation:
(?<letter>[A-Z]) // capture single letter
[^:]+ // skip all chars until ':'
: // colon
(?<number>[^,"]+) // capture all until ',' or '"'
Working demo
Fixed it with this.
var regex = new Regex("(?<=C\\\":)\\d+.\\d+(?=\\s*,)");
var test = regex.Match(content).ToString();
String literal to use for C#:
#"C\\"":([.0-9]*),"
If you wish to filter for only a valid numbers:
#"C\\"":([0-9]+.[0-9]+),"

c# How do trim all non numeric character in a string

what is the faster way to trim all alphabet in a string that have alphabet prefix.
For example, input sting "ABC12345" , and i wish to havee 12345 as output only.
Thanks.
Please use "char.IsDigit", try this:
static void Main(string[] args)
{
var input = "ABC12345";
var numeric = new String(input.Where(char.IsDigit).ToArray());
Console.Read();
}
You can use Regular Expressions to trim an alphabetic prefix
var input = "ABC123";
var trimmed = Regex.Replace(input, #"^[A-Za-z]+", "");
// trimmed = "123"
The regular expression (second parameter) ^[A-Za-z]+ of the replace method does most of the work, it defines what you want to be replaced using the following rules:
The ^ character ensures a match only exists at the start of a string
The [A-Za-z] will match any uppercase or lowercase letters
The + means the upper or lowercase letters will be matched as many times in a row as possible
As this is the Replace method, the third parameter then replaces any matches with an empty string.
The other answers seem to answer what is the slowest way .. so if you really need the fastest way, then you can find the index of the first digit and get the substring:
string input = "ABC12345";
int i = 0;
while ( input[i] < '0' || input[i] > '9' ) i++;
string output = input.Substring(i);
The shortest way to get the value would probably be the VB Val method:
double value = Microsoft.VisualBasic.Conversion.Val("ABC12345"); // 12345.0
You would have to regular expression. It seems you are looking for only digits and not letters.
Sample:
string result =
System.Text.RegularExpressions.Regex.Replace("Your input string", #"\D+", string.Empty);

Get a specific part from a string based on a pattern

I have a string in this format:
ABCD_EFDG20120700.0.xml
This has a pattern which has three parts to it:
First is the set of chars before the '_', the 'ABCD'
Second are the set of chars 'EFDG' after the '_'
Third are the remaining 20120700.0.xml
I can split the original string and get the number(s) from the second element in the split result using this switch:
\d+
Match m = Regex.Match(splitname[1], "\\d+");
That returns only '20120700'. But I need '20120700.0'.
How do I get the required string?
You can extend your regex to look for any number of digits, then period and then any number of digits once again:
Match m = Regex.Match(splitname[1], "\\d+\\.\\d+");
Although with such regular expression you don't even need to split the string:
string s = "ABCD_EFDG20120700.0.xml";
Match m = Regex.Match(s, "\\d+\\.\\d+");
string result = m.Value; // result is 20120700.0
I can suggest you to use one regex operation for all you want like this:
var rgx = new Regex(#"^([^_]+)_([^\d.]+)([\d.]+\d+)\.(.*)$");
var matches = rgx.Matches(input);
if (matches.Count > 0)
{
Console.WriteLine("{0}", matches[0].Groups[0]); // All input string
Console.WriteLine("{0}", matches[0].Groups[1]); // ABCD
Console.WriteLine("{0}", matches[0].Groups[2]); // EFGH
Console.WriteLine("{0}", matches[0].Groups[3]); // 20120700.0
Console.WriteLine("{0}", matches[0].Groups[4]); // xml
}

C# Regex Split - How do I split string into 2 words

I have the following string:
String myNarrative = "ID: 4393433 This is the best narration";
I want to split this into 2 strings;
myId = "ID: 4393433";
myDesc = "This is the best narration";
How do I do this in Regex.Split()?
Thanks for your help.
If it is a fixed format as shown, use Regex.Match with Capturing Groups (see Matched Subexpressions). Split is useful for dividing up a repeating sequence with unbound multiplicity; the input does not represent such a sequence but rather a fixed set of fields/values.
var m = Regex.Match(inp, #"ID:\s+(\d+)\s+(.*)\s+");
if (m.Success) {
var number = m.Groups[1].Value;
var rest = m.Groups[2].Value;
} else {
// Failed to match.
}
Alternatively, one could use Named Groups and have a read through the Regular Expression Language quick-reference.

Categories