Related
I have a requirement to find and extract a number contained within a string.
For example, from these strings:
string test = "1 test"
string test1 = " 1 test"
string test2 = "test 99"
How can I do this?
\d+ is the regex for an integer number. So
//System.Text.RegularExpressions.Regex
resultString = Regex.Match(subjectString, #"\d+").Value;
returns a string containing the first occurrence of a number in subjectString.
Int32.Parse(resultString) will then give you the number.
Here's how I cleanse phone numbers to get the digits only:
string numericPhone = new String(phone.Where(Char.IsDigit).ToArray());
go through the string and use Char.IsDigit
string a = "str123";
string b = string.Empty;
int val;
for (int i=0; i< a.Length; i++)
{
if (Char.IsDigit(a[i]))
b += a[i];
}
if (b.Length>0)
val = int.Parse(b);
use regular expression ...
Regex re = new Regex(#"\d+");
Match m = re.Match("test 66");
if (m.Success)
{
Console.WriteLine(string.Format("RegEx found " + m.Value + " at position " + m.Index.ToString()));
}
else
{
Console.WriteLine("You didn't enter a string containing a number!");
}
What I use to get Phone Numbers without any punctuation...
var phone = "(787) 763-6511";
string.Join("", phone.ToCharArray().Where(Char.IsDigit));
// result: 7877636511
Regex.Split can extract numbers from strings. You get all the numbers that are found in a string.
string input = "There are 4 numbers in this string: 40, 30, and 10.";
// Split on one or more non-digit characters.
string[] numbers = Regex.Split(input, #"\D+");
foreach (string value in numbers)
{
if (!string.IsNullOrEmpty(value))
{
int i = int.Parse(value);
Console.WriteLine("Number: {0}", i);
}
}
Output:
Number: 4
Number: 40
Number: 30
Number: 10
if the number has a decimal points, you can use below
using System;
using System.Text.RegularExpressions;
namespace Rextester
{
public class Program
{
public static void Main(string[] args)
{
//Your code goes here
Console.WriteLine(Regex.Match("anything 876.8 anything", #"\d+\.*\d*").Value);
Console.WriteLine(Regex.Match("anything 876 anything", #"\d+\.*\d*").Value);
Console.WriteLine(Regex.Match("$876435", #"\d+\.*\d*").Value);
Console.WriteLine(Regex.Match("$876.435", #"\d+\.*\d*").Value);
}
}
}
results :
"anything 876.8 anything" ==> 876.8
"anything 876 anything" ==> 876
"$876435" ==> 876435
"$876.435" ==> 876.435
Sample : https://dotnetfiddle.net/IrtqVt
Here's a Linq version:
string s = "123iuow45ss";
var getNumbers = (from t in s
where char.IsDigit(t)
select t).ToArray();
Console.WriteLine(new string(getNumbers));
Another simple solution using Regex
You should need to use this
using System.Text.RegularExpressions;
and the code is
string var = "Hello3453232wor705Ld";
string mystr = Regex.Replace(var, #"\d", "");
string mynumber = Regex.Replace(var, #"\D", "");
Console.WriteLine(mystr);
Console.WriteLine(mynumber);
You can also try this
string.Join(null,System.Text.RegularExpressions.Regex.Split(expr, "[^\\d]"));
Here is another Linq approach which extracts the first number out of a string.
string input = "123 foo 456";
int result = 0;
bool success = int.TryParse(new string(input
.SkipWhile(x => !char.IsDigit(x))
.TakeWhile(x => char.IsDigit(x))
.ToArray()), out result);
Examples:
string input = "123 foo 456"; // 123
string input = "foo 456"; // 456
string input = "123 foo"; // 123
Just use a RegEx to match the string, then convert:
Match match = Regex.Match(test , #"(\d+)");
if (match.Success) {
return int.Parse(match.Groups[1].Value);
}
string input = "Hello 20, I am 30 and he is 40";
var numbers = Regex.Matches(input, #"\d+").OfType<Match>().Select(m => int.Parse(m.Value)).ToArray();
You can do this using String property like below:
return new String(input.Where(Char.IsDigit).ToArray());
which gives only number from string.
For those who want decimal number from a string with Regex in TWO line:
decimal result = 0;
decimal.TryParse(Regex.Match(s, #"\d+").Value, out result);
Same thing applys to float, long, etc...
var match=Regex.Match(#"a99b",#"\d+");
if(match.Success)
{
int val;
if(int.TryParse(match.Value,out val))
{
//val is set
}
}
The question doesn't explicitly state that you just want the characters 0 to 9 but it wouldn't be a stretch to believe that is true from your example set and comments. So here is the code that does that.
string digitsOnly = String.Empty;
foreach (char c in s)
{
// Do not use IsDigit as it will include more than the characters 0 through to 9
if (c >= '0' && c <= '9') digitsOnly += c;
}
Why you don't want to use Char.IsDigit() - Numbers include characters such as fractions, subscripts, superscripts, Roman numerals, currency numerators, encircled numbers, and script-specific digits.
Here is another simple solution using Linq which extracts only the numeric values from a string.
var numbers = string.Concat(stringInput.Where(char.IsNumber));
Example:
var numbers = string.Concat("(787) 763-6511".Where(char.IsNumber));
Gives: "7877636511"
var outputString = String.Join("", inputString.Where(Char.IsDigit));
Get all numbers in the string.
So if you use for examaple '1 plus 2' it will get '12'.
Extension method to get all positive numbers contained in a string:
public static List<long> Numbers(this string str)
{
var nums = new List<long>();
var start = -1;
for (int i = 0; i < str.Length; i++)
{
if (start < 0 && Char.IsDigit(str[i]))
{
start = i;
}
else if (start >= 0 && !Char.IsDigit(str[i]))
{
nums.Add(long.Parse(str.Substring(start, i - start)));
start = -1;
}
}
if (start >= 0)
nums.Add(long.Parse(str.Substring(start, str.Length - start)));
return nums;
}
If you want negative numbers as well simply modify this code to handle the minus sign (-)
Given this input:
"I was born in 1989, 27 years ago from now (2016)"
The resulting numbers list will be:
[1989, 27, 2016]
An interesting approach is provided here by Ahmad Mageed, uses Regex and StringBuilder to extract the integers in the order in which they appear in the string.
An example using Regex.Split based on the post by Ahmad Mageed is as follows:
var dateText = "MARCH-14-Tue";
string splitPattern = #"[^\d]";
string[] result = Regex.Split(dateText, splitPattern);
var finalresult = string.Join("", result.Where(e => !String.IsNullOrEmpty(e)));
int DayDateInt = 0;
int.TryParse(finalresult, out DayDateInt);
I have used this one-liner to pull all numbers from any string.
var phoneNumber = "(555)123-4567";
var numsOnly = string.Join("", new Regex("[0-9]").Matches(phoneNumber)); // 5551234567
string verificationCode ="dmdsnjds5344gfgk65585";
string code = "";
Regex r1 = new Regex("\\d+");
Match m1 = r1.Match(verificationCode);
while (m1.Success)
{
code += m1.Value;
m1 = m1.NextMatch();
}
Did the reverse of one of the answers to this question:
How to remove numbers from string using Regex.Replace?
// Pull out only the numbers from the string using LINQ
var numbersFromString = new String(input.Where(x => x >= '0' && x <= '9').ToArray());
var numericVal = Int32.Parse(numbersFromString);
Here is my Algorithm
//Fast, C Language friendly
public static int GetNumber(string Text)
{
int val = 0;
for(int i = 0; i < Text.Length; i++)
{
char c = Text[i];
if (c >= '0' && c <= '9')
{
val *= 10;
//(ASCII code reference)
val += c - 48;
}
}
return val;
}
static string GetdigitFromString(string str)
{
char[] refArray = { '0', '1', '2', '3', '4', '5', '6', '7', '8', '9' };
char[] inputArray = str.ToCharArray();
string ext = string.Empty;
foreach (char item in inputArray)
{
if (refArray.Contains(item))
{
ext += item.ToString();
}
}
return ext;
}
here is my solution
string var = "Hello345wor705Ld";
string alpha = string.Empty;
string numer = string.Empty;
foreach (char str in var)
{
if (char.IsDigit(str))
numer += str.ToString();
else
alpha += str.ToString();
}
Console.WriteLine("String is: " + alpha);
Console.WriteLine("Numeric character is: " + numer);
Console.Read();
You will have to use Regex as \d+
\d matches digits in the given string.
string s = "kg g L000145.50\r\n";
char theCharacter = '.';
var getNumbers = (from t in s
where char.IsDigit(t) || t.Equals(theCharacter)
select t).ToArray();
var _str = string.Empty;
foreach (var item in getNumbers)
{
_str += item.ToString();
}
double _dou = Convert.ToDouble(_str);
MessageBox.Show(_dou.ToString("#,##0.00"));
Using #tim-pietzcker answer from above, the following will work for PowerShell.
PS C:\> $str = '1 test'
PS C:\> [regex]::match($str,'\d+').value
1
Below I have a piece of code that is able to extract the number values from a piece of text. So if I have string +12 is on same day, then it will extract 12.
However if I have a negative number like -12 is on the same day, I want it to extract -12, not 12.
How I can extract the minus symbol?
foreach (char c in alternativeAirportPrice.Text)
{
if (char.IsNumber(c))
{
string test = "-12 on same day";
string alternativeAirportPriceValue = string.Join("", test.ToCharArray()
.Where(x => char.IsDigit(x)).ToArray());
return alternativeAirportPriceValue;
}
}
You can use a regex pattern:
-?\d+
This will match any string of digits or a - followed by a string of digits.
string text = "-12 on the same day";
var match = Regex.Match(text, "-?\\d+");
return match.Value;
Remember to add a using directive to System.Text.RegularExpressions!
This should be what you want based on the question's desired result. Note that you don't need a foreach loop for this purpose just LINQ is enough:
string.Join("", test.Split(' ').Where(x => int.TryParse(x , out _)).ToArray());
return alternativeAirportPriceValue;
Try this:
public IEnumerable<int> ExtractNumbers(string text)
{
text += " ";
var temp = string.Empty;
for (var i = 0; i < text.Length; i++)
{
if (char.IsDigit(text[i]))
{
if ('-'.Equals(text[i - 1]))
{
temp += text[i - 1];
}
temp += text[i];
}
else if (temp.Length > 0)
{
yield return int.Parse(temp);
temp = string.Empty;
}
}
}
This way you can handle cases where the string has multiple numbers in it, as observed by #Sir Rufo
The line:
text += " ";
is there to ensure the loop will hit the "else if" block when a number is in the last position of the string, e.g. "-12 on the same day 123456"
I have a string containing regular characters, special characters and numbers. I'm trying to remove the regular characters, just keeping the numbers and the special characters. I use a loop to check if a character is a special character or a number. Then, I replace it with an empty string. However, this doesn't seem to work because I get an error "can't apply != to string or char". My code is below. If possible, please give me some ideas to fix this. Thanks.
public string convert_string_to_no(string val)
{
string str_val = "";
int val_len = val.Length;
for (int i = 0; i < val_len; i++)
{
char myChar = Convert.ToChar(val.Substring(i, 1));
if ((char.IsDigit(myChar) == false) && (myChar != "-"))
{
str_val = str_val.replace(str_val.substring(i,1),"");
}
}
return str_val;
}
you can use regular expressions to do that.its faster than using loop and clean
String test ="Q1W2-hjkxas1-EE3R4-5T";
Regex rgx = new Regex("[^0-9-]");
Console.WriteLine(rgx.Replace(test, ""));
check the working code here
It seem try to change "-" to '-', and better to construct the string and not replacing the char.
public string convert_string_to_no(string val)
{
string str_val = "";
int val_len = val.Length;
for (int i = 0; i < val_len; i++)
{
char myChar = Convert.ToChar(val.Substring(i, 1));
if (char.IsDigit(myChar) && myChar == '-')
{
str_val += myChar;
}
}
return str_val;
}
I perfer Linq:
public static class StringExtensions
{
public static string ToLimitedString(this string instance,
string validCharacters)
{
// null reference checking...
var result = new string(instance
.Where(c => validCharacters.Contains(c))
.ToArray());
return result;
}
}
usage:
var test ="Q1W2-hjkxas1-EE3R4-5T";
var limited = test.ToLimitedString("01234567890-");
Console.WriteLine(limited);
result:
12-1-34-5
DotNetFiddle Example
Not entirely sure this is possible, but say I have two strings like so:
"IAmAString-00001"
"IAmAString-00023"
What would be a quick'n'easy way to iterate from IAmAString-0001 to IAmAString-00023 by moving up the index of just the numbers on the end?
The problem is a bit more general than that, for example the string I could be dealing could be of any format but the last bunch of chars will always be numbers, so something like Super_Confusing-String#w00t0003 and in that case the last 0003 would be what I'd use to iterate through.
Any ideas?
You can use char.IsDigit:
static void Main(string[] args)
{
var s = "IAmAString-00001";
int index = -1;
for (int i = 0; i < s.Length; i++)
{
if (char.IsDigit(s[i]))
{
index = i;
break;
}
}
if (index == -1)
Console.WriteLine("digits not found");
else
Console.WriteLine("digits: {0}", s.Substring(index));
}
which produces this output:
digits: 00001
string.Format and a for loop should do what you want.
for(int i = 0; i <=23; i++)
{
string.Format("IAmAString-{0:D4}",i);
}
or something close to that (not sitting in front of a compiler).
string start = "IAmAString-00001";
string end = "IAmAString-00023";
// match constant part and ending digits
var matchstart = Regex.Match(start,#"^(.*?)(\d+)$");
int numberstart = int.Parse(matchstart.Groups[2].Value);
var matchend = Regex.Match(end,#"^(.*?)(\d+)$");
int numberend = int.Parse(matchend.Groups[2].Value);
// constant parts must be the same
if (matchstart.Groups[1].Value != matchend.Groups[1].Value)
throw new ArgumentException("");
// create a format string with same number of digits as original
string format = new string('0', matchstart.Groups[2].Length);
for (int ii = numberstart; ii <= numberend; ++ii)
Console.WriteLine(matchstart.Groups[1].Value + ii.ToString(format));
You could use a Regex:
var match=Regex.Match("Super_Confusing-String#w00t0003",#"(?<=(^.*\D)|^)\d+$");
if(match.Success)
{
var val=int.Parse(match.Value);
Console.WriteLine(val);
}
To answer more specifically, you could use named groups to extract what you need:
var match=Regex.Match(
"Super_Confusing-String#w00t0003",
#"(?<prefix>(^.*\D)|^)(?<digits>\d+)$");
if(match.Success)
{
var prefix=match.Groups["prefix"].Value;
Console.WriteLine(prefix);
var val=int.Parse(match.Groups["digits"].Value);
Console.WriteLine(val);
}
If you can assume that the last 5 characters are the number then:
string prefix = "myprefix-";
for (int i=1; i <=23; i++)
{
Console.WriteLine(myPrefix+i.ToString("D5"));
}
This function will find the trailing number.
private int FindTrailingNumber(string str)
{
string numString = "";
int numTest;
for (int i = str.Length - 1; i > 0; i--)
{
char c = str[i];
if (int.TryParse(c.ToString(), out numTest))
{
numString = c + numString;
}
}
return int.Parse(numString);
}
Assuming all your base strings are the same, this would iterate between strings.
string s1 = "asdf123";
string s2 = "asdf127";
int num1 = FindTrailingNumber(s1);
int num2 = FindTrailingNumber(s2);
string strBase = s1.Replace(num1.ToString(), "");
for (int i = num1; i <= num2; i++)
{
Console.WriteLine(strBase + i.ToString());
}
I think it would be better if you do the search from the last (Rick already upvoted you since it was ur logic :-))
static void Main(string[] args)
{
var s = "IAmAString-00001";
int index = -1;
for (int i = s.Length - 1; i >=0; i--)
{
if (!char.IsDigit(s[i]))
{
index = i;
break;
}
}
if (index == -1)
Console.WriteLine("digits not found");
else
Console.WriteLine("digits: {0}", s.Substring(index));
Console.ReadKey();
}
HTH
If the last X numbers are always digits, then:
int x = 5;
string s = "IAmAString-00001";
int num = int.Parse(s.Substring(s.Length - x, x));
Console.WriteLine("Your Number is: {0}", num);
If the last digits can be 3, 4, or 5 in length, then you will need a little more logic:
int x = 0;
string s = "IAmAString-00001";
foreach (char c in s.Reverse())//Use Reverse() so you start with digits only.
{
if(char.IsDigit(c) == false)
break;//If we start hitting non-digit characters, then exit the loop.
++x;
}
int num = int.Parse(s.Substring(s.Length - x, x));
Console.WriteLine("Your Number is: {0}", num);
I'm not good with complicated RegEx. Because of this, I always shy away from it when maximum optimization is unnecessary. The reason for this is RegEx doesn't always parse strings the way you expect it to. If there is and alternate solution that will still run fast then I'd rather go that route as it's easier for me to understand and know that it will work with any combination of strings.
For Example: if you use some of the other solutions presented here with a string like "I2AmAString-000001", then you will get "2000001" as your number instead of "1".
I have a string User name (sales) and I want to extract the text between the brackets, how would I do this?
I suspect sub-string but I can't work out how to read until the closing bracket, the length of text will vary.
If you wish to stay away from regular expressions, the simplest way I can think of is:
string input = "User name (sales)";
string output = input.Split('(', ')')[1];
A very simple way to do it is by using regular expressions:
Regex.Match("User name (sales)", #"\(([^)]*)\)").Groups[1].Value
As a response to the (very funny) comment, here's the same Regex with some explanation:
\( # Escaped parenthesis, means "starts with a '(' character"
( # Parentheses in a regex mean "put (capture) the stuff
# in between into the Groups array"
[^)] # Any character that is not a ')' character
* # Zero or more occurrences of the aforementioned "non ')' char"
) # Close the capturing group
\) # "Ends with a ')' character"
Assuming that you only have one pair of parenthesis.
string s = "User name (sales)";
int start = s.IndexOf("(") + 1;
int end = s.IndexOf(")", start);
string result = s.Substring(start, end - start);
Use this function:
public string GetSubstringByString(string a, string b, string c)
{
return c.Substring((c.IndexOf(a) + a.Length), (c.IndexOf(b) - c.IndexOf(a) - a.Length));
}
and here is the usage:
GetSubstringByString("(", ")", "User name (sales)")
and the output would be:
sales
Regular expressions might be the best tool here. If you are not famililar with them, I recommend you install Expresso - a great little regex tool.
Something like:
Regex regex = new Regex("\\((?<TextInsideBrackets>\\w+)\\)");
string incomingValue = "Username (sales)";
string insideBrackets = null;
Match match = regex.Match(incomingValue);
if(match.Success)
{
insideBrackets = match.Groups["TextInsideBrackets"].Value;
}
string input = "User name (sales)";
string output = input.Substring(input.IndexOf('(') + 1, input.IndexOf(')') - input.IndexOf('(') - 1);
A regex maybe? I think this would work...
\(([a-z]+?)\)
using System;
using System.Text.RegularExpressions;
private IEnumerable<string> GetSubStrings(string input, string start, string end)
{
Regex r = new Regex(Regex.Escape(start) +`"(.*?)"` + Regex.Escape(end));
MatchCollection matches = r.Matches(input);
foreach (Match match in matches)
yield return match.Groups[1].Value;
}
int start = input.IndexOf("(") + 1;
int length = input.IndexOf(")") - start;
output = input.Substring(start, length);
Use a Regular Expression:
string test = "(test)";
string word = Regex.Match(test, #"\((\w+)\)").Groups[1].Value;
Console.WriteLine(word);
input.Remove(input.IndexOf(')')).Substring(input.IndexOf('(') + 1);
The regex method is superior I think, but if you wanted to use the humble substring
string input= "my name is (Jayne C)";
int start = input.IndexOf("(");
int stop = input.IndexOf(")");
string output = input.Substring(start+1, stop - start - 1);
or
string input = "my name is (Jayne C)";
string output = input.Substring(input.IndexOf("(") +1, input.IndexOf(")")- input.IndexOf("(")- 1);
var input = "12(34)1(12)(14)234";
var output = "";
for (int i = 0; i < input.Length; i++)
{
if (input[i] == '(')
{
var start = i + 1;
var end = input.IndexOf(')', i + 1);
output += input.Substring(start, end - start) + ",";
}
}
if (output.Length > 0) // remove last comma
output = output.Remove(output.Length - 1);
output : "34,12,14"
Here is a general purpose readable function that avoids using regex:
// Returns the text between 'start' and 'end'.
string ExtractBetween(string text, string start, string end)
{
int iStart = text.IndexOf(start);
iStart = (iStart == -1) ? 0 : iStart + start.Length;
int iEnd = text.LastIndexOf(end);
if(iEnd == -1)
{
iEnd = text.Length;
}
int len = iEnd - iStart;
return text.Substring(iStart, len);
}
To call it in your particular example you can do:
string result = ExtractBetween("User name (sales)", "(", ")");
I'm finding that regular expressions are extremely useful but very difficult to write. So, I did some research and found this tool that makes writing them so easy.
Don't shy away from them because the syntax is difficult to figure out. They can be so powerful.
This code is faster than most solutions here (if not all), packed as String extension method, it does not support recursive nesting:
public static string GetNestedString(this string str, char start, char end)
{
int s = -1;
int i = -1;
while (++i < str.Length)
if (str[i] == start)
{
s = i;
break;
}
int e = -1;
while(++i < str.Length)
if (str[i] == end)
{
e = i;
break;
}
if (e > s)
return str.Substring(s + 1, e - s - 1);
return null;
}
This one is little longer and slower, but it handles recursive nesting more nicely:
public static string GetNestedString(this string str, char start, char end)
{
int s = -1;
int i = -1;
while (++i < str.Length)
if (str[i] == start)
{
s = i;
break;
}
int e = -1;
int depth = 0;
while (++i < str.Length)
if (str[i] == end)
{
e = i;
if (depth == 0)
break;
else
--depth;
}
else if (str[i] == start)
++depth;
if (e > s)
return str.Substring(s + 1, e - s - 1);
return null;
}
I've been using and abusing C#9 recently and I can't help throwing in Spans even in questionable scenarios... Just for the fun of it, here's a variation on the answers above:
var input = "User name (sales)";
var txtSpan = input.AsSpan();
var startPoint = txtSpan.IndexOf('(') + 1;
var length = txtSpan.LastIndexOf(')') - startPoint;
var output = txtSpan.Slice(startPoint, length);
For the OP's specific scenario, it produces the right output.
(Personally, I'd use RegEx, as posted by others. It's easier to get around the more tricky scenarios where the solution above falls apart).
A better version (as extension method) I made for my own project:
//Note: This only captures the first occurrence, but
//can be easily modified to scan across the text (I'd prefer Slicing a Span)
public static string ExtractFromBetweenChars(this string txt, char openChar, char closeChar)
{
ReadOnlySpan<char> span = txt.AsSpan();
int firstCharPos = span.IndexOf(openChar);
int lastCharPos = -1;
if (firstCharPos != -1)
{
for (int n = firstCharPos + 1; n < span.Length; n++)
{
if (span[n] == openChar) firstCharPos = n; //This allows the opening char position to change
if (span[n] == closeChar) lastCharPos = n;
if (lastCharPos > firstCharPos) break;
//This would correctly extract "sales" from this [contrived]
//example: "just (a (name (sales) )))(test"
}
return span.Slice(firstCharPos + 1, lastCharPos - firstCharPos - 1).ToString();
}
return "";
}
Much similar to #Gustavo Baiocchi Costa but offset is being calculated with another intermediate Substring.
int innerTextStart = input.IndexOf("(") + 1;
int innerTextLength = input.Substring(start).IndexOf(")");
string output = input.Substring(innerTextStart, innerTextLength);
I came across this while I was looking for a solution to a very similar implementation.
Here is a snippet from my actual code. Starts substring from the first char (index 0).
string separator = "\n"; //line terminator
string output;
string input= "HowAreYou?\nLets go there!";
output = input.Substring(0, input.IndexOf(separator));