Extract number from string with C# Regex [duplicate] - c#

I have a requirement to find and extract a number contained within a string.
For example, from these strings:
string test = "1 test"
string test1 = " 1 test"
string test2 = "test 99"
How can I do this?

\d+ is the regex for an integer number. So
//System.Text.RegularExpressions.Regex
resultString = Regex.Match(subjectString, #"\d+").Value;
returns a string containing the first occurrence of a number in subjectString.
Int32.Parse(resultString) will then give you the number.

Here's how I cleanse phone numbers to get the digits only:
string numericPhone = new String(phone.Where(Char.IsDigit).ToArray());

go through the string and use Char.IsDigit
string a = "str123";
string b = string.Empty;
int val;
for (int i=0; i< a.Length; i++)
{
if (Char.IsDigit(a[i]))
b += a[i];
}
if (b.Length>0)
val = int.Parse(b);

use regular expression ...
Regex re = new Regex(#"\d+");
Match m = re.Match("test 66");
if (m.Success)
{
Console.WriteLine(string.Format("RegEx found " + m.Value + " at position " + m.Index.ToString()));
}
else
{
Console.WriteLine("You didn't enter a string containing a number!");
}

What I use to get Phone Numbers without any punctuation...
var phone = "(787) 763-6511";
string.Join("", phone.ToCharArray().Where(Char.IsDigit));
// result: 7877636511

Regex.Split can extract numbers from strings. You get all the numbers that are found in a string.
string input = "There are 4 numbers in this string: 40, 30, and 10.";
// Split on one or more non-digit characters.
string[] numbers = Regex.Split(input, #"\D+");
foreach (string value in numbers)
{
if (!string.IsNullOrEmpty(value))
{
int i = int.Parse(value);
Console.WriteLine("Number: {0}", i);
}
}
Output:
Number: 4
Number: 40
Number: 30
Number: 10

if the number has a decimal points, you can use below
using System;
using System.Text.RegularExpressions;
namespace Rextester
{
public class Program
{
public static void Main(string[] args)
{
//Your code goes here
Console.WriteLine(Regex.Match("anything 876.8 anything", #"\d+\.*\d*").Value);
Console.WriteLine(Regex.Match("anything 876 anything", #"\d+\.*\d*").Value);
Console.WriteLine(Regex.Match("$876435", #"\d+\.*\d*").Value);
Console.WriteLine(Regex.Match("$876.435", #"\d+\.*\d*").Value);
}
}
}
results :
"anything 876.8 anything" ==> 876.8
"anything 876 anything" ==> 876
"$876435" ==> 876435
"$876.435" ==> 876.435
Sample : https://dotnetfiddle.net/IrtqVt

Here's a Linq version:
string s = "123iuow45ss";
var getNumbers = (from t in s
where char.IsDigit(t)
select t).ToArray();
Console.WriteLine(new string(getNumbers));

Another simple solution using Regex
You should need to use this
using System.Text.RegularExpressions;
and the code is
string var = "Hello3453232wor705Ld";
string mystr = Regex.Replace(var, #"\d", "");
string mynumber = Regex.Replace(var, #"\D", "");
Console.WriteLine(mystr);
Console.WriteLine(mynumber);

You can also try this
string.Join(null,System.Text.RegularExpressions.Regex.Split(expr, "[^\\d]"));

Here is another Linq approach which extracts the first number out of a string.
string input = "123 foo 456";
int result = 0;
bool success = int.TryParse(new string(input
.SkipWhile(x => !char.IsDigit(x))
.TakeWhile(x => char.IsDigit(x))
.ToArray()), out result);
Examples:
string input = "123 foo 456"; // 123
string input = "foo 456"; // 456
string input = "123 foo"; // 123

Just use a RegEx to match the string, then convert:
Match match = Regex.Match(test , #"(\d+)");
if (match.Success) {
return int.Parse(match.Groups[1].Value);
}

string input = "Hello 20, I am 30 and he is 40";
var numbers = Regex.Matches(input, #"\d+").OfType<Match>().Select(m => int.Parse(m.Value)).ToArray();

You can do this using String property like below:
return new String(input.Where(Char.IsDigit).ToArray());
which gives only number from string.

For those who want decimal number from a string with Regex in TWO line:
decimal result = 0;
decimal.TryParse(Regex.Match(s, #"\d+").Value, out result);
Same thing applys to float, long, etc...

var match=Regex.Match(#"a99b",#"\d+");
if(match.Success)
{
int val;
if(int.TryParse(match.Value,out val))
{
//val is set
}
}

The question doesn't explicitly state that you just want the characters 0 to 9 but it wouldn't be a stretch to believe that is true from your example set and comments. So here is the code that does that.
string digitsOnly = String.Empty;
foreach (char c in s)
{
// Do not use IsDigit as it will include more than the characters 0 through to 9
if (c >= '0' && c <= '9') digitsOnly += c;
}
Why you don't want to use Char.IsDigit() - Numbers include characters such as fractions, subscripts, superscripts, Roman numerals, currency numerators, encircled numbers, and script-specific digits.

Here is another simple solution using Linq which extracts only the numeric values from a string.
var numbers = string.Concat(stringInput.Where(char.IsNumber));
Example:
var numbers = string.Concat("(787) 763-6511".Where(char.IsNumber));
Gives: "7877636511"

var outputString = String.Join("", inputString.Where(Char.IsDigit));
Get all numbers in the string.
So if you use for examaple '1 plus 2' it will get '12'.

Extension method to get all positive numbers contained in a string:
public static List<long> Numbers(this string str)
{
var nums = new List<long>();
var start = -1;
for (int i = 0; i < str.Length; i++)
{
if (start < 0 && Char.IsDigit(str[i]))
{
start = i;
}
else if (start >= 0 && !Char.IsDigit(str[i]))
{
nums.Add(long.Parse(str.Substring(start, i - start)));
start = -1;
}
}
if (start >= 0)
nums.Add(long.Parse(str.Substring(start, str.Length - start)));
return nums;
}
If you want negative numbers as well simply modify this code to handle the minus sign (-)
Given this input:
"I was born in 1989, 27 years ago from now (2016)"
The resulting numbers list will be:
[1989, 27, 2016]

An interesting approach is provided here by Ahmad Mageed, uses Regex and StringBuilder to extract the integers in the order in which they appear in the string.
An example using Regex.Split based on the post by Ahmad Mageed is as follows:
var dateText = "MARCH-14-Tue";
string splitPattern = #"[^\d]";
string[] result = Regex.Split(dateText, splitPattern);
var finalresult = string.Join("", result.Where(e => !String.IsNullOrEmpty(e)));
int DayDateInt = 0;
int.TryParse(finalresult, out DayDateInt);

I have used this one-liner to pull all numbers from any string.
var phoneNumber = "(555)123-4567";
var numsOnly = string.Join("", new Regex("[0-9]").Matches(phoneNumber)); // 5551234567

string verificationCode ="dmdsnjds5344gfgk65585";
string code = "";
Regex r1 = new Regex("\\d+");
Match m1 = r1.Match(verificationCode);
while (m1.Success)
{
code += m1.Value;
m1 = m1.NextMatch();
}

Did the reverse of one of the answers to this question:
How to remove numbers from string using Regex.Replace?
// Pull out only the numbers from the string using LINQ
var numbersFromString = new String(input.Where(x => x >= '0' && x <= '9').ToArray());
var numericVal = Int32.Parse(numbersFromString);

Here is my Algorithm
//Fast, C Language friendly
public static int GetNumber(string Text)
{
int val = 0;
for(int i = 0; i < Text.Length; i++)
{
char c = Text[i];
if (c >= '0' && c <= '9')
{
val *= 10;
//(ASCII code reference)
val += c - 48;
}
}
return val;
}

static string GetdigitFromString(string str)
{
char[] refArray = { '0', '1', '2', '3', '4', '5', '6', '7', '8', '9' };
char[] inputArray = str.ToCharArray();
string ext = string.Empty;
foreach (char item in inputArray)
{
if (refArray.Contains(item))
{
ext += item.ToString();
}
}
return ext;
}

here is my solution
string var = "Hello345wor705Ld";
string alpha = string.Empty;
string numer = string.Empty;
foreach (char str in var)
{
if (char.IsDigit(str))
numer += str.ToString();
else
alpha += str.ToString();
}
Console.WriteLine("String is: " + alpha);
Console.WriteLine("Numeric character is: " + numer);
Console.Read();

You will have to use Regex as \d+
\d matches digits in the given string.

string s = "kg g L000145.50\r\n";
char theCharacter = '.';
var getNumbers = (from t in s
where char.IsDigit(t) || t.Equals(theCharacter)
select t).ToArray();
var _str = string.Empty;
foreach (var item in getNumbers)
{
_str += item.ToString();
}
double _dou = Convert.ToDouble(_str);
MessageBox.Show(_dou.ToString("#,##0.00"));

Using #tim-pietzcker answer from above, the following will work for PowerShell.
PS C:\> $str = '1 test'
PS C:\> [regex]::match($str,'\d+').value
1

Related

How to stop String.Concat(); from removing whitespaces? [duplicate]

I would like to split a string with delimiters but keep the delimiters in the result.
How would I do this in C#?
If the split chars were ,, ., and ;, I'd try:
using System.Text.RegularExpressions;
...
string[] parts = Regex.Split(originalString, #"(?<=[.,;])")
(?<=PATTERN) is positive look-behind for PATTERN. It should match at any place where the preceding text fits PATTERN so there should be a match (and a split) after each occurrence of any of the characters.
If you want the delimiter to be its "own split", you can use Regex.Split e.g.:
string input = "plum-pear";
string pattern = "(-)";
string[] substrings = Regex.Split(input, pattern); // Split on hyphens
foreach (string match in substrings)
{
Console.WriteLine("'{0}'", match);
}
// The method writes the following to the console:
// 'plum'
// '-'
// 'pear'
So if you are looking for splitting a mathematical formula, you can use the following Regex
#"([*()\^\/]|(?<!E)[\+\-])"
This will ensure you can also use constants like 1E-02 and avoid having them split into 1E, - and 02
So:
Regex.Split("10E-02*x+sin(x)^2", #"([*()\^\/]|(?<!E)[\+\-])")
Yields:
10E-02
*
x
+
sin
(
x
)
^
2
Building off from BFree's answer, I had the same goal, but I wanted to split on an array of characters similar to the original Split method, and I also have multiple splits per string:
public static IEnumerable<string> SplitAndKeep(this string s, char[] delims)
{
int start = 0, index;
while ((index = s.IndexOfAny(delims, start)) != -1)
{
if(index-start > 0)
yield return s.Substring(start, index - start);
yield return s.Substring(index, 1);
start = index + 1;
}
if (start < s.Length)
{
yield return s.Substring(start);
}
}
Just in case anyone wants this answer aswell...
Instead of string[] parts = Regex.Split(originalString, #"(?<=[.,;])") you could use string[] parts = Regex.Split(originalString, #"(?=yourmatch)") where yourmatch is whatever your separator is.
Supposing the original string was
777- cat
777 - dog
777 - mouse
777 - rat
777 - wolf
Regex.Split(originalString, #"(?=777)") would return
777 - cat
777 - dog
and so on
This version does not use LINQ or Regex and so it's probably relatively efficient. I think it might be easier to use than the Regex because you don't have to worry about escaping special delimiters. It returns an IList<string> which is more efficient than always converting to an array. It's an extension method, which is convenient. You can pass in the delimiters as either an array or as multiple parameters.
/// <summary>
/// Splits the given string into a list of substrings, while outputting the splitting
/// delimiters (each in its own string) as well. It's just like String.Split() except
/// the delimiters are preserved. No empty strings are output.</summary>
/// <param name="s">String to parse. Can be null or empty.</param>
/// <param name="delimiters">The delimiting characters. Can be an empty array.</param>
/// <returns></returns>
public static IList<string> SplitAndKeepDelimiters(this string s, params char[] delimiters)
{
var parts = new List<string>();
if (!string.IsNullOrEmpty(s))
{
int iFirst = 0;
do
{
int iLast = s.IndexOfAny(delimiters, iFirst);
if (iLast >= 0)
{
if (iLast > iFirst)
parts.Add(s.Substring(iFirst, iLast - iFirst)); //part before the delimiter
parts.Add(new string(s[iLast], 1));//the delimiter
iFirst = iLast + 1;
continue;
}
//No delimiters were found, but at least one character remains. Add the rest and stop.
parts.Add(s.Substring(iFirst, s.Length - iFirst));
break;
} while (iFirst < s.Length);
}
return parts;
}
Some unit tests:
text = "[a link|http://www.google.com]";
result = text.SplitAndKeepDelimiters('[', '|', ']');
Assert.IsTrue(result.Count == 5);
Assert.AreEqual(result[0], "[");
Assert.AreEqual(result[1], "a link");
Assert.AreEqual(result[2], "|");
Assert.AreEqual(result[3], "http://www.google.com");
Assert.AreEqual(result[4], "]");
A lot of answers to this! One I knocked up to split by various strings (the original answer caters for just characters i.e. length of 1). This hasn't been fully tested.
public static IEnumerable<string> SplitAndKeep(string s, params string[] delims)
{
var rows = new List<string>() { s };
foreach (string delim in delims)//delimiter counter
{
for (int i = 0; i < rows.Count; i++)//row counter
{
int index = rows[i].IndexOf(delim);
if (index > -1
&& rows[i].Length > index + 1)
{
string leftPart = rows[i].Substring(0, index + delim.Length);
string rightPart = rows[i].Substring(index + delim.Length);
rows[i] = leftPart;
rows.Insert(i + 1, rightPart);
}
}
}
return rows;
}
This seems to work, but its not been tested much.
public static string[] SplitAndKeepSeparators(string value, char[] separators, StringSplitOptions splitOptions)
{
List<string> splitValues = new List<string>();
int itemStart = 0;
for (int pos = 0; pos < value.Length; pos++)
{
for (int sepIndex = 0; sepIndex < separators.Length; sepIndex++)
{
if (separators[sepIndex] == value[pos])
{
// add the section of string before the separator
// (unless its empty and we are discarding empty sections)
if (itemStart != pos || splitOptions == StringSplitOptions.None)
{
splitValues.Add(value.Substring(itemStart, pos - itemStart));
}
itemStart = pos + 1;
// add the separator
splitValues.Add(separators[sepIndex].ToString());
break;
}
}
}
// add anything after the final separator
// (unless its empty and we are discarding empty sections)
if (itemStart != value.Length || splitOptions == StringSplitOptions.None)
{
splitValues.Add(value.Substring(itemStart, value.Length - itemStart));
}
return splitValues.ToArray();
}
Recently I wrote an extension method do to this:
public static class StringExtensions
{
public static IEnumerable<string> SplitAndKeep(this string s, string seperator)
{
string[] obj = s.Split(new string[] { seperator }, StringSplitOptions.None);
for (int i = 0; i < obj.Length; i++)
{
string result = i == obj.Length - 1 ? obj[i] : obj[i] + seperator;
yield return result;
}
}
}
I'd say the easiest way to accomplish this (except for the argument Hans Kesting brought up) is to split the string the regular way, then iterate over the array and add the delimiter to every element but the last.
To avoid adding character to new line try this :
string[] substrings = Regex.Split(input,#"(?<=[-])");
result = originalString.Split(separator);
for(int i = 0; i < result.Length - 1; i++)
result[i] += separator;
(EDIT - this is a bad answer - I misread his question and didn't see that he was splitting by multiple characters.)
(EDIT - a correct LINQ version is awkward, since the separator shouldn't get concatenated onto the final string in the split array.)
Iterate through the string character by character (which is what regex does anyway.
When you find a splitter, then spin off a substring.
pseudo code
int hold, counter;
List<String> afterSplit;
string toSplit
for(hold = 0, counter = 0; counter < toSplit.Length; counter++)
{
if(toSplit[counter] = /*split charaters*/)
{
afterSplit.Add(toSplit.Substring(hold, counter));
hold = counter;
}
}
That's sort of C# but not really. Obviously, choose the appropriate function names.
Also, I think there might be an off-by-1 error in there.
But that will do what you're asking.
veggerby's answer modified to
have no string items in the list
have fixed string as delimiter like "ab" instead of single character
var delimiter = "ab";
var text = "ab33ab9ab"
var parts = Regex.Split(text, $#"({Regex.Escape(delimiter)})")
.Where(p => p != string.Empty)
.ToList();
// parts = "ab", "33", "ab", "9", "ab"
The Regex.Escape() is there just in case your delimiter contains characters which regex interprets as special pattern commands (like *, () and thus have to be escaped.
using System.Collections.Generic;
using System.Text.RegularExpressions;
namespace ConsoleApplication9
{
class Program
{
static void Main(string[] args)
{
string input = #"This;is:a.test";
char sep0 = ';', sep1 = ':', sep2 = '.';
string pattern = string.Format("[{0}{1}{2}]|[^{0}{1}{2}]+", sep0, sep1, sep2);
Regex regex = new Regex(pattern);
MatchCollection matches = regex.Matches(input);
List<string> parts=new List<string>();
foreach (Match match in matches)
{
parts.Add(match.ToString());
}
}
}
}
I wanted to do a multiline string like this but needed to keep the line breaks so I did this
string x =
#"line 1 {0}
line 2 {1}
";
foreach(var line in string.Format(x, "one", "two")
.Split("\n")
.Select(x => x.Contains('\r') ? x + '\n' : x)
.AsEnumerable()
) {
Console.Write(line);
}
yields
line 1 one
line 2 two
I came across same problem but with multiple delimiters. Here's my solution:
public static string[] SplitLeft(this string #this, char[] delimiters, int count)
{
var splits = new List<string>();
int next = -1;
while (splits.Count + 1 < count && (next = #this.IndexOfAny(delimiters, next + 1)) >= 0)
{
splits.Add(#this.Substring(0, next));
#this = new string(#this.Skip(next).ToArray());
}
splits.Add(#this);
return splits.ToArray();
}
Sample with separating CamelCase variable names:
var variableSplit = variableName.SplitLeft(
Enumerable.Range('A', 26).Select(i => (char)i).ToArray());
I wrote this code to split and keep delimiters:
private static string[] SplitKeepDelimiters(string toSplit, char[] delimiters, StringSplitOptions splitOptions = StringSplitOptions.None)
{
var tokens = new List<string>();
int idx = 0;
for (int i = 0; i < toSplit.Length; ++i)
{
if (delimiters.Contains(toSplit[i]))
{
tokens.Add(toSplit.Substring(idx, i - idx)); // token found
tokens.Add(toSplit[i].ToString()); // delimiter
idx = i + 1; // start idx for the next token
}
}
// last token
tokens.Add(toSplit.Substring(idx));
if (splitOptions == StringSplitOptions.RemoveEmptyEntries)
{
tokens = tokens.Where(token => token.Length > 0).ToList();
}
return tokens.ToArray();
}
Usage example:
string toSplit = "AAA,BBB,CCC;DD;,EE,";
char[] delimiters = new char[] {',', ';'};
string[] tokens = SplitKeepDelimiters(toSplit, delimiters, StringSplitOptions.RemoveEmptyEntries);
foreach (var token in tokens)
{
Console.WriteLine(token);
}

how to extract a negative symbol from text

Below I have a piece of code that is able to extract the number values from a piece of text. So if I have string +12 is on same day, then it will extract 12.
However if I have a negative number like -12 is on the same day, I want it to extract -12, not 12.
How I can extract the minus symbol?
foreach (char c in alternativeAirportPrice.Text)
{
if (char.IsNumber(c))
{
string test = "-12 on same day";
string alternativeAirportPriceValue = string.Join("", test.ToCharArray()
.Where(x => char.IsDigit(x)).ToArray());
return alternativeAirportPriceValue;
}
}
You can use a regex pattern:
-?\d+
This will match any string of digits or a - followed by a string of digits.
string text = "-12 on the same day";
var match = Regex.Match(text, "-?\\d+");
return match.Value;
Remember to add a using directive to System.Text.RegularExpressions!
This should be what you want based on the question's desired result. Note that you don't need a foreach loop for this purpose just LINQ is enough:
string.Join("", test.Split(' ').Where(x => int.TryParse(x , out _)).ToArray());
return alternativeAirportPriceValue;
Try this:
public IEnumerable<int> ExtractNumbers(string text)
{
text += " ";
var temp = string.Empty;
for (var i = 0; i < text.Length; i++)
{
if (char.IsDigit(text[i]))
{
if ('-'.Equals(text[i - 1]))
{
temp += text[i - 1];
}
temp += text[i];
}
else if (temp.Length > 0)
{
yield return int.Parse(temp);
temp = string.Empty;
}
}
}
This way you can handle cases where the string has multiple numbers in it, as observed by #Sir Rufo
The line:
text += " ";
is there to ensure the loop will hit the "else if" block when a number is in the last position of the string, e.g. "-12 on the same day 123456"

Getting number from a string in C#

I am scraping some website content which is like this - "Company Stock Rs. 7100".
Now, what i want is to extract the numeric value from this string. I tried split but something or the other goes wrong with my regular expression.
Please let me know how to get this value.
Use:
var result = Regex.Match(input, #"\d+").Value;
If you want to find only number which is last "entity" in the string you should use this regex:
\d+$
If you want to match last number in the string, you can use:
\d+(?!\D*\d)
int val = int.Parse(Regex.Match(input, #"\d+", RegexOptions.RightToLeft).Value);
I always liked LINQ:
var theNumber = theString.Where(x => char.IsNumber(x));
Though Regex sounds like the native choice...
This code will return the integer at the end of the string. This will work better than the regular expressions in the case that there is a number somewhere else in the string.
public int getLastInt(string line)
{
int offset = line.Length;
for (int i = line.Length - 1; i >= 0; i--)
{
char c = line[i];
if (char.IsDigit(c))
{
offset--;
}
else
{
if (offset == line.Length)
{
// No int at the end
return -1;
}
return int.Parse(line.Substring(offset));
}
}
return int.Parse(line.Substring(offset));
}
If your number is always after the last space and your string always ends with this number, you can get it this way:
str.Substring(str.LastIndexOf(" ") + 1)
Here is my answer ....it is separating numeric from string using C#....
static void Main(string[] args)
{
String details = "XSD34AB67";
string numeric = "";
string nonnumeric = "";
char[] mychar = details.ToCharArray();
foreach (char ch in mychar)
{
if (char.IsDigit(ch))
{
numeric = numeric + ch.ToString();
}
else
{
nonnumeric = nonnumeric + ch.ToString();
}
}
int i = Convert.ToInt32(numeric);
Console.WriteLine(numeric);
Console.WriteLine(nonnumeric);
Console.ReadLine();
}
}
}
You can use \d+ to match the first occurrence of a number:
string num = Regex.Match(input, #"\d+").Value;

How to get number from string in C#

i have a String in HTML (1-3 of 3 Trip) how do i get the number 3(before trip) and convert it to int.I want to use it as a count
Found this code
public static string GetNumberFromStr(string str)
{
str = str.Trim();
Match m = Regex.Match(str, #"^[\+\-]?\d*\.?[Ee]?[\+\-]?\d*$");
return (m.Value);
}
But it can only get 1 number
Regex is unnecessary overhead in your case. try this:
int ExtractNumber(string input)
{
int number = Convert.ToInt32(input.Split(' ')[2]);
return number;
}
Other useful methods for Googlers:
// throws exception if it fails
int i = int.Parse(someString);
// returns false if it fails, returns true and changes `i` if it succeeds
bool b = int.TryParse(someString, out i);
// this one is able to convert any numeric Unicode character to a double. Returns -1 if it fails
double two = char.GetNumericValue('٢')
Forget Regex. This code splits the string using a space as a delimiter and gets the number in the index 2 position.
string trip = "1-3 of 3 trip";
string[] array = trip.Split(' ');
int theNumberYouWant = int.Parse(array[2]);
Try this:
public static int GetNumberFromStr(string str)
{
str = str.Trim();
Match m = Regex.Match(str, #"^.*of\s(?<TripCount>\d+)");
return m.Groups["TripCount"].Length > 0 ? int.Parse(m.Groups["TripCount"].Value) : 0;
}
Another way to do it:
public static int[] GetNumbersFromString(string str)
{
List<int> result = new List<int>();
string[] numbers = Regex.Split(input, #"\D+");
int i;
foreach (string value in numbers)
{
if (int.TryParse(value, out i))
{
result.Add(i);
}
}
return result.ToArray();
}
Example of how to use:
const string input = "There are 4 numbers in this string: 40, 30, and 10.";
int[] numbers = MyHelperClass.GetNumbersFromString();
for(i = 0; i < numbers.length; i++)
{
Console.WriteLine("Number {0}: {1}", i + 1, number[i]);
}
Output:
Number: 4
Number: 40
Number: 30
Number: 10
Thanks to: http://www.dotnetperls.com/regex-split-numbers
If I'm reading your question properly, you'll get a string that is a single digit number followed by ' Trip' and you want to get the numeric value out?
public static int GetTripNumber(string tripEntry)
{
return int.Parse(tripEntry.ToCharArray()[0]);
}
Not really sure if you mean that you always have "(x-y of y Trip)" as a part of the string you parse...if you look at the pattern it only catches the "x-y" part thought with the acceptance of .Ee+- as seperators. If you want to catch the "y Trip" part you will have to look at another regex instead.
You could do a simple, if you change the return type to int instead of string:
Match m = Regex.Match(str, #"(?<maxTrips>\d+)\sTrip");
return m.Groups["maxTrips"].Lenght > 0 ? Convert.ToInt32(m.Groups["maxTrips"].Value) : 0;

Parse an integer from a string with trailing garbage

I need to parse a decimal integer that appears at the start of a string.
There may be trailing garbage following the decimal number. This needs to be ignored (even if it contains other numbers.)
e.g.
"1" => 1
" 42 " => 42
" 3 -.X.-" => 3
" 2 3 4 5" => 2
Is there a built-in method in the .NET framework to do this?
int.TryParse() is not suitable. It allows trailing spaces but not other trailing characters.
It would be quite easy to implement this but I would prefer to use the standard method if it exists.
You can use Linq to do this, no Regular Expressions needed:
public static int GetLeadingInt(string input)
{
return Int32.Parse(new string(input.Trim().TakeWhile(c => char.IsDigit(c) || c == '.').ToArray()));
}
This works for all your provided examples:
string[] tests = new string[] {
"1",
" 42 ",
" 3 -.X.-",
" 2 3 4 5"
};
foreach (string test in tests)
{
Console.WriteLine("Result: " + GetLeadingInt(test));
}
foreach (var m in Regex.Matches(" 3 - .x. 4", #"\d+"))
{
Console.WriteLine(m);
}
Updated per comments
Not sure why you don't like regular expressions, so I'll just post what I think is the shortest solution.
To get first int:
Match match = Regex.Match(" 3 - .x. - 4", #"\d+");
if (match.Success)
Console.WriteLine(int.Parse(match.Value));
There's no standard .NET method for doing this - although I wouldn't be surprised to find that VB had something in the Microsoft.VisualBasic assembly (which is shipped with .NET, so it's not an issue to use it even from C#).
Will the result always be non-negative (which would make things easier)?
To be honest, regular expressions are the easiest option here, but...
public static string RemoveCruftFromNumber(string text)
{
int end = 0;
// First move past leading spaces
while (end < text.Length && text[end] == ' ')
{
end++;
}
// Now move past digits
while (end < text.Length && char.IsDigit(text[end]))
{
end++;
}
return text.Substring(0, end);
}
Then you just need to call int.TryParse on the result of RemoveCruftFromNumber (don't forget that the integer may be too big to store in an int).
I like #Donut's approach.
I'd like to add though, that char.IsDigit and char.IsNumber also allow for some unicode characters which are digits in other languages and scripts (see here).
If you only want to check for the digits 0 to 9 you could use "0123456789".Contains(c).
Three example implementions:
To remove trailing non-digit characters:
var digits = new string(input.Trim().TakeWhile(c =>
("0123456789").Contains(c)
).ToArray());
To remove leading non-digit characters:
var digits = new string(input.Trim().SkipWhile(c =>
!("0123456789").Contains(c)
).ToArray());
To remove all non-digit characters:
var digits = new string(input.Trim().Where(c =>
("0123456789").Contains(c)
).ToArray());
And of course: int.Parse(digits) or int.TryParse(digits, out output)
This doesn't really answer your question (about a built-in C# method), but you could try chopping off characters at the end of the input string one by one until int.TryParse() accepts it as a valid number:
for (int p = input.Length; p > 0; p--)
{
int num;
if (int.TryParse(input.Substring(0, p), out num))
return num;
}
throw new Exception("Malformed integer: " + input);
Of course, this will be slow if input is very long.
ADDENDUM (March 2016)
This could be made faster by chopping off all non-digit/non-space characters on the right before attempting each parse:
for (int p = input.Length; p > 0; p--)
{
char ch;
do
{
ch = input[--p];
} while ((ch < '0' || ch > '9') && ch != ' ' && p > 0);
p++;
int num;
if (int.TryParse(input.Substring(0, p), out num))
return num;
}
throw new Exception("Malformed integer: " + input);
string s = " 3 -.X.-".Trim();
string collectedNumber = string.empty;
int i;
for (x = 0; x < s.length; x++)
{
if (int.TryParse(s[x], out i))
collectedNumber += s[x];
else
break; // not a number - that's it - get out.
}
if (int.TryParse(collectedNumber, out i))
Console.WriteLine(i);
else
Console.WriteLine("no number found");
This is how I would have done it in Java:
int parseLeadingInt(String input)
{
NumberFormat fmt = NumberFormat.getIntegerInstance();
fmt.setGroupingUsed(false);
return fmt.parse(input, new ParsePosition(0)).intValue();
}
I was hoping something similar would be possible in .NET.
This is the regex-based solution I am currently using:
int? parseLeadingInt(string input)
{
int result = 0;
Match match = Regex.Match(input, "^[ \t]*\\d+");
if (match.Success && int.TryParse(match.Value, out result))
{
return result;
}
return null;
}
Might as well add mine too.
string temp = " 3 .x£";
string numbersOnly = String.Empty;
int tempInt;
for (int i = 0; i < temp.Length; i++)
{
if (Int32.TryParse(Convert.ToString(temp[i]), out tempInt))
{
numbersOnly += temp[i];
}
}
Int32.TryParse(numbersOnly, out tempInt);
MessageBox.Show(tempInt.ToString());
The message box is just for testing purposes, just delete it once you verify the method is working.
I'm not sure why you would avoid Regex in this situation.
Here's a little hackery that you can adjust to your needs.
" 3 -.X.-".ToCharArray().FindInteger().ToList().ForEach(Console.WriteLine);
public static class CharArrayExtensions
{
public static IEnumerable<char> FindInteger(this IEnumerable<char> array)
{
foreach (var c in array)
{
if(char.IsNumber(c))
yield return c;
}
}
}
EDIT:
That's true about the incorrect result (and the maintenance dev :) ).
Here's a revision:
public static int FindFirstInteger(this IEnumerable<char> array)
{
bool foundInteger = false;
var ints = new List<char>();
foreach (var c in array)
{
if(char.IsNumber(c))
{
foundInteger = true;
ints.Add(c);
}
else
{
if(foundInteger)
{
break;
}
}
}
string s = string.Empty;
ints.ForEach(i => s += i.ToString());
return int.Parse(s);
}
private string GetInt(string s)
{
int i = 0;
s = s.Trim();
while (i<s.Length && char.IsDigit(s[i])) i++;
return s.Substring(0, i);
}
Similar to Donut's above but with a TryParse:
private static bool TryGetLeadingInt(string input, out int output)
{
var trimmedString = new string(input.Trim().TakeWhile(c => char.IsDigit(c) || c == '.').ToArray());
var canParse = int.TryParse( trimmedString, out output);
return canParse;
}

Categories