I'm essentially trying to read an xml file. One of the values has a suffix, e.g. "30d". This is meant to mean '30 days'. So I'm trying to convert this to a DateTime.Now.AddDays(30). To read this field in the XML, i decided to use an Enum:
enum DurationType { Min = "m", Hours = "h", Days = "d" }
Now I'm not exactly sure how exactly to approach this efficiently (I'm a little daft when it comes to enums). Should I separate the suffix, in this case "d", out of the string first, then try and match it in the enum using a switch statement?
I guess if you dumb down my question, it'd be: What's the best way to get from 30d, to DateTime.Now.AddDays(30) ?
You could make an ExtensionMethod to parse the string and return the DateTime you want
Something like:
public static DateTime AddDuration(this DateTime datetime, string str)
{
int value = 0;
int mutiplier = str.EndsWith("d") ? 1440 : str.EndsWith("h") ? 60 : 1;
if (int.TryParse(str.TrimEnd(new char[]{'m','h','d'}), out value))
{
return datetime.AddMinutes(value * mutiplier);
}
return datetime;
}
Usage:
var date = DateTime.Now.AddDuration("2d");
This seems like a good place to use regular expressions; specifically, capture groups.
Below is a working example:
using System;
using System.Text.RegularExpressions;
namespace RegexCaptureGroups
{
class Program
{
// Below is a breakdown of this regular expression:
// First, one or more digits followed by "d" or "D" to represent days.
// Second, one or more digits followed by "h" or "H" to represent hours.
// Third, one or more digits followed by "m" or "M" to represent minutes.
// Each component can be separated by any number of spaces, or none.
private static readonly Regex DurationRegex = new Regex(#"((?<Days>\d+)d)?\s*((?<Hours>\d+)h)?\s*((?<Minutes>\d+)m)?", RegexOptions.IgnoreCase);
public static TimeSpan ParseDuration(string input)
{
var match = DurationRegex.Match(input);
var days = match.Groups["Days"].Value;
var hours = match.Groups["Hours"].Value;
var minutes = match.Groups["Minutes"].Value;
int daysAsInt32, hoursAsInt32, minutesAsInt32;
if (!int.TryParse(days, out daysAsInt32))
daysAsInt32 = 0;
if (!int.TryParse(hours, out hoursAsInt32))
hoursAsInt32 = 0;
if (!int.TryParse(minutes, out minutesAsInt32))
minutesAsInt32 = 0;
return new TimeSpan(daysAsInt32, hoursAsInt32, minutesAsInt32, 0);
}
static void Main(string[] args)
{
Console.WriteLine(ParseDuration("30d"));
Console.WriteLine(ParseDuration("12h"));
Console.WriteLine(ParseDuration("20m"));
Console.WriteLine(ParseDuration("1d 12h"));
Console.WriteLine(ParseDuration("5d 30m"));
Console.WriteLine(ParseDuration("1d 12h 20m"));
Console.WriteLine("Press any key to exit.");
Console.ReadKey();
}
}
}
EDIT: Below is an alternative, slightly more condensed version of the above, though I'm not sure which one I prefer more. I'm usually not a fan of overly dense code.
I adjusted the regular expression to put a limit of 10 digits on each number. This allows me to safely use the int.Parse function, because I know that the input consists of at least one digit and at most ten (unless it didn't capture at all, in which case it would be empty string: hence, the purpose of the ParseInt32ZeroIfNullOrEmpty method).
// Below is a breakdown of this regular expression:
// First, one to ten digits followed by "d" or "D" to represent days.
// Second, one to ten digits followed by "h" or "H" to represent hours.
// Third, one to ten digits followed by "m" or "M" to represent minutes.
// Each component can be separated by any number of spaces, or none.
private static readonly Regex DurationRegex = new Regex(#"((?<Days>\d{1,10})d)?\s*((?<Hours>\d{1,10})h)?\s*((?<Minutes>\d{1,10})m)?", RegexOptions.IgnoreCase);
private static int ParseInt32ZeroIfNullOrEmpty(string input)
{
return string.IsNullOrEmpty(input) ? 0 : int.Parse(input);
}
public static TimeSpan ParseDuration(string input)
{
var match = DurationRegex.Match(input);
return new TimeSpan(
ParseInt32ZeroIfNullOrEmpty(match.Groups["Days"].Value),
ParseInt32ZeroIfNullOrEmpty(match.Groups["Hours"].Value),
ParseInt32ZeroIfNullOrEmpty(match.Groups["Minutes"].Value),
0);
}
EDIT: Just to take this one more step, I've added another version below, which handles days, hours, minutes, seconds, and milliseconds, with a variety of abbreviations for each. I split the regular expression into multiple lines for readability. Note, I also had to adjust the expression by using (\b|(?=[^a-z])) at the end of each component: this is because the "ms" unit was being captured as the "m" unit. The special syntax of "?=" used with "[^a-z]" indicates to match the character but not to "consume" it.
// Below is a breakdown of this regular expression:
// First, one to ten digits followed by "d", "dy", "dys", "day", or "days".
// Second, one to ten digits followed by "h", "hr", "hrs", "hour", or "hours".
// Third, one to ten digits followed by "m", "min", "minute", or "minutes".
// Fourth, one to ten digits followed by "s", "sec", "second", or "seconds".
// Fifth, one to ten digits followed by "ms", "msec", "millisec", "millisecond", or "milliseconds".
// Each component may be separated by any number of spaces, or none.
// The expression is case-insensitive.
private static readonly Regex DurationRegex = new Regex(#"
((?<Days>\d{1,10})(d|dy|dys|day|days)(\b|(?=[^a-z])))?\s*
((?<Hours>\d{1,10})(h|hr|hrs|hour|hours)(\b|(?=[^a-z])))?\s*
((?<Minutes>\d{1,10})(m|min|minute|minutes)(\b|(?=[^a-z])))?\s*
((?<Seconds>\d{1,10})(s|sec|second|seconds)(\b|(?=[^a-z])))?\s*
((?<Milliseconds>\d{1,10})(ms|msec|millisec|millisecond|milliseconds)(\b|(?=[^a-z])))?",
RegexOptions.IgnoreCase | RegexOptions.IgnorePatternWhitespace);
private static int ParseInt32ZeroIfNullOrEmpty(string input)
{
return string.IsNullOrEmpty(input) ? 0 : int.Parse(input);
}
public static TimeSpan ParseDuration(string input)
{
var match = DurationRegex.Match(input);
return new TimeSpan(
ParseInt32ZeroIfNullOrEmpty(match.Groups["Days"].Value),
ParseInt32ZeroIfNullOrEmpty(match.Groups["Hours"].Value),
ParseInt32ZeroIfNullOrEmpty(match.Groups["Minutes"].Value),
ParseInt32ZeroIfNullOrEmpty(match.Groups["Seconds"].Value),
ParseInt32ZeroIfNullOrEmpty(match.Groups["Milliseconds"].Value));
}
update:
Don't vote for this. I'm leaving it simply because it's an alternative approach. Instead look at sa_ddam213 and Dr. Wily's Apprentice's answers.
Should I separate the suffix, in this case "d", out of the string
first, then try and match it in the enum using a switch statement?
Yes.
For a fully working example:
private void button1_Click( object sender, EventArgs e ) {
String value = "30d";
Duration d = (Duration)Enum.Parse(typeof(Duration), value.Substring(value.Length - 1, 1).ToUpper());
DateTime result = d.From(new DateTime(), value);
MessageBox.Show(result.ToString());
}
enum Duration { D, W, M, Y };
static class DurationExtensions {
public static DateTime From( this Duration duration, DateTime dateTime, Int32 period ) {
switch (duration)
{
case Duration.D: return dateTime.AddDays(period);
case Duration.W: return dateTime.AddDays((period*7));
case Duration.M: return dateTime.AddMonths(period);
case Duration.Y: return dateTime.AddYears(period);
default: throw new ArgumentOutOfRangeException("duration");
}
}
public static DateTime From( this Duration duration, DateTime dateTime, String fullValue ) {
Int32 period = Convert.ToInt32(fullValue.ToUpper().Replace(duration.ToString(), String.Empty));
return From(duration, dateTime, period);
}
}
I really don't see how using an enum helps here.
Here's how I might approach it.
string s = "30d";
int typeIndex = s.IndexOfAny(new char[] { 'd', 'w', 'm' });
if (typeIndex > 0)
{
int value = int.Parse(s.Substring(0, typeIndex));
switch (s[typeIndex])
{
case 'd':
result = DateTime.Now.AddDays(value);
break;
case 'w':
result = DateTime.Now.AddDays(value * 7);
break;
case 'm':
result = DateTime.Now.AddMonths(value);
break;
}
}
Depending on the reliability of your input data, you might need to use int.TryParse() instead of int.Parse(). Otherwise, this should be all you need.
Note: I've also written a sscanf() replacement for .NET that would handle this quite easily. You can see the code for that in the article A sscanf() Replacement for .NET.
Try the following code, assuming that values like "30d" are in a string 'val'.
DateTime ConvertValue(string val) {
if (val.Length > 0) {
int prefix = Convert.ToInt32(val.Length.Remove(val.Length-1));
switch (val[val.Length-1]) {
case 'd': return DateTime.Now.AddDays(prefix);
case 'm': return DateTime.Now.AddMonths(prefix);
// etc.
}
throw new ArgumentException("string in unexpected format.");
}
Example of a console application example/tutorial:
enum DurationType
{
[DisplayName("m")]
Min = 1,
[DisplayName("h")]
Hours = 1 * 60,
[DisplayName("d")]
Days = 1 * 60 * 24
}
internal class Program
{
private static void Main(string[] args)
{
string input1 = "10h";
string input2 = "1d10h3m";
var x = GetOffsetFromDate(DateTime.Now, input1);
var y = GetOffsetFromDate(DateTime.Now, input2);
}
private static Dictionary<string, DurationType> suffixDictionary
{
get
{
return Enum
.GetValues(typeof (DurationType))
.Cast<DurationType>()
.ToDictionary(duration => duration.GetDisplayName(), duration => duration);
}
}
public static DateTime GetOffsetFromDate(DateTime date, string input)
{
MatchCollection matches = Regex.Matches(input, #"(\d+)([a-zA-Z]+)");
foreach (Match match in matches)
{
int numberPart = Int32.Parse(match.Groups[1].Value);
string suffix = match.Groups[2].Value;
date = date.AddMinutes((int)suffixDictionary[suffix]);
}
return date;
}
}
[AttributeUsage(AttributeTargets.Field)]
public class DisplayNameAttribute : Attribute
{
public DisplayNameAttribute(String name)
{
this.name = name;
}
protected String name;
public String Name { get { return this.name; } }
}
public static class ExtensionClass
{
public static string GetDisplayName<TValue>(this TValue value) where TValue : struct, IConvertible
{
FieldInfo fi = typeof(TValue).GetField(value.ToString());
DisplayNameAttribute attribute = (DisplayNameAttribute)fi.GetCustomAttributes(typeof(DisplayNameAttribute), false).FirstOrDefault();
if (attribute != null)
return attribute.Name;
return value.ToString();
}
}
Uses an attribute to define your suffix, uses the enum value to define your offset.
Requires:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Reflection;
using System.Text.RegularExpressions;
It may be considered a hack to use the enum integer value but this example will still let you parse out all the Enums (for any other use like switch case) with little tweaks.
Enums can't be backed with non-numeric types, so string-based enums are out. It's possible you may be overthinking it. Without knowing any more about the problem, the most straightforward solution seems to be splitting off the last character, converting the rest to an int, and then handling each final char as a separate case.
I'd suggest using regexp to strip the number first and than execute Enum.Parse Method to evaluate the value of the enum. Than you can use a switch (see Corylulu's answer) to get the right offset, based on the parsed number and enum value.
Related
I would like to have a flexible template that can translate cases similar to:
WHnnn => WH001, WH002, WH003... (nnn is just a number indicated 3 digits)
INVyyyyMMdd => INV20220228
ORDERyyyyMMdd-nnn => ORDER20220228-007
I know that I can use the following code to achieve a specific template:
string.Format("INV{0:yyyy-MM-dd}", DateTime.Now)
Which should have the same result as case 2 above. But that's not flexible. As the customer may customize their own template as long as I can understand/support, like the third case above.
I know even for the third case, I can do something like this:
string.Format("ORDER{0:yyyy-MM-dd}-{1:d3}", DateTime.Now, 124)
But that's clumsy, as I would like the template (input) to be just like this:
ORDERyyyyMMdd-nnn
The requirement is to support all the supported patterns by string.Format in C#, but the template can be any combination of those patterns.
I would probably use a custom formatter for this case.
Create a new class that will contain date/time & number and will implement IFormattable interface.
There is one tip: use some internal format in style INV{nnn} or INV[nnn] where only the part in {} or [] will be replaced with the value.
Otherwise there could be unwanted changes like in Inv contains 'n'. You could get output as I7v.
In your examples the N is upper case, but will it be the case even after each customisation?
Code (simplified version):
internal sealed class InvoiceNumberInfo : IFormattable
{
private static readonly Regex formatMatcher = new Regex(#"^(?<before>.*?)\[(?<code>\w+?)\](?<after>.*)$");
private readonly DateTime date;
private readonly int number;
public InvoiceNumberInfo(DateTime date, int number)
{
this.date = date;
this.number = number;
}
public string ToString(string format, IFormatProvider formatProvider)
{
var output = format;
while (true)
{
var match = formatMatcher.Match(output);
if (!match.Success)
{
return output;
}
output = match.Groups["before"].Value + FormatValue(match.Groups["code"].Value) + match.Groups["after"].Value;
}
}
private string FormatValue(string code)
{
if (code[0] == 'n')
{
var numberFormat = "D" + code.Length.ToString(CultureInfo.InvariantCulture);
return this.number.ToString(numberFormat);
}
else
{
return this.date.ToString(code);
}
}
}
Use:
internal static class Program
{
public static void Main(string[] args)
{
if (args.Length == 0)
{
Console.WriteLine("No format to display");
return;
}
var inv = new InvoiceNumberInfo(DateTime.Now, number: 7);
foreach (string format in args)
{
Console.WriteLine("Format: '{0}' is formatted as {1:" + format + "}", format, inv);
}
}
}
And output:
Format: 'WH[nnn]' is formatted as WH007
Format: 'INV[yyyyMMdd]' is formatted as INV20220227
Format: 'ORDER[yyyyMMdd]-[nnn]' is formatted as ORDER20220227-007
NOTE This is only a simplified version, proof of concept. Use proper error checking for your code.
I want to know about how to splitting a value in string format in to two parts. Here in my asp application I'm parsing string value from view to controller.
And then I want to split the whole value in to two parts.
Example like: Most of the times value firest two letters could be TEXT value (like "PO" , "SS" , "GS" ) and the rest of the others are numbers (SS235452).
The length of the numbers cannot declare, since it generates randomly. So Want to split it from the begining of the string value. Need a help for that.
My current code is
string approvalnumber = approvalCheck.ApprovalNumber.ToUpper();
Thanks.
As you already mentioned that first part will have 2 letters and it's only second part which is varying, you can use Substring Method of String as shown below.
var textPart = input.Substring(0,2);
var numPart = input.Substring(2);
The first line fetches 2 characters from starting index zero and the second statement fetches all characters from index 2. You can cast the second part to a number if required.
Please note that the second parameter of Substring is not mentioned in second line. This parameter is for length and if nothing is mentioned it fetches till end of string.
You could try using regex to extract alpha, numbers from the string.
This javascript function returns only numbers from the input string.
function getNumbers(input) {
return input.match(/[0-9]+/g);
}
I'd use a RegExp. Considering the fact that you indicate ASP-NET-4 I assume you can't use tuples, out var etc. so it'd go as follows:
using System.Text.RegularExpressions;
using FluentAssertions;
using Xunit;
namespace Playground
{
public class Playground
{
public struct ProjectCodeMatch
{
public string Code { get; set; }
public int? Number { get; set; }
}
[Theory]
[InlineData("ABCDEFG123", "ABCDEFG", 123)]
[InlineData("123456", "", 123456)]
[InlineData("ABCDEFG", "ABCDEFG", null)]
[InlineData("ab123", "AB", 123)]
public void Split_Works(string input, string expectedCode, int? expectedNumber)
{
ProjectCodeMatch result;
var didParse = TryParse(input, out result);
didParse.Should().BeTrue();
result.Code.Should().Be(expectedCode);
result.Number.Should().Be(expectedNumber);
}
private static bool TryParse(string input, out ProjectCodeMatch result)
{
/*
* A word on this RegExp:
* ^ - the match must happen at the beginning of the string (nothing before that)
* (?<Code>[a-zA-Z]+) - grab any number of letters and name this part the "Code" group
* (?<Number>\d+) - grab any number of numbers and name this part the Number group
* {0,1} this group must occur at most 1 time
* $ - the match must end at the end of the string (nothing after that)
*/
var regex = new Regex(#"^(?<Code>[a-zA-Z]+){0,1}(?<Number>\d+){0,1}$");
var match = regex.Match(input);
if (!match.Success)
{
result = default;
return false;
}
int number;
var isNumber = int.TryParse(match.Groups["Number"].Value, out number);
result = new ProjectCodeMatch
{
Code = match.Groups["Code"].Value.ToUpper(),
Number = isNumber ? number : null
};
return true;
}
}
}
A linq answer:
string d = "PO1232131";
string.Join("",d.TakeWhile(a => Char.IsLetter(a)))
I have a long string with double-type values separated by # -value1#value2#value3# etc
I splitted it to string table. Then, I want to convert every single element from this table to double type and I get an error. What is wrong with type-conversion here?
string a = "52.8725945#18.69872650000002#50.9028073#14.971600200000012#51.260062#15.5859949000000662452.23862099999999#19.372202799999250800000045#51.7808372#19.474096499999973#";
string[] someArray = a.Split(new char[] { '#' });
for (int i = 0; i < someArray.Length; i++)
{
Console.WriteLine(someArray[i]); // correct value
Convert.ToDouble(someArray[i]); // error
}
There are 3 problems.
1) Incorrect decimal separator
Different cultures use different decimal separators (namely , and .).
If you replace . with , it should work as expected:
Console.WriteLine(Convert.ToDouble("52,8725945"));
You can parse your doubles using overloaded method which takes culture as a second parameter. In this case you can use InvariantCulture (What is the invariant culture) e.g. using double.Parse:
double.Parse("52.8725945", System.Globalization.CultureInfo.InvariantCulture);
You should also take a look at double.TryParse, you can use it with many options and it is especially useful to check wheter or not your string is a valid double.
2) You have an incorrect double
One of your values is incorrect, because it contains two dots:
15.5859949000000662452.23862099999999
3) Your array has an empty value at the end, which is an incorrect double
You can use overloaded Split which removes empty values:
string[] someArray = a.Split(new char[] { '#' }, StringSplitOptions.RemoveEmptyEntries);
Add a class as Public and use it very easily like convertToInt32()
using System;
using System.Collections.Generic;
using System.Linq;
using System.Web;
/// <summary>
/// Summary description for Common
/// </summary>
public static class Common
{
public static double ConvertToDouble(string Value) {
if (Value == null) {
return 0;
}
else {
double OutVal;
double.TryParse(Value, out OutVal);
if (double.IsNaN(OutVal) || double.IsInfinity(OutVal)) {
return 0;
}
return OutVal;
}
}
}
Then Call The Function
double DirectExpense = Common.ConvertToDouble(dr["DrAmount"].ToString());
Most people already tried to answer your questions.
If you are still debugging, have you thought about using:
Double.TryParse(String, Double);
This will help you in determining what is wrong in each of the string first before you do the actual parsing.
If you have a culture-related problem, you might consider using:
Double.TryParse(String, NumberStyles, IFormatProvider, Double);
This http://msdn.microsoft.com/en-us/library/system.double.tryparse.aspx has a really good example on how to use them.
If you need a long, Int64.TryParse is also available: http://msdn.microsoft.com/en-us/library/system.int64.tryparse.aspx
Hope that helps.
private double ConvertToDouble(string s)
{
char systemSeparator = Thread.CurrentThread.CurrentCulture.NumberFormat.CurrencyDecimalSeparator[0];
double result = 0;
try
{
if (s != null)
if (!s.Contains(","))
result = double.Parse(s, CultureInfo.InvariantCulture);
else
result = Convert.ToDouble(s.Replace(".", systemSeparator.ToString()).Replace(",", systemSeparator.ToString()));
}
catch (Exception e)
{
try
{
result = Convert.ToDouble(s);
}
catch
{
try
{
result = Convert.ToDouble(s.Replace(",", ";").Replace(".", ",").Replace(";", "."));
}
catch {
throw new Exception("Wrong string-to-double format");
}
}
}
return result;
}
and successfully passed tests are:
Debug.Assert(ConvertToDouble("1.000.007") == 1000007.00);
Debug.Assert(ConvertToDouble("1.000.007,00") == 1000007.00);
Debug.Assert(ConvertToDouble("1.000,07") == 1000.07);
Debug.Assert(ConvertToDouble("1,000,007") == 1000007.00);
Debug.Assert(ConvertToDouble("1,000,000.07") == 1000000.07);
Debug.Assert(ConvertToDouble("1,007") == 1.007);
Debug.Assert(ConvertToDouble("1.07") == 1.07);
Debug.Assert(ConvertToDouble("1.007") == 1007.00);
Debug.Assert(ConvertToDouble("1.000.007E-08") == 0.07);
Debug.Assert(ConvertToDouble("1,000,007E-08") == 0.07);
In your string I see: 15.5859949000000662452.23862099999999 which is not a double (it has two decimal points). Perhaps it's just a legitimate input error?
You may also want to figure out if your last String will be empty, and account for that situation.
Usually when I have need to convert currency string (like 1200,55 zł or $1,249) to decimal value I do it like this:
if (currencyString.Contains("zł)) {
decimal value = Decimal.Parse(dataToCheck.Trim(), NumberStyles.Number | NumberStyles.AllowCurrencySymbol);
}
Is there a way to check if string is currency without checking for specific currency?
If you just do the conversion (you should add | NumberStyles.AllowThousands
| NumberStyles.AllowDecimalPoint as well) then if the string contains the wrong currency symbol for the current UI the parse will fail - in this case by raising an exception. It it contains no currency symbol the parse will still work.
You can therefore use TryParse to allow for this and test for failure.
If your input can be any currency you can use this version of TryParse that takes a IFormatProvider as argument with which you can specify the culture-specific parsing information about the string. So if the parse fails for the default UI culture you can loop round each of your supported cultures trying again. When you find the one that works you've got both your number and the type of currency it is (Zloty, US Dollar, Euro, Rouble etc.)
As I understand it's better to do:
decimal value = -1;
if (Decimal.TryParse(dataToCheck.Trim(), NumberStyles.Number |
NumberStyles.AllowCurrencySymbol,currentCulture, out value)
{do something}
See Jeff Atwood description about TryParse. It doesn't throw an exception and extremely faster than Parse in exception cases.
To check if a string is a currency amount that would be used for entering wages - I used this:
public bool TestIfWages(string wages)
{
Regex regex = new Regex(#"^\d*\.?\d?\d?$");
bool y = regex.IsMatch(wages);
return y;
}
You might try searching the string for what you think is a currency symbol, then looking it up in a dictionary to see if it really is a currency symbol. I would just look at the beginning of the string and the end of the string and pick out anything that's not a digit, then that's what you look up. (If there's stuff at both ends then I think you can assume it's not a currency.)
The advantage to this approach is that you only have to scan the string once, and you don't have to test separately for each currency.
Here's an example of what I had in mind, although it could probably use some refinement:
class Program
{
private static ISet<string> _currencySymbols = new HashSet<string>() { "$", "zł", "€", "£" };
private static bool StringIsCurrency(string str)
{
// Scan the beginning of the string until you get to the first digit
for (int i = 0; i < str.Length; i++)
{
if (char.IsDigit(str[i]))
{
if (i == 0)
{
break;
}
else
{
return StringIsCurrencySymbol(str.Substring(0, i).TrimEnd());
}
}
}
// Scan the end of the string until you get to the last digit
for (int i = 0, pos = str.Length - 1; i < str.Length; i++, pos--)
{
if (char.IsDigit(str[pos]))
{
if (i == 0)
{
break;
}
else
{
return StringIsCurrencySymbol(str.Substring(pos + 1, str.Length - pos - 1).TrimStart());
}
}
}
// No currency symbol found
return false;
}
private static bool StringIsCurrencySymbol(string symbol)
{
return _currencySymbols.Contains(symbol);
}
static void Main(string[] args)
{
Test("$1000.00");
Test("500 zł");
Test("987");
Test("book");
Test("20 €");
Test("99£");
}
private static void Test(string testString)
{
Console.WriteLine(testString + ": " + StringIsCurrency(testString));
}
}
I have a string that contains an int. How can I parse the int in C#?
Suppose I have the following strings, which contains an integer:
15 person
person 15
person15
15person
How can I track them, or return null if no integer is found in the string?
You can remove all non-digits, and parse the string if there is anything left:
str = Regex.Replace(str, "\D+", String.Empty);
if (str.Length > 0) {
int value = Int32.Parse(str);
// here you can use the value
}
Paste this code into a test:
public int? ParseAnInt(string s)
{
var match = System.Text.RegularExpressions.Regex.Match(s, #"\d+");
if (match.Success)
{
int result;
//still use TryParse to handle integer overflow
if (int.TryParse(match.Value, out result))
return result;
}
return null;
}
[TestMethod]
public void TestThis()
{
Assert.AreEqual(15, ParseAnInt("15 person"));
Assert.AreEqual(15, ParseAnInt("person 15"));
Assert.AreEqual(15, ParseAnInt("person15"));
Assert.AreEqual(15, ParseAnInt("15person"));
Assert.IsNull(ParseAnInt("nonumber"));
}
The method returns null is no number is found - it also handles the case where the number causes an integer overflow.
To reduce the chance of an overflow you could instead use long.TryParse
Equally if you anticipate multiple groups of digits, and you want to parse each group as a discreet number you could use Regex.Matches - which will return an enumerable of all the matches in the input string.
Use something like this :
Regex r = new Regex("\d+");
Match m = r.Match(yourinputstring);
if(m.Success)
{
Dosomethingwiththevalue(m.Value);
}
Since everyone uses Regex to extract the numbers, here's a Linq way to do it:
string input = "15person";
string numerics = new string(input.Where(Char.IsDigit).ToArray());
int result = int.Parse(numerics);
Just for the sake of completeness, it's probably not overly elegant. Regarding Jaymz' comment, this would return 151314 when 15per13so14n is passed.