Suppose I have a string that looks like this "dentist: 800-483-9767" or this "john (232)233-2323" or some other combination that has numbers, letters and other types of characters, and that's max 25 characters long. I want to extract the digits and letters characters into 2 strings so that I get this:
string digits = "8004839767";
string letters = "dentist";
What's the best what to do this?
Thanks
You can use Linq and char.IsDigit, char.IsLetter
string input = "dentist: 800-483-9767";
string digits = new string(input.Where(char.IsDigit).ToArray());
string letters = new string(input.Where(char.IsLetter).ToArray());
Result:
input = "dentist: 800-483-9767";
digits = "8004839767"
letters = "dentist"
input = "john (232)233-2323";
digits = "2322332323"
letters = "john"
If it really is about getting digits and letters (and not splitting somewhere, matching a phone number or someting similar), this would be my attempt:
var input = "dentist: 0800-483-9767";
var digits = string.Join(string.Empty, input.Where(char.IsDigit));
var letters = string.Join(string.Empty, input.Where(char.IsLetter));
string input = "dentist: 800-483-9767";
string[] split = input.Split(':');
string letters = split[0];
string digits = split[1].Replace("-","").Trim();
Similar to what juergen d posted, however, it will remove the dashes in the number.
When saying "looks like", you should be more specific about what can change and what cannot.
If the format is always :, you can do
Data process(String input)
{
var elements = input.Split(new char[] {':'});
Data result;
result.letters = elements[0].Trim();
result.digits = elements[1].Trim().Replace("-", "");
return result;
}
That's exactly the kind of unprecision I was talking about.
If you are sure that the letters won't contain numbers, you can use a regular expression to separate the letters from the rest. If you are sure that the letters won't contain spaces, you can Split(' '). I mean, could it be john 2 232:233-2323, where "john 2" is the name, and the rest is a number.
If you want to parse it, the first thing you have to do is to identify some kind of format.
If the name won't contain spaces, then just call Split(' ', 2), take the first thing as the name, and remove everything that is not a number from the second one with a regular expression. I've never used regexp in C# before, but I think it should be Regex.Replace(input, "[^\\d]+", "", RegexOptions.None).
Related
what is the faster way to trim all alphabet in a string that have alphabet prefix.
For example, input sting "ABC12345" , and i wish to havee 12345 as output only.
Thanks.
Please use "char.IsDigit", try this:
static void Main(string[] args)
{
var input = "ABC12345";
var numeric = new String(input.Where(char.IsDigit).ToArray());
Console.Read();
}
You can use Regular Expressions to trim an alphabetic prefix
var input = "ABC123";
var trimmed = Regex.Replace(input, #"^[A-Za-z]+", "");
// trimmed = "123"
The regular expression (second parameter) ^[A-Za-z]+ of the replace method does most of the work, it defines what you want to be replaced using the following rules:
The ^ character ensures a match only exists at the start of a string
The [A-Za-z] will match any uppercase or lowercase letters
The + means the upper or lowercase letters will be matched as many times in a row as possible
As this is the Replace method, the third parameter then replaces any matches with an empty string.
The other answers seem to answer what is the slowest way .. so if you really need the fastest way, then you can find the index of the first digit and get the substring:
string input = "ABC12345";
int i = 0;
while ( input[i] < '0' || input[i] > '9' ) i++;
string output = input.Substring(i);
The shortest way to get the value would probably be the VB Val method:
double value = Microsoft.VisualBasic.Conversion.Val("ABC12345"); // 12345.0
You would have to regular expression. It seems you are looking for only digits and not letters.
Sample:
string result =
System.Text.RegularExpressions.Regex.Replace("Your input string", #"\D+", string.Empty);
I'm trying to match the following cases and pull the number value:
"b 30.00"
"bill 30.00"
"bill 30"
"b 30"
I've tried:
var regex = new Regex("^b(?-i:ill)?$ ^$?d+(.d{2})?$", RegexOptions.IgnoreCase);
However, this doesn't seem to return a match, and I'm not sure how to pull the digit.
You haven't well understand how to use anchors ^ & $, read about this.
var regex = new Regex(#"^[Bb](?:ill)? \d+(?:\.\d{2})?$");
or better since you only need ascii digits (and not all possible digits of the world):
var regex = new Regex(#"^[Bb](?:ill)? [0-9]+(?:\.[0-9]{2})?$");
If you want to figure a literal . you must escape it (same thing for a literal $). Note the use of a verbatim string to avoid double backslashes.
Feel free to add capture groups around what you want to capture.
You didn't mention if RegEx is actually required to accomplish your goal. If RegEx is not required, and you know that your string is in a specific format, you could just split the string:
string val = "bill 30.00";
string[] split = val.Split(' ');
string name = string.Empty;
decimal currency = 0m;
if (split.Length > 1)
{
name = split[0];
decimal.TryParse(split[1], out currency);
}
new Regex (#"\b\d+(.\d {2})*") should give you what you want
Just try the code
string Value = "bill 30.00";
string resultString = Regex.Match(Value, #"\d+").Value;
string temp = "12345&refere?X=Assess9677125?Folder_id=35478";
I need to extract the number 12345 alone and I don't need the numbers 9677125 and 35478.
What regex can I use?
Here is the regex for extracting 5 digit number in the beginning of the string:
^(\d{5})&
If length is arbitrary:
^(\d+)&
If termination pattern is not always &:
^(\d+)[^\d]
Based on the Sayse's comment you can simply rewrite as:
^(\d+)
and in case of the termination is some number(for instance 999) then:
^(\d+)999
You don't need regex if you only want to extract the first number:
string temp = "12345&refere?X=Assess9677125?Folder_id=35478";
int first = Int32.Parse(String.Join("", temp.TakeWhile(c => Char.IsDigit(c))));
Console.WriteLine(first); // 12345
If the number you want is always at the beginning of the string and terminated by an ampersand (&) you don't need a regex at all. Just split the string on the ampersand and get the first element of the resulting array:
String temp = "12345&refere?X=Assess9677125?Folder_id=35478";
var splitArray = String.Split('&', temp);
var number = splitArray[0]; // returns 12345
Alternatively, you can get the index of the ampersand and substring up to that point:
String temp = "12345&refere?X=Assess9677125?Folder_id=35478";
var ampersandIndex = temp.IndexOf("&");
var number = temp.SubString(0, ampersandIndex); // returns 12345
From what you haven given us this is fairly simple:
var regex = new Regex(#"^(?<number>\d+)&");
var match = regex.Match("12345&refere?X=Assess9677125?Folder_id=35478");
if (match.Success)
{
var number = int.Parse(match.Groups["number"].Value);
}
Edit: Of course you can replace the argument of new Regex with any of the combinations Giorgi has given.
I have the following string:
01-21-27-0000-00-048 and it is easy to split it apart because each section is separated by a -, but sometimes this string is represented as 01-21-27-0000-00048, so splitting it is not as easy because the last 2 parts are combined. How can I handle this? Also, what about the case where it might be something like 01-21-27-0000-00.048
In case anyone is curious, this is a parcel number and it varies from county to county and a county can have 1 format or they can have 100 formats.
This is a very good case for using regular expressions. You string matches the following regexp:
(\d{2})-(\d{2})-(\d{2})-(\d{4})-(\d{2})[.-]?(\d{3})
Match the input against this expression, and harvest the six groups of digits from the match:
var str = new[] {
"01-21-27-0000-00048", "01-21-27-0000-00.048", "01-21-27-0000-00-048"
};
foreach (var s in str) {
var m = Regex.Match(s, #"(\d{2})-(\d{2})-(\d{2})-(\d{4})-(\d{2})[.-]?(\d{3})");
for (var i = 1 /* one, not zero */ ; i != m.Groups.Count ; i++) {
Console.Write("{0} ", m.Groups[i]);
}
Console.WriteLine();
}
If you would like to allow for other characters, say, letters in the segments that are separated by dashes, you could use \w instead of \d to denote a letter, a digit, or an underscore. If you would like to allow an unspecified number of such characters within a known range, say, two to four, you can use {2,4} in the regexp instead of the more specific {2}, which means "exactly two". For example,
(\w{2,3})-(\w{2})-(\w{2})-(\d{4})-(\d{2})[.-]?(\d{3})
lets the first segment contain two to three digits or letters, and also allow for letters in segments two and three.
Normalize the string first.
I.e. if you know that the last part is always three characters, then insert a - as the fourth-to-last character, then split the resultant string. Along the same line, convert the dot '.' to a dash '-' and split that string.
Replace all the char which are not digit with emptyString('').
then any of your string become in the format like
012127000000048
now you can use the divide it in (2, 2, 2, 4, 2, 3) parts.
I have the following strings:
string a = "1. testdata";
string b = "12. testdata xxx";
What I would like is to be able to extract the number into one string and the characters following the number into another. I tried using .IndexOf(".") and then remove, trim and
substrings. If possible I would like to find something simpler as I have this to do in a
lot of parts of my code.
if the format is always the same you could do:
a.Split('.');
Proposed solutions so far are not correct.
First, after Split('.') or Split(".") you will have space in the beginning of second substring.
Second, if you have more than one dot - you'll have to do something yet after the split.
More robust solution is below:
string a = "11. Test string. With dots.";
var res = a.Split(new[] {". "}, 2, StringSplitOptions.None);
string number = res[0];
string val = res[1];
Argument 2 specifies maximum number of strings to return. Thus when you have several dots - it will make a split only at the first.
string[]list = a.Split(".");
string numbers = list[0];
string chars = list[1];