I have a string in this format:
ABCD_EFDG20120700.0.xml
This has a pattern which has three parts to it:
First is the set of chars before the '_', the 'ABCD'
Second are the set of chars 'EFDG' after the '_'
Third are the remaining 20120700.0.xml
I can split the original string and get the number(s) from the second element in the split result using this switch:
\d+
Match m = Regex.Match(splitname[1], "\\d+");
That returns only '20120700'. But I need '20120700.0'.
How do I get the required string?
You can extend your regex to look for any number of digits, then period and then any number of digits once again:
Match m = Regex.Match(splitname[1], "\\d+\\.\\d+");
Although with such regular expression you don't even need to split the string:
string s = "ABCD_EFDG20120700.0.xml";
Match m = Regex.Match(s, "\\d+\\.\\d+");
string result = m.Value; // result is 20120700.0
I can suggest you to use one regex operation for all you want like this:
var rgx = new Regex(#"^([^_]+)_([^\d.]+)([\d.]+\d+)\.(.*)$");
var matches = rgx.Matches(input);
if (matches.Count > 0)
{
Console.WriteLine("{0}", matches[0].Groups[0]); // All input string
Console.WriteLine("{0}", matches[0].Groups[1]); // ABCD
Console.WriteLine("{0}", matches[0].Groups[2]); // EFGH
Console.WriteLine("{0}", matches[0].Groups[3]); // 20120700.0
Console.WriteLine("{0}", matches[0].Groups[4]); // xml
}
Related
i have a string with this format :
111111#1
the number of digit character is 5 or 6 and after that i set a '#' and also set a digit charcter.
i use Regex.IsMatch like this :
if (Regex.IsMatch(string, #"^d{6}#\d{1}"))
{...}
but it cant handle my string
what is my mistake?
You're missing the backslash on the first d so it's not matching against digits:
Regex.IsMatch("111111#1", #"^\d{6}#\d{1}")
This single line Regex will capture two groups: the leading five to six digits and the '#' followed by a single digit:
(\d{5,6})(#\d{1})
Example:
string pattern = #"(\d{5,6})(#\d{1})";
string input = "111111#1";
MatchCollection matches = Regex.Matches(input, pattern);
foreach (Match match in matches)
{
var firstGroupValue = match.Groups[1]; // "111111"
var secondGroupValue = match.Groups[2]; // "#1"
}
I'm trying to separate numbers from words or characters and any other punctuation with whitespace in string wrote them together e.g. string is:
string input = "ok, here is369 and777, and 20k0 10+1.any word.";
and desired output should be:
ok, here is 369 and 777 , and 20 k 0 10 + 1 .any word.
I'm not sure if I'm on right way, but now what I'm trying to do, is to find if string contains numbers and then somehow replace it all with same values but with whitespace between. If it is possible, how can I find all individual numbers (not each digit in number to be clearer), separated or not separated by words or whitespace and attach each found number to value, which can be used for all at once to replace it with same numbers but with spaces on sides. This way it returns only first occurrence of a number in string:
class Program
{
static void Main(string[] args)
{
string input = "here is 369 and 777 and 15 2080 and 579";
string resultString = Regex.Match(input, #"\d+").Value;
Console.WriteLine(resultString);
Console.ReadLine();
}
}
output:
369
but also I'm not sure if I can get all different found number for single replacement value for each. Would be good to find out in which direction to go
If what we need is basically to add spaces around numbers, try this:
string tmp = Regex.Replace(input, #"(?<a>[0-9])(?<b>[^0-9\s])", #"${a} ${b}");
string res = Regex.Replace(tmp, #"(?<a>[^0-9\s])(?<b>[0-9])", #"${a} ${b}");
Previous answer assumed that words, numbers and punctuation should be separated:
string input = "here is369 and777, and 20k0";
var matches = Regex.Matches(input, #"([A-Za-z]+|[0-9]+|\p{P})");
foreach (Match match in matches)
Console.WriteLine("{0}", match.Groups[1].Value);
To construct the required result string in a short way:
string res = string.Join(" ", matches.Cast<Match>().Select(m => m.Groups[1].Value));
You were on the right path. Regex.Match only returns one match and you would have to use .NextMatch() to get the next value that matches your regular expression. Regex.Matches returns every possible match into a MatchCollection that you can then parse with a loop as I did in my example:
string input = "here is 369 and 777 and 15 2080 and 579";
foreach (Match match in Regex.Matches(input, #"\d+"))
{
Console.WriteLine(match.Value);
}
Console.ReadLine();
This Outputs:
369
777
15
2080
579
This provides the desired output:
string input = "ok, here is369 and777, and 20k0 10+1.any word.";
var matches = Regex.Matches(input, #"([\D]+|[0-9]+)");
foreach (Match match in matches)
Console.Write("{0} ", match.Groups[0].Value);
[\D] will match anything non digit. Please note space after {0}.
string temp = "12345&refere?X=Assess9677125?Folder_id=35478";
I need to extract the number 12345 alone and I don't need the numbers 9677125 and 35478.
What regex can I use?
Here is the regex for extracting 5 digit number in the beginning of the string:
^(\d{5})&
If length is arbitrary:
^(\d+)&
If termination pattern is not always &:
^(\d+)[^\d]
Based on the Sayse's comment you can simply rewrite as:
^(\d+)
and in case of the termination is some number(for instance 999) then:
^(\d+)999
You don't need regex if you only want to extract the first number:
string temp = "12345&refere?X=Assess9677125?Folder_id=35478";
int first = Int32.Parse(String.Join("", temp.TakeWhile(c => Char.IsDigit(c))));
Console.WriteLine(first); // 12345
If the number you want is always at the beginning of the string and terminated by an ampersand (&) you don't need a regex at all. Just split the string on the ampersand and get the first element of the resulting array:
String temp = "12345&refere?X=Assess9677125?Folder_id=35478";
var splitArray = String.Split('&', temp);
var number = splitArray[0]; // returns 12345
Alternatively, you can get the index of the ampersand and substring up to that point:
String temp = "12345&refere?X=Assess9677125?Folder_id=35478";
var ampersandIndex = temp.IndexOf("&");
var number = temp.SubString(0, ampersandIndex); // returns 12345
From what you haven given us this is fairly simple:
var regex = new Regex(#"^(?<number>\d+)&");
var match = regex.Match("12345&refere?X=Assess9677125?Folder_id=35478");
if (match.Success)
{
var number = int.Parse(match.Groups["number"].Value);
}
Edit: Of course you can replace the argument of new Regex with any of the combinations Giorgi has given.
I'm looking for a way to search a string for everything before a set of characters in C#. For Example, if this is my string value:
This is is a test.... 12345
I want build a new string with all of the characters before "12345".
So my new string would equal "This is is a test.... "
Is there a way to do this?
I've found Regex examples where you can focus on one character but not a sequence of characters.
You don't need to use a Regex:
public string GetBitBefore(string text, string end)
{
var index = text.IndexOf(end);
if (index == -1) return text;
return text.Substring(0, index);
}
You can use a lazy quantifier to match anything, followed by a lookahead:
var match = Regex.Match("This is is a test.... 12345", #".*?(?=\d{5})");
where:
.*? lazily matches everything (up to the lookahead)
(?=…) is a positive lookahead: the pattern must be matched, but is not included in the result
\d{5} matches exactly five digits. I'm assuming this is your lookahead; you can replace it
You can do so with help of regex lookahead.
.*(?=12345)
Example:
var data = "This is is a test.... 12345";
var rxStr = ".*(?=12345)";
var rx = new System.Text.RegularExpressions.Regex (rxStr,
System.Text.RegularExpressions.RegexOptions.IgnoreCase);
var match = rx.Match(data);
if (match.Success) {
Console.WriteLine (match.Value);
}
Above code snippet will print every thing upto 12345:
This is is a test....
For more detail about see regex positive lookahead
This should get you started:
var reg = new Regex("^(.+)12345$");
var match = reg.Match("This is is a test.... 12345");
var group = match.Groups[1]; // This is is a test....
Of course you'd want to do some additional validation, but this is the basic idea.
^ means start of string
$ means end of string
The asterisk tells the engine to attempt to match the preceding token zero or more times. The plus tells the engine to attempt to match the preceding token once or more
{min,max} indicate the minimum/maximum number of matches.
\d matches a single character that is a digit, \w matches a "word character" (alphanumeric characters plus underscore), and \s matches a whitespace character (includes tabs and line breaks).
[^a] means not so exclude a
The dot matches a single character, except line break characters
In your case there many way to accomplish the task.
Eg excluding digit: ^[^\d]*
If you know the set of characters and they are not only digit, don't use regex but IndexOf(). If you know the separator between first and second part as "..." you can use Split()
Take a look at this snippet:
class Program
{
static void Main(string[] args)
{
string input = "This is is a test.... 12345";
// Here we call Regex.Match.
MatchCollection matches = Regex.Matches(input, #"(?<MySentence>(\w+\s*)*)(?<MyNumberPart>\d*)");
foreach (Match item in matches)
{
Console.WriteLine(item.Groups["MySentence"]);
Console.WriteLine("******");
Console.WriteLine(item.Groups["MyNumberPart"]);
}
Console.ReadKey();
}
}
You could just split, not as optimal as the indexOf solution
string value = "oiasjdoiasj12345";
string end = "12345";
string result = value.Split(new string[] { end }, StringSplitOptions.None)[0] //Take first part of the result, not the quickest but fairly simple
In my current project I have to work alot with substring and I'm wondering if there is an easier way to get out numbers from a string.
Example:
I have a string like this:
12 text text 7 text
I want to be available to get out first number set or second number set.
So if I ask for number set 1 I will get 12 in return and if I ask for number set 2 I will get 7 in return.
Thanks!
This will create an array of integers from the string:
using System.Linq;
using System.Text.RegularExpressions;
class Program {
static void Main() {
string text = "12 text text 7 text";
int[] numbers = (from Match m in Regex.Matches(text, #"\d+") select int.Parse(m.Value)).ToArray();
}
}
Try using regular expressions, you can match [0-9]+ which will match any run of numerals within your string. The C# code to use this regex is roughly as follows:
Match match = Regex.Match(input, "[0-9]+", RegexOptions.IgnoreCase);
// Here we check the Match instance.
if (match.Success)
{
// here you get the first match
string value = match.Groups[1].Value;
}
You will of course still have to parse the returned strings.
Looks like a good match for Regex.
The basic regular expression would be \d+ to match on (one or more digits).
You would iterate through the Matches collection returned from Regex.Matches and parse each returned match in turn.
var matches = Regex.Matches(input, "\d+");
foreach(var match in matches)
{
myIntList.Add(int.Parse(match.Value));
}
You could use regex:
Regex regex = new Regex(#"^[0-9]+$");
you can split the string in parts using string.Split, and then travese the list with a foreach applying int.TryParse, something like this:
string test = "12 text text 7 text";
var numbers = new List<int>();
int i;
foreach (string s in test.Split(' '))
{
if (int.TryParse(s, out i)) numbers.Add(i);
}
Now numbers has the list of valid values