I have a big string, and want to find the first occurrence of X, X is "numberXnumber"... 3X3, or 4X9...
How could i do this in C#?
var s = "long string.....24X10 .....1X3";
var match = Regex.Match(s, #"\d+X\d+");
if (match.Success) {
Console.WriteLine(match.Index); // 16
Console.WriteLine(match.Value); // 24X10;
}
Also take a look at NextMatch which is a handy function
match = match.NextMatch();
match.Value; // 1X3;
For those who love extension methods:
public static int RegexIndexOf(this string str, string pattern)
{
var m = Regex.Match(str, pattern);
return m.Success ? m.Index : -1;
}
Yes, regex could do that for you
you could do ([0-9]+)X([0-9]+) If you know that the numbers are only single digit you could take [0-9]X[0-9]
this may help you
string myText = "33x99 lorem ipsum 004x44";
//the first matched group index
int firstIndex = Regex.Match(myText,"([0-9]+)(x)([0-9]+)").Index;
//first matched "x" (group = 2) index
int firstXIndex = Regex.Match(myText,"([0-9]+)(x)([0-9]+)").Groups[2].Index;
var index = new Regex("yourPattern").Match("X").Index;
http://www.regular-expressions.info/download/csharpregexdemo.zip
You can use this pattern:
\d([xX])\d
If I test
blaat3X3test
I get:
Match offset: 5 Match length: 3
Matched text: 3X3 Group 1 offset: 6
Group 1 length: 1 Group 1 text: X
Do you want the number, or the index of the number? You can get both of these, but you're probably going to want to take a look at System.Text.RegularExpressions.Regex
The actual pattern is going to be [0-9]x[0-9] if you want only single numbers (89x72 will only match 9x7), or [0-9]+x[0-9]+ to match the longest consecutive string of numbers in both directions.
Related
I am trying to write a regex expression to use it in C#
The use of the regex is to get a substring of the input according to the input size
The regex expression target
If the input size less than 13 then get the full input
Else if the input size is greater than 25 then get from the input the substring from index 3 till index 16 (so that I skip the first three chars)
Here is what I came with till now
(?(?=.{25,}).{3}(.{13})|(?(?=.{0,13})(.{0,13})))
This is not working since when the input size is greater than 25 the result is not trimming the first three chars
Check it here
Note that a non-regex solution is rather trivial:
public string check(string s)
{
var res = "";
if (s.Length>=25)
res = s.Substring(3,13);
else if (s.Length <= 13)
res = s;
return res;
}
If you want to use a regex, you may use
^(?=.{25,}).{3}(?<res>.{13})|^(?=.{0,13}$)(?<res>.*)
See the regex demo. Compile with RegexOptions.Singleline to support newlines in the input.
Details
^ - start of string
(?=.{25,}) - if there are 25 or more chars after the start of string, match
.{3} - any 3 chars
(?<res>.{13}) - and capture 13 chars into res group
| - or
^(?=.{0,13}$) - make sure there are no more than 0 to 13 chars in the string and then
(?<res>.*) - grab the whole string (if no RegexOptions.Singleline is used, only 1 line will be matched).
Use it as
var res = "";
var m = Regex.Match(s, #"^(?=.{25,}).{3}(?<res>.{13})|^(?=.{0,13}$)(?<res>.*)")
if (m.Success)
{
res = m.Groups["res"].Value;
}
See a C# demo.
string temp = "12345&refere?X=Assess9677125?Folder_id=35478";
I need to extract the number 12345 alone and I don't need the numbers 9677125 and 35478.
What regex can I use?
Here is the regex for extracting 5 digit number in the beginning of the string:
^(\d{5})&
If length is arbitrary:
^(\d+)&
If termination pattern is not always &:
^(\d+)[^\d]
Based on the Sayse's comment you can simply rewrite as:
^(\d+)
and in case of the termination is some number(for instance 999) then:
^(\d+)999
You don't need regex if you only want to extract the first number:
string temp = "12345&refere?X=Assess9677125?Folder_id=35478";
int first = Int32.Parse(String.Join("", temp.TakeWhile(c => Char.IsDigit(c))));
Console.WriteLine(first); // 12345
If the number you want is always at the beginning of the string and terminated by an ampersand (&) you don't need a regex at all. Just split the string on the ampersand and get the first element of the resulting array:
String temp = "12345&refere?X=Assess9677125?Folder_id=35478";
var splitArray = String.Split('&', temp);
var number = splitArray[0]; // returns 12345
Alternatively, you can get the index of the ampersand and substring up to that point:
String temp = "12345&refere?X=Assess9677125?Folder_id=35478";
var ampersandIndex = temp.IndexOf("&");
var number = temp.SubString(0, ampersandIndex); // returns 12345
From what you haven given us this is fairly simple:
var regex = new Regex(#"^(?<number>\d+)&");
var match = regex.Match("12345&refere?X=Assess9677125?Folder_id=35478");
if (match.Success)
{
var number = int.Parse(match.Groups["number"].Value);
}
Edit: Of course you can replace the argument of new Regex with any of the combinations Giorgi has given.
I have a string in this format:
ABCD_EFDG20120700.0.xml
This has a pattern which has three parts to it:
First is the set of chars before the '_', the 'ABCD'
Second are the set of chars 'EFDG' after the '_'
Third are the remaining 20120700.0.xml
I can split the original string and get the number(s) from the second element in the split result using this switch:
\d+
Match m = Regex.Match(splitname[1], "\\d+");
That returns only '20120700'. But I need '20120700.0'.
How do I get the required string?
You can extend your regex to look for any number of digits, then period and then any number of digits once again:
Match m = Regex.Match(splitname[1], "\\d+\\.\\d+");
Although with such regular expression you don't even need to split the string:
string s = "ABCD_EFDG20120700.0.xml";
Match m = Regex.Match(s, "\\d+\\.\\d+");
string result = m.Value; // result is 20120700.0
I can suggest you to use one regex operation for all you want like this:
var rgx = new Regex(#"^([^_]+)_([^\d.]+)([\d.]+\d+)\.(.*)$");
var matches = rgx.Matches(input);
if (matches.Count > 0)
{
Console.WriteLine("{0}", matches[0].Groups[0]); // All input string
Console.WriteLine("{0}", matches[0].Groups[1]); // ABCD
Console.WriteLine("{0}", matches[0].Groups[2]); // EFGH
Console.WriteLine("{0}", matches[0].Groups[3]); // 20120700.0
Console.WriteLine("{0}", matches[0].Groups[4]); // xml
}
I have a string conforming to the following pattern:
(cc)-(nr).(nr)M(nr)(cc)whitespace(nr)
where cc is artbitrary number of letter characters, nr is arbitrary number of numerical characters, and M is is the actual letter M.
For example:
ASF-1.15M437979CA 100000
EU-12.15M121515PO 1145
I need to find the positions of -, . and M whithin the string. The problem is, the leading characters and the ending characters can contain the letter M as well, but I need only the one in the middle.
As an alternative, the subtraction of the first characters (until -) and the first two numbers (as in (nr).(nr)M...) would be enough.
If you need a regex-based solution, you just need to use 3 capturing groups around the required patterns, and then access the Groups[n].Index property:
var rxt = new Regex(#"\p{L}*(-)\d+(\.)\d+(M)\d+\p{L}*\s*\d+");
// Collect matches
var matches = rxt.Matches(#"ASF-1.15M437979CA 100000 or EU-12.15M121515PO 1145");
// Now, we can get the indices
var posOfHyphen = matches.Cast<Match>().Select(p => p.Groups[1].Index);
var posOfDot = matches.Cast<Match>().Select(p => p.Groups[2].Index);
var posOfM = matches.Cast<Match>().Select(p => p.Groups[3].Index);
Output:
posOfHyphen => [3, 32]
posOfDot => [5, 35]
posOfM => [8, 38]
Regex:
string pattern = #"[A-Z]+(-)\d+(\.)\d+(M)\d+[A-Z]+";
string value = "ASF-1.15M437979CA 100000 or EU-12.15M121515PO 1145";
var match = Regex.Match(value, pattern);
if (match.Success)
{
int sep1 = match.Groups[1].Index;
int sep2 = match.Groups[2].Index;
int sep3 = match.Groups[3].Index;
}
In my current project I have to work alot with substring and I'm wondering if there is an easier way to get out numbers from a string.
Example:
I have a string like this:
12 text text 7 text
I want to be available to get out first number set or second number set.
So if I ask for number set 1 I will get 12 in return and if I ask for number set 2 I will get 7 in return.
Thanks!
This will create an array of integers from the string:
using System.Linq;
using System.Text.RegularExpressions;
class Program {
static void Main() {
string text = "12 text text 7 text";
int[] numbers = (from Match m in Regex.Matches(text, #"\d+") select int.Parse(m.Value)).ToArray();
}
}
Try using regular expressions, you can match [0-9]+ which will match any run of numerals within your string. The C# code to use this regex is roughly as follows:
Match match = Regex.Match(input, "[0-9]+", RegexOptions.IgnoreCase);
// Here we check the Match instance.
if (match.Success)
{
// here you get the first match
string value = match.Groups[1].Value;
}
You will of course still have to parse the returned strings.
Looks like a good match for Regex.
The basic regular expression would be \d+ to match on (one or more digits).
You would iterate through the Matches collection returned from Regex.Matches and parse each returned match in turn.
var matches = Regex.Matches(input, "\d+");
foreach(var match in matches)
{
myIntList.Add(int.Parse(match.Value));
}
You could use regex:
Regex regex = new Regex(#"^[0-9]+$");
you can split the string in parts using string.Split, and then travese the list with a foreach applying int.TryParse, something like this:
string test = "12 text text 7 text";
var numbers = new List<int>();
int i;
foreach (string s in test.Split(' '))
{
if (int.TryParse(s, out i)) numbers.Add(i);
}
Now numbers has the list of valid values