I need to make a validation in a string with positional information using regex.
Example:
020005254877841100557810AAAAAA841158891BBBB
I need to get match position 5 until 10 and has to be only numbers.
How can I do this using RegEx?
hmm does it really have to be regex?
I would have done like this.
var myString = "020005254877841100557810AAAAAA841158891BBBB";
var isValid = myString.Substring(4, 5).All(Char.IsDigit);
If you absolutely HAVE to use regex, here it is.... (Though I think you should go with something like Jonas W's answer).
Match m = Regex.Match(myString, "^.{4}\d{5}.*");
if(m.Success){
//do stuff
}
The regex means, "from the beginning of the string (^), match 4 of any character (.{4}), then five digits, (\d{5}), then however many of any other characters (.*)"
If you really feel you must use a regex this will be the one:
"^....\d{5}"
You can also do multiple checks like this:
"^....(\d{5}|\D{5})
That one will match all numbers or all non-digit characters, but not a mix or anything with whitespace.
To build on what FailedDev mentioned.
string myString = "020005254877841100557810AAAAAA841158891BBBB";
myString.Substring(5, 5); //will get the substring at index 5 to 10.
double Num;
bool isNum = double.TryParse(myString, out Num); //returns true of all numbers
Hope that helps,
Related
What is the regular expression to validate a comma delimited list like this one:
12365, 45236, 458, 1, 99996332, ......
I suggest you to do in the following way:
(\d+)(,\s*\d+)*
which would work for a list containing 1 or more elements.
This regex extracts an element from a comma separated list, regardless of contents:
(.+?)(?:,|$)
If you just replace the comma with something else, it should work for any delimiter.
It depends a bit on your exact requirements. I'm assuming: all numbers, any length, numbers cannot have leading zeros nor contain commas or decimal points. individual numbers always separated by a comma then a space, and the last number does NOT have a comma and space after it. Any of these being wrong would simplify the solution.
([1-9][0-9]*,[ ])*[1-9][0-9]*
Here's how I built that mentally:
[0-9] any digit.
[1-9][0-9]* leading non-zero digit followed by any number of digits
[1-9][0-9]*, as above, followed by a comma
[1-9][0-9]*[ ] as above, followed by a space
([1-9][0-9]*[ ])* as above, repeated 0 or more times
([1-9][0-9]*[ ])*[1-9][0-9]* as above, with a final number that doesn't have a comma.
Match duplicate comma-delimited items:
(?<=,|^)([^,]*)(,\1)+(?=,|$)
Reference.
This regex can be used to split the values of a comma delimitted list. List elements may be quoted, unquoted or empty. Commas inside a pair of quotation marks are not matched.
,(?!(?<=(?:^|,)\s*"(?:[^"]|""|\\")*,)(?:[^"]|""|\\")*"\s*(?:,|$))
Reference.
/^\d+(?:, ?\d+)*$/
i used this for a list of items that had to be alphanumeric without underscores at the front of each item.
^(([0-9a-zA-Z][0-9a-zA-Z_]*)([,][0-9a-zA-Z][0-9a-zA-Z_]*)*)$
You might want to specify language just to be safe, but
(\d+, ?)+(\d+)?
ought to work
I had a slightly different requirement, to parse an encoded dictionary/hashtable with escaped commas, like this:
"1=This is something, 2=This is something,,with an escaped comma, 3=This is something else"
I think this is an elegant solution, with a trick that avoids a lot of regex complexity:
if (string.IsNullOrEmpty(encodedValues))
{
return null;
}
else
{
var retVal = new Dictionary<int, string>();
var reFields = new Regex(#"([0-9]+)\=(([A-Za-z0-9\s]|(,,))+),");
foreach (Match match in reFields.Matches(encodedValues + ","))
{
var id = match.Groups[1].Value;
var value = match.Groups[2].Value;
retVal[int.Parse(id)] = value.Replace(",,", ",");
}
return retVal;
}
I think it can be adapted to the original question with an expression like #"([0-9]+),\s?" and parse on Groups[0].
I hope it's helpful to somebody and thanks for the tips on getting it close to there, especially Asaph!
In JavaScript, use split to help out, and catch any negative digits as well:
'-1,2,-3'.match(/(-?\d+)(,\s*-?\d+)*/)[0].split(',');
// ["-1", "2", "-3"]
// may need trimming if digits are space-separated
The following will match any comma delimited word/digit/space combination
(((.)*,)*)(.)*
Why don't you work with groups:
^(\d+(, )?)+$
If you had a more complicated regex, i.e: for valid urls rather than just numbers. You could do the following where you loop through each element and test each of them individually against your regex:
const validRelativeUrlRegex = /^(^$|(?!.*(\W\W))\/[a-zA-Z0-9\/-]+[^\W_]$)/;
const relativeUrls = "/url1,/url-2,url3";
const startsWithComma = relativeUrls.startsWith(",");
const endsWithComma = relativeUrls.endsWith(",");
const areAllURLsValid = relativeUrls
.split(",")
.every(url => validRelativeUrlRegex.test(url));
const isValid = areAllURLsValid && !endsWithComma && !startsWithComma
I'm searching for a regular expression that will extract 0263563
from
;010263563=2119?
and 0267829
from
%00000026782904?;010267829=4119?
(Must be the same regular expression).
Start at the 3rd character after the semicolon and take 7 characters.
;..(\d{7})
or more general:
;..(.{7})
Or according to comment : "To clarify characters to take are digits"
;\d\d(\d{7})
Well, you need this:
;\d{2}(\d+)
The First group will contain the number you want.
;[\S]{2}([\S]{7})
Since you said characters and not numbers, but it would work either way
For rules that are that precise (start 3 chars after ;, then next 7), you could use a plain substring:
string s = "%00000026782904?;010267829=4119?";
var pos = s.IndexOf(';');
var number = s.Substring(pos+3, 7);
And of course, test whether that IndexOf really found the ;
The below regex would exactly 7 digits which must be preceded by a ; , any two characters.
(?<=;.{2})\d{7}
Code:
String input = #";010263563=2119?
%00000026782904?;010267829=4119?";
Regex rgx = new Regex(#"(?<=;.{2})\d{7}");
foreach (Match m in rgx.Matches(input))
Console.WriteLine(m.Groups[0].Value);
IDEONE
Output:
0263563
0267829
Assuming regex is not an absolute requirement, you could use:
var input = "%00000026782904?;010267829=4119?";
// output will be: 0267829
var digits = input.SkipWhile(x => x != ';').Skip(3).Take(7).ToArray();
var output = new string(digits);
Suppose I have a string
Likes (20)
I want to fetch the sub-string enclosed in round brackets (in above case its 20) from this string. This sub-string can change dynamically at runtime. It might be any other number from 0 to infinity. To achieve this my idea is to use a for loop that traverses the whole string and then when a ( is present, it starts adding the characters to another character array and when ) is encountered, it stops adding the characters and returns the array. But I think this might have poor performance. I know very little about regular expressions, so is there a regular expression solution available or any function that can do that in an efficient way?
If you don't fancy using regex you could use Split:
string foo = "Likes (20)";
string[] arr = foo.Split(new char[]{ '(', ')' }, StringSplitOptions.None);
string count = arr[1];
Count = 20
This will work fine regardless of the number in the brackets ()
e.g:
Likes (242535345)
Will give:
242535345
Works also with pure string methods:
string result = "Likes (20)";
int index = result.IndexOf('(');
if (index >= 0)
{
result = result.Substring(index + 1); // take part behind (
index = result.IndexOf(')');
if (index >= 0)
result = result.Remove(index); // remove part from )
}
Demo
For a strict matching, you can do:
Regex reg = new Regex(#"^Likes\((\d+)\)$");
Match m = reg.Match(yourstring);
this way you'll have all you need in m.Groups[1].Value.
As suggested from I4V, assuming you have only that sequence of digits in the whole string, as in your example, you can use the simpler version:
var res = Regex.Match(str,#"\d+")
and in this canse, you can get the value you are looking for with res.Value
EDIT
In case the value enclosed in brackets is not just numbers, you can just change the \d with something like [\w\d\s] if you want to allow in there alphabetic characters, digits and spaces.
Even with Linq:
var s = "Likes (20)";
var s1 = new string(s.SkipWhile(x => x != '(').Skip(1).TakeWhile(x => x != ')').ToArray());
const string likes = "Likes (20)";
int likesCount = int.Parse(likes.Substring(likes.IndexOf('(') + 1, (likes.Length - likes.IndexOf(')') + 1 )));
Matching when the part in paranthesis is supposed to be a number;
string inputstring="Likes (20)"
Regex reg=new Regex(#"\((\d+)\)")
string num= reg.Match(inputstring).Groups[1].Value
Explanation:
By definition regexp matches a substring, so unless you indicate otherwise the string you are looking for can occur at any place in your string.
\d stand for digits. It will match any single digit.
We want it to potentially be repeated several times, and we want at least one. The + sign is regexp for previous symbol or group repeated 1 or more times.
So \d+ will match one or more digits. It will match 20.
To insure that we get the number that is in paranteses we say that it should be between ( and ). These are special characters in regexp so we need to escape them.
(\d+) would match (20), and we are almost there.
Since we want the part inside the parantheses, and not including the parantheses we tell regexp that the digits part is a single group.
We do that by using parantheses in our regexp. ((\d+)) will still match (20), but now it will note that 20 is a subgroup of this match and we can fetch it by Match.Groups[].
For any string in parantheses things gets a little bit harder.
Regex reg=new Regex(#"\((.+)\)")
Would work for many strings. (the dot matches any character) But if the input is something like "This is an example(parantesis1)(parantesis2)", you would match (parantesis1)(parantesis2) with parantesis1)(parantesis2 as the captured subgroup. This is unlikely to be what you are after.
The solution can be to do the matching for "any character exept a closing paranthesis"
Regex reg=new Regex(#"\(([^\(]+)\)")
This will find (parantesis1) as the first match, with parantesis1 as .Groups[1].
It will still fail for nested paranthesis, but since regular expressions are not the correct tool for nested paranthesis I feel that this case is a bit out of scope.
If you know that the string always starts with "Likes " before the group then Saves solution is better.
I have a regular expression:
12345678|[0]{8}|[1]{8}|[2]{8}|[3]{8}|[4]{8}|[5]{8}|[6]{8}|[7]{8}|[8]{8}|[9]{8}
which matches if the string contains 12345679 or 11111111 or 22222222 ... or ... 999999999.
How can I changed this to only match if NOT the above? (I am not able to just !IsMatch in the C# unfortunately)...EDIT because that is black box code to me and I am trying to set the regex in an existing config file
This will match everything...
foundMatch = Regex.IsMatch(SubjectString, #"^(?:(?!123456789|(\d)\1{7}).)*$");
unless one of the "forbidden" sequences is found in the string.
Not using !isMatch as you can see.
Edit:
Adding your second constraint can be done with a lookahead assertion:
foundMatch = Regex.IsMatch(SubjectString, #"^(?=\d{9,12})(?:(?!123456789|(\d)\1{7}).)*$");
Works perfectly
string s = "55555555";
Regex regx = new Regex(#"^(?:12345678|(\d)\1{7})$");
if (!regx.IsMatch(s)) {
Console.WriteLine("It does not match!!!");
}
else {
Console.WriteLine("it matched");
}
Console.ReadLine();
Btw. I simplified your expression a bit and added anchors
^(?:12345678|(\d)\1{7})$
The (\d)\1{7} part takes a digit \d and the \1 checks if this digit is repeated 7 more times.
Update
This regex is doing what you want
Regex regx = new Regex(#"^(?!(?:12345678|(\d)\1{7})$).*$");
First of all, you don't need any of those [] brackets; you can just do 0{8}|1{8}| etc.
Now for your problem. Try using a negative lookahead:
#"^(?:(?!123456789|(\d)\1{7}).)*$"
That should take care of your issue without using !IsMatch.
I am not able to just !IsMatch in the C# unfortunately.
Why not? What's wrong with the following solution?
bool notMatch = !Regex.Match(yourString, "^(12345678|[0]{8}|[1]{8}|[2]{8}|[3]{8}|[4]{8}|[5]{8}|[6]{8}|[7]{8}|[8]{8}|[9]{8})$");
That will match any string that contains more than just 12345678, 11111111, ..., 99999999
i have strings in the form [abc].[some other string].[can.also.contain.periods].[our match]
i now want to match the string "our match" (i.e. without the brackets), so i played around with lookarounds and whatnot. i now get the correct match, but i don't think this is a clean solution.
(?<=\.?\[) starts with '[' or '.['
([^\[]*) our match, i couldn't find a way to not use a negated character group
`.*?` non-greedy did not work as expected with lookarounds,
it would still match from the first match
(matches might contain escaped brackets)
(?=\]$) string ends with an ]
language is .net/c#. if there is an easier solution not involving a regex i'd be also happy to know
what really irritates me is the fact, that i cannot use (.*?) to capture the string, as it seems non-greedy does not work with lookbehinds.
i also tried: Regex.Split(str, #"\]\.\[").Last().TrimEnd(']');, but i'm not really pround of this solution either
The following should do the trick. Assuming the string ends after the last match.
string input = "[abc].[some other string].[can.also.contain.periods].[our match]";
var search = new Regex("\\.\\[(.*?)\\]$", RegexOptions.RightToLeft);
string ourMatch = search.Match(input).Groups[1]);
Assuming you can guarantee the input format, and it's just the last entry you want, LastIndexOf could be used:
string input = "[abc].[some other string].[can.also.contain.periods].[our match]";
int lastBracket = input.LastIndexOf("[");
string result = input.Substring(lastBracket + 1, input.Length - lastBracket - 2);
With String.Split():
string input = "[abc].[some other string].[can.also.contain.periods].[our match]";
char[] seps = {'[',']','\\'};
string[] splitted = input.Split(seps,StringSplitOptions.RemoveEmptyEntries);
you get "out match" in splitted[7] and can.also.contain.periods is left as one string (splitted[4])
Edit: the array will have the string inside [] and then . and so on, so if you have a variable number of groups, you can use that to get the value you want (or remove the strings that are just '.')
Edited to add the backslash to the separator to treat cases like '\[abc\]'
Edit2: for nested []:
string input = #"[abc].[some other string].[can.also.contain.periods].[our [the] match]";
string[] seps2 = { "].["};
string[] splitted = input.Split(seps2, StringSplitOptions.RemoveEmptyEntries);
you our [the] match] in the last element (index 3) and you'd have to remove the extra ]
You have several options:
RegexOptions.RightToLeft - yes, .NET regex can do this! Use it!
Match the whole thing with greedy prefix, use brackets to capture the suffix that you're interested in
So generally, pattern becomes .*(pattern)
In this case, .*\[([^\]]*)\], then extract what \1 captures (see this on rubular.com)
References
regular-expressions.info/Grouping with brackets