Create a Regex pattern to validate this example expression (.1); - c#

I am looking for some help in validating that this string is valid. I need a regex pattern that will catch any letters within the set of parenthesis. I also need to make sure there is a semi-colon at the end of the parentheses. Any ideas? My regex is absolutely terrible......
This is what I want to match:
Total Hours Worked (.5);
Total Hours Worked (.A);
Total Hours Worked (A);
First result should be false while the last 2 should be true.
This is what I have tried:
Match validateLettersAndSemiColon = Regex.Match(StringToMatch, "[a-z]);");

This is just an example using as input the following 3 strings:
Total Hours Worked (.5);
Total Hours Worked (.A);
Total Hours Worked (A);
I am not considering any nested inner parenthesis only that the possible combinations inside the parenthesis are letters and dot.
Here is a simple example:
string[] data = new string[] { "Total Hours Worked (.5);", "Total Hours Worked (.A);", "Total Hours Worked (A);" };
foreach (string input in data)
{
Console.WriteLine("Result for:" + input);
Match match = Regex.Match(input, #"\([a-z.]+\);$", RegexOptions.IgnoreCase);
if (match.Success)
{
Console.WriteLine("YES");
}
else
{
Console.WriteLine("NO");
}
}
#"\([a-z.]+\);$" the \ before the parenthesis escapes it to be captured as a normal parenthesis, the [a-z.]+ means we want to match any amount of letters and dot, can also limit it but should give you an idea. The $ at the end means we want it to end with );
If you want to limit it to a single dot right after the first parenthesis you may use the below regex instead, it will turn the dot as a single optional character at the begin right after the (
#"\(\.?[a-z]+\);$"
The result of the above would be:
Total Hours Worked (.5);
NO
Total Hours Worked (.A);
YES
Total Hours Worked (A);
YES

Your regex is /\([^)]+\);/ or /\(.+?\)/ if you don't have nested parenthesis. It works even if you have two or more of these parenthesis group in the same line.
If you have nested parenthesis use /\(.+\);/, but this will not work if you have two or more parenthesis group in the same line.
In the end, if you have a string like:
(aba(cc);a);eeee(dd(e););
can be pretty hard for a single regex.
Edit 1
If your parenthesis group you want to validate takes the whole string, you can use a ^ to signal the beginning of the string and a $ for the end. Thus the regex becomes
/^\([^)]+\);$/

Try following regex:
\([^0-9]+\)\s*;
This will match any characters within parenthesis except digits.
I would recommend to put \s* between ) and ; to allow space as in most of the programming language.

Try this;
string[] inputstrings = new string[] { "Total Hours Worked (.5);", "Total Hours Worked (.A);", "Total Hours Worked (A);" };//Collection of inputs.
Regex rgx = new Regex(#"\(\.?(?<StringValue>[a-zA-Z]*)\)\;{1}");//Regular expression to find all matches.
foreach (string input in inputstrings)//Iterate through each string in collection.
{
Match match = rgx.Match(input);
if (match.Success)//If a match is found.
{
string value = match.Groups[1].Value;//Capture first named group.
Console.WriteLine(value);//Display captured substring.
}
else//If nothing is found.
{
Console.WriteLine("A match was not found.");
}
}
Here is Ideone sample.

Related

Regex for splitting string into a collection of two based on a pattern

Using the C# Regex.Split method, I would like to split strings that will always start with RepXYZ, Where the XYZ bit is a number that will always have either 3 or 4 characters.
Examples
"Rep1007$chkCheckBox"
"Rep127_Group_Text"
The results should be:
{"Rep1007","$chkCheckBox"}
{"Rep127","_Group_Text"}
So far I have tried (Rep)[\d]{3,4} and ((Rep)[\d]{3,4})+ but both of those are giving me unwanted results
Using Regex.Split often results in empty or unwanted items in the resulting array. Using (Rep)[\d]{3,4} in Regex.Split, will put Rep without the numbers into the resulting array. (Rep[\d]{3,4}) will put the Rep and the numbers into the result, but since the match is at the start, there will be an empty item in the array.
I suggest using Regex.Match here:
var match = Regex.Match(text, #"^(Rep\d+)(.*)$");
if (match.Success)
{
Console.WriteLine(match.Groups[1].Value);
Console.WriteLine(match.Groups[2].Value);
}
See the regex demo
Details:
^ - start of string
(Rep\d+) - capturing group 1: Rep and any one or more digits
(.*) - capturing group 2: any one or more chars other than a newline, as many as possible
$ - end of string.
A splitting approach is better implemented with a lookaround-based regex:
var results = Regex.Split(text, #"(?<=^Rep\d+)(?=[$_])");
See this regex demo.
(?<=^Rep\d+)(?=[$_]) splits a string at the location that is immediately preceded with Rep and one or more digits at the start of the string, and immediately followed with $ or _.
Try splitting on the regex pattern on either $ or _:
string input = "Rep127_Group_Text";
string[] parts = input.Split(new[] { '$', '_' }, 2);
foreach (string part in parts)
{
Console.WriteLine(part);
}
This prints:
Rep127
Group_Text

c# making a regex to accept numnerical values

I am trying to make a regex on field which accepts in the following:
Where X is a numerical value between 0-9 so 3 numbers before the - and three after the dash.
I started with the following but I got lost in adding validation after the dash.
([0-9-])\w+([0-9-])
3 digits, a dash then 3 digits:
\d{3}-\d{3}
var example = "123-455";
var pattern = #"\A(\d){3}-(\d){3}\Z";
var result = Regex.Match(example, pattern);
This will not only search for the pattern within your string, but also make sure that the beginning and end of the pattern is at the beginning and end of your target string. This ensures that you won't get a match e.g. for:
"silly123-456stuff" or "0123-4567".
In other words, it both looks for a pattern, and limits its length by anchoring it to the begining and end of the string.
string pattern = #"^([0-9]{3})-([0-9]{3})$";
Regex rgx = new Regex(pattern);
I would add the the beginning and end of line to the regex
^\d{3}-\d{3}$
^ = at the beginning of the line
\d = a number
{3} = three times
- = a dash
\d = a number
{3} = three times
$ = the end of the line
Not setting the start and end of line could catch invalid patterns, such as Text123-4858
Edit: even better than line markers, the anchors proposed by Kjartan are the correct answer in this case.

Using RegEx to match Month-Day in C#

Let me preface this by saying I am new to Regex and C# so I am still trying to figure it out. I also realize that Regex is a deep subject that takes time to understand. I have done a little research to figure this out but I don't have the time needed to properly study the art of Regex syntax as I need this program finished tomorrow. (no this is not homework, it is for my job)
I am using c# to search through a text file line by line and I am trying to use a Regex expression to check whether any lines contain any dates of the current month in the format MM-DD. The Regex expression is used within a method that is passed each line of the file.
Here is the method I am currently using:
private bool CheckTransactionDates(string line)
{
// in the actual code this is dynamically set based on other variables
string month = "12";
Regex regExPattern = new Regex(#"\s" + month + #"-\d(0[1-9]|[1-2][0-9]|3[0-1])\s");
Match match = regExPattern.Match(line);
return match.Success;
}
Essentially I need it to match if it is preceded by a space and followed by a space. Only if it is the current month (in this case 12), an hyphen, and a day of the month ( " 12-01 " should match but not " 12-99 "). It should always be 2 digits on either side of the hyphen.
This Regex (The only thing I can make match) will work, but also picks up items outside the necessary range:
Regex regExPattern = new Regex(#"\s" + month + #"-\d{2}\s");
I have also tried this without sucess:
Regex regExPattern = new Regex(#"\s" + month + #"-\d[01-30]{2}\s");
Can anyone tell me what I need to change to get the results I need?
Thanks in advance.
If you just need to find out if the line contains any valid match, something like this will work:
private bool CheckTransactionDates(string line)
{
// in the actual code this is dynamically set based on other variables
int month = DateTime.Now.Month;
int daysInMonth = DateTime.DaysInMonth(DateTime.Today.Year, DateTime.Today.Month);
Regex pattern = new Regex(string.Format(#"{0:00}-(?<DAY>[0123][0-9])", month));
int day = 0;
foreach (Match match in pattern.Matches(line))
{
if (int.TryParse(match.Groups["DAY"].Value, out day))
{
if (day <= daysInMonth)
{
return true;
}
}
}
return false;
}
Here's how it works:
You determine the month to search for (here, I use the current month), and the number of days in that month.
Next, the regex pattern is built using a string.Format function that puts the left-zero-padded month, followed by dash, followed by any two digit number 00 to 39 (the [0123] for the first digit, the [0-9] for the second digit). This narrows the regex matches, but not conclusively for a date. The (?<DAY>...) that surrounds it creates a regex group, which will make processing it later easier. Note that I didn't check for a whitespace, in case the line begins with a valid date. You could easily add a space to the pattern, or modify the pattern to your specific needs.
Next, we check all possible matches on that line (pattern.Matches) in a loop.
If a match is found, we then try to parse it as an integer (it should always work, based on the pattern we are matching). We use the DAY group of that match that we defined in the pattern.
After parsing that match into an integer day, we check to see if that day is a valid number for the month specified. If it is, we return true from the function, as we found a valid date.
Finally, if we found no matches, or if none of the matches is valid, we return false from the function (only if we hadn't returned true earlier).
One thing to note is that \s matches any white space character, not just a space:
\s match any white space character [\r\n\t\f ]
However, a Regex that literally looks for a space would not, one like this (12-\d{2}). However, I've got to go with the rest of the community a bit on what to do with the matches. You're going to need to go through every match and validate the date with a better approach:
var input = string.Format(
" 11-20 2690 E 28.76 12-02 2468 E* 387.85{0}11-15 3610 E 29.34 12-87 2534 E",
Environment.NewLine);
var pattern = string.Format(#" ({0}-\d{{2}}) ", DateTime.Now.ToString("MM"));
var lines = new List<string>();
foreach (var line in input.Split(new string[] { Environment.NewLine },
StringSplitOptions.RemoveEmptyEntries))
{
var m = Regex.Match(line, pattern);
if (!m.Success)
{
continue;
}
DateTime dt;
if (!DateTime.TryParseExact(m.Value.Trim(),
"MM-dd",
null,
DateTimeStyles.None,
out dt))
{
continue;
}
lines.Add(line);
}
The reason I went through the lines one at a time is because presumably you need to know what line is good and what line is bad. My logic may not exactly match what you need but you can easily modify it.

Get sub-strings from a string that are enclosed using some specified character

Suppose I have a string
Likes (20)
I want to fetch the sub-string enclosed in round brackets (in above case its 20) from this string. This sub-string can change dynamically at runtime. It might be any other number from 0 to infinity. To achieve this my idea is to use a for loop that traverses the whole string and then when a ( is present, it starts adding the characters to another character array and when ) is encountered, it stops adding the characters and returns the array. But I think this might have poor performance. I know very little about regular expressions, so is there a regular expression solution available or any function that can do that in an efficient way?
If you don't fancy using regex you could use Split:
string foo = "Likes (20)";
string[] arr = foo.Split(new char[]{ '(', ')' }, StringSplitOptions.None);
string count = arr[1];
Count = 20
This will work fine regardless of the number in the brackets ()
e.g:
Likes (242535345)
Will give:
242535345
Works also with pure string methods:
string result = "Likes (20)";
int index = result.IndexOf('(');
if (index >= 0)
{
result = result.Substring(index + 1); // take part behind (
index = result.IndexOf(')');
if (index >= 0)
result = result.Remove(index); // remove part from )
}
Demo
For a strict matching, you can do:
Regex reg = new Regex(#"^Likes\((\d+)\)$");
Match m = reg.Match(yourstring);
this way you'll have all you need in m.Groups[1].Value.
As suggested from I4V, assuming you have only that sequence of digits in the whole string, as in your example, you can use the simpler version:
var res = Regex.Match(str,#"\d+")
and in this canse, you can get the value you are looking for with res.Value
EDIT
In case the value enclosed in brackets is not just numbers, you can just change the \d with something like [\w\d\s] if you want to allow in there alphabetic characters, digits and spaces.
Even with Linq:
var s = "Likes (20)";
var s1 = new string(s.SkipWhile(x => x != '(').Skip(1).TakeWhile(x => x != ')').ToArray());
const string likes = "Likes (20)";
int likesCount = int.Parse(likes.Substring(likes.IndexOf('(') + 1, (likes.Length - likes.IndexOf(')') + 1 )));
Matching when the part in paranthesis is supposed to be a number;
string inputstring="Likes (20)"
Regex reg=new Regex(#"\((\d+)\)")
string num= reg.Match(inputstring).Groups[1].Value
Explanation:
By definition regexp matches a substring, so unless you indicate otherwise the string you are looking for can occur at any place in your string.
\d stand for digits. It will match any single digit.
We want it to potentially be repeated several times, and we want at least one. The + sign is regexp for previous symbol or group repeated 1 or more times.
So \d+ will match one or more digits. It will match 20.
To insure that we get the number that is in paranteses we say that it should be between ( and ). These are special characters in regexp so we need to escape them.
(\d+) would match (20), and we are almost there.
Since we want the part inside the parantheses, and not including the parantheses we tell regexp that the digits part is a single group.
We do that by using parantheses in our regexp. ((\d+)) will still match (20), but now it will note that 20 is a subgroup of this match and we can fetch it by Match.Groups[].
For any string in parantheses things gets a little bit harder.
Regex reg=new Regex(#"\((.+)\)")
Would work for many strings. (the dot matches any character) But if the input is something like "This is an example(parantesis1)(parantesis2)", you would match (parantesis1)(parantesis2) with parantesis1)(parantesis2 as the captured subgroup. This is unlikely to be what you are after.
The solution can be to do the matching for "any character exept a closing paranthesis"
Regex reg=new Regex(#"\(([^\(]+)\)")
This will find (parantesis1) as the first match, with parantesis1 as .Groups[1].
It will still fail for nested paranthesis, but since regular expressions are not the correct tool for nested paranthesis I feel that this case is a bit out of scope.
If you know that the string always starts with "Likes " before the group then Saves solution is better.

.NET Regex - "Not" Match

I have a regular expression:
12345678|[0]{8}|[1]{8}|[2]{8}|[3]{8}|[4]{8}|[5]{8}|[6]{8}|[7]{8}|[8]{8}|[9]{8}
which matches if the string contains 12345679 or 11111111 or 22222222 ... or ... 999999999.
How can I changed this to only match if NOT the above? (I am not able to just !IsMatch in the C# unfortunately)...EDIT because that is black box code to me and I am trying to set the regex in an existing config file
This will match everything...
foundMatch = Regex.IsMatch(SubjectString, #"^(?:(?!123456789|(\d)\1{7}).)*$");
unless one of the "forbidden" sequences is found in the string.
Not using !isMatch as you can see.
Edit:
Adding your second constraint can be done with a lookahead assertion:
foundMatch = Regex.IsMatch(SubjectString, #"^(?=\d{9,12})(?:(?!123456789|(\d)\1{7}).)*$");
Works perfectly
string s = "55555555";
Regex regx = new Regex(#"^(?:12345678|(\d)\1{7})$");
if (!regx.IsMatch(s)) {
Console.WriteLine("It does not match!!!");
}
else {
Console.WriteLine("it matched");
}
Console.ReadLine();
Btw. I simplified your expression a bit and added anchors
^(?:12345678|(\d)\1{7})$
The (\d)\1{7} part takes a digit \d and the \1 checks if this digit is repeated 7 more times.
Update
This regex is doing what you want
Regex regx = new Regex(#"^(?!(?:12345678|(\d)\1{7})$).*$");
First of all, you don't need any of those [] brackets; you can just do 0{8}|1{8}| etc.
Now for your problem. Try using a negative lookahead:
#"^(?:(?!123456789|(\d)\1{7}).)*$"
That should take care of your issue without using !IsMatch.
I am not able to just !IsMatch in the C# unfortunately.
Why not? What's wrong with the following solution?
bool notMatch = !Regex.Match(yourString, "^(12345678|[0]{8}|[1]{8}|[2]{8}|[3]{8}|[4]{8}|[5]{8}|[6]{8}|[7]{8}|[8]{8}|[9]{8})$");
That will match any string that contains more than just 12345678, 11111111, ..., 99999999

Categories