How to use regex in order to catch a power statement - c#

How can I use regex to catch a power statement, here are some examples:
24
(2*5)x
y(y+1)
or more complex ones such as x4+(x*2)(x+1) in which case it has 2 matches ("x4" and "(x*2)(x+1)")
I managed to get it working without the parenthesis using the expression:
Regex rPower = new Regex(#"\w\^\w");
But to deal with the possible existence of parenthesis I was thinking of something along these lines, but it still isn't working...
Regex rPower = new Regex(#"(?(?=\()(.*?(?=\)))|(\w))\^(?(?=\()(.*?(?=\)))|(\w))");
Any help/explanation that includes the thought process behind it would be deeply appreciated since I don't know much about regex and and I'm just now starting to learn it.
Thanks in advance
EDIT: For clarity what I intend to do is:
If in the string there is a substring which may start with an "(" in which case it should read everything from that "(" until it find and ")" otherwise assume it's an "\w", separated by a "^" which in turn follows another pattern just like the one it started with.
Basically it will match the expression "(random_Expression)(random_Expression)", but it may not actually be a complex expression, if it does not contain any parenthesis I will assume it's a simple "\w".
I hope I made myself clear :S

You may use this:
(\([^)]*\)|\w)\^(\([^)]*\)|\w)
Sample matches:
2^2 matches 2^2
a+b^c matches b^c
(a+b)^(c+d) matches (a+b)^(c+d)
2^(a+b) matches 2^(a+b)
(a+b)^2 matches (a+b)^2
(a+b)^2+5^2-(3+2)^(2+3) matches (a+b)^2, 5^2, (3+2)^(2+3)
Obviously, you may find bugs on the expression if stuff like nested operations is used. If you are going to work with complex expressions, I guess you will have to parse them carefully with a more elaborated method.
Could you please edit or reply with an explanation even if brief of
how the expression is working?
It is similar to your original expression \w\^\w, but it changes each \w with (\([^)]*\)|\w). If you look closely, that matches either "something inside parentheses" (given by\([^)]*\), which doesn't work for nested brackets) or "a simple word" (\w).
Hope that helps a bit :)

Related

Reusable Non-Capture Groups [duplicate]

I can't seem to find an answer to this problem, and I'm wondering if one exists. Simplified example:
Consider a string "nnnn", where I want to find all matches of "nn" - but also those that overlap with each other. So the regex would provide the following 3 matches:
nnnn
nnnn
nnnn
I realize this is not exactly what regexes are meant for, but walking the string and parsing this manually seems like an awful lot of code, considering that in reality the matches would have to be done using a pattern, not a literal string.
Update 2016:
To get nn, nn, nn, SDJMcHattie proposes in the comments (?=(nn)) (see regex101).
(?=(nn))
Original answer (2008)
A possible solution could be to use a positive look behind:
(?<=n)n
It would give you the end position of:
nnnn
 
nnnn
 
nnnn
As mentioned by Timothy Khouri, a positive lookahead is more intuitive (see example)
I would prefer to his proposition (?=nn)n the simpler form:
(n)(?=(n))
That would reference the first position of the strings you want and would capture the second n in group(2).
That is so because:
Any valid regular expression can be used inside the lookahead.
If it contains capturing parentheses, the backreferences will be saved.
So group(1) and group(2) will capture whatever 'n' represents (even if it is a complicated regex).
Using a lookahead with a capturing group works, at the expense of making your regex slower and more complicated. An alternative solution is to tell the Regex.Match() method where the next match attempt should begin. Try this:
Regex regexObj = new Regex("nn");
Match matchObj = regexObj.Match(subjectString);
while (matchObj.Success) {
matchObj = regexObj.Match(subjectString, matchObj.Index + 1);
}
AFAIK, there is no pure regex way to do that at once (ie. returning the three captures you request without loop).
Now, you can find a pattern once, and loop on the search starting with offset (found position + 1). Should combine regex use with simple code.
[EDIT] Great, I am downvoted when I basically said what Jan shown...
[EDIT 2] To be clear: Jan's answer is better. Not more precise, but certainly more detailed, it deserves to be chosen. I just don't understand why mine is downvoted, since I still see nothing incorrect in it. Not a big deal, just annoying.

Regular expression to match specific filename pattern containing underscores

I'm trying to create a regular expression that would match files of this pattern:
Id_Name_processID_timestamp_logName.txt
Example of filename: abcd_Service_11234_15112013_Log.txt
I don't need perfect matching something that would match anything_anything_anything_anything_anything.txt would work for me.
I haven't tried anything just lost time starring at this Regex Tutorial for quite a long time, i don t know where to start :(.
Go to this site: http://regexpal.com/
Put abcd_Service_11234_15112013_Log.txt in the lower box.
Start writing your rexex on the top box, until it matches (it's a simple one, really, chars, underscore, rinse and repeat) ... You'll be ok ...
My regex, a short simple one.
^\w+_\w+.txt
Edit:
I do agree with the 1st answer: You really need to try something on your own but that website must be the least userfriendly page on regex. You get my answer out of sympathy ;)

Regex to validate domain name with port

I am new developer and don't have much exposure on Regular Expression. Today I assigned to fix a bug using regex but after lots of effort I am unable to find the error.
Here is my requirement.
My code is:
string regex = "^([A-Za-z0-9\\-]+|[A-Za-z0-9]{1,3}\\.[A-Za-z0-9]{1,3}\\.[A-Za-z0-9] {1,3}\\.[A-Za-z0-9]{1,3}):([0-9]{1,5}|\\*)$";
Regex _hostEndPointRegex = new Regex(regex);
bool isTrue = _hostEndPointRegex.IsMatch(textBox1.Text);
It's throwing an error for the domain name like "nikhil-dev.in.abc.ni:8080".
I am not sure where the problem is.
Your regex is a bit redundant in that you or in some stuff that is already included in the other or block.
I just simplified what you had to
(?:[A-Za-z0-9-]+\.)+[A-Za-z0-9]{1,3}:\d{1,5}
and it works just fine...
I'm not sure why you had \ in the allowed characters as I am pretty sure \ is not allowed in a host name.
Your problem is that your or | breaks things up like this...
[A-Za-z0-9\\-]+
or
[A-Za-z0-9]{1,3}\\.[A-Za-z0-9]{1,3}\\.[A-Za-z0-9]{1,3}\\.[A-Za-z0-9]{1,3}
or
\*
Which as the commentor said was not including "-" in the 2nd block.
So perhaps you intended
^((?:[A-Za-z0-9\\-]+|[A-Za-z0-9]{1,3})\.[A-Za-z0-9]{1,3}\.[A-Za-z0-9]{1,3}\.[A-Za-z0-9]{1,3}):([0-9]{1,5}|\*)$
However the first to two or'ed items would be redundant as + includes {1-3}.
ie. [A-Za-z0-9\-]+ would also match anything that this matches [A-Za-z0-9]{1,3}
You can use this tool to help test your Regex:
http://regexpal.com/
Personally I think every developer should have regexbuddy
The regex above although it works will allow non-valid host names.
it should be modified to not allow punctuation in the first character.
So it should be modified to look like this.
(?:[A-Za-z0-9][A-Za-z0-9-]+\.)(?:[A-Za-z0-9-]+\.)+[A-Za-z0-9]{1,3}:\d{1,5}
Also in theory the host isn't allowed to end in a hyphen.
it is all so complicated I would use the regex only to capture the parts and then use Uri.CheckHostName to actually check the Uri is valid.
Or you can just use the regex suggested by CodeCaster

How to Find All Matches in Regular Expressions when one Overlaps OR Contains the Other?

The question of how to find every match when they might overlap was asked in Overlapping matches in Regex. However, as far as I can see, the answers there does not cover a more general case.
How can we find all substrings that begin with "a" and end with "z"? For example, given "akzzaz", it should find "akz", "akzz", "az" and "akzzaz".
Since there may be more than one match starting at the same position, ("akz" and "akzz") and also there may be more than one match ending at the same position ("az" and "akzzaz") I cannot see how using a lookahead or lookbehind helps as in the mentioned link. (Also, please bear in mind that in the general case "a" and "z" might be more complex regular expressions)
I use C#, so, in case it matters, having any feature specific to .Net Regular Expressions is OK.
Regular expressions are designed to find one match at a time. Even a global match operation is simply repeated applications of the same regex, each starting at the end of the previous match in the target string. So no, regexes are not able to find all matches in this way.
I will stick my neck out and say that I don't believe you can even find "all strings beginning with 'a' in 'akzzaz'" with a regex. /(a.*)/g will find the entire string, while /(a.*?)/g will find just 'a' twice.
The way I would code this would be to locate all 'a's, and search each of the substrings from there to the end of the string for all 'z's. So search 'akzzaz` and 'az' for 'z', giving 'akz', 'akzz', 'akzzaz', and 'az'. That is a fairly simple thing to do, but not a job for a regex unless the actual 'a' and 'z' tokens are complex.
For your current problem, string.startwith and string.endwith would do be a better job. Regular Expression is not necessarily faster in all cases.
Try this regular expression
a[akz]+z - in case a, k and z are the only characters
a[a-z]+z - in case of any alphabet
I think it's worth noting that there is actually a way for a regex to return more than one match at the same time. Although this doesn't answer your question, I think this would be a good place to mention this for others who may run into a similar situation.
The regex below for example would return all the right substrings of a string with a single match and has them in different capturing groups:
(?=(\w+)).
This regex uses capturing groups inside a zero-width assertion and for each match at position i(each character) the capturing group is a substring of length n-i.
Doing anything that would require the regex engine to stay in the same place after a match is probably overkill for a regular expression approach.

Regular Expression to reject special characters other than commas

I am working in asp.net. I am using Regular Expression Validator
Could you please help me in creating a regular expression for not allowing special characters other than comma. Comma has to be allowed.
I checked in regexlib, however I could not find a match. I treid with ^(a-z|A-Z|0-9)*[^#$%^&*()']*$ . When I add other characters as invalid, it does not work.
Also could you please suggest me a place where I can find a good resource of regular expressions? regexlib seems to be big; but any other place which lists very limited but most used examples?
Also, can I create expressions using C# code? Any articles for that?
[\w\s,]+
works fine, as you can see bellow.
RegExr is a great place to test your regular expressions with real time results, it also comes with a very complete list of common expressions.
[] character class \w Matches any word character (alphanumeric & underscore). \s
Matches any whitespace character (spaces, tabs, line breaks). , include comma + is greedy match; which will match the previous 1 or more times.
[\d\w\s,]*
Just a guess
To answer on any articles, I got started here, find it to be an excellent resource:
http://www.regular-expressions.info/
For your current problem, try something like this:
[\w\s,]*
Here's a breakdown:
Match a single character present in the list below «[\w\s,]*»
Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
A word character (letters, digits, etc.) «\w»
A whitespace character (spaces, tabs, line breaks, etc.) «\s»
The character “,” «,»
For a single character that is not a comma, [^,] should work perfectly fine.
You can try [\w\s,] regular expression. This regex will match only alpha-numeric characters and comma. If any other character appears within text, then this wont match.
For your second question regarding regular expression resource, you can goto
http://www.regular-expressions.info/
This website has lot of tutorials on regex, plus it has lot of usefult information.
Also, can I create expressions using
C# code? Any articles for that?
By this, do you mean to say you want to know which class and methods for regular expression execution? Or you want tool that will create regular expression for you?
You can create expressions with C#, something like this usually does the trick:
Regex regex = new Regex(#"^[a-z | 0-9 | /,]*$", RegexOptions.IgnoreCase);
System.Console.Write("Enter Text");
String s = System.Console.ReadLine();
Match match = regex.Match(s);
if (match.Success == true)
{
System.Console.WriteLine("True");
}
else
{
System.Console.WriteLine("False");
}
System.Console.ReadLine();
You need to import the System.Text.RegularExpressions;
The regular expression above, accepts only numbers, letters (both upper and lower case) and the comma.
For a small introduction to Regular Expressions, I think that the book for MCTS 70-536 can be of a big help, I am pretty sure that you can either download it from somewhere or obtain a copy.
I am assuming that you never messed around with regular expressions in C#, hence I provided the code above.
Hope this helps.
Thank you, all..
[\w\s,]* works
Let me go through regular-expressions.info and come back if I need further support.
Let me try the C# code approach and come back if I need further support.
[This forum is awesome. Quality replies so qucik..]
Thanks again
(…) is denoting a grouping and not a character set that’s denoted with […]. So try this:
^[a-zA-Z0-9,]*$
This will only allow alphanumeric characters and the comma.

Categories