I have an arithmetic expression
string exp = "((2+3.1)/2)*4.456";
I want to validate by using regular expression. The expression can only have integers, floating point numbers, operands and parenthesis.
How can i generate regular expression to validate please help or suggest any other way to validate that string.
Using Perl/PCRE we could verify such simple arithmetic expressions with help of a pattern structured like:
expr = pnum ( op pnum )*
pnum = num | \( expr \)
Where num and op defined as required. For example:
num = -?+\d++(?:\.\d++)?+
op = [-+*/]
Which would give us the following working expression:
(?x)^ (?&expr) $
(?(DEFINE)
(?<expr> (?&pnum) (?: (?&op) (?&pnum) )*+ )
(?<pnum> (?> (?&num) | \( (?&expr) \) ) )
(?<num> -?+\d++(?:\.\d++)?+ )
(?<op> [-+*/] )
)
But such expressions could not be used with .NET regex as it does not support (recursive) suppatern calls (?&name).
Instead .NET regex lib offers us its special feature: balancing groups.
With balancing groups we could rewrite the required recursive call used in pnum, and use a structure like this instead:
expr = pnum ( op pnum )* (?(p)(?!))
pnum = (?> (?<p> \( )* num (?<-p> \) )* )
What we've done here is to allow any number of optional opening and closing paranthesis before and after every number, counting the total number of open parentheses (?<p> \( ), subtracting closing parentheses from that number (?<-p> \) ) and at the end of the expression make sure that the number of open parentheses is 0 (?(p)(?!)).
(I believe this is equivalent to the original structure, altho I haven't made any formal proof.)
Resulting in the following .NET pattern:
(?x)
^
(?> (?<p> \( )* (?>-?\d+(?:\.\d+)?) (?<-p> \) )* )
(?>(?:
[-+*/]
(?> (?<p> \( )* (?>-?\d+(?:\.\d+)?) (?<-p> \) )* )
)*)
(?(p)(?!))
$
C# Example:
using System;
using System.Text.RegularExpressions;
namespace RegexTest
{
class Program
{
static void Main(string[] args)
{
var expressions = new string[] {
"((2+3.1)/2)*4.456",
"1",
"(2)",
"2+2",
"(1+(2+3))",
"-2*(2+-2)",
"1+(3/(2+7-(4+3)))",
"1-",
"2+2)",
"(2+2",
"(1+(2+3)",
};
var regex = new Regex(#"(?x)
^
(?> (?<p> \( )* (?>-?\d+(?:\.\d+)?) (?<-p> \) )* )
(?>(?:
[-+*/]
(?> (?<p> \( )* (?>-?\d+(?:\.\d+)?) (?<-p> \) )* )
)*)
(?(p)(?!))
$
");
foreach (var expr in expressions)
{
Console.WriteLine("Expression: " + expr);
Console.WriteLine(" Result: " + (regex.IsMatch(expr) ? "Matched" : "Failed"));
}
}
}
}
Output:
Expression: ((2+3.1)/2)*4.456
Result: Matched
Expression: 1
Result: Matched
Expression: (2)
Result: Matched
Expression: 2+2
Result: Matched
Expression: (1+(2+3))
Result: Matched
Expression: -2*(2+-2)
Result: Matched
Expression: 1+(3/(2+7-(4+3)))
Result: Matched
Expression: 1-
Result: Failed
Expression: 2+2)
Result: Failed
Expression: (2+2
Result: Failed
Expression: (1+(2+3)
Result: Failed
You could write a simple lexer in F# using fslex/fsyacc. Here is an example which is very close to your requirement: http://blogs.msdn.com/b/chrsmith/archive/2008/01/18/fslex-sample.aspx
Related
I'm, looking for a regular expression that will match only when all curly braces properly match. Matching braces can be nested.
Ex.
Matches
Hello {0}{}
Hello to the following {0}: {{Object1}}, {{Object2}}
Test { {1} { {2} { {3} { {4}}}}}
Non-matches
}{Hello {0}
{{}Hello to the following {0}: {{Object1}}, {{Object2}}
Test { {1} { {2} { {3} { {4}{}
In .NET you can use balancing groups to count, which allows you to solve such problems.
For example make sure { and } are balanced you could use an expression like:
(?x)^
[^{}]*
(?:
(?:
(?'open' \{ ) # open++
[^{}]*
)+
(?:
(?'close-open' \} ) # open--, only if open > 0
[^{}]*
)+
)*
(?(open) (?!) ) # fail if open != 0
$
bool BracesMatch( string s )
{
int numOpen = 0, numClosed = 0;
foreach( char c in s.ToCharArray() )
{
if ( c == '{' ) numOpen++;
if ( c == '}' ) numClosed++;
if ( numClosed > numOpen ) return false;
}
return numOpen == numClosed;
}
This might work using the Dot-Net balanced groups as well.
# #"^[^{}]*(?:\{(?>[^{}]+|\{(?<Depth>)|\}(?<-Depth>))*(?(Depth)(?!))\}[^{}]*)*[^{}]*$"
^
[^{}]* # Anything (but only if we're not at the start of { or } )
(?:
\{ # Match opening {
(?> # Then either match (possessively):
[^{}]+ # Anything (but only if we're not at the start of { or } )
| # or
\{ # { (and increase the braces counter)
(?<Depth> )
| # or
\} # } (and decrease the braces counter).
(?<-Depth> )
)* # Repeat as needed.
(?(Depth) # Assert that the braces counter is at zero.
(?!) # Fail this part if depth > 0
)
\} # Then match a closing }.
[^{}]* # Anything (but only if we're not at the start of { or } )
)* # Repeat as needed
[^{}]* # Anything (but only if we're not at the start of { or } )
$
I have a line XX,VV,A01,A02,A03,A11,A12,A13,A14,B11,B12,B13,ZZ,DD
I need a regular expression for
If I find A01,A02,A03 or A11,A12,A13,A14 in my line, I have to replace with "AA"
If I find B11,B12,B13 I have to replace with "BB"
I have tried using
if (Regex.IsMatch(Value, "^A0[2-9]")|| Regex.IsMatch(Value, "^A1[0-5]"))
It didnt work -- so basically if i have A02,A03, A04, A05, A06, A07 or A10,A11,A12....... A15 , I have to replace with "AA"
Description
(?:((?:[AB](?=0[2-9]|1[0-5])))[0-9]{2}(?:(?=,\s*\1),|))*
Replace With: $1$1
This regular expression will do the following:
finds consecutive comma delimited runs of A02 - A15 or B02 - B15
replaces the entire run with either AA or BB
Example
Live Demo
https://regex101.com/r/gN8aP6/1
Sample text
XX,VV,A01,A02,A03,A11,A12,A13,A14,B11,B12,B13,ZZ,DD
Sample Matches
XX,VV,A01,AA,BB,ZZ,DD
Explanation
NODE EXPLANATION
----------------------------------------------------------------------
(?: group, but do not capture (0 or more times
(matching the most amount possible)):
----------------------------------------------------------------------
( group and capture to \1:
----------------------------------------------------------------------
(?: group, but do not capture:
----------------------------------------------------------------------
[AB] any character of: 'A', 'B'
----------------------------------------------------------------------
(?= look ahead to see if there is:
----------------------------------------------------------------------
0 '0'
----------------------------------------------------------------------
[2-9] any character of: '2' to '9'
----------------------------------------------------------------------
| OR
----------------------------------------------------------------------
1 '1'
----------------------------------------------------------------------
[0-5] any character of: '0' to '5'
----------------------------------------------------------------------
) end of look-ahead
----------------------------------------------------------------------
) end of grouping
----------------------------------------------------------------------
) end of \1
----------------------------------------------------------------------
[0-9]{2} any character of: '0' to '9' (2 times)
----------------------------------------------------------------------
(?: group, but do not capture:
----------------------------------------------------------------------
(?= look ahead to see if there is:
----------------------------------------------------------------------
, ','
----------------------------------------------------------------------
\s* whitespace (\n, \r, \t, \f, and " ")
(0 or more times (matching the most
amount possible))
----------------------------------------------------------------------
\1 what was matched by capture \1
----------------------------------------------------------------------
) end of look-ahead
----------------------------------------------------------------------
, ','
----------------------------------------------------------------------
| OR
----------------------------------------------------------------------
) end of grouping
----------------------------------------------------------------------
)* end of grouping
----------------------------------------------------------------------
You have to remove the ^ from your expression like A0[2-9]. Since the result is not at the beginning of your expression(^).
Online Demo
.NET Fiddle Demo
using System;
using System.Collections;
using System.Collections.Generic;
using System.Data;
using System.Diagnostics;
using System.Text.RegularExpressions;
public static class Sample1
{
public static void Main()
{
var sampleInput = "XX,VV,A01,A02,A03,A11,A12,A13,A14,B11,B12,B13,ZZ,DD";
var results = Regex.Replace(sampleInput, "A0[2-9]|A1[0-5]", "AA");
Console.WriteLine("Line: {0}", results);
}
}
I have a string such as this
(ed) (Karlsruhe Univ. (TH) (Germany, F.R.))
I need to split it into two such as this
ed
Karlsruhe Univ. (TH) (Germany, F.R.)
Basically, ignoring whitespace and parenthesis within a parenthesis
Is it possible to use a regex to achieve this?
If you can have more parentheses, it's better to use balancing groups:
string text = "(ed) (Karlsruhe Univ. (TH) (Germany, F.R.))";
var charSetOccurences = new Regex(#"\(((?:[^()]|(?<o>\()|(?<-o>\)))+(?(o)(?!)))\)");
var charSetMatches = charSetOccurences.Matches(text);
foreach (Match match in charSetMatches)
{
Console.WriteLine(match.Groups[1].Value);
}
ideone demo
Breakdown:
\(( # First '(' and begin capture
(?:
[^()] # Match all non-parens
|
(?<o> \( ) # Match '(', and capture into 'o'
|
(?<-o> \) ) # Match ')', and delete the 'o' capture
)+
(?(o)(?!)) # Fails if 'o' stack isn't empty
)\) # Close capture and last opening brace
\((.*?)\)\s*\((.*)\)
you will get the two values in two match groups \1 and \2
demo here : http://regex101.com/r/rP5kG2
and this is what you get if you search and replace with the pattern \1\n\2 which also seems to be what you need exactly
string str = "(ed) (Karlsruhe Univ. (TH) (Germany, F.R.))";
Regex re = new Regex(#"\((.*?)\)\s*\((.*)\)");
Match match = re.Match(str);
In general, No.
You can't describe recursive patterns in regular expression. ( Since it's not possible to recognize it with a finite automaton. )
Reg Expression for Getting Text Between parenthesis ( ), I had tried but i am not getting the RegEx. For this example
Regex.Match(script, #"\((.*?)\)").Value
Example:-
add(mul(a,add(b,c)),d) + e - sub(f,g)
Output =>
1) mul(a,add(b,c)),d
2) f,g
.NET allows recursion in regular expressions. See Balancing Group Definitions
var input = #"add(mul(a,add(b,c)),d) + e - sub(f,g)";
var regex = new Regex(#"
\( # Match (
(
[^()]+ # all chars except ()
| (?<Level>\() # or if ( then Level += 1
| (?<-Level>\)) # or if ) then Level -= 1
)+ # Repeat (to go from inside to outside)
(?(Level)(?!)) # zero-width negative lookahead assertion
\) # Match )",
RegexOptions.IgnorePatternWhitespace);
foreach (Match c in regex.Matches(input))
{
Console.WriteLine(c.Value.Trim('(', ')'));
}
I have text like this:
This is {name1:value1}{name2:{name3:even dipper {name4:valu4} dipper} some inner text} text
I want to parse out data like that:
Name: name1
Value: value1
Name: name2
Value: {name3:even dipper {name4:valu4} dipper} some inner text
I would then recursively process each value to parse out nested fields.
Can you recommend a RegEx expression to do this?
In C# you can use balancing groups to count and balance the brackets:
{ (?'name' \w+ ) : # start of tag
(?'value' # named capture
(?> # don't backtrack
(?:
[^{}]+ # not brackets
| (?'open' { ) # count opening bracket
| (?'close-open' } ) # subtract closing bracket (matches only if open count > 0)
)*
)
(?(open)(?!)) # make sure open is not > 0
)
} # end of tag
Example:
string re = #"(?x) # enable eXtended mode (comments/spaces ignored)
{ (?'name' \w+ ) : # start of tag
(?'value' # named capture
(?> # don't backtrack
(?:
[^{}]+ # not brackets
| (?'open' { ) # count opening bracket
| (?'close-open' } ) # subtract closing bracket (matches only if open count > 0)
)*
)
(?(open)(?!)) # make sure open is not > 0
)
} # end of tag
";
string str = #"This is {name1:value1}{name2:{name3:even dipper {name4:valu4} dipper} some inner text} text";
foreach (Match m in Regex.Matches(str, re))
{
Console.WriteLine("name: {0}, value: {1}", m.Groups["name"], m.Groups["value"]);
}
Output:
name: name1, value: value1
name: name2, value: {name3:even dipper {name4:valu4} dipper} some inner text
If using Perl/PHP/PCRE it's not complicated at all. You can use an expression like:
{(\w+): # start of tag
((?:
[^{}]+ # not a tag
| (?R) # a tag (recurse to match the whole regex)
)*)
} # end of tag