Dynamic logical expression parsing/evaluation in C# or VB?

Dynamic logical expression parsing/evaluation in C# or VB? - c#

What is the best was to evaluate an expression like the following:
(A And B) Or (A And C) Or (Not B And C)
or
(A && B) || (A && C) || (!B && C)
At runtime, I was planning on converting the above expressions to the following:
(True And False) Or (True And False) Or (Not False And True)
or
(True && False) || (True && False) || (! False && True)
Conditions:
1) The logical expression is not known until runtime.
2) The number variable and their values are not known until runtime.
3) Variable values are never null.
I know I could create a simple assemble with a class and a method that I generate at runtime based on the inputs, but is there a better way.
I have done this before. Use a string builder to write the code, then call the compiler. After that, you load the assembly and call the method.
Suggestions?
Thanks.

If you're using .NET3.5 then you can parse the text and create an abstract sytax tree using the Expression classes. Then create a suitable LambdaExpression instance and compile it into a delegate, which you can then execute.
Constructing a parser and syntax tree builder for this kind of fairly simple grammer is quite an interesting exercise, and will execute somewhat faster than invoking the compiler (and it's neater in my view as well).
If you're not using .NET3.5, then it's also not complicated to implement an interpreted abstract syntax tree yourself.

Be warned: the two final conditions you're talking about are not necessarily equivalent. The && operators in C# will use short-circuit evalution, while the logical And operator in VB does not. If you want to be sure the statements are equivalent, translate a user And to AndAlso and a user Or to OrElse.
For simple expresssions you probably won't notice a difference. But if the conditions can have side effects or if the performance difference between the two is a concern, this can be important.

You can use https://github.com/mrazekv/logicalparser
Its simply library to write logical expression (evaulated with precenednce table, allows to OR, NOT, AND operator and >, >=, <=, < on integer variables and = on string variables)

You can do this easily with:
a parser generator (like ANTLR, mentioned above) that takes boolean expressions as input and produces an infix list and
code to evaluate a Reverse Polish Notation stack.
The grammar looks something like this:
program: exprList ;
exprList: expr { Append($1); }
| expr OR exprList { Append(OR); }
| expr AND exprList { Append(AND); }
| NOT exprList { Append(NOT); }
| ( exprList ) { /* Do nothing */ }
;
expr: var { Append($1); }
| TRUE { Append(True); }
| FALSE { Append(False); }
;
To evaluate, you do this:
for each item in list
if item is symbol or truth value, push onto RPN stack
else if item is AND, push (pop() AND pop())
else if item is OR, push (pop() OR pop())
else if item is NOT, push (NOT pop())
result = pop()
For symbols, you have to substitute the truth value at runtime.

You can write a simple interpreter/parser. Use something like ANTLR and reuse existing grammars.

If you are using .NET 3.5, you can create a Lambda Expression. Then you can create a delegate from it and call as standard delegate/method.
On the internet is a lot of samples about Lambda Expressions.

One solution would be to assemble the expression as a string, and then send it SQL Server, or whatever your database is for evaluation. Replace the actual variables with 1=1 or 0=1 for True and False respectively, and you would end up with a query like this:
SELECT 1 WHERE (1=1 And 0=1) Or (1=1 And 1=1) Or (Not 0=1 And 1=1)
Then when you run the query, you get a 1 back when the result is true. May not be the most elegant solution, but it will work. A lot of people will probably advise against this, but I'm just going to throw it out there as a possible solution anyway.

This will not be the best answer, but I myself had this problem some time ago.
Here is my old code:
VB.Net - no warranty at all!
https://cloud.downfight.de/index.php/s/w92i9Qq1Ia216XB
Dim BoolTermParseObjekt As New BoolTermParse
MsgBox(BoolTermParseObjekt.parseTerm("1 und (((0 oder 1 und (0 oder 4))) oder 2)").ToString)
This code eats a String with multiple '(', ')', 'and', 'or' plus 'other things' and breaks down the logic to a boolean by replacing the things with boolean values.
therefore:
Whatever 'other things' I wanted to evaluate I had to put in Function resolveTerm()
at the comment "'funktionen ausführen und zurückgeben, einzelwert!"
in page 2.
There the only evaluation rightnow is "If number is > 1"
Greetings

Take a look at my library, Proviant. It's a .NET Standard library using the Shunting Yard algorithm to evaluate boolean expressions.
It could also generate a truth-table for your expressions.
You could also implement your own grammar.

Related

Why does the code get compiled when I use !!= C#

I am trying to understand how does the code get compiled when I use (!!=)
Apparently the 2 snippets below do the same thing.
Why are both permissable?
if (4 !!= 5)
Console.WriteLine("vvvvvv");
the above does the same thing as:
if (4 != 5)
Console.WriteLine("vvvvvv");

The expression 4 !!= 5 is parsed as the null-forgiving operator applied to 4, and then != applied to that expression and 5. That is, (4!) != 5.
According to the draft spec, a null forgiving expression is a kind of primary expression:
primary_expression
: ...
| null_forgiving_expression
;
null_forgiving_expression
: primary_expression '!'
;
and that:
The postfix ! operator has no runtime effect - it evaluates to the result of the underlying expression. Its only role is to change the null state of the expression to "not null", and to limit warnings given on its use.
In other words, the ! after 4 does nothing and is very redundant. The constant 4 is never null after all :)

This only works in C# 8.0 and later. See null-forgiving.
I believe you are just stating that 4 could be null and telling the compiler that it should not show errors if 4 does happen to be null.

To complement other answers - in case you don't understand something about what is happening in terms of compilation you can use tools such decompilers - for example an online one - https://sharplab.io/. Among the others capabilities it provides ability to see the decompiled to IL(not very useful here), C# (basically desugared version of the code, for this one - see, also not very useful here) and also syntax tree (for this one - see), which can be useful in this particular case. I've used next code (so it can be compiled in release mode without optimizing constants out):
public class C {
public void M(int? i) {
if (i !!= 5)
Console.WriteLine("vvvvvv");
}
}
If you expand CompilationUnit ->
ClassDeclaration -> MethodDeclaration -> Body ->
IfStatement -> Condition -> Left you will see that it is actually SuppressNullableWarningExpression with operand being i:
With sharplab.io kindly highlighting the part of the code which is represented by selected syntax node. So as others described you can see that compiler parses your code as 4 followed by null-forgiving operator.

Parser for query filter expression tree

I am looking for a parser that can operate on a query filter. However, I'm not quite sure of the terminology so it's proving hard work. I hope that someone can help me. I've read about 'Recursive descent parsers' but I wonder if these are for full-blown language parsers rather than the logical expression evaluation that I'm looking for.
Ideally, I am looking for .NET code (C#) but also a similar parser that works in T-SQL.
What I want is for something to parse e.g.:
((a=b)|(e=1))&(c<=d)
Ideally, the operators can be definable (e.g. '<' vs 'lt', '=' vs '==' vs 'eq', etc) and we can specify function-type labels (e.g. (left(x,1)='e')). The parser loads this, obeys order precedence (and ideally handles the lack of any brackets) and then calls-back to my code with expressions to evaluate to a boolean result - e.g. 'a=b'?). I wouldn't expect the parser to understand the custom functions in the expression (though some basic ones would be useful, like string splitting). Splitting the expression (into left- and right-hand parts) would be nice.
It is preferable that the parser asks the minimum number of questions to have to work out the final result - e.g. if one side of an AND is false, there is no point evaluating the other side, and to evaluate the easiest side first (i.e. in the above expression, 'c<=d' should be assumed to be quicker and thus evaluated first.
I can imagine that this is a lot of work to do, however, fairly common. Can anyone give me any pointers? If there aren't parsers that are as flexible as above, are there any basic parsers that I can use as a start?
Many Thanks
Lee

Take a look at this. ANTLR is a good parser generator and the linked-to article has working code which you may be able to adapt to your needs.

You could check out Irony. With it you define your grammar in C# code using a syntax which is not to far from bnf. They even have a simple example on their site (expression evaluator) which seems to be quite close to what you want to achieve.
Edit: There's been a talk about Irony at this year's Lang.Net symposium.
Hope this helps!

Try Vici.Parser: download it here (free) , it's the most flexible expression parser/evaluator I've found so far.

If it's possible for you, use .Net 3.5 expressions.
Compiler parses your expression for you and gives you expression tree that you can analyze and use as you need. Not very simple but doable (actually all implementations of IQueryable interface do exactly this).

You can use .NET expression trees for this. And the example is actually pretty simple.
Expression<Func<int, int, int, int, bool>> test = (int a, int b, int c, int d) => ((a == b) | (c == 1)) & (c <= d);
And then just look at "test" in the debugger. Everything is already parsed for you, you can just use it.
The only problem is that in .NET 3.5 you can have only up to 4 arguments in Func. So, I changed "e" to "c" in one place. In 4.0 this limit is changed to 16.

Why does 'Submissions.Where(s => (false && s.Status == Convert.ToInt16("")))' raise an FormatException?

I thought the query was quite trivial, but it's raising a FormatException ("Input string was not in a correct format") nonetheless:
Submissions.Where(s => (false && s.Status == Convert.ToInt16("")))
(of course, in my code, another expression that evaluates to 'false' is located before '&&')
So why is the part after '&&' evaluated, since the first part is always false and the total expression can never evaluate to true?
The situation is particularly strange because only the Convert.ToInt16("") part seems to raise an exception - other parts of my original query of more or less the same structure, like
Submissions.Where(s => (false && s.SubmissionDate <= DateTime.Now))
are evaluated correctly.

As the others have pointed out, LINQ to SQL code gets pulled apart into an expression tree before being run as SQL code against the database. Since SQL does not necessarily follow the same short-circuit boolean rules as C#, the right side of your expression code might get parsed so that the SQL can be constructed.
From MSDN:
C# specifies short circuit semantics
based on lexical order of operands for
logical operators && and ||. SQL on
the other hand is targeted for
set-based queries and therefore
provides more freedom for the
optimizer to decide the order of
execution.
As for why you're getting an exception with this code, Convert.ToInt16("") will always throw precisely that exception because there's no way to convert an empty string into an integer. Your other example doesn't attempt an invalid conversion, hence it runs without a problem.

If Submissions is an IQueryable<T>, then this isn't a regular C# delegate, but is an expression tree. Some code (the LINQ provider) has to pull this tree apart and understand it - so if you have oddities in the expressions, then expect odd output.

Well based on your answer to my question in the comments, since it's Linq to Sql, it's not actually a delegate. I tried recreating it using Linq to Objects, and sure enough there was no issue at all. VS actually pointed out that "Unreachable code detected". Since in your case it's actually Linq to Sql, then it's building up an expression tree, in which case it has to decipher all of it and all bets are off.

Suggestion: use a static Int16 to hold the result of Convert.ToInt16(""), then refer to the static in the predicate.
Better still, do you know what the result of Convert.ToInt16("") is? Yes? Then use that instead. For instance, if it's 0, then say s.Status == 0. You could even make that a constant.

In C++ and C# are multiple condition checks performed in a predetermined or random sequence?

Situation: condition check in C++ or C# with many criteria:
if (condition1 && condition2 && condition3)
{
// Do something
}
I've always believed the sequence in which these checks are performed is not guaranteed. So it is not necessarily first condition1 then condition2 and only then condition3. I learned it in my times with C++. I think I was told that or read it somewhere.
Up until know I've always written secure code to account for possible null pointers in the following situation:
if ((object != null) && (object.SomeFunc() != value))
{
// A bad way of checking (or so I thought)
}
So I was writing:
if (object != null)
{
if (object.SomeFunc() != value)
{
// A much better and safer way
}
}
Because I was not sure the not-null check will run first and only then the instance method will be called to perform the second check.
Now our greatest community minds are telling me the sequence in which these checks are performed is guaranteed to run in the left-to-right order.
I'm very surprised. Is it really so for both C++ and C# languages?
Has anybody else heard the version I heard before now?

Short Answer is left to right with short-circuit evaluation. The order is predictable.
// perfectly legal and quite a standard way to express in C++/C#
if( x != null && x.Count > 0 ) ...
Some languages evaluate everything in the condition before branching (VB6 for example).
// will fail in VB6 if x is Nothing.
If x Is Not Nothing And x.Count > 0 Then ...
Ref: MSDN C# Operators and their order or precedence.

They are defined to be evaluated from left-to-right, and to stop evaluating when one of them evaluates to false. That's true in both C++ and C#.

I don't think there is or has been any other way. That would be like the compiler deciding to run statements out of order for no reason. :) Now, some languages (like VB.NET) have different logical operators for short-circuiting and not short-circuiting. But, the order is always well defined at compile time.
Here is the operator precedence from the C# language spec. From the spec ...
Except for the assignment operators,
all binary operators are
left-associative, meaning that
operations are performed from left to
right. For example, x + y + z is
evaluated as (x + y) + z.

They have to be performed from left to right. This allows short circuit evaluation to work.
See the Wikipedia article for more information.

Is this a reasonable use of the ternary operator? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
Are there any understanding / maintainability issues that result from code like
inVar1 == 0 ? NULL : v.push_back(inVar1);
inVar2 == 0 ? NULL : v.push_back(inVar2);
and so forth.
The possibly confusing idea is using the ternary operator for program flow rather than variable assignment, which is the usual explanation.
I haven't seen coding standards at work that address this usage, so while I'm comfortable doing this I'd like to find out if there is a good reason not to.

I think it's confusing and a lot harder to read than simply typing;
if (inVar != 0)
v.push_back(inVar);
I had to scan your example several times to figure out what the result would be with any certainty. I'd even prefer a single-line if() {} statement than your example - and I hate single-line if statements :)

The ternary operator is meant to return a value.
IMO, it should not mutate state, and the return value should be used.
In the other case, use if statements. If statements are meant to execute code blocs.

The ternary is a good thing, and I generally promote it's usage.
What you're doing here however tarnishes it's credibility. It's shorter, yes, but it's needlessly complicated.

I think this should be avoided. You could use a 1-line if statement in its place.
if(inVar1 != 0) v.push_back(inVar1);

Compilers these days will make an if as fast as a ternary operator.
You goal should be how easy is it for another software developer to read.
I vote for
if ( inVar != 0 )
{
v.push_back( inVar );
}
why the brackets...because one day you may want to put something else in there and the brackets are pre-done for you. Most editors these days will put them in anyway.

Your use of the ternary operator gains you nothing and you hurt the codes readability.
Since the ternary operator returns a value that you are not using it is odd code. The use of an if is much more clear in a case like yours.

As litb mentioned in the comments, this isn't valid C++. GCC, for example, will emit an error on this code:
error: `(&v)->std::vector<_Tp, _Alloc>::push_back [with _Tp = int, _Alloc =
std::allocator<int>](((const int&)((const int*)(&inVar1))))' has type `void'
and is not a throw-expression
However, that can be worked around by casting:
inVar1 == 0 ? (void)0 : v.push_back(inVar1);
inVar2 == 0 ? (void)0 : v.push_back(inVar2);
But at what cost? And for what purpose?
It's not like using the ternary operator here is any more concise than an if-statement in this situation:
inVar1 == 0 ? NULL : v.push_back(inVar1);
if(inVar1 != 0) v.push_back(inVar1);

While, in practice, I agree with the sentiments of those who discourage this type of writing (when reading, you have to do extra work to scan the expression for its side effects), I'd like to offer
!inVar1 ?: v.push_back(inVar1);
!inVar2 ?: v.push_back(inVar2);
...if you're going for obscure, that is. GCC allows x ?: y in place of x ? x : y. :-)

I use ternary operator when I need to call some function with conditional arguments - in this case it is better then if.
Compare:
printf("%s while executing SQL: %s",
is_sql_err() ? "Error" : "Warning", sql_msg());
with
if (is_sql_err())
printf("Error while executing SQL: %s", sql_msg());
else
printf("Warning while executing SQL: %s", sql_msg());
I find the former is more appealing. And it complies to DRY principle, unlike latter - you don't need to write two nearly identical lines.

I think you would be better served in doing a proper if structure. I even prefer to always have braces with my if structures, in the event I have to add lines later to the conditional execution.
if (inVar != 0) {
v.push_back(inVar);
}

I think that sometimes the ternary are a necessary evil in initializer lists for constructors. I use them mostly for constructors where I want to allocate memory and set some pointer to point at it before the body of the constructor.
An example, suppose you had an integer storage class that you wanted to have take a vector as an input but the internal representation is an array:
class foo
{
public:
foo(std::vector<int> input);
private:
int* array;
unsigned int size;
};
foo:foo(std::vector<int> input):size(input.size()), array( (input.size()==0)?
NULL : new int[input.size])
{
//code to copy elements and do other start up goes here
}
This is how I use the ternary operator. I don't think it is as confusing as some people do but I do think that one should limit how much they use it.

Most of the tortured ternaries (how's that for alliteration?) I see are merely attempts at putting logic that really belongs in an if statement in a place where an if statement doesn't belong or can't go.
For instance:
if (inVar1 != 0)
v.push_back(inVar1);
if (inVar2 != 0)
v.push_back(inVar2);
works assuming that v.push_back is void, but what if it's returning a value that needs to get passed to another function? In that case, it would have to look something like this:
SomeType st;
if (inVar1 != 0)
st = v.push_back(inVar1);
else if (inVar2 != 0)
st = v.push_back(inVar2);
SomeFunc(st);
But that's more to digest for such a simple piece of code. My solution: define another function.
SomeType GetST(V v, int inVar1, int inVar2){
if (inVar1 != 0)
return v.push_back(inVar1);
if (inVar2 != 0)
return v.push_back(inVar2);
}
//elsewhere
SomeFunc(GetST(V v, inVar1, inVar2));
At any rate, the point is this: if you have some logic that's too tortured for a ternary but will clutter up your code if it's put in an if statement, put it somewhere else!

inVar1 != 0 || v.push_back(inVar1);
inVar2 != 0 || v.push_back(inVar2);
common pattern found in languages like Perl.

If you have multiple method invocations in one or both of the tenary arguments then its wrong. All lines of code regardless of what statement should be short and simple, ideally not compounded.

A proper if statement is more readable, as others have mentioned. Also, when you're stepping through your code with a debugger, you won't be able to readily see which branch of an if is taken when everything is in one line or you're using a ternary expression:
if (cond) doIt();
cond ? noop() : doIt();
Whereas the following is much nicer to step through (whether you have the braces or not):
if (cond) {
doIt();
}

As mentioned, it's not shorter or clearer than a 1 line if statement. However, it's also no longer - and isn't really that hard to grok. If you know the ternary operator, it's pretty obvious what's happening.
After all, I don't think anyone would have a problem if it was being assigned to a variable (even if it was mutating state as well):
var2 = inVar1 == 0 ? NULL : v.push_back(inVar1);
The fact that the ternary operator always returns a value - IMO - is irrelevant. There's certainly no requirement that you use all return values...after all, an assignment returns a value.
That being said, I'd replace it with an if statement if I ran across it with a NULL branch.
But, if it replaced a 3 line if statement:
if (inVar == 0) {
v.doThingOne(1);
} else {
v.doThingTwo(2);
}
with:
invar1 == 0 ? v.doThingOne(1) : v.doThingTwo(2);
I might leave it...depending on my mood. ;)

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Dynamic logical expression parsing/evaluation in C# or VB? - c#

You can use https://github.com/mrazekv/logicalparser Its simply library to write logical expression (evaulated with precenednce table, allows to OR, NOT, AND operator and >, >=, <=, < on integer variables and = on string variables)

You can write a simple interpreter/parser. Use something like ANTLR and reuse existing grammars.

If you are using .NET 3.5, you can create a Lambda Expression. Then you can create a delegate from it and call as standard delegate/method. On the internet is a lot of samples about Lambda Expressions.

Take a look at my library, Proviant. It's a .NET Standard library using the Shunting Yard algorithm to evaluate boolean expressions. It could also generate a truth-table for your expressions. You could also implement your own grammar.

Related

Why does the code get compiled when I use !!= C#

Parser for query filter expression tree

Why does 'Submissions.Where(s => (false && s.Status == Convert.ToInt16("")))' raise an FormatException?

In C++ and C# are multiple condition checks performed in a predetermined or random sequence?

Is this a reasonable use of the ternary operator? [closed]

Categories

Resources