Maybe monad using expression trees? - c#

Ugly:
string city = null;
if (myOrder != null && myOrder.Customer != null)
city = myOrder.Customer.City;
Better (maybe monad):
var city = myOrder
.With(x => x.Customer)
.With(x => x.City)
Even better? Any reason this couldn't be written?
var city = Maybe(() => myOrder.Customer.City);

Yes, it should be possible. However, it's quite a bit more complicated than it looks on the surface to implement an expression tree re-writer correctly. Especially if you want to be able to correctly handle fields, properties, indexed properties, method calls, and other constructs that are valid in arbitrary expressions.
It may also not be the most well-performing operation since to evaluate the expression you have to dynamically compile the expression tree into a lambda function each time.
There's an implementation on this pattern on CodePlex. I've never personally used it, so I can't say how well implemented it is, or whether it handles all of the cases I've described.
An alternative to creating an expression tree re-writer, is to write Maybe() to accept a lambda function (rather than an expression tree) and catch any ArgumentNullException thrown, returning default(T) in those cases. It rubs many people the wrong way to use exceptions for flow control in this way ... but it's certainly an easier implementation to get right. I personally avoid it myself since it can mask null reference errors within methods called as part of the expression, which is not desirable.

I recently implemented some monads in C# (including a basic expression tree parser inspired by a Bartosz Milewski article).
Take a look if you're interested: https://github.com/htoma/monads/blob/master/expressionMonad/expressionMonad/Program.cs

Some points that come to my mind:
.Solutions work fine for memory objects, but run into trouble with EF as these static calls cannot be converted to run against persistent storage (that is, SQL DB). This limits the scope of application somewhat heavily.
I will practically always want to know whether the chain did produce valid result. Hence, I will have one conditional block if(city == null) in any case.
Any current solution other than "the ugly" involves Expressions.
Hence, my choice would be something like
var property = ( () => myOrder.Customer.City );
city = HasValue(property) ? property.Invoke() : "unknown";
HasValue(Expression e) walks through the LINQ expression tree recursively until it either reaches end (returning true) or encounters null-valued property (returning false). The implementation should be simple, use MethodInfo Member of MemberExpression class to parse the AST. One could also implement getter this way as Brian suggested, but I like above better because HasValue always returns bool. Further:
Member invokations can be handled, too.
Evaluation could be made as myOrder.HasValue(x => x.Customer.City) but that brings some complications.

Simpler answer if the objects are cheap to create and you want to avoid null checks:
myOrder.NewIfNull().Customer.NewIfNull().City;
This will return either null or some initial value you set in the constructor or field initializer for City. NewIfNull isn't built-in, but it's real easy:
public static T NewIfNull<T>(this T input) where T:new()
{
return input ?? new T();
}

I know that my implementation of Maybe (as per CodeProject article) carries a cost, but I'm sure it's nothing compared to the idea of getting an Expression<T> involved there. Basically you're talking Reflection all the way. I wouldn't mind if it was pre-compiled, Roslyn-style, but we aren't there yet.
I'd argue that my implementation's advantage goes way beyond the mythical ?. operator. The ability to write a whole algorithm using a chain such as this means that you can inject your own creations (such as If, Do, etc.) and provide your own specialized logic.
I realize this is more complicated than what you're trying to do here, but it doesn't look like we're going to get a null-coalescing dot operator in C#5.

Related

Is LINQ smart enough not to check conditional flag multiple times?

My question is, will LINQ in the following code read flag value three times when numbers materializing numbers collection? I am trying to optimize my code. Here I want Where clause to be evaluated only once, if flag == true
List<int> list = new(){1, 2, 3};
bool flag = true;
bool IsNumberBig(int num)
{
return num > 100;
}
var numbers = list.Where(l => flag || IsNumberBig(l)).ToList();
I failed to find a related question. Would be thankful to see how I could check this myself.
The value of flag will be evaluated every time the lambda is called. Obviously, that's cheaper than evaluating IsNumberBig() (or some more complex method in there), but still not free.
To optimize this, you could write something like
List<int> numbers;
if (flag)
{
numbers = list;
}
else
{
numbers = list.Where(IsNumberBig).ToList();
}
Like this, no iteration is done if flag is true (which in your case would return all elements, anyway)
I think it is important to note that LINQ is mostly syntactic sugar. It does not do optimization. The vast majority of optimizations are done by the compiler, or more specifically, the jitter.
One problem when discussing optimizations are that the jitter are allowed to perform any kind of optimization as long as the result is the same. But it also have to do any optimizations as fast as possible, so it rarely does all the things it would be allowed to do. It will also depend on the version of the compiler, the more recent ones have tiered compilation to get a better optimization of frequently used loops.
Because of all this it can be difficult to guess what the compiler will and will not optimize, and the best approach is to just benchmark the code. Use Benchmark.Net with and without the check, and that way you will get a correct answer. It should also tell you if the performance difference is anything to worry about.
Even thou guessing what the compiler will do is difficult, there are a few things worthy of note. Most optimizations are done within a method, the compiler will not try to rewrite method signatures. However, small methods tend to be inlined, and can therefore be optimized as part of the calling method. So if all your code was inlined it would very likely remove the flag check. However, one of the things that prevent inlining are indirect calls, like calling a method thru an interface, or in this case, calling a delegate. Just about everything in LINQ is delegates and interfaces, and this tend to hamper performance. So in general, use LINQ for convenience, not due to performance.
All that said, modern processors have pretty amazing branch-predictors, so I would expect the effect of an easy to predict branch like that to be fairly small. There is likely other things that have a larger effect on performance.
But the most important thing is Benchmark and/or profile the code instead of just guessing about performance. It is common for people trying to optimize the completely wrong thing, even for experienced developers. If you want to get started check out Measure app performance in visual studio and Benchmark .net.
You could do this extension:
It basically wraps #PMF's answer in an extension method, so you can use it like the Where you already know. It just takes an additional condition parameter, that switches on/off the application of the predicate. This comes with the advantage that you can chain it with other Linq-Methods just like the plain old Where.
using System.Linq;
using System.Collections.Generic;
public static class LinqEx
{
public static IEnumerable<T> WhereIf<T>(this IEnumerable<T> source, bool condition, Func<T, bool> predicate)
{
return condition
? source.Where(predicate)
: source;
}
}
And then use it like:
var someList = // assume it is a List<int> with some items
var shallFilter = true; // or false
var filteredList = someList.WhereIf(shallFilter, l => IsBigNumber(l)).ToList();
See it in Action in this Fiddle.
Mind that this would make more sense in a DB-Related (Linq to SQL) setting, since that here is really a micro optimization. For it to show effect, you'd have to have many items and a very costly predicate.
One word on your code, too:
var numbers = list.Where(l => flag || IsNumberBig(l)).ToList();
If flag is true, flag || X will evaluate to true, regardless of X. X will not even be evaluated.
So, you basically implemented the opposite of your requirement.
See also: Conditional logical OR operator ||

C# - Fastest way to do this type of null check?

I have the following code that performs a null or empty check on any type of object:
public static void IfNullOrEmpty(Expression<Func<string>> parameter)
{
Throw.IfNull(parameter);
if (parameter.GetValue().ToString().Length == 0)
{
throw new ArgumentException("Cannot be empty", parameter.GetName());
}
}
It calls the GetValue extension method below:
public static T GetValue<T>(this Expression<Func<T>> parameter)
{
MemberExpression member;
Expression expression;
member = (MemberExpression)parameter.Body;
expression = member.Expression;
return (T)parameter.Compile()();
}
I am passing in an expression containing a string in this method for testing. This method takes on average 2 ms on my machine (even slower on another machine I'm testing on) which adds up if it is called several times throughout the application. It seems like this method is too slow. What is the fastest way to do this type of null check?
Compiling an expression naturally requires quite some work. What I normally do if this code will run often is that I only compile the expressions once and save the compiled delegate for further usage.
It's possible to keep a "normal" cache but for a cache to be efficient you need a good hash function and I don't see how you could make that here. You need to restructure your code a bit so that every place where you use GetValue has a proper access to a compiled delegate instead. Without seeing more code I can't give you any hints about that one.
There can be many reasons why you see the following call being faster. Because of the difficulty to hash I don't expect that one. More likely you are seeing the works of a modern CPU that does a lot of guessing to run code fast. If you just ran the same expressions it's possible that the CPU is able to to guess more about the next call and can run faster. There is always GC to consider too.
One way to test the guessing idea could be to create a large array with a few different expressions. Do one test where it is ordered by expression and one where it is random. If my suspicion holds true the first one should be faster.
If I'm reading your code right, the only reason why you need an Expression is so if the check fails, you'll be able to extract the name of the parameter and pass it into the exception that you'll throw, right? If so, that's an awfully steep price to pay for a slightly more convenient error message that (hopefully) only occurs in a very tiny percentage of cases.
The 2ms overhead is slightly higher than I'd expect, but substantial overhead is hard or impossible to avoid here. You're essentially forcing the runtime to traverse an expression tree, translate it into MSIL and then translate and optimize that MSIL again into executable code through the JIT, just to do a != null check which will always succeed unless a developer has made a mistake somewhere.
You could probably come up with some sort of caching mechanism; the Entity Framework caches expressions by traversing the expression tree and building a hash of it and uses that as the key of a dictionary where the compiled expression is stored as a delegate. It'd be substantially cheaper than sending it through the JIT on every call, but it'd still be orders of magnitude more expensive when compared to a simpler != null check which takes nanoseconds or less (especially when you consider modern branch-predicting CPUs).
So in my opinion, this approach is a nice idea, but simply not worth it when you consider the cost and when the alternative is pretty painless (especially with the new nameof operator). It's also fairly brittle, because what if I'm a developer who thinks he can do this:
Throw.IfNullOrEmpty(() => clientId + "something")
Your cast to MemberExpression will fail there.
Likewise, it would be reasonable for someone to think that because you pass in an expression and an expression is only code as data that it would be safe to do this:
Throw.IfNullOrEmpty(() => parent.Child.MightBeNull.MightAlsoBeNull.ClientID)
It's perfectly possible to safely evaluate that expression if you partially traversed the expression tree, but in your example the whole expression is compiled and executed at once and will likely fail with a NullReferenceException there.
I guess it comes down to that an argument of type Expression<Func<T>> is just not sufficiently strict enough for the use of a null check. You could do all sorts of weird stuff in them and the compiler would be perfectly happy, but you'd get unexpected results at runtime.

Inline helper methods in c#

From this answer I've learned that it is possible to strongly suggest inlining in C# as follows:
using System.Runtime.CompilerServices;
[MethodImpl(MethodImplOptions.AggressiveInlining)]
bool MyCondition() { return someObject != null && someObject.Count > 2; }
In a current project we use statemachines as defined by the Appccelerate StateMachine framework, which results in sequences like the following (which in our project are much longer):
fsm.In(States.A)
.On(Events.B)
.If(arguments => false).Goto(States.B1)
.If(() => someVariable && somethingElse == false).Goto(States.B3);
.If(MyCondition).Goto(States.B2)
In order to simplify these structures I would like to separate the lambda expressions (or Action delegates) into helper methods (i.e. the last statement). Reasons for doing so is that with proper method names it would increase readability of the code, and secondly when autogenerating documentation it will use the method name instead of the non-intuitive [anonymous] text.
The question however, is whether it is any point in using AggressiveInlining or will simple lambda expressions involving upto 4 variables with simple comparison operators be automatically inlined by JIT/compiler?
My gut feeling is to inline these methods, as I believe the different parts of the statemachine will get a lot of hits, and thusly to reduce method calling would be a benefit. But then again how smart is JIT/compiler to automagically do this?
The problem is that you understand lambdas in c# incorrectly. When compiler translate c# to MSIL lambdas become classes and so you have nothing to inline. You can take a look at great Marc Gravell post on SO. So, whether or not you define lambdas in external class, you'll need to get an object from heap (i simplify compiled code behaviour). And so, as i think, there'll be no difference in performance for your application.
Also keep in mind that the state machine keeps the delegate in its internal data structure. The guard action cannot be inlined there.
The only possible thing is to inline the methods that the helper method calls.

Extension Methods - IsNull and IsNotNull, good or bad use?

I like readability.
So, I came up with an extension mothod a few minutes ago for the (x =! null) type syntax, called IsNotNull. Inversly, I also created a IsNull extension method, thus
if(x == null) becomes if(x.IsNull())
and
if(x != null) becomes if(x.IsNotNull())
However, I'm worried I might be abusing extension methods. Do you think that this is bad use of Extenion methods?
It doesn't seem any more readable and could confuse people reading the code, wondering if there's any logic they're unaware of in those methods.
I have used a PerformIfNotNull(Func method) (as well as an overload that takes an action) which I can pass a quick lambda expression to replace the whole if block, but if you're not doing anything other than checking for null it seems like it's not providing anything useful.
I don't find that incredibly useful, but this:
someString.IsNullOrBlank() // Tests if it is empty after Trimming, too
someString.SafeTrim() // Avoiding Exception if someString is null
because those methods actually save you from having to do multiple checks. but replacing a single check with a method call seems useless to me.
It is perfectly valid to do but I don't think it is incredibly useful. Since extension methods are simply compiler trickery I struggle to call any use of them "abuse" since they are just fluff anyhow. I only complain about extension methods when they hurt readability.
Instead I'd go with something like:
static class Check {
public static T NotNull(T instance) {
... assert logic
return instance;
}
}
Then use it like this:
Check.NotNull(x).SomeMethod();
y = Check.NotNull(x);
Personally it's much clearer what is going on than to be clever and allow the following:
if( ((Object)null).IsNull() ) ...
I don't entirely agree with the reasoning saying "it may confuse".
To some extent I can see what is meant, that there is no reason to venture outside "common understanding" -- everybody understands object != null.
But in Visual Studio, we have wonderful tools where you can simply hover over the method, to reveal some additional information.
If we were to say that the extension-method was annotated with a good explanation, then I feel that the argument of confusion falls apart.
The methods .IsNotNull() and .IsNull() explain exactly what they are. I feel they are very reasonable and useful.
In all honesty it is a matter of "what you like". If you feel the methods will make it more readable in the context of your project, then go for it. If you are breaking convention in your project, then I would say the opposite.
I have had the same thoughts as you have on the subject and have asked several very very experienced developers at my place of work. And none of them have come up with a good reason (except what has been mentioned about -confusion- here) that would explain why you shouldn't do this.
Go for it :-)
There is precedent, in as much as the string class has IsNullOrEmpty
You're also introducing method call overhead for something that's a CLR intrinsic operation. The JIT might inline it away, but it might not. It's a micro-perf nitpick, to be sure, but I'd agree that it's not particularly useful. I do things like this when there's a significant readability improvement, or if I want some other behavior like "throw an ArgumentNullException and pass the arg name" that's dumb to do inline over and over again.
It can make sense if you, for instance, assume that you might want to throw an exception whenever x is null (just do it in the extension method). However, I my personal preference in this particular case is to check explicitly (a null object should be null :-) ).
To follow the pattern it should be a property rather than a method (but of course that doesn't work with extensions).
Data values in the System.Data namespace has an IsNull property that determines if the value contains a DbNull value.
The DataRow class has an IsNull method, but it doesn't determine if the DataRow is null, it determines if one of the fields in the data row contains a DbNull value.

When not to use lambda expressions [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
A lot of questions are being answered on Stack Overflow, with members specifying how to solve these real world/time problems using lambda expressions.
Are we overusing it, and are we considering the performance impact of using lambda expressions?
I found a few articles that explores the performance impact of lambda vs anonymous delegates vs for/foreach loops with different results
Anonymous Delegates vs Lambda Expressions vs Function Calls Performance
Performance of foreach vs. List.ForEach
.NET/C# Loop Performance Test (FOR, FOREACH, LINQ, & Lambda).
DataTable.Select is faster than LINQ
What should be the evaluation criteria when choosing the appropriate solution? Except for the obvious reason that it's more concise code and readable when using lambda.
Even though I will focus on point one, I begin by giving my 2 cents on the whole issue of performance. Unless differences are big or usage is intensive, usually I don't bother about microseconds that when added don't amount to any visible difference to the user. I emphasize that I only don't care when considering non-intensive called methods. Where I do have special performance considerations is on the way I design the application itself. I care about caching, about the use of threads, about clever ways to call methods (whether to make several calls or to try to make only one call), whether to pool connections or not, etc., etc. In fact I usually don't focus on raw performance, but on scalibility. I don't care if it runs better by a tiny slice of a nanosecond for a single user, but I care a lot to have the ability to load the system with big amounts of simultaneous users without noticing the impact.
Having said that, here goes my opinion about point 1. I love anonymous methods. They give me great flexibility and code elegance. The other great feature about anonymous methods is that they allow me to directly use local variables from the container method (from a C# perspective, not from an IL perspective, of course). They spare me loads of code oftentimes. When do I use anonymous methods? Evey single time the piece of code I need isn't needed elsewhere. If it is used in two different places, I don't like copy-paste as a reuse technique, so I'll use a plain ol' delegate. So, just like shoosh answered, it isn't good to have code duplication. In theory there are no performance differences as anonyms are C# tricks, not IL stuff.
Most of what I think about anonymous methods applies to lambda expressions, as the latter can be used as a compact syntax to represent anonymous methods. Let's assume the following method:
public static void DoSomethingMethod(string[] names, Func<string, bool> myExpression)
{
Console.WriteLine("Lambda used to represent an anonymous method");
foreach (var item in names)
{
if (myExpression(item))
Console.WriteLine("Found {0}", item);
}
}
It receives an array of strings and for each one of them, it will call the method passed to it. If that method returns true, it will say "Found...". You can call this method the following way:
string[] names = {"Alice", "Bob", "Charles"};
DoSomethingMethod(names, delegate(string p) { return p == "Alice"; });
But, you can also call it the following way:
DoSomethingMethod(names, p => p == "Alice");
There is no difference in IL between the both, being that the one using the Lambda expression is much more readable. Once again, there is no performance impact as these are all C# compiler tricks (not JIT compiler tricks). Just as I didn't feel we are overusing anonymous methods, I don't feel we are overusing Lambda expressions to represent anonymous methods. Of course, the same logic applies to repeated code: Don't do lambdas, use regular delegates. There are other restrictions leading you back to anonymous methods or plain delegates, like out or ref argument passing.
The other nice things about Lambda expressions is that the exact same syntax doesn't need to represent an anonymous method. Lambda expressions can also represent... you guessed, expressions. Take the following example:
public static void DoSomethingExpression(string[] names, System.Linq.Expressions.Expression<Func<string, bool>> myExpression)
{
Console.WriteLine("Lambda used to represent an expression");
BinaryExpression bExpr = myExpression.Body as BinaryExpression;
if (bExpr == null)
return;
Console.WriteLine("It is a binary expression");
Console.WriteLine("The node type is {0}", bExpr.NodeType.ToString());
Console.WriteLine("The left side is {0}", bExpr.Left.NodeType.ToString());
Console.WriteLine("The right side is {0}", bExpr.Right.NodeType.ToString());
if (bExpr.Right.NodeType == ExpressionType.Constant)
{
ConstantExpression right = (ConstantExpression)bExpr.Right;
Console.WriteLine("The value of the right side is {0}", right.Value.ToString());
}
}
Notice the slightly different signature. The second parameter receives an expression and not a delegate. The way to call this method would be:
DoSomethingExpression(names, p => p == "Alice");
Which is exactly the same as the call we made when creating an anonymous method with a lambda. The difference here is that we are not creating an anonymous method, but creating an expression tree. It is due to these expression trees that we can then translate lambda expressions to SQL, which is what Linq 2 SQL does, for instance, instead of executing stuff in the engine for each clause like the Where, the Select, etc. The nice thing is that the calling syntax is the same whether you're creating an anonymous method or sending an expression.
My answer will not be popular.
I believe Lambda's are 99% always the better choice for three reasons.
First, there is ABSOLUTELY nothing wrong with assuming your developers are smart. Other answers have an underlying premise that every developer but you is stupid. Not so.
Second, Lamdas (et al) are a modern syntax - and tomorrow they will be more commonplace than they already are today. Your project's code should flow from current and emerging conventions.
Third, writing code "the old fashioned way" might seem easier to you, but it's not easier to the compiler. This is important, legacy approaches have little opportunity to be improved as the compiler is rev'ed. Lambdas (et al) which rely on the compiler to expand them can benefit as the compiler deals with them better over time.
To sum up:
Developers can handle it
Everyone is doing it
There's future potential
Again, I know this will not be a popular answer. And believe me "Simple is Best" is my mantra, too. Maintenance is an important aspect to any source. I get it. But I think we are overshadowing reality with some cliché rules of thumb.
// Jerry
Code duplication.
If you find yourself writing the same anonymous function more than once, it shouldn't be one.
Well, when we are talking bout delegate usage, there shouldn't be any difference between lambda and anonymous methods -- they are the same, just with different syntax. And named methods (used as delegates) are also identical from the runtime's viewpoint. The difference, then, is between using delegates, vs. inline code - i.e.
list.ForEach(s=>s.Foo());
// vs.
foreach(var s in list) { s.Foo(); }
(where I would expect the latter to be quicker)
And equally, if you are talking about anything other than in-memory objects, lambdas are one of your most powerful tools in terms of maintaining type checking (rather than parsing strings all the time).
Certainly, there are cases when a simple foreach with code will be faster than the LINQ version, as there will be fewer invokes to do, and invokes cost a small but measurable time. However, in many cases, the code is simply not the bottleneck, and the simpler code (especially for grouping, etc) is worth a lot more than a few nanoseconds.
Note also that in .NET 4.0 there are additional Expression nodes for things like loops, commas, etc. The language doesn't support them, but the runtime does. I mention this only for completeness: I'm certainly not saying you should use manual Expression construction where foreach would do!
I'd say that the performance differences are usually so small (and in the case of loops, obviously, if you look at the results of the 2nd article (btw, Jon Skeet has a similar article here)) that you should almost never choose a solution for performance reasons alone, unless you are writing a piece of software where performance is absolutely the number one non-functional requirement and you really have to do micro-optimalizations.
When to choose what? I guess it depends on the situation but also the person. Just as an example, some people perfer List.Foreach over a normal foreach loop. I personally prefer the latter, as it is usually more readable, but who am I to argue against this?
Rules of thumb:
Write your code to be natural and readable.
Avoid code duplications (lambda expressions might require a little extra diligence).
Optimize only when there's a problem, and only with data to back up what that problem actually is.
Any time the lambda simply passes its arguments directly to another function. Don't create a lambda for function application.
Example:
var coll = new ObservableCollection<int>();
myInts.ForEach(x => coll.Add(x))
Is nicer as:
var coll = new ObservableCollection<int>();
myInts.ForEach(coll.Add)
The main exception is where C#'s type inference fails for whatever reason (and there are plenty of times that's true).
If you need recursion, don't use lambdas, or you'll end up getting very distracted!
Lambda expressions are cool. Over older delegate syntax they have a few advantages like, they can be converted to either anonymous function or expression trees, parameter types are inferred from the declaration, they are cleaner and more concise, etc. I see no real value to not use lambda expression when you're in need of an anonymous function. One not so big advantage the earlier style has is that you can omit the parameter declaration totally if they are not used. Like
Action<int> a = delegate { }; //takes one argument, but no argument specified
This is useful when you have to declare an empty delegate that does nothing, but it is not a strong reason enough to not use lambdas.
Lambdas let you write quick anonymous methods. Now that makes lambdas meaningless everywhere where anonymous methods go meaningless, ie where named methods make more sense. Over named methods, anonymous methods can be disadvantageous (not a lambda expression per se thing, but since these days lambdas widely represent anonymous methods it is relevant):
because it tend to lead to logic duplication (often does, reuse is difficult)
when it is unnecessary to write to one, like:
//this is unnecessary
Func<string, int> f = x => int.Parse(x);
//this is enough
Func<string, int> f = int.Parse;
since writing anonymous iterator block is impossible.
Func<IEnumerable<int>> f = () => { yield return 0; }; //impossible
since recursive lambdas require one more line of quirkiness, like
Func<int, int> f = null;
f = x => (x <= 1) ? 1 : x * f(x - 1);
well, since reflection is kinda messier, but that is moot isn't it?
Apart from point 3, the rest are not strong reasons not to use lambdas.
Also see this thread about what is disadvantageous about Func/Action delegates, since often they are used along with lambda expressions.

Categories