How can one refactor this Linq function? - c#

I have a function which uses the "Any" method on a list.
List<String> list = Factory.ReturnList(); //Returns a list of all countries in Europe
I need to verify that the list contains partial strings like "(DE)" and "(FR)", ...
Assert.Equal(list.Any(item=> item.Contains("(DE)")), true);
Assert.Equal(list.Any(item=> item.Contains("(FR)")), true);
Assert.Equal(list.Any(item=> item.Contains("(BE)")), true);
Assert.Equal(list.Any(item=> item.Contains("(NL)")), true);
Now I would like to write it as a function. I have seen other people use code like this:
Func<IWebElement, bool> condition = item => item.Text == "something";
Assert.Equal(list.Any(condition), true);
I already tried the following, but that didn't work:
Func<List<String>, bool> condition = list.Any(x => x.Contains(input));
Due to:
Cannot implicitly convert type 'bool' to 'System.Func<System.Collections.Generic.List<string>, bool>'
How can I do that in my example?
'input' needs to be a parameter/variable so the Func can be invoked with different parameters

You are currently invoking the method and trying to assign it to your Func which of course doesn't work since you've already invoked it, returning bool.
Instead, encapsulate the two input arguments and update the Func signature to match your intent.
Func<List<String>, string, bool> condition =
(list, input) => list.Any(x => x.Contains(input));
You can then invoke your Func like so:
Assert.Equal(condition(new List<string>(), ""), true);
This gives you the highest degree of flexibility and reusability since the condition is constant, but the parameters can change.

If you just want a lambda expression, use:
Func<List<String>, bool> condition = (list,input) => list.Any(x => x.Contains(input));

How about putting those strings into an array?
string[] countries = { "DE", "FR", "BE", "NL" };
result = from item in list
where countries.Any(val => item.Contains(val))
select item;

Related

How do I use an array of values in a LINQ Expression builder?

I want to dynamically build a LINQ query so I can do something like
var list = n.Elements().Where(getQuery("a", "b"));
instead of
var list = n.Elements().Where(e => e.Name = new "a" || e.Name == "c");
(Most of the time, I need to pass XNames with namespaces, not just localnames...)
My problem is in accessing the array elements:
private static Func<XElement, bool> getQuery(XName[] names)
{
var param = Expression.Parameter(typeof(XElement), "e");
Expression exp = Expression.Constant(false);
for (int i = 0; i < names.Length; i++)
{
Expression eq = Expression.Equal(
Expression.Property(param, typeof(XElement).GetProperty("Name")!.Name),
/*--->*/ Expression.Variable(names[i].GetType(), "names[i]")
);
}
var lambda = Expression.Lambda<Func<XElement, bool>>(exp, param);
return lambda.Compile();
}
Obviously the Variable expression is wrong, but I'm having difficulty building an expression capable of accessing the array values.
Do you need to create an expression and compile it? Unless I'm missing some nuance to this, all you need is a function that returns a Func<XElement, bool>.
private Func<XElement, bool> GetQuery(params string[] names)
{
return element => names.Any(n => element.Name == n);
}
This takes an array of strings and returns a Func<XElement>. That function returns true if the element name matches any of the arguments.
You can then use that as you described:
var list = n.Elements.Where(GetQuery("a", "b"));
There are plenty of ways to do something like this. For increased readability an extension like this might be better:
public static class XElementExtensions
{
public static IEnumerable<XElement> WhereNamesMatch(
this IEnumerable<XElement> elements,
params string[] names)
{
return elements.Where(element =>
names.Any(n => element.Name == n));
}
}
Then the code that uses it becomes
var list = n.Elements.WhereNamesMatch("a", "b");
That's especially helpful when we have other filters in our LINQ query. All the Where and other methods can become hard to read. But if we isolate them into their own functions with clear names then the usage is easier to read, and we can re-use the extension in different queries.
If you want to write it as Expression you can do it like so:
public static Expression<Func<Person, bool>> GetQuery(Person[] names)
{
var parameter = Expression.Parameter(typeof(Person), "e");
var propertyInfo = typeof(Person).GetProperty("Name");
var expression = names.Aggregate(
(Expression)Expression.Constant(false),
(acc, next) => Expression.MakeBinary(
ExpressionType.Or,
acc,
Expression.Equal(
Expression.Constant(propertyInfo.GetValue(next)),
Expression.Property(parameter, propertyInfo))));
return Expression.Lambda<Func<Person, bool>>(expression, parameter);
}
Whether or not you compile the expression is determined by the means you want to achieve. If you want to pass the expression to a query provider (cf. Queryable.Where) and have e.g. the database filter your values, then you may not compile the expression.
If you want to filter a collection in memory, i.e. you enumerate all elements (cf. Enumerable.Where) and apply the predicate to all the elements, then you have to compile the expression. In this case you should probably not use the Expression api as this adds complexity to your code and you are then more vulnerable to runtime errors.

I'm confused about this statement. Lambda operator? [duplicate]

This question already has answers here:
Understanding how lambda expression works [closed]
(4 answers)
Closed 3 years ago.
[Route("{year:min(2000)}/{month:range(1,12)}/{key}")]
public IActionResult Post(int year, int month, string key)
{
var post = _db.Posts.FirstOrDefault(x => x.Key == key);
return View(post);
}
Hi,
I'm doing this in ASP.NET Core with C#.
Vague part for me is this: _db.Posts.FirstOrDefault(x => x.Key == key);
So what I'm guessing is that:
execute FirstOrDefault method.
parameter x is passed (I don't what it is being passed exactly though).
then, compare x.Key with key
what is next step?
parameter x is passed (I don't what it is being passed exactly though).
No, this does not happen. What is passed is an expression defining an anonymous function. Such expressions, when using the => operator, are commonly called lambda expressions. x is the part of the expression which determines how the function is called. It's a placeholder for the input variable used by the function expression.
It will help you understand if I give you a pretend version of how the FirstOrDefault() method might be implemented:
public T FirstOrDefault<T>(this IEnumerable<T> items, Func<T, boolean> predicate)
{
foreach(T item in items)
{
if(predicate(item)) return item;
}
return default(T);
}
Some things to understand in that code:
this in front of the first parameter turns the function into an extension method. Instead of calling the method with two arguments, you skip the first argument... call it with only the second argument as if it were a member of the type from the first argument. ie, _db.Posts.FirstOrDefault(foo) instead of FirstOrDefault(_db.Posts, foo).
The key variable in the expression is called a closure. It is available as part of the predicate() function inside this method, even though it's not passed as an argument. This is why the predicate(item) call is able to determine true or false with only item as an input.
The predicate() function call within this method was passed as an argument to the method. That is how the x => x.Key == key argument is interpreted; it becomes the predicate() method used by the FirstOrDefault() function. You can think of it as if predicate() were defined like this:
bool predicate(T x)
{
return x.Key == key;
}
The C# compiler makes this translation for you automatically, and even infers the correct run-time type for T and automatically handles scope for the key closure.
The other answers are close, but not completely correct.
I assume that _db is an Entity Framework DbContext, and _db.Posts is a DbSet<Post>.
As such the .FirstOrDefault() method you are seeing is actually an Extension method and the x => x.Key == key part is an Expression tree.
What happens behind the scenes is that the call to _db.Posts.FirstOrDefault(x => x.Key == key) is translated to a SQL statement like SELECT TOP(1) Key, Content, ... FROM posts WHERE Key = #key, the result of which is mapped into a Post entity.
There are a lot of language features at play to make all this work, so let's have a look!
Extension methods
Extension methods are static methods, but can be called like instance methods.
They are defined in static classes and have a 'receiver' argument. In the case of FirstOrDefault the extension method looks like this:
public static class Queryable {
public static T FirstOrDefault<T>(this IQueryable<T> source, Expression<Func<T, bool>> predicate = null) {
// do something with source and predicate and return something as a result
}
}
It's usage _db.Posts.FirstOrDefault(...) is actually syntactic sugar and will be translated by the C# compiler to a static method call a la Queryable.FirstOrDefault(_db.Posts, ...).
Note that extension methods are, despite the syntactic sugar, still static methods do not have have access to their receiver's internal state. They can only access public members.
Delegates
C# has support for pseudo-first-class functions, called delegates. There are several ways to instantiate a delegate.
They can be used to capture existing methods or they can be initialized with an anonymous function.
The most elegant way to initialize a delegate with an anonymous function is to use lambda style functions like x => x + 10 or (x, y) => x + y.
The reason you don't see type annotations in these examples is that the compiler can infer the types of the arguments in many common situations.
Here is another example:
// This is a normal function
bool IsEven(int x) {
return x % 2 == 0;
}
// This is an anonymous function captured in a delegate of type `Func<T1, TResult>`
Func<int, bool> isEven = x => x % 2 == 0;
// You can also capture methods in delegates
Func<int, bool> isEven = IsEven;
// Methods can be called
int a = IsEven(5); // result is false
// Delegates can be called as well
int b = isEven(4); // result is true
// The power of delegates comes from being able to pass them around as arguments
List<int> Filter(IEnumerable<int> array, Func<int, bool> predicate) {
var result = new List<int>();
foreach (var n in array) {
if (predicate(n)) {
result.Add(n);
}
}
return result;
}
var numbers = new List<int> { 1, 2, 3, 4, 5, 6 };
var evenNumbers = Filter(numbers, isEven); // result is a list of { 2, 4, 6 }
var numbersGt4 = Filter(numbers, x => x > 4); // result is a list of { 5, 6 }
Expression trees
The C# compiler has a feature that allows you to create an Expression tree with normal-looking code.
For example Expression<Func<int, int>> add10Expr = (x => x + 10); will initialize add10Expr not with an actual function but with an expression tree, which is an object graph.
Initialized by hand it would look like this:
Expression xParameter = Expression.Parameter(typeof(int), "x");
Expression<Func<int, int>> add10Expr =
Expression.Lambda<Func<int, int>>(
Expression.Add(
xParameter,
Expression.Constant(10)
),
xParameter
);
(which is super cumbersome)
The power of expression trees comes from being able to create, inspect and transform them at runtime.
Which is what Entity Framework does: it translates these C# expression trees to SQL code.
Entity Framework
With all of these features together you can write predicates and other code in C# which gets translated by Entity Framework to SQL, the results of which are "materialized" as normal C# objects.
You can write complex queries to the database all within the comfort of C#.
And best of all, your code is statically typed.
The x is the range variable of the object you called the function on. The same object you would get in foreach (var x in _db.Posts) It then iterates through that collection looking for x.Key == key and returns the first object that fulfills that. So that function will return the first object in db.Posts where Key == key
edit: corrected term
Your lambda expression with FirstOrDefault is equivalent to the following extension method
public static Post FirstOrDefault(this YourDBType _db, string Key)
{
foreach(Post x in _db.Posts)
{
if(x.Key == Key)
{
return x
}
}
return null
}
X isn't an parameter, its just a shorthand way of referring to the individual item in the collection you are working on like we would have in a foreach statement. The last step in your question is "either return the first Post that has the same key we are comparing against, or return the default value of a Post object (which is null for objects)"

Extracting lambda expression from LINQ

I have next chunk of code
var query = wordCollection.Select((word) => { return word.ToUpper(); })
.Where((word) =>
{
return String.IsNullOrEmpty(word);
})
.ToList();
Suppose I want to refactor this code and extract the lambda expression from Where clause. In Visual Studio I just select this lambda and do Refactor -> Extract Method. By doing so I have my LINQ modified to
var query = wordCollection.Select((word) => { return word.ToUpper(); })
.Where(NewMethod1())
.ToList();
and a NewMethod1() is declared as
private static Func<string, bool> NewMethod1()
{
return (word) =>
{
return String.IsNullOrEmpty(word);
};
}
The question is why this new method does NOT have any input parameters, as delegate Func states that NewMethod1() should have a string input parameter?
To get the expected result, mark just this part String.IsNullOrEmpty(word) and Extract the method:
private bool NewMethod(string word)
{
return String.IsNullOrEmpty(word);
}
What you originally got is because the extract created a method that returns a delegate. Not a method that matches the delegate. It is a method that returns another method. The latter accepts a string parameter word and returns a bool result.
Sure doing the above changes your code to:
.Where((word) => NewMethod(word))
But you can safely change that to:
.Where(NewMethod)
Side Note:
No need to use the return keyword in your Linq Queries or any one-line lambda, you can refactor you query to be like this:
var query = wordCollection.Select(word => word.ToUpper())
.Where(word => string.IsNullOrEmpty(word))
.ToList();
You are selecting the whole lambda, so it is trying to extract the whole lambda statement as a delegate that takes in a word and returns a boolean - Func < string, bool>.
When refactoring you should have only selected the "return String.IsNullOrEmpty(word);" part.
Additionally, you are using the lambas in an unnecessarily complex way.
You could refactor your LINQ statement to this:
var query = wordCollection.Select(word => word.ToUpper())
.Where(word => String.IsNullOrEmpty(word))
.ToList();
Or even to this:
var query = wordCollection.Select(word => word.ToUpper())
.Where(String.IsNullOrEmpty)
.ToList();

Creating, combining and caching lambda expressions

Say I have the following.
foreach (var loop in helper.Loop(x => x.LoopItems))
{
loop.Text(x => x.Name);
loop.Span(x => x.Name);
foreach (var loopItem in loop.Loop(x => x.NestedLoopItems))
{
loopItem.Text(x => x.Age);
}
}
This works just create with my current implementation, however it has to compile the inner lambda expression as many times as there are loop items. Currently this does something like this to create the expression to access a List<T> indexer. eg. x.ListItems[i]
var methodCallExpression = Expression.Call(_expression, ((PropertyInfo) _expression.Member).PropertyType.GetMethod("get_Item"), Expression.Constant(i));
var expression = Expression.Lambda<Func<TModel, T>>(methodCallExpression, _expression.GetParameter<TModel>());
It then does
var newExpression = CombineExpression(listExpression);
var enumerable = newExpression.Compile().Invoke(_htmlHelper.ViewData.Model);
And it is the compile step that seems to be the expensive one.
Would there be any way to cache this given the fact that it needs to create a new one for each loop, such that i in Expression.Constant(i) needs to increment each time modifying the expression.
I don't know what CombineExpression does but if you can change from Func<TModel, T> to Func<TModel, int, T> (assuming the indexer is always an int) if not you could add another generic to the method since you are already passing in "i" to get the Constant in the expression.
Also not entirely sure what type _expression is either so I don't know if this exactly is calling the overloads I think it is.
var parameterExpression = Expression.Parameter(typeof(int), "i");
var methodCallExpression = Expression.Call(_expression, ((PropertyInfo) _expression.Member).PropertyType.GetMethod("get_Item"), parameterExpression);
var expression = Expression.Lambda<Func<TModel, int, T>>(methodCallExpression, _expression.GetParameter<TModel>(), parameterExpression);
Then you could compile expression and get Func<TModel, int, T> and when invoking the Func you would pass in the "i" value.
Again since I don't know what CombineExpression does but if you get a strongly typed Func out of it you can just call it without the invoke.
Another side note why do expressions to get access to a list indexer? Why not just use IEnumerable<> or worst cast if you don't know they type but need the objects cast to IEnumerable (non-generic) and iterate over that?

System.Func passed in to a linq where method without enumerating

I have a method where I am trying to return all default customer addresses with the matching gender. I would like to be able to build up the filtering query bit by bit by passing in System.Func methods to the where clause.
var emailAddresses = new List<string>();
// get all customers.
IQueryable<Customer> customersQ = base.GetAllQueryable(appContext).Where(o => o.Deleted == false);
// for each customer filter, filter the query.
var genders = new List<string>() { "C" };
Func<Customer, bool> customerGender = (o => genders.Contains(o.Addresses.FirstOrDefault(a => a.IsDefaultAddress).Gender));
customersQ = customersQ.Where(customerGender).AsQueryable();
emailAddresses = (from c in customersQ
select c.Email).Distinct().ToList();
return emailAddresses;
But this method calls the database for every address (8000) times which is very slow.
however if I replace the two lines
Func<Customer, bool> customerGender = (o => genders.Contains(o.Addresses.FirstOrDefault(a => a.IsDefaultAddress).Gender));
customersQ = customersQ.Where(customerGender).AsQueryable();
with one line
customersQ = customersQ.Where(o => genders.Contains(o.Addresses.FirstOrDefault(a => a.IsDefaultAddress).Gender)).AsQueryable();
Then the query only makes one call to the database and is very fast.
My question is why does this make a difference? How can I make the first method work with only calling the database once?
Use expression instead of Func:
Expression<Func<Customer, bool>> customerGender = (o =>
genders.Contains(o.Addresses.FirstOrDefault(a => a.IsDefaultAddress).Gender));
customersQ = customersQ.Where(customerGender).AsQueryable();
When you are using simple Func delegate, then Where extension of Enumerable is called. Thus all data goes into memory, where it is enumerated and lambda is executed for each entity. And you have many calls to database.
On the other hand, when you are using expression, then Where extension of Queryable is called, and expression is converted into SQL query. That's why you have single query in second case (if you use in-place lambda it is converted into expression).

Categories