Why does this LINQ query compile? - c#

After reading "Odd query expressions" by Jon Skeet, I tried the code below.
I expected the LINQ query at the end to translate to int query = proxy.Where(x => x).Select(x => x); which does not compile because Where returns an int. The code compiled and prints "Where(x => x)" to the screen and query is set to 2. Select is never called, but it needs to be there for the code to compile. What is happening?
using System;
using System.Linq.Expressions;
public class LinqProxy
{
public Func<Expression<Func<string,string>>,int> Select { get; set; }
public Func<Expression<Func<string,string>>,int> Where { get; set; }
}
class Test
{
static void Main()
{
LinqProxy proxy = new LinqProxy();
proxy.Select = exp =>
{
Console.WriteLine("Select({0})", exp);
return 1;
};
proxy.Where = exp =>
{
Console.WriteLine("Where({0})", exp);
return 2;
};
int query = from x in proxy
where x
select x;
}
}

It's because your "select x" is effectively a no-op - the compiler doesn't bother putting the Select(x => x) call at the end. It would if you removed the where clause though. Your current query is known as a degenerate query expression. See section 7.16.2.3 of the C# 4 spec for more details. In particular:
A degenerate query expression is one that trivially selects the elements of the source. A later phase of the translation removes degenerate queries introduced by other translation steps by replacing them with their source. It is important however to ensure that the result of a query expression is never the source object itself, as that would reveal the type and identity of the source to the client of the query. Therefore this step protects degenerate queries written directly in source code by explicitly calling Select on the source. It is then up to the implementers of Select and other query operators to ensure that these methods never return the source object itself.
So, three translations (regardless of data source)
// Query // Translation
from x in proxy proxy.Where(x => x)
where x
select x
from x in proxy proxy.Select(x => x)
select x
from x in proxy proxy.Where(x => x)
where x .Select(x => x * 2)
select x * 2

It compiles because the LINQ query syntax is a lexical substitution. The compiler turns
int query = from x in proxy
where x
select x;
into
int query = proxy.Where(x => x); // note it optimises the select away
and only then does it check whether the methods Where and Select actually exist on the type of proxy. Accordingly, in the specific example you gave, Select does not actually need to exist for this to compile.
If you had something like this:
int query = from x in proxy
select x.ToString();
then it would get changed into:
int query = proxy.Select(x => x.ToString());
and the Select method would be called.

Related

How to convert this lambda expression to linq

I can't read complicated lambda expression and only the super basic lambda expression only I know I am just starting to study lambda.
As my title above how to convert this lambda into linq?
var train = db.Certificates
.Join(db.TrainingSchedules, a => a.CertificateId, b => b.CertificateId, (a, b) => new { a, b })
.Where(x => x.a.Year.Value.Year == year && x.a.TrainingTypeId.Value == trainingTypeId && x.a.IsApproved.Value && x.b.EndDate >= DateTime.Now)
.Select(z => z.a).Distinct().Where(q => !db.Registrations.Where(s => s.EmployeeId == empId).Select(t => t.Certificate).Any(u => u.CertificateId == q.CertificateId));
Can someone explain to me why it has a different variables?. Like x , q , z , b?
As my title above how to convert this lambda into linq?
Have you every worked in Linq to know how does it looks like ?
If the above specified code is not Linq, then what is Linq, Lambda is an integral part of fluent representation of the Linq, since most of the APIs would need a Func delegate, that's where Lambda comes in.
Now regarding x , q , z , b, what do they represent ?
The Linq APIs are nothing but extension methods for IEnumerable<T>, like List<T>, Dictionary<TK,TV>, whether we write the fluent' or theSqlsyntax these variables represent, each and every element in the collection, which is processed as part of logic provided by theFunc Delegate, you can certainly use and should use more credible variable to represent the exact thing, similar to other parts of code whereint x, float y, DateTime z` is a bad way to code
Regarding the statement posted above consider following changes:
Rename a,b as cert,ts, which refers to a Certificate and Training Schedule classes respectively
Instead of generating anonymous type new { a, b }, create a class like CerificateTrainingSchedule that has all the elements of the Certificate and Training Schedule class respectively, it would be easier to work as you move forward
Rename x as cts, to represent combined CerificateTrainingSchedule
If its easy to read then separate Where clause in multiple chains, like:
.Where(cts => cts.a.Year.Value.Year == year)
.Where(cts => cts.a.TrainingTypeId.Value == trainingTypeId)
.Where(cts => cts.a.IsApproved.Value)
.Where(cts => cts.b.EndDate >= DateTime.Now)
Similarly the names of other variables can be modified to represent the true class and its objects, instead of random a,b,c,d. Also calls can be chained for clear understanding of the logic
Edit - // Modified Linq Query
// Joined / Merged version of Certificate and Training Schedule, add more fields as required, current one is based on certain assumptions of the fields / properties in the Certificate & TrainingSchedule classes respectively
public class CertificateTrainingSchedule
{
public int Year {get; set;} // Certificate Class Property
public int TrainingTypeId {get; set;} // Certificate Class Property
public bool IsApproved {get; set;} // Certificate Class Property
public DateTime EndDate {get; set;} // TrainingSchedule Class Property
}
var train = db.Certificates
.Join(db.TrainingSchedules, cert => cert.CertificateId, ts => ts.CertificateId, (cert, ts) => new CertificateTrainingSchedule{ Year = cert.Year, TrainingTypeId = cert.TrainingTypeId, IsApproved = cert.IsApproved,EndDate = ts.EndDate})
.Where(cts => cts.Year == year)
.Where(cts => cts.TrainingTypeId == trainingTypeId)
.Where(cts => cts.IsApproved)
.Where(cts => cts.EndDate >= DateTime.Now)
.Select(cts => new {cts.Year,cts.TrainingTypeId,cts.IsApproved})
.Distinct() // Allowing anonymous type to avoid IEqualityComparer<Certificate>
.Where(certMain => !db.Registrations.Where(s => s.EmployeeId == empId)
.Select(cert => new Certificate{Year = cert.Year,TrainingTypeId = cert.TrainingTypeId,IsApproved = cert.IsApproved})
.Any(cert => cert.CertificateId == certMain.CertificateId))
I assume that by your question, you mean you want a query expression that is equivalent to calling the LINQ methods explicitly. Without a good Minimal, Complete, and Verifiable code example, it's impossible to know for sure what a correct example would be. However, the following is I believe what you're looking for:
var train =
from q in
(from x in
(from a in db.Certificates
join b in db.TrainingSchedules on a.CertificateId equals b.CertificateId
select new { a, b })
where x.a.Year.Value.Year == year && x.a.TrainingTypeId == trainingTypeId &&
x.a.IsApproved.Value && x.b.EndDate >= DateTime.Now
select x.a).Distinct()
where !(from s in db.Registrations where s.EmployeeId == empId select s.Certificate)
.Any(u => u.CertificateId == q.CertificateId)
select q;
Note that not all of the LINQ methods have a C# query expression language equivalent. In particular, there's no equivalent for Distinct() or Any(), so these are still written out explicitly.
Can someone explain to me why it has a different variables?. Like x , q , z , b?
Each lambda expression has the input on the left side of the => and the result expression on the right. The variables you're referring to are the inputs. These are commonly written using single letters when writing lambda expressions, because the a lambda expression is so short, the meaning can be clear without a longer variable name. For that matter, independent lambda expressions could even use the same variable name.
Note that in the query expression syntax, not all of the variables "made it". In particular, we lost z and t because those variables were superfluous.
In an expression this long, it's possible you might find longer variable names helpful. But it's a trade-off. The query expression language is meant to provide a compact way to represent queries on data sources. Longer variable names could make it harder to understand the query itself, even as it potentially makes it easier to understand the intent of each individual part of the expression. It's very much a matter of personal preference.
I can't read complicated lambda expression and only the super basic lambda expression only I know I am just starting to study lambda.
Try reading this:
var train = db.Certificates
.Where(c => c.Year.Value.Year == year &&
c.TrainingTypeId.Value == trainingTypeId &&
c.IsApproved.Value &&
c.TrainingSchedules.Any(ts => ts.EndDate >= DateTime.Now) &&
!c.Registrations.Any(r => r.EmployeeId == empId));
If you can, then you are just fine.
Note that this is not an exact translation of the sample query, but is functionally equivalent (should produce the same result). The sample query is a good example of badly written query - variable naming, unnecessary multiplicative Join which requires then a Distinct operator (while GroupJoin would do the same w/o the need of Distinct), inconsistent handling of two similar detail criteria (Join for TrainingSchedules and Any for Registrations), overcomplicated criteria for Registrations part etc.
Shortly, don't write such queries. Concentrate on the desired result from the query and use the most logical constructs to express it. Avoid manual joins when you have navigation properties. If you don't have navigation properties, then add them to the model - it's easy one time action which helps a lot when writing queries. For instance, in my translation I assume you have something like this:
class Certificate
{
// Other properties ...
public ICollection<TrainingSchedule> TrainingSchedules { get; set; }
public ICollection<Registration> Registrations { get; set; }
}
class TrainingSchedule
{
// Other properties ...
public Certificate Certificate { get; set; }
}
class Registration
{
// Other properties ...
public Certificate Certificate { get; set; }
}
UPDATE: Here is the same using the query syntax:
var train =
from c in db.Certificates
where c.Year.Value.Year == year &&
c.TrainingTypeId.Value == trainingTypeId &&
c.IsApproved.Value &&
c.TrainingSchedules.Any(ts => ts.EndDate >= DateTime.Now) &&
!c.Registrations.Any(r => r.EmployeeId == empId)
select c;

Using custom methods that return IQueryable in the LINQ to entities

I have two methods that return IQueryable:
IQueryable<Person> GetGoodPeople();
and
IQueryable<Person> GetBadPeople();
I need to write this query:
var q = from x in GetGoodPeople()
from y in GetBadPeople()
select new { Good = x, Bad = y };
The above code is not supported in the linq to entities (the NotSupportedException is thrown), except I declare a variable and use it in the query:
var bad = GetBadPeople()
var q = from x in GetGoodPeople()
from y in bad
select new { Good = x, Bad = y };
Is there a way that I can use IQueryable methods in the linq to entities directly?
Short answer - it's not possible feasible. Your fix is the correct way to solve the problem.
Once entity framework (and LINQ2Sql as well) begins parsing the expression tree, it's too late. The call to GetBadPeople() is actually lazily executed, and as such, is attempted to be converted into SQL itself.
Here's what it may look like:
Table(Person).Take(1).SelectMany(x => value(UserQuery).GetBadPeople(), (x, y) => new <>f__AnonymousType0`2(Good = x, Bad = y))
Here, I've written GetGoodPeople() as simply returning People.Take(1). Note how that query is verbatim, but GetBadPeople() contains a function call.
Your workaround of evaluating GetBadPeople() outside of the expression is the correct solution. This causes the expression tree to call Expression.Constant(bad), rather than attemping to invoke GetBadPeople().
That makes the query look like this:
Table(Person).Take(1).SelectMany(x => value(UserQuery+<>c__DisplayClass1_0).bad, (x, y) => new <>f__AnonymousType0`2(Good = x, Bad = y))
Note there's no method invocation here - we simply pass in the variable.
You can approximate a cartesian product by using an unconstrained join. It doesn't seem to be susceptible to the NotSupportedException. I checked the backend and it renders a single sql statement.
var q = from x in GetGoodPeople()
join y in GetBadPeople()
on 1 equals 1
select new { Good = x, Bad = y };

C# LINQ to Entities does not recognize the method 'Boolean'

I have the following linq expression in lambda syntax:
var myValue = 6;
var from = 2;
var to = 8;
var res = MyList.Where(m => m.person.Id == person.Id
&& IsBetween(myValue, from, to))
.Select(x => new Person { blah blah blah })
.ToList());
IsBetween is simple generic helper method to see whether I have something in between:
public bool IsBetween<T>(T element, T start, T end)
{
return Comparer<T>.Default.Compare(element, start) >= 0
&& Comparer<T>.Default.Compare(element, end) <= 0;
}
Now I get this error, and I don't know hot to get around it:
LINQ to Entities does not recognize the method 'Boolean IsBetween[Decimal](System.Decimal, System.Decimal, System.Decimal)' method, and this method cannot be translated into a store expression.
You cannot call arbitrary methods from within a LINQ to Entities query, as the query is executed within the SQL database engine. You can only call methods which the framework can translate into equivalent SQL.
If you need to call an arbitrary method, the query operator calling the method call will need to be preceded by an AsEnumerable() operator such that the call happens client-side. Be aware that by doing this, all results to the left-hand side of AsEnumerable() will potentially be loaded into memory and processed.
In cases where the method you are calling is short enough, I would simply inline the logic. In your case, you would also need to drop the Comparer calls, and IsBetween(myValue, from, to) would simply become myValue >= from && myValue <= to.
In Addition to this, if you want to pass the values to IsBetween method from MyList.
Take a wrapper class (here Person) contains the same properties to be passed to the method.
and do something like this:
var res = MyList.Where(m => m.person.Id == person.Id)
.Select(x => new Person { p1 = x.p1, p2 = x.p2 })
.AsEnumerable()
.where(x => (IsBetween(x.p1, x.p2)))
.ToList());

How to refactor LINQ select statement

Given the following LINQ statement, can someone tell me if it is possible to refactor the select portion into an expression tree? I have not used expression tree's before and have not been able to find much information regarding Selects.. Note this is to be translated into SQL and run inside SQL Server, not in memory.
var results = db.Widgets
.Select(w => new
{
Name = (w is x) ? "Widget A" : "Widget B"
});
I would like to be able to do this..
var name = [INSERT REUSABLE EXPRESSION]
var somethingElse = [INSERT REUSABLE EXPRESSION]
var results = db.Widgets.Select(w => new { Name = name, SomethingElse = somethingElse });
Obviously the intended use is for more complex statements.
You can do this using LinqKit. It'll work as long as your method is translatable to SQL. This is essentially what a complete example might look like:
public static class ReusableMethods
{
public static Expression<Func<int, Person>> GetAge()
{
return p => p.Age;
}
}
var getAge = ReusableMethods.GetAge();
var ageQuery = from p in People.AsExpandable()
select getAge.Invoke(p);
Note that:
You need to add AsExpandable() to your IQueryable.
You must assign your method to a local variable before using it (not sure about the exact reason why, but its a must).

Calling a method inside a Linq query

I want to insert into my table a column named 'S' that will get some string value based on a value it gets from a table column.
For example: for each ID (a.z) I want to gets it's string value stored in another table. The string value is returned from another method that gets it through a Linq query.
Is it possible to call a method from Linq?
Should I do everything in the same query?
This is the structure of the information I need to get:
a.z is the ID in the first square in table #1, from this ID I get another id in table #2, and from that I can get my string value that I need to display under column 'S'.
var q = (from a in v.A join b in v.B
on a.i equals b.j
where a.k == "aaa" && a.h == 0
select new {T = a.i, S = someMethod(a.z).ToString()})
return q;
The line S = someMethod(a.z).ToString() causing the following error:
Unable to cast object of type 'System.Data.Linq.SqlClient.SqlColumn'
to type 'System.Data.Linq.SqlClient.SqlMethodCall'.
You have to execute your method call in Linq-to-Objects context, because on the database side that method call will not make sense - you can do this using AsEnumerable() - basically the rest of the query will then be evaluated as an in memory collection using Linq-to-Objects and you can use method calls as expected:
var q = (from a in v.A join b in v.B
on a.i equals b.j
where a.k == "aaa" && a.h == 0
select new {T = a.i, Z = a.z })
.AsEnumerable()
.Select(x => new { T = x.T, S = someMethod(x.Z).ToString() })
You'll want to split it up into two statements. Return the results from the query (which is what will hit the database), and then enumerate the results a second time in a separate step to transform the translation into the new object list. This second "query" won't hit the database, so you'll be able to use the someMethod() inside it.
Linq-to-Entities is a bit of a strange thing, because it makes the transition to querying the database from C# extremely seamless: but you always have to remind yourself, "This C# is going to get translated into some SQL." And as a result, you have to ask yourself, "Can all this C# actually get executed as SQL?" If it can't - if you're calling someMethod() inside it - your query is going to have problems. And the usual solution is to split it up.
(The other answer from #BrokenGlass, using .AsEnumerable(), is basically another way to do just that.)
That is an old question, but I see nobody mention one "hack", that allows to call methods during select without reiterating. Idea is to use constructor and in constructor you can call whatever you wish (at least it works fine in LINQ with NHibernate, not sure about LINQ2SQL or EF, but I guess it should be the same).
Below I have source code for benchmark program, it looks like reiterating approach in my case is about twice slower than constructor approach and I guess there's no wonder - my business logic was minimal, so things like iteration and memory allocation matters.
Also I wished there was better way to say, that this or that should not be tried to execute on database,
// Here are the results of selecting sum of 1 million ints on my machine:
// Name Iterations Percent
// reiterate 294 53.3575317604356%
// constructor 551 100%
public class A
{
public A()
{
}
public A(int b, int c)
{
Result = Sum(b, c);
}
public int Result { get; set; }
public static int Sum(int source1, int source2)
{
return source1 + source2;
}
}
class Program
{
static void Main(string[] args)
{
var range = Enumerable.Range(1, 1000000).ToList();
BenchmarkIt.Benchmark.This("reiterate", () =>
{
var tst = range
.Select(x => new { b = x, c = x })
.AsEnumerable()
.Select(x => new A
{
Result = A.Sum(x.b, x.c)
})
.ToList();
})
.Against.This("constructor", () =>
{
var tst = range
.Select(x => new A(x, x))
.ToList();
})
.For(60)
.Seconds()
.PrintComparison();
Console.ReadKey();
}
}

Categories