Can these two LINQ queries be used interchangeably? - c#

a) Would the following two queries produce the same results:
var query1 = collection_1
.SelectMany(c_1 => c_1.collection_2)
.SelectMany(c_2 => c_2.collection_3)
.Select(c_3 => c_3);
var query2 = collection_1
.SelectMany(c_1 => c_1.collection_2
.SelectMany(c_2 => c_2.collection_3.Select(c_3 => c_3)));
b) I assume the two queries can't always be used interchangeably? For example, if we wanted the output elements to also contain values of c_1 and c_2, then we only achieve this with query2, but not with query1:
var query2 = collection_1
.SelectMany(c_1 => c_1.collection_2
.SelectMany(c_2 => c_2.collection_3.Select(c_3 => new { c_1, c_2, c_3 } )));
?
Thank you

The snippets you've given seem to be invalid. c_3 isn't defined in the scope of the Select statement, so unless I've misunderstood something, this won't compile.
It seems as though you're trying to select the elements of collection_3, but this is done implicitly by SelectMany, and so the final Select statements in both cases are redundant. Take them out, and the two queries are equivalent.
All you need is this:
var query = collection_1
.SelectMany(c_1 => c_1.collection_2)
.SelectMany(c_2 => c_2.collection_3);
Update: x => x is the identity mapping, so Select(x => x) is always redundant, regardless of the context. It just means "for every element in the sequence, select the element".
The second snippet is of course different, and the SelectMany and Select statements indeed need to be nested in order to select all three elements, c_1, c_2, and c_3.
Like Gert, says, though, you're probably better off using query comprehension syntax. It's much more succinct and makes it easier to mentally parse the workings of a query.

a. The queries are equal because in both cases you end up with all c_3's in c_1 through c_2.
b. You can't get to c_1 and c_2 with these queries as you suggest. If you want that you need this overload of SelectMany. This "fluent" syntax is quite clumsy though. This is typically a case where comprehensive syntax which does the same is much better:
from c_1 in colection_1
from c_2 in c_1.collection_2
from c_3 in c_2.collection_3
select new { c_1.x, c_2.y, c_3.z }

Related

C#, lambda : How are redundant calls handled?

Im curious about how the compiler handles the following expression:
var collapsed = elements.GroupBy(elm => elm.OrderIdentifier).Select(group => new ModelsBase.Laser.Element()
{
CuttingDurationInSeconds = group.Sum(itm => itm.CuttingDurationInSeconds),
FladderDurationInSeconds = group.Sum(itm => itm.FladderDurationInSeconds),
DeliveryDate = group.Min(itm => itm.DeliveryDate),
EfterFladderOpstilTid = group.First().EfterFladderOpstilTid,
EfterRadanOpstilTid = group.First().EfterRadanOpstilTid,
});
As you can see, I'm using group sum twice, so does anyone know if the "group" list will be iterated twice to get both sums, or will it be optimized so there is actually only 1 complete iteration of the list.
LINQ ist most often not the best way to reach high performance, what you get is productivity in programming, you get a result without much lines of code.
The possibilities to optimize is limited. In case of Querys to SQL, there is one rule of thumb: One Query is better than two queries.
1) there is only one round trip to the SQL_Server
2) SQL Server is made to optimize those queries, and optimization is getting better if the server knows, what you want to do in the next step. Optimization is done per query, not over multiple queries.
In case of Linq to Objects, there is absolutely no gain in building huge queries.
As your example shows, it will probably cause multiple iterations. You keep your code simpler and easier to read - but you give up control and therefore performance.
The compiler certainly won't optimize any of that.
If this is using LINQ to Objects, and therefore delegates, the delegate will iterate over each group 5 times, for the 5 properties.
If this is using LINQ to SQL, Entity Framework or something similar, and therefore expression trees, then it's basically up to the query provider to optimize this appropriately.
You can optimise your request by adding two field in the grouping key
var collapsed = elements.GroupBy(elm => new{
OrderIdentifier=elm.OrderIdentifier,
EfterFladderOpstilTid=elm.EfterFladderOpstilTid,
EfterRadanOpstilTid=elm.EfterRadanOpstilTid
})
.Select(group => new ModelsBase.Laser.Element()
{
CuttingDurationInSeconds = group.Sum(itm => itm.CuttingDurationInSeconds),
FladderDurationInSeconds = group.Sum(itm => itm.FladderDurationInSeconds),
DeliveryDate = group.Min(itm => itm.DeliveryDate),
EfterFladderOpstilTid = group.Key.EfterFladderOpstilTid,
EfterRadanOpstilTid = group.Key.EfterRadanOpstilTid,
});
Or by using LET statement
var collapsed = from groupedElement in
(from element in elements
group element by element.OrderIdentifier into g
select g)
let First = groupedElement.First()
select new ModelsBase.Laser.Element()
{
CuttingDurationInSeconds = groupedElement.Sum(itm => itm.CuttingDurationInSeconds),
FladderDurationInSeconds = groupedElement.Sum(itm => itm.FladderDurationInSeconds),
DeliveryDate = groupedElement.Min(itm => itm.DeliveryDate),
EfterFladderOpstilTid = First.EfterFladderOpstilTid,
EfterRadanOpstilTid = First.EfterRadanOpstilTid
};

Entity Framework (using In and Select Distinct)

I am relatively new to Entity Framework 6.0 and I have come across a situation where I want to execute a query in my C# app that would be similar to this SQL Query:
select * from periods where id in (select distinct periodid from ratedetails where rateid = 3)
Is it actually possible to execute a query like this in EF or would I need to break it into smaller steps?
Assuming that you have in your Context class:
DbSet<Period> Periods...
DbSet<RateDetail> RateDetails...
You could use some Linq like this:
var distincts = dbContext.RateDetails
.Where(i => i.rateId == 3)
.Select(i => i.PeriodId)
.Distinct();
var result = dbContext.Periods
.Where(i => i.Id)
.Any(j => distincts.Contains(j.Id));
Edit: Depending on your entities, you will probably need a custom Comparer for Distinct(). You can find a tutorial here, and also here
or use some more Linq magic to split the results.
Yes, this can be done but you should really provide a better example for your query. You are already providing a bad starting point there. Lets use this one:
SELECT value1, value2, commonValue
FROM table1
WHERE EXISTS (
SELECT 1
FROM table2
WHERE table1.commonValue = table2.commonValue
// include some more filters here on table2
)
First, its almost always better to use EXISTS instead of IN.
Now to turn this into a Lambda would be something like this, again you provided no objects or object graph so I will just make something up.
DbContext myContext = this.getContext();
var myResults = myContext.DbSet<Type1>().Where(x => myContext.DbSet<Type2>().Any(y => y.commonValue == x.commonValue)).Select(x => x);
EDIT - updated after you provided the new sql statement
Using your example objects this would produce the best result. Again, this is more efficient than a Contains which translates to an IN clause.
Sql you really want:
SELECT *
FROM periods
WHERE EXISTS (SELECT 1 FROM ratedetails WHERE rateid = 3 AND periods.id = ratedetails.periodid)
The Lamda statement you are after
DbContext myContext = this.getContext();
var myResults = myContext.DbSet<Periods>()
.Where(x => myContext.DbSet<RateDetails>().Any(y => y.periodid == x.id && y.rateid == 3))
.Select(x => x);
Here is a good starting point for learning about lamda's and how to use them.
Lambda Expressions (C# Programming Guide).
this is your second where clause in your query
var priodidList=ratedetails.where(x=>x.rateid ==3).DistinctBy(x=>x.rateid);
now for first part of query
var selected = periods.Where(p => p.id
.Any(a => priodidList.Contains(a.periodid ))
.ToList();

How can I Skip and Take objects until I have 10 distinct ones in LINQ to Entities?

This is the problematic code:
var distinctCatNames = allCats.Select(c => c.CatName).Distinct();
if (skip.HasValue) distinctCatNames = distinctCatNames .Skip(skip.Value);
if (take.HasValue) distinctCatNames = distinctCatNames .Take(take.Value);
var distinctCatNameList= distinctCatNames .ToList();
If you imagine I have a list of 100 cats, I want to select the 10 distinct names. It's going into a paged list so it has to use skip and take.
The above won't work, because it has to be ordered with OrderBy.
If I put the OrderBy after the distinct, I can't do Skip and Take because the result is an IOrderedQueryable, not an IQueryable (compiler error).
If I do it before, the error says DbSortClause expressions must have a type that is order comparable.
I need to make sure that under the hood it's translating my query properly, because there may be a lot of cats so I want to ensure it generates SQL that incorporates the skip/take in the query rather than getting ALL cats and then doing it on that collection.
Any ideas?
You need to order the items but then simply type the variable you store it in as an IQueryable, rather than an IOrderedQueryable:
var distinctCatNames = allCats.Select(c => c.CatName)
.Distinct()
.OrderBy(name => name)
.AsQueryable();

EF: how to reuse same filter in both standard query and selectMany

I am new to EF and LINQ.
The following two pieces of code work:
dbContext.Categories.Where(cat => [...big ugly code for filtering...] );
&
dbContext.Products.Where(prod => prod.PROD_UID == 1234)
.SelectMany(prod => prod.Categories.Where(
cat => [...big ugly code for filtering...] );
But I want somehow to create only one, reusable, expression or delegate for my filter. I have the following:
private static Expression<Func<Category, bool>> Filter(filter)
{
return cat => [...big ugly code for filtering...] ;
}
but I cannot use it in SelectMany.
I am aware that:
Where clause of standard query accepts Expression<Func<Category,bool>> and returns IQueryable<Category>
Where clause of SelectMany accepts Func<Category,bool> and returns IEnumerable<Category>.
What is the best way to accomplish this? Are any tricks here?
PS: I want in the end to get all categories of a product.
It looks like you're trying to use SelectMany as a filter. SelectMany is used to flatten a collection of collections (or a collection of a type that contains another collection) into one flat collection.
I think what you want is:
dbContext.Products.Where(prod => prod.PROD_UID == 1234)
.SelectMany(prod => prod.Categories)
.Where(filter);
In which case you can reuse the same expression to filter.
EDIT
Based on your updated question it looks like you are applying Where to an IEnumerable<T> property, so the compiler is binding to IEnumerable.Where which takes a Func instead of an Expression.
You should be able to just call AsQueryable() on your collection property to bind to IQueryable.Where():
dbContext.Products.Where(prod => prod.PROD_UID == 1234)
.SelectMany(prod => prod.Categories
.AsQueryable()
.Where(filter);
The next option would be to compile the expression to turn it into a Func:
dbContext.Products.Where(prod => prod.PROD_UID == 1234)
.SelectMany(prod => prod.Categories
.Where(filter.Compile());
But it wouldn't surprise me if the underlying data provider isn't able to translate that to SQL.
All you need to do is call the Filter function before executing the query and store it in a local variable. When the query provider sees a method it attempts to translate that method into SQL, rather than executing the method and using the result. Whenever it encounters a local variable it doesn't attempt to translate it into SQL but rather evaluates the variable to its value, and then uses that value in the query.
As for the problems that you're having due to the fact that the relationship collection isn't an IQueryable, it's probably best to simply approach the query differently and just pull directly from the categories list instead:
var filter = Filter();
dbContext.Categories.Where(filter)
.Where(cat => cat.Product.PROD_UID == 1234);
After analyzing in more detail the SQL generated by EF, I realized that having the filter inside SelectMany is not more efficient.
So, both suggestions (initial one from DStanley and Servy's) should be ok for my case (many-to-many relation between Categories and Products)
/* 1 */
dbContext.Products.Where(prod => prod.PROD_UID == 1234)
.SelectMany(prod => prod.Categories)
.Where( Filter ); // reuseable expression
this will result into a INNER JOIN
/* 2 */
dbContext.Categories.Where( Filter ) // reuseable expression
.Where(c => c.Products.Where(prod.PROD_UID == 1234).Any());
this will result into a EXISTS (sub-select)
The execution plan seems to be absolutely identical for both in my case; so, I will choose for now #2, and will keep an eye on performance.

LINQ - Using Select - understanding select

I find LINQ a little difficult to wrap my head around. I like the concept and believe it has lots of potential. But, after writing so much SQL the syntax is just not easy for me to swallow.
A. What is the deal with multiple ways to select?
I see that I am able to create a context and perform a Select() using a method.
context.Table.Select(lamba expression);
ok...Why would I use this? How does it compare to (or does it) this type of select?
var returnVal = from o in context.Table
orderby o.Column
select o;
B. Please explain the variable nature of
**from X** in context.Table
Why do we stick a seemingly arbitrarily named variable here? Shouldn't this be a known type of type <Table>?
So...
var returnVal = context.Table.Select(o => o);
and
var returnVal = from o in context.Table
select o;
are the same. In the second case, C# just has nice syntactic sugar to give you something closer to normal SQL syntax. Notice I removed the orderby from your second query. If you wanted that in there, then the first one would become:
var returnVal = context.Table.OrderBy(o => o.Column).Select(o => o);
As for your last question... we're not sticking an arbitrarily named variable here. We're giving a name to each row so that we can reference it later on in the statement. It is implicitly typed because the system knows what type Table contains.
In response to your comment, I wanted to add one more thought. You mentioned things getting nasty with the normal method calls. It really can. Here's a simple example where its immediately much cleaner (at least, if you're used to SQL syntax) in the LINQ syntax:
var returnVal = context.Table.OrderBy(o => o.Column1)
.ThenBy(o => o.Column2)
.ThenBy(o => o.Column3)
.Select(o => o);
versus
var returnVal = from o in context.Table
orderby o.Column1, o.Column2, o.Column3
select o;
A: this is the same. The compiler transforms the query expression to method calls. Exactly the same.
B: The x is the same as in foreach(var X in context.Table). You define a name for an individual element of the table/sequence.
In B, X's type is implicit. You could just as easily do something like:
from Row x in context.Table
and it would be the same. In A, there isn't any difference between using a lambda and the equivalent full-LINQ syntax, except that you would never do .Select(x => x). It's for transforming items. Say you had a list of integers, .Select(x => x * x) would return the square of each of them.

Categories