Linq !Contains() alternative - c#

I'm trying to improve the performance of a query that uses linq and fluent API.
The query looks like this:
var result = DbContext.Set<Parent>()
.Join(DbContext.Set<Child>(),
(parent) => parent.Id,
(child) => child.ParentId,
(parent, child) => cild)
.Select(x => new { x.SomeOtherId, x.AnotherId })
.Where(x => !filters.Contains(x.SomeOtherId))
.Select(x => x.AnotherId )
.Distinct()
.ToListAsync(cancellationToken);
As I understand, .Contains degrades the performance of queries such as this and using a Join is better, e.g
.Join(filters, (c) => c.SomeOtherId, (filterId) => filterId, (c, filterId) => c.AnotherId);
This will give results where there is a match, however how would I find results where there is not a match?
Thanks

Related

EF Core Linq-to-Sql GroupBy SelectMany not working with SQL Server

I am trying the following Linq with LinqPad connecting to SQL Server with EF Core:
MyTable.GroupBy(x => x.SomeField)
.OrderBy(x => x.Key)
.Take(5)
.SelectMany(x => x)
I get this error:
The LINQ expression 'x => x' could not be translated. Either rewrite the query in a form that can be translated, or switch to client evaluation explic...
However, this works:
MyTable.AsEnumerable()
.GroupBy(x => x.SomeField)
.OrderBy(x => x.Key)
.Take(5)
.SelectMany(x => x)
I was under the impression that EF Core should be able to translate such an expression.
Am I doing anything wrong?
That exception message is an EF Core message, not EF6.
In EF 6 your expression should work, though with something like a ToList() on the end. I suspect the error you are encountering is that you may be trying to do something more prior to materializing the collection, and that is conflicting with the group by SelectMany evaluation.
For instance, something like this EF might take exception to:
var results = MyTable
.GroupBy(x => x.SomeField)
.OrderBy(x => x.Key)
.Take(5)
.SelectMany(x => x)
.Select(x => new ViewModel { Id = x.Id, Name = x.Name} )
.ToList();
where something like this should work:
var results = MyTable
.GroupBy(x => x.SomeField)
.OrderBy(x => x.Key)
.Take(5)
.SelectMany(x => x.Select(y => new ViewModel { Id = y.Id, Name = y.Name} ))
.ToList();
You don't want to use:
MyTable.AsEnumerable(). ...
As this is materializing your entire table into memory, which might be ok if the table is guaranteed to remain relatively small, but if the production system grows significantly it forms a cascading performance decline over time.
Edit: Did a bit of digging, credit to this post as it does look like another limitation in EF Core's parser. (No idea how something that works in EF6 cannot be successfully integrated into EF Core... Reinventing wheels I guess)
This should work:
var results = MyTable
.GroupBy(x => x.SomeField)
.OrderBy(x => x.Key)
.Take(5)
.Select(x => x.Key)
.SelectMany(x => _context.MyTable.Where(y => y.Key == x))
.ToList();
So for example where I had a Parent and Child table where I wanted to group by ParentId, take the top 5 parents and select all of their children:
var results = context.Children
.GroupBy(x => x.ParentId)
.OrderBy(x => x.Key) // ParentId
.Take(5)
.Select(x => x.Key) // Select the top 5 parent ID
.SelectMany(x => context.Children.Where(c => c.ParentId == x)).ToList();
EF pieces this back together by doing a SelectMany back on the DbSet against the selected group IDs.
Credit to the discussions here: How to select top N rows for each group in a Entity Framework GroupBy with EF 3.1
Edit 2: The more I look at this, the more hacky it feels. Another alternative would be to look at breaking it up into two simpler queries:
var keys = MyTable.OrderBy(x => x.SomeField)
.Select(x => x.SomeField)
.Take(5)
.ToList();
var results = MyTable.Where(x => keys.Contains(x.SomeField))
.ToList();
I think that translates your original example, but the gist is to select the applicable ID/Discriminating keys first, then query for the desired data using those keys. So in the case of my All children from the first 5 parents that have children:
var parentIds = context.Children
.Select(x => x.ParentId)
.OrderBy(x => x)
.Take(5)
.ToList();
var children = context.Children
.Where(x => parentIds.Contains(x.ParentId))
.ToList();
EF Core has limitation for such query, which is fixed in EF Core 6. This is SQL limitation and there is no direct translation to SQL for such GroupBy.
EF Core 6 is creating the following query when translating this GroupBy.
var results = var results = _context.MyTable
.Select(x => new { x.SomeField })
.Distinct()
.OrderBy(x => x.SomeField)
.Take(5)
.SelectMany(x => _context.MyTable.Where(y => y.SomeField == x.SomeField))
.ToList();
It is not most optimal query for such task, because in SQL it can be expressed by Window Function ROW_NUMBER() with PARTITION on SomeField and additional JOIN can be omitted.
Also check this function, which makes such query automatically.
_context.MyTable.TakeDistinct(5, x => x.SomeField);

EF Core 3.0 .Include does not work as expected and Super Slow

I have linq query like this in EF Core 2.0, It work as it is, but when I upgrade to EF Core 3.0 it always timeout. I found the issue in query = query.Where(x => x.Questions);.
My Question is i would like to return the course with filter questions like only Take(10) or with .Where condition that only display certain range not all questions.
var query = _courseRepository.Table;
query = query.Where(x => x.Id == id);
query = query.Include(x => x.Questions);
query = query.Include(x => x.CourseYear);
query = query.Include(x => x.CourseSubject);
query = query.Include(x => x.Instructors).ThenInclude(y => y.User);
query = query.Include(x => x.Instructors).ThenInclude(y => y.Course);
query = query.Include(x => x.Instructors).ThenInclude(y => y.CourseClass);
query = query.Include(x => x.CourseSections);
query = query.Include(x => x.CourseSections).ThenInclude(y => y.Lessons);
query = query.Include(x => x.CourseClasses);
query = query.Include(x => x.UserCourses).ThenInclude(y => y.User);
var result = query.FirstOrDefault();
EFCore 3.0 changed the query(ies) generated by using .Include() and you are experiencing the Cartesian Explosion Problem;
Specifically there is the following Red Caution in the Docs now:
Caution
Since version 3.0.0, each Include will cause an additional JOIN to be
added to SQL queries produced by relational providers, whereas
previous versions generated additional SQL queries. This can
significantly change the performance of your queries, for better or
worse. In particular, LINQ queries with an exceedingly high number of
Include operators may need to be broken down into multiple separate
LINQ queries in order to avoid the cartesian explosion problem.
The solution is to execute multiple queries now per the docs.
Its super unfortunate loading entity graphs, common to highly normalized data, is so un-performant but this is its current state with EF.
See: Loading Related Data and scroll until you see red.
var query = _courseRepository.Table
.Include(x => x.Questions)
.Include(x => x.CourseClasses)
.Include(x => x.CourseYear)
.Include(x => x.CourseSubject);
var course = await query.FirstOrDefaultAsync(x => x.Id == id);
query.Include(x => x.Instructors).ThenInclude(y => y.User).SelectMany(a => a.Instructors).Load();
query.Include(x => x.Instructors).ThenInclude(y => y.Course).SelectMany(a => a.Instructors).Load();
query.Include(x => x.Instructors).ThenInclude(y => y.CourseClass).SelectMany(a => a.Instructors).Load();
query.Include(x => x.CourseSections).ThenInclude(y => y.Lessons).SelectMany(a => a.CourseSections).Load();
query.Include(x => x.UserCourses).ThenInclude(y => y.User).SelectMany(a => a.UserCourses).Load();

How to factorize a Linq request with multiple includes?

I want to simplify a linq Query that contains multiple includes.
My model is simple: a site is linked to one contract, that is linked to one client. On that client I need to get with a single request the telephones, mails and honorifics (appelRef).
I want a single request because behind the request is translated by entity framework into a SQL Server request.
Here is the linq request:
var search =
from IMT.Site s in imtContext.IMTObjects.OfType<IMT.Site>()
.Include(
s => s.LienContratSiteRef
.Select(l => l.Contrat)
.Select(c => c.LienContratClientRef
.Select(l => l.Client)
.Select(cl => cl.Telephones ) ) )
.Include(s => s.LienContratSiteRef
.Select(l => l.Contrat)
.Select(c => c.LienContratClientRef
.Select(l => l.Client)
.Select(cl => cl.Mails ) ) )
.Include(s => s.LienContratSiteRef
.Select(l => l.Contrat)
.Select(c => c.LienContratClientRef
.Select(l => l.Client)
.Select(cl => cl.AppelRef ) ) )
where s.Reference.ToString() == siteId
select s;
Yor can notice the block
.Include(
s => s.LienContratSiteRef
.Select(l => l.Contrat)
.Select(c => c.LienContratClientRef
.Select(l => l.Client)
..is repeated three time. Is ther a way to factorize that code block ?
Update: there are intermedray objects LienContratSiteRef and LienContratClientRef and relationships are 0 - *, so that LienContratSiteRef.Contrat and LienContratClientRef.Client are collections.
I also tried:
.Include(
s => s.LienContratSiteRef
.Select(l => l.Contrat)
.Select(c => c.LienContratClientRef
.Select(l => l.Client)
.Select(cl => new { Tels = cl.Telephones, Mail = cl.Mails, Appel = cl.AppelRef} ) ) )
but It results with a runtime error:
The Include path expression must refer to a navigation property
defined on the type.
The s => ... inside the include could be refactored into a delegate, then just .Include that delegate multiple times.
It looks like the signature would be Func<Site, IEnumerable<Client>>?
E.g.
static IEnumerable<Client> Foo(Site site) => site.LienContratSiteRef
.Select(l => l.Contrat)
.Select(c => c.LienContratClientRef
.Select(l => l.Client)
String-based Chaining
The Include() method supports a dot-delimited string parameter which can be used to pull an complete graph of objects down as opposed to making multiple chained select calls:
.Include("LienContratSiteRif.Contrat.LienContratClientRef.Client")
From there, if you wanted to include multiple additional properties, I believe that you could have another include for those sub properties:
.Include("LienContratSiteRif.Contrat.LienContratClientRef.Client, Client.Telephones, ...")
Lambda-based Chaining
You should be able to accomplish something similar by chaining your lambda-based includes into a single Include() call as well using:
.Include(c => LienContratSiteRif.Contrat.LienContratClientRef.Client)
It looks like you're trying to make an entity framework projection!
A project allows you to select only the properties that you want to return. (just like a SQL Select)
To use a projection, your code should roughly look like this:
var search = imtContext.IMTObjects.OfType<IMT.Site>()
.Where(s => s.Reference.ToString() == siteId)
.Select(s => new {
Telephones = s.LienContratSiteRef.Contrat.Select(c => c.LienContratClientRef.Client.Select(cli => cli.Telephones),
Mails = s.LienContratSiteRef.Contrat.Select(c => c.LienContratClientRef.Client.Select(cli => cli.Mails),
AppelRef = s.LienContratSiteRef.Contrat.Select(c => c.LienContratClientRef.Client.Select(cli => cli.AppelRef)
}).ToList();
If Telephones, Mails, AppelRefare also collections, then you can do aggregate the collections together like this in memory after the query has been run:
var telephones = search.SelectMany(x => x.Telephones).ToList();
var mails = search.SelectMany(x => x.Mails).ToList();
var appelRefs = search.SelectMany(x => x.AppelRef).ToList();

Linq - where inside include

I know that this won't work as written, but I'm struggling to see the right answer, and this non-functional code hopefully illustrates what I'm trying to achieve:
var defaults = _cilQueryContext.DefaultCharges
.Where(dc => dc.ChargingSchedule_RowId == cs.RowId);
List<DevelopmentType> devTypes =
defaults.Select(dc => dc.DevelopmentType)
.Include(d => d.DefaultCharges)
.Include(d => d.OverrideCharges.Where(oc => oc.ChargingSchedule_RowId == cs.RowId))
.Include(d => d.OverrideCharges.Select(o => o.Zone))
.ToList();
Essentially, I had presumed this required a join, but seeing as I'm trying to select a parent object containing two related types of children, I can't see what would go in the join's "select new" clause.
As far as I am aware Include does not support this type of sub-querying. Your best option is to use projection e.g.
List<DevelopmentType> devTypes =
defaults.Include(x => x.DefaultCharges)
.Include(x => x.OverrideCharges)
.Select(x => new {
DevType = x.DevelopmentType,
Zones = x.OverrideCharges.Where(oc => oc.ChargingSchedule_RowId == cs.RowId)
.Select(oc => oc.Zone).ToList()
})
.Select(x => x.DevType)
.ToList();

Subquery parent id in NHibernate QueryOver

I'm trying to understand how this may work.
What I would have is to have all Trees from a given list of IDs that has not rotten apples.
Looks easy but I'm fresh of NHibernate, not so good in SQL and as you can see I'm stuck.
I wrote this code down here:
Tree treeitem = null;
QueryOver<Apple> qapple = QueryOver.Of<Apple>()
.Where(x => (!x.IsRotten))
.And(Restrictions.IdEq(Projections.Property<Tree>(y => y.Id)))
// Or this one...
//.And(Restrictions.EqProperty(
// Projections.Property<Apple>(y => y.Tree.Id),
// Projections.Property<Tree>(y => y.Id)))
.Select(x => x.Id);
return this.NHibernateSession.QueryOver<Tree>()
.Where(x => x.Id.IsIn(ListOfTreeId))
.WithSubquery.WhereExists<Apple>(qapple)
.SelectList(list => list
.Select(z => z.Id).WithAlias(() => treeitem.Id)
.Select(z => z.Name).WithAlias(() => treeitem.Name)
.Select(z => z.Type).WithAlias(() => critem.Type)
.TransformUsing(Transformers.AliasToBean<Tree>())
.List<T>();
And the pseudo SQL I get is something like this:
SELECT id, name, type FROM trees WHERE id IN (1, 2, 3)
AND EXIST(SELECT id FROM apples WHERE NOT rotten AND apples.idtree = apples.id)
As you can see there's a problem with the subquery that use the same table Id instead of something like that:
EXIST(SELECT id FROM apples WHERE NOT rotten AND apples.idtree = tree.id)
I'm bit lost actually. Maybe there's another way to build this up.
Any help is welcome, thanks.
Im not sure why you are using resulttransformer when the return type is the same as the query type
return NHibernateSession.QueryOver<Tree>()
.Where(t => t.Id.IsIn(ListOfTreeId))
.JoinQueryOver<Apple>(t => t.Apples)
.Where(a => !a.IsRotten)
.List();
Update: the Compiler chooses ICollection<Apple> while it really should choose Apple therefor specify the generic argument in JoinQueryOver explicitly
Update2: to get them unique
opt 1)
...
.SetResultTransformer(Transformers.DistinctRootEntity());
.List();
opt 2)
Tree treeAlias = null;
var nonRottenApples = QueryOver.Of<Apple>()
.Where(a => !a.IsRotten)
.Where(a => a.Tree.Id == treeAlias.Id)
.Select(x => x.Id); <- optional
return NHibernateSession.QueryOver(() => treeAlias)
.Where(t => t.Id.IsIn(ListOfTreeId))
.WithSubquery.WhereExists(nonRottenApples)
.List();

Categories