Modularize (refactor) Linq queries

Modularize (refactor) Linq queries - c#

I have a few Linq queries. Semantically, they are
a join b join c join d where filter1(a) && filter2(c) && filter3(d)
a join b join c where filter1(a) && filter2(c)
a join b join c join e where filter1(a) && filter2(c) && filter4(e)
...
I want to be able to factor out the shared part:
a join b join c where filter1(a) && filter2(c)
and dynamically append join d and filter3(d)
Is there a way to do this? I am already using the Predicate Builder to dynamically build conditionals (filters).
EDIT: I am using Linq-to-SQL.
EDIT: The base query looks like:
from a in As.AsExpandable()
join b in Bs on a.Id equals b.PId
join c in Cs on b.Id equals c.PId
where filter1(a) && filter2(b) && filter3(c)
select new A { ... }
filters are predicates in Predicate Builder. The type of the query is IQueryable<A>.
Next, I'd like to join this with d
from a in BaseQuery()
join d in D on a.Id equals d.PId
Currently join d .. causes a compilation error:
The type of one of the expressions in the join clause is incorrect. Type inference failed in the call to Join

Your example is a bit vague, but it is easy to create a method that returns an IQueryable<T> and reuse that method, if that’s what you mean. Here is an example:
// Reusable method
public IQueryable<SomeObject> GetSomeObjectsByFilter(Context c)
{
return
from someObject in context.SomeObjects
where c.B.A.Amount < 1000
where c.Roles.Contains(r => r.Name == "Admin")
select someObject;
}
You can reuse this method in other places like this:
var q =
from c in GetSomeObjectsByFilter(context)
where !c.D.Contains(d => d.Items.Any(i => i.Value > 100))
select c;
Because the way IQueryable works, only the final query (the collection that you start iterating) will trigger a call to the database, which allows you to build a highly maintainable system by reusing business logic that gets effectively executed inside the database, whiteout the loss of any performance.
I do this all the time and it improves the maintainability of my code big time. It works no matter which O/RM tool you run, because there is no difference in Queryable<T> composition, between writing the query in one peace, or splitting it out to different methods.
Note that you do sometimes need some smart transformations to get the duplicate parts in a single method. Things that might help are returning grouped sets, and returning a set of a different type, than what you think you need. This sounds a bit vaque, but just post a question here at SO when you have problems splitting up a method. There are enough people here that can help you with that.

I can answer half your question easily. Linq makes it simple to append .where clauses to an existing query.
Example:
var x = db.Table1.where(w => w.field1 == nTestValue);
x = x.where(w => w.field2 == nTestValue2);
I believe you can do the joins as well but I have to go find an example in some old code. I'll look if nobody else jumps in with it soon.

Related

How do I group one to many in a nested fashion with linq to entities?

I have three tables, Chart, ChartQuestion, ChartAnswer
A Chart has many ChartQuestions, a ChartQuestion has many ChartAnswers.
I'd like to do a linq query that gives me an object containing all of this data, so it'd be a ChartObject containing List<ChartQuestion> and each ChartQuestion contains a List<ChartAnswer>
I started with this:
(from chart in db.Chart
join chartQuestion in db.chartQuestion on chart.ChartId equals chartQuestion.ChartId into chartQuestions)
This seems to be the first step. However I want to include the ChartAnswers now, so I have to do another join to pull back the ChartAnswers but I don't know how to do this.
I can't do a join, and while I can do a from, I am not sure of the exact syntax.
(from chart in db.Chart
join chartQuestion in db.chartQuestion on chart.ChartId equals chartQuestion.ChartId into chartQuestions
from chartQuestionsSelection in chartQuestions
join chartAnswer in context.ChartAnswers on chartQuestions.ChartAnswerId equals chartAnswer.ChartAnswerId into chartAnswers // This is wrong
)
With that code above you end up as chartAnswers being separate to the chartQuestions rather than belonging to them, so I don't think it is correct.
Any idea?

joining a table destructors it i.e. you now have the row as a variable if you need it in a nested fashion then select in like this
(from chart in db.Chart
select new { //can also be your own class/record, perhaps a DTO
chart.Id,
chart.Name,
questions = (from chartQuestion in db.chartQuestion
where chart.ChartId == chartQuestion.ChartId
select new { //perhaps a dto
chartQuestion.Id,
chartQuestion.Question,
Answers =
(from chartAnswer in context.ChartAnswers
where chartAnswer.ChartAnswerId == chartQuestion.ChartAnswerId
select chartAnswer).ToList() //this is translated and not evaluated
}).ToList() //this is translated not evaluated
}).ToListAsync(cancellationToken) //this will evaluate the expression
If you need it by joins then you can group join it like this:
from m in _context.Chart
join d in _context.ChartQuestions
on m.ID equals d.ID into mdJoin
select new
{
chartId = m.ID,
chartName = "m.name",
quess = from d in mdJoin
join dd in _context.ChartAnswer
on d.Iddomain equals dd.DomainId into anJoin
select new
{
quesId = d.ID,
quesName = d.Question,
anss = anJoin
}
}
A better way: If you edit your DbConfigurations to include navigation properties of each then the code will become way simpler.
example:
db.Chart
.Include(x => x.chartQuestions)
.ThenInclude(x => x.chartAnswers)
You can search more about how to do navigation properties in EFCore, but ofcourse if the code base is large then this might not be feasible for you.

Try following :
(from chart in db.Chart
join chartQuestion in db.chartQuestion on chart.ChartId equals chartQuestion.ChartId
join chartAnswer in context.ChartAnswers on chartAnswer.ChartAnswerId equals chartQuestion.ChartAnswerId
)

LINQ performance question when joining IEnumerable with IQueryable

I had some serious speed issues with the LINQ in this code (variable names have been changed)
var A = _service.GetA(param1, param2); // Returns Enumerable results
var results = (from b in _B.All() // _B.All() returns IQueryable
join c in _C.All() on b.Id equals c.Id // _C.All() returns IQueryable
join a in A on a.Id equals c.Id
where b.someId == id && a.boolVariable // A bool value
select new
{
...
}).ToList();
This LINQ took over 10 seconds to execute even though the number of rows in B and C tables were less than 100k.
I looked into this and by trial and error I managed to get the LINQ execution time to 200ms by changing the code to this:
var A = _service.GetA(param1, param2).Where(a => a.boolVariable); // Returns Enumerable results
var results = (from b in _B.All() // _B.All() returns IQueryable
join c in _C.All() on b.Id equals c.Id // _C.All() returns IQueryable
join a in A on a.Id equals c.Id
where b.someId == id
select new
{
...
}).ToList();
So my question is, why does this simple change have such drastic effects on the LINQ performance? The only change is that I filter the Enumerable list beforehand, the A enumerable has about 30 items before filtering and 15 after filtering.

In your first scenario: first it joins all the records in A which would take long time to join, then filters out for a.boolVariable.
In your second scenario you have a smaller subset of records for A prior to joining - of course this would take less time to join.

Linq to objects - boondoggle?

I thought that LINQ to objects is great way for differed execution of joined data. In reality, it comes up as bad way to do things...
Here is what we had, having few, few hundred, 3K and 3.5K records in a,b,c,d correspondingly
IEnumerable<MyModel> data =
(from a in AList
from b in BList.Where(r => r.AId == a.Id)
from c in CList.Where(r => r.BId == b.Id)
from d in DList.Where(r => r.SomeId == myId && r.Some2Id == c.Some2Id)
// . . . . . .
Wasn't LINQ supposed to be great about doing it?
In reality, following works much faster, 60 times faster in fact
var dTemp = DList.Where(r => r.SomeId == myId).ToList();
var cTemp = CList.Where(c => dTemp.Any(d => d.Some2Id == c.Some2Id)).ToList();
IEnumerable<MyModel> data =
(from a in AList
from b in BList.Where(r => r.AId == a.Id)
from c in cTemp.Where(r => r.BId == b.Id)
// . . . . . .
And then I came across this article
Q: Is there a way to improve this query without abandoning single LINQ?
Or does this mean that LINQ to objects in form of joins need to be avoided and replaced by some sequential calls if performance is at stake?

Let's analyze the differences.
First query: you are performing a filter on BList, a filter on CListand two filters on DList, all in a deferred-execution manner. You then use a kind of a join.
Second query: you perform a static filter on DList and evaluate it, another static filter on CList based on DList and evaluate it and then a deferred-executed filter on both AList and BList.
The second query is faster because:
DList is not being looked at for useless values (due to previous filters)
CList only contains useful values due to previous filters
Anyway, both queries are wrong. Multiple from is basically a cross-join, as explained here. As #Reddog commented, the best way is to actually use Join:
var data = from a in AList
join b in BList on a.Id equals b.AId
join c in CList on b.Id equals c.BId
join d in DList on c.Some2Id equals d.Some2Id
where d.SomeId == someId;

Linq multiple where queries

I have an issue building a fairly hefty linq query. Basically I have a situation whereby I need to execute a subquery in a loop to filter down the number of matches that are returned from the database. Example code is in this loop below:
foreach (Guid parent in parentAttributes)
{
var subQuery = from sc in db.tSearchIndexes
join a in db.tAttributes on sc.AttributeGUID equals a.GUID
join pc in db.tPeopleIndexes on a.GUID equals pc.AttributeGUID
where a.RelatedGUID == parent && userId == pc.CPSGUID
select sc.CPSGUID;
query = query.Where(x => subQuery.Contains(x.Id));
}
When I subsequently call the ToList() on the query variable it appears that only a single one of the subqueries has been performed and I'm left with a bucketful of data I don't require. However this approach works:
IList<Guid> temp = query.Select(x => x.Id).ToList();
foreach (Guid parent in parentAttributes)
{
var subQuery = from sc in db.tSearchIndexes
join a in db.tAttributes on sc.AttributeGUID equals a.GUID
join pc in db.tPeopleIndexes on a.GUID equals pc.AttributeGUID
where a.RelatedGUID == parent && userId == pc.CPSGUID
select sc.CPSGUID;
temp = temp.Intersect(subQuery).ToList();
}
query = query.Where(x => temp.Contains(x.Id));
Unfortunately this approach is nasty as it results in multiple queries to the remote database whereby the initial approach if I could get it working would only result in a single hit. Any ideas?

I think you are hitting a special case of capturing the loop variable in the lambda expression used to filter. Also known as an access to modified closure error.
Try this:
foreach (Guid parentLoop in parentAttributes)
{
var parent = parentLoop;
var subQuery = from sc in db.tSearchIndexes
join a in db.tAttributes on sc.AttributeGUID equals a.GUID
join pc in db.tPeopleIndexes on a.GUID equals pc.AttributeGUID
where a.RelatedGUID == parent && userId == pc.CPSGUID
select sc.CPSGUID;
query = query.Where(x => subQuery.Contains(x.Id));
}
The problem is capturing the parent variable in the closure (that the LINQ syntax is converted to), which causes all the subQueryes to be run with the same parent id.
What happens is the compiler generating a class to hold the delegate and the local variables the delegate accesses. The compiler re-uses the same instance of that class for each loop; and therefore, once the query executes, all of the Wheres executes with the same parent Guid, namely the last to execute.
Declaring the parent inside the loop scope causes the compiler to essentially make a copy of the variable, with the correct value, to be captured.
This can be a bit hard to grasp at first, so if this is the first time it has hit you; I'd recommend these two articles for background and a thorough explanation:
Eric Lippert: Closing over the loop variable considered harmful and part two.
Jon Skeet: Closures

Maybe this way?
var subQuery = from sc in db.tSearchIndexes
join a in db.tAttributes on sc.AttributeGUID equals a.GUID
join pc in db.tPeopleIndexes on a.GUID equals pc.AttributeGUID
where parentAttributes.Contains(a.RelatedGUID) && userId == pc.CPSGUID
select sc.CPSGUID;

Linq how to write a JOIN

Linq to EF, I'm using asp.net 4, EF 4 and C#.
Here are two ways I came up with to query my data. Ways A and C are working fine. B however needs to implement and additional WHERE statement (as "where c.ModeContent == "NA").
My question is:
Regarding this kind of join (outer join, I suppose) what is the best approach in term of performance?
Could you show me some code to implement additional WHERE statement in B?
Any way to improve this code?
Thanks for your time! :-)
// A
var queryContents = from c in context.CmsContents
where c.ModeContent == "NA" &&
!(from o in context.CmsContentsAssignedToes select o.ContentId)
.Contains(c.ContentId)
select c;
// B - I need to implent where c.ModeContent == "NA"
var result01 = from c in context.CmsContents
join d in context.CmsContentsAssignedToes on c.ContentId equals d.ContentId into g
where !g.Any()
select c;
// C
var result02 = context.CmsContents.Where(x => x.ModeContent == "NA").Where(item1 => context.CmsContentsAssignedToes.All(item2 => item1.ContentId != item2.ContentId));

Regarding query B you can apply the condition like this:
var result01 = from c in context.CmsContents where c.ModeContent == "NA"
join d in context.CmsContentsAssignedToes on c.ContentId equals d.ContentId into g
where !g.Any()
select c;

Your query will be far more readable and maintainable (and perform at least as well) if you use your association properties instead of join:
var result = from c in context.CmsContents
where c.ModeContent == "NA"
&& !c.AssignedToes.Any()
select c;
I'm guessing that the navigation on CmsContent to CmsContentsAssignedToes is called AssignedToes. Change the name in my query if it's actually called something else.
This query can be read out loud and you know exactly what it means. The join versions you have to think about.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Modularize (refactor) Linq queries - c#

Related

How do I group one to many in a nested fashion with linq to entities?

LINQ performance question when joining IEnumerable with IQueryable

Linq to objects - boondoggle?

Linq multiple where queries

Linq how to write a JOIN

Categories

Resources