LINQ to SQL filter child collection - c#

I'm strugling with this query, i think I'm missing something.
I have two autogenerated dbml models.
public partial class RegulatorsOrganizationView
{
private int regulatorOrgId;
private string regulatorOrgName;
private EntitySet<RegulatorsView> regulatorsViews;
}
public partial class RegulatorsView
{
private int regulatorId;
private string regulatorName;
}
I need to apply filtering by name, input string "filterText" should be a part of regulatorName
If regulator is not matching - should be filtered out from regulatorsViews
If regulatorOrganizationView have at least one match in regulatorsViews - should be included
If regulatorsViews collection of regulatorOrganizationView does not have regulators that match condition, but it's name contains filterText - it should be included.
Currently I'm loading all the matching regualatorsOrganizationViews, and do filtering on regulators down the line.
List<RegulatorOrganizationView> regOrgs = boatDataContext.RegulatorOrganizationView
.Where(r => r.RegulatorsViews.Any(ar => ar.regulatorName.ToUpper().Contains(filterText.ToUpper()))
|| r.regulatorName.ToUpper().Contains(filterText.ToUpper())
.ToList();
But this way I'm loading redundent Regulators only to filter them out later on.
How can I rebuild this query to load only matching regulators from starters ?
It tried to use Select() to assign regulatorOrgnization filter list of Regulators.
regulatorsOrgs = DataContext.RegulatorOrganizationViews
.Where(ro => ro.regulatorOrgName.ToUpper().Contains(filterText.ToUpper())
|| ro.RegulatorsViews.Any(r => r.regulatorName.ToUpper().Contains(filterText.ToUpper()))
.Select(ro => new RegulatorOrganizationView()
{
regulatorId = ro.regulatorId,
regulatorOrgName = ro.regulatorOrgName,
RegulatorsViews = ro.RegulatorsViews
.Where(r => r.regulatorName.ToUpper().Contains(filterText.ToUpper())
.Select(r => new RegulatorsView()
{
regulatorId = r.regulatorId,
regulatorName = r.regulatorName,
}).ToEntitySet()
}).ToList();
But I'm getting exception: Message="The explicit construction of the entity type 'RegulatorsOrganizationView' in a query is not allowed."
Looks like filtered Include() would be an option (like in EF) but I can't find a way to use it with Linq To SQL.
Any ideas ?

In LINQ-to-SQL it's a bit messy and not intuitive to do this. You have to use DataLoadOptions:
var opt = new DataLoadOptions();
opt.AssociateWith((RegulatorsOrganizationView v)
=> v.regulatorsViews.Where(ar => ar.regulatorName.Contains(filterText)));
opt.LoadWith((RegulatorsOrganizationView v) => => v.regulatorsViews);
DataContext.LoadOptions = opt;
var result = DataContext.RegulatorOrganizationViews
.Where(ro => ro.regulatorOrgName.Contains(filterText)
&& ro.regulatorsViews.Any());
So this says: when loading RegulatorOrganizationViews, then when their regulatorsViews are associated, make them meet the given condition.
Then it says: when when loading RegulatorOrganizationViews, also load their regulatorsViews.
The latter is like Include in Entity Framework. The former makes it behave like filtered Include, or maybe closer, a global query filter.
I removed the ToUpper calls for brevity, but you don't need them if the database collation is case-insensitive.

Related

LINQ Query - Proper "Where" clause to select only children that meet a condition

I have an IQueryable of a complex EF model, let's call it GeneralForm. This GeneralForm entity aggregates a member called Section. The Section contains a list of FormFields and each FormField has a name. I want to select only the FormFields whose names are in a list of given names.
IQueryable<GeneralForm> query = InitializeMyQuery();
What is the correct "Where" clause to do so. something like this:
if (criteria.FormFieldNames.Any())
{
query = query.Where(gf => gf.Section.FormFields.Where(x => criteria.FormFieldNames.Contains(x.FormField.FieldName)).Any());
}
does not work, as it still retrieves all FormFields, not just the ones I want.
Any suggestion would be highly appreciated.
Thanks,
Ed
Edit 1: This is how the query is built (for privacy reason, I renamed some entities and I also removed the ones that do not really pertain to the issue I am trying to resolve):
query = (from genFormEntry in _context.GeneralForms
.Include(r => r.Sections)
.Include(r => r.Form.FormFields)
.Include(r => r.Form.FormFields.Select(x => x.FormField))
select genFormEntry);
This query retrieves Sections that have any matching form name. It doesn't do any filtering on the FormField side.
You may try to join those tables manually, or depending on your Ef version, you can try using filtered includes:
if (criteria.FormFieldNames.Any())
{
query = query
.Include(gf => gf.Section.FormFields.Where(x => criteria.FormFieldNames.Contains(x.FormField.FieldName)) // Include the FormFields that match the criteria
.Where(gf => gf.Section.FormFields.Where(x => criteria.FormFieldNames.Contains(x.FormField.FieldName)).Any());
}
Edit:
As Ef 6.1 doesn't support filtered includes. Only two options left. 1 is mentioned above which is manual linq joins (which is pretty ugly and not versatile) and the other is to rewrite the query like below :
// guessing navigation property names here.
query = _context.FormFields.Include(r => r.Form.Section.GeneralForm);
// and later in your code
if (criteria.FormFieldNames.Any())
{
query = query.Where(f => criteria.FormFieldNames.Contains(f.FieldName));
}

Why is Entity Framework having performance issues when calculating a sum

I am using Entity Framework in a C# application and I am using lazy loading. I am experiencing performance issues when calculating the sum of a property in a collection of elements. Let me illustrate it with a simplified version of my code:
public decimal GetPortfolioValue(Guid portfolioId) {
var portfolio = DbContext.Portfolios.FirstOrDefault( x => x.Id.Equals( portfolioId ) );
if (portfolio == null) return 0m;
return portfolio.Items
.Where( i =>
i.Status == ItemStatus.Listed
&&
_activateStatuses.Contains( i.Category.Status )
)
.Sum( i => i.Amount );
}
So I want to fetch the value for all my items that have a certain status of which their parent has a specific status as well.
When logging the queries generated by EF I see it is first fetching my Portfolio (which is fine). Then it does a query to load all Item entities that are part of this portfolio. And then it starts fetching ALL Category entities for each Item one by one. So if I have a portfolio that contains 100 items (each with a category), it literally does 100 SELECT ... FROM categories WHERE id = ... queries.
So it seems like it's just fetching all info, storing it in its memory and then calculating the sum. Why does it not do a simple join between my tables and calculate it like that?
Instead of doing 102 queries to calculate the sum of 100 items I would expect something along the lines of:
SELECT
i.id, i.amount
FROM
items i
INNER JOIN categories c ON c.id = i.category_id
WHERE
i.portfolio_id = #portfolioId
AND
i.status = 'listed'
AND
c.status IN ('active', 'pending', ...);
on which it could then calculate the sum (if it is not able to use the SUM directly in the query).
What is the problem and how can I improve the performance other than writing a pure ADO query instead of using Entity Framework?
To be complete, here are my EF entities:
public class ItemConfiguration : EntityTypeConfiguration<Item> {
ToTable("items");
...
HasRequired(p => p.Portfolio);
}
public class CategoryConfiguration : EntityTypeConfiguration<Category> {
ToTable("categories");
...
HasMany(c => c.Products).WithRequired(p => p.Category);
}
EDIT based on comments:
I didn't think it was important but the _activeStatuses is a list of enums.
private CategoryStatus[] _activeStatuses = new[] { CategoryStatus.Active, ... };
But probably more important is that I left out that the status in the database is a string ("active", "pending", ...) but I map them to an enum used in the application. And that is probably why EF cannot evaluate it? The actual code is:
... && _activateStatuses.Contains(CategoryStatusMapper.MapToEnum(i.Category.Status)) ...
EDIT2
Indeed the mapping is a big part of the problem but the query itself seems to be the biggest issue. Why is the performance difference so big between these two queries?
// Slow query
var portfolio = DbContext.Portfolios.FirstOrDefault(p => p.Id.Equals(portfolioId));
var value = portfolio.Items.Where(i => i.Status == ItemStatusConstants.Listed &&
_activeStatuses.Contains(i.Category.Status))
.Select(i => i.Amount).Sum();
// Fast query
var value = DbContext.Portfolios.Where(p => p.Id.Equals(portfolioId))
.SelectMany(p => p.Items.Where(i =>
i.Status == ItemStatusConstants.Listed &&
_activeStatuses.Contains(i.Category.Status)))
.Select(i => i.Amount).Sum();
The first query does a LOT of small SQL queries whereas the second one just combines everything into one bigger query. I'd expect even the first query to run one query to get the portfolio value.
Calling portfolio.Items this will lazy load the collection in Items and then execute the subsequent calls including the Where and Sum expressions. See also Loading Related Entities article.
You need to execute the call directly on the DbContext the Sum expression can be evaluated database server side.
var portfolio = DbContext.Portfolios
.Where(x => x.Id.Equals(portfolioId))
.SelectMany(x => x.Items.Where(i => i.Status == ItemStatus.Listed && _activateStatuses.Contains( i.Category.Status )).Select(i => i.Amount))
.Sum();
You also have to use the appropriate type for _activateStatuses instance as the contained values must match the type persisted in the database. If the database persists string values then you need to pass a list of string values.
var _activateStatuses = new string[] {"Active", "etc"};
You could use a Linq expression to convert enums to their string representative.
Notes
I would recommend you turn off lazy loading on your DbContext type. As soon as you do that you will start to catch issues like this at run time via Exceptions and can then write more performant code.
I did not include error checking for if no portfolio was found but you could extend this code accordingly.
Yep CategoryStatusMapper.MapToEnum cannot be converted to SQL, forcing it to run the Where in .Net. Rather than mapping the status to the enum, _activeStatuses should contain the list of integer values from the enum so the mapping is not required.
private int[] _activeStatuses = new[] { (int)CategoryStatus.Active, ... };
So that the contains becomes
... && _activateStatuses.Contains(i.Category.Status) ...
and can all be converted to SQL
UPDATE
Given that i.Category.Status is a string in the database, then
private string[] _activeStatuses = new[] { CategoryStatus.Active.ToString(), ... };

Why can't I use include?

That's my code:
ProjetoTipoCargaModelo projAux =
dbContext.ProjetoTipoCargaDbSet.Find(idProjetoTipoCarga);
ICollection<ProjetoTipoCargaRegraModelo> regras =
projAux.ListaRegra.Where(x => x.Ativo).ToList();
IQueryable<ProjetoTipoCargaRegraModelo> pr =
dbContext.ProjetoTipoCargaDbSet.Select(
x => regras.FirstOrDefault(y => y.IdProjetoTipoCarga == x.IdProjetoTipoCarga));
var projetoCompleto = pr.
Include(x => x.ListaRegraLiberacaoInicioViagem).
Include(x => x.ListaRegraTecnologiaAceita).
Include(x => x.RegraAreaSombra).
Include(x => x.RegraAtuadorNecessario)
It's showing an error at first include, but I'm trying to do it on Iquerable object!
What's wrong where?
My problem is make this include in a filtered set of results.
[Edit]
Error:
Cannot convert lambda expression to type 'string' because it is not a delegate type
It's not a runtime error, it's a compilation error.
[Edit 2]
My usings:
using System;
using System.Collections.Generic;
using System.Data.Entity;
using System.Linq;
[Edit 3]
Answer:
ICollection<ProjetoTipoCargaRegraModelo> regras = projAux.ListaRegra.Where(x => x.Ativo).ToList();
IQueryable<ProjetoTipoCargaModelo> pr = dbContext.ProjetoTipoCargaDbSet.Where(x => x.IdProjetoTipoCarga == regras.FirstOrDefault(y => y.IdProjetoTipoCarga == x.IdProjetoTipoCarga).IdProjetoTipoCarga);
var projetoCompleto = pr.
Include(x => x.ListaRegraLiberacaoInicioViagem).
Include(x => x.ListaRegraTecnologiaAceita).
Include(x => x.RegraAreaSombra).
The .Include() method only works on ObjectQuery<TEntity>
Try:
context.EntitySet.Include(...).Select(...)
instead of:
context.EntitySet.Select(...).Include(...)
or use an extenion method like this:
public static class MyExtensions
{
public static IQueryable<TEntity> Include<TEntity>(
this IQueryable<TEntity> query, string path)
{
var efQuery = query as ObjectQuery<TEntity>;
if (efQuery == null)
return query;
return efQuery.Include(path);
}
}
or better yet, use the already available extension method that supports lambda expressions instead of strings as paths.
Also, do not use so many includes unless most are 1:1 or :1 relationships, 1: (or :) relationships greatly increase the IO from the database, resulting in bad performance.
Consider using multiple queries with .Future() to enable a single access to the database instead.
It's not related to the Includes.
You can't use regas in a LINQ-to-entities query because it's an ICollection. EF can't translate that into SQL.
Your "answer" can't possibly work with EF. The object regas is an in-memory list of Regra entities (I guess). If you use that directly in...
var pr = dbContext.ProjetoTipoCargaDbSet
.Where(x => x.IdProjetoTipoCarga == regras.FirstOrDefault(y => y.IdProjetoTipoCarga == x.IdProjetoTipoCarga).IdProjetoTipoCarga);
...you should get an exception like
Unable to create a constant value of type 'Regra'. Only primitive types or enumeration types are supported...
But, boy, what a tortuous way to get where you want to be! First you get a ProjetoTipoCargaModelo object by an idProjetoTipoCarga. Then you fetch its active Regras. Then you basically use the IdProjetoTipoCarga values of the Regras to see if one of them is equal to the original idProjetoTipoCarga and if so, you use its value to get a ProjetoTipoCargaModelo object.
If you remove all the redundancies, what's left is:
var pr = dbContext.ProjetoTipoCargaDbSet
.Where(x => x.IdProjetoTipoCarga == idProjetoTipoCarga
&& x.ListaRegra.Any(r => r.Ativo));
I you us this LINQ statement, you append your includes to pr.

Linq with boolean function to relational db in Entity Framework

Probably a few things wrong with my code here but I'm mostly having a problem with the syntax. Entry is a model for use in Entries and contains a TimeStamp for each entry. Member is a model for people who are assigned entries and contains an fk for Entry. I want to sort my list of members based off of how many entries the member has within a given period (arbitrarily chose 30 days).
A. I'm not sure that the function I created works correctly, but this is aside from the main point because I haven't really dug into it yet.
B. I cannot figure out the syntax of the Linq statement or if it's even possible.
Function:
private bool TimeCompare(DateTime TimeStamp)
{
DateTime bound = DateTime.Today.AddDays(-30);
if (bound <= TimeStamp)
{
return true;
}
return false;
}
Member list:
public PartialViewResult List()
{
var query = repository.Members.OrderByDescending(p => p.Entry.Count).Where(TimeCompare(p => p.Entry.Select(e => e.TimeStamp));
//return PartialView(repository.Members);
return PartialView(query);
}
the var query is my problem here and I can't seem to find a way to incorporate a boolean function into a .where statement in a linq.
EDIT
To summarize I am simply trying to query all entries timestamped within the past 30 days.
I also have to emphasize the relational/fk part as that appears to be forcing the Timestamp to be IEnumerable of System.Datetime instead of simple System.Datetime.
This errors with "Cannot implicitly convert timestamp to bool" on the E.TimeStamp:
var query = repository.Members.Where(p => p.Entry.First(e => e.TimeStamp) <= past30).OrderByDescending(p => p.Entry.Count);
This errors with Operator '<=' cannot be applied to operands of type 'System.Collections.Generic.IEnumerable' and 'System.DateTime'
var query = repository.Members.Where(p => p.Entry.Select(e => e.TimeStamp) <= past30).OrderByDescending(p => p.Entry.Count);
EDIT2
Syntactically correct but not semantically:
var query = repository.Members.Where(p => p.Entry.Select(e => e.TimeStamp).FirstOrDefault() <= timeComparison).OrderByDescending(p => p.Entry.Count);
The desired result is to pull all members and then sort by the number of entries they have, this pulls members with entries and then orders by the number of entries they have. Essentially the .where should somehow be nested inside of the .count.
EDIT3
Syntactically correct but results in a runtime error (Exception Details: System.ArgumentException: DbSortClause expressions must have a type that is order comparable.
Parameter name: key):
var query = repository.Members.OrderByDescending(p => p.Entry.Where(e => e.TimeStamp <= timeComparison));
EDIT4
Closer (as this line compiles) but it doesn't seem to be having any effect on the object. Regardless of how many entries I add for a user it doesn't change the sort order as desired (or at all).
var timeComparison = DateTime.Today.AddDays(-30).Day;
var query = repository.Members.OrderByDescending(p => p.Entry.Select(e => e.TimeStamp.Day <= timeComparison).FirstOrDefault());
A bit of research dictates that Linq to Entities (IE: This section)
...var query = repository.Members.OrderByDescending(...
tends to really not like it if you use your own functions, since it will try to map to a SQL variant.
Try something along the lines of this, and see if it helps:
var query = repository.Members.AsEnumerable().Where(TimeCompare(p => p.Entry.Select(e => e.TimeStamp).OrderByDescending(p => p.Entry.Count));
Edit: I should just read what you are trying to do. You want it to grab only the ones within the last X number of days, correct? I believe the following should work, but I would need to test when I get to my home computer...
public PartialViewResult List()
{
var timeComparison = DateTime.Today.AddDays(-30);
var query = repository.Members.Where(p => p.Entry.Select(e => e.TimeStamp).FirstOrDefault() <= timeComparison).OrderByDescending(p => p.Entry.Count));
//return PartialView(repository.Members);
return PartialView(query);
}
Edit2: This may be a lack of understanding from your code, but is e the same type as p? If so, you should be able to just reference the timestamp like so:
public PartialViewResult List()
{
var timeComparison = DateTime.Today.AddDays(-30);
var query = repository.Members.Where(p => p.TimeStamp <= timeComparison).OrderByDescending(p => p.Entry.Count));
//return PartialView(repository.Members);
return PartialView(query);
}
Edit3: In Edit3, I see what you are trying to do now (I believe). You're close, but OrderByDescending would need to go on the end. Try this:
var query = repository.Members
.Select(p => p.Entry.Where(e => e.TimeStamp <= timeComparison))
.OrderByDescending(p => p.Entry.Count);
Thanks for all the help Dylan but here is the final answer:
public PartialViewResult List()
{
var timeComparison = DateTime.Today.AddDays(-30).Day;
var query = repository.Members
.OrderBy(m => m.Entry.Where(e => e.TimeStamp.Day <= timeComparison).Count());
return PartialView(query);
}

Multiple Include and Where Clauses Linq

I have a database where I'm wanting to return a list of Clients.
These clients have a list of FamilyNames.
I started with this
var query = DbContext.Clients.Include(c => c.FamilyNames).ToList() //returns all clients, including their FamilyNames...Great.
But I want somebody to be able to search for a FamilyName, ifany results are returned, then show the clients to the user.
so I did this...
var query = DbContext.Clients.Include(c => c.FamilyNames.Where(fn => fn.familyName == textEnteredByUser)).ToList();
I tried...
var query = DbContext.Clients.Include(c => c.FamilyNames.Any(fn => fn.familyName == textEnteredByUser)).ToList();
and...
var query = DbContext.FamilyNames.Include(c => c.Clients).where(fn => fn.familyname == textEnteredByUser.Select(c => c.Clients)).ToList();
What I would like to know (obviously!) is how I could get this to work, but I would like it if at all possible to be done in one query to the database. Even if somebody can point me in the correct direction.
Kind regards
In Linq to Entities you can navigate on properties and they will be transformed to join statements.
This will return a list of clients.
var query = DbContext.Clients.Where(c => c.FamilyNames.Any(fn => fn == textEnteredByUser)).ToList();
If you want to include all their family names with eager loading, this should work:
var query = DbContext.Clients.Where(c => c.FamilyNames.Any(fn => fn == textEnteredByUser)).Include(c => c.FamilyNames).ToList();
Here is some reference about loading related entities if something doesn't work as expected.
You can use 'Projection', basically you select just the fields you want from any level into a new object, possibly anonymous.
var query = DbContext.Clients
.Where(c => c.FamilyNames.Any(fn => fn == textEnteredByUser))
// only calls that can be converted to SQL safely here
.Select(c => new {
ClientName = c.Name,
FamilyNames = c.FamilyNames
})
// force the query to be materialized so we can safely do other transforms
.ToList()
// convert the anon class to what we need
.Select(anon => new ClientViewModel() {
ClientName = anon.ClientName,
// convert IEnumerable<string> to List<string>
FamilyNames = anon.FamilyNames.ToList()
});
That creates an anonymous class with just those two properties, then forces the query to run, then performs a 2nd projection into a ViewModel class.
Usually I would be selecting into a ViewModel for passing to the UI, limiting it to just the bare minimum number of fields that the UI needs. Your needs may vary.

Categories