I am using entity framework core 2.1, I have a database context with an accessor for a model containing a boolean field represented as a non nullable bit field in an MS SQL database. I want to construct a query that evaluates in SQL efficiently that provides me a count of all rows in the table, and those with the bit column enabled.
var groups = await this.context.Models
.AsNoTracking()
.GroupBy(i => 1)
.Select(g => new ViewModel
{
Count = g.Count(),
Revoked = g.Count(p => p.IsRevoked)
})
.ToArrayAsync();
In order to force the query to consume all rows, I use ToArray, however the group by, count and where clauses log they cannot be evaluated remotely.
Other attempts such as:
var query = await this.context.Models
.AsNoTracking()
.GroupBy(i => i.IsRevoked)
.ToArrayAsync();
Produces two groups which I can later inspect but they fail to evaluate the bit column the same.
How can I generate a single expression that produces a new object with the count of all rows and the count of the subset which have the bit field enabled?
The first technique (group by constant) worked well in EF6. Just instead of predicate based Count which has not direct SQL equivalent, using the conditional Sum produced a nice GROUP BY SQL.
Unfortunately, this doesn't translate to SQL in EF Core, even in 2.1.
Fortunately, combining it with intermediate projection produces the desired SQL translation in EF 2.1:
var counts = await this.context.Models
.Select(e => new { Revoked = e.IsRevoked ? 1 : 0 })
.GroupBy(e => 1)
.Select(g => new ViewModel
{
Count = g.Count(),
Revoked = g.Sum(e => e.Revoked)
})
.ToArrayAsync();
Related
I have the following Entity Framework Core 3.0 query:
var units = await context.Units
.SelectMany(y => y.UnitsI18N)
.OrderBy(y => y.Name)
.GroupBy(y => y.LanguageCode)
.ToDictionaryAsync(y => y.Key, y => y.Select(z => z.Name));
I get the following error:
Client side GroupBy is not supported.
To run the query on the client, or part of it, I would do the following:
var units = context.Units
.SelectMany(y => y.UnitsI18N)
.OrderBy(y => y.Name)
.AsEnumerable()
.GroupBy(y => y.LanguageCode)
.ToDictionary(y => y.Key, y => y.Select(z => z.Name));
Now it works.
Why am I getting this error if I am not running the query on the client?
It seems like there is a common misconception about what LINQ GroupBy does and what SQL GROUP BY is able to do. Since I fell into the exact same trap and had to wrap my head around this recently, I decided to write a more thorough explanation of this issue.
Short answer:
The LINQ GroupBy is much different from the SQL GROUP BY statement: LINQ just divides the underlying collection into chunks depending on a key, while SQL additionally applies an aggregation function to condense each of these chunks down into a single value.
This is why EF has to perform your LINQ-kind GroupBy in memory.
Before EF Core 3.0, this was done implicitly, so EF downloaded all result rows and then applied the LINQ GroupBy. However, this implicit behavior might let the programmer expect that the entire LINQ query is executed in SQL, with potentially enormous performance impact when the result set is rather large. For this reason, implicit client side evaluation of GroupBy was disabled completely in EF Core 3.0.
Now it is required to explicitly call functions like .AsEnumerable() or .ToList(), which download the result set and continue with in-memory LINQ operations.
Long answer:
The following table solvedExercises will be the running example for this answer:
+-----------+------------+
| StudentId | ExerciseId |
+-----------+------------+
| 1 | 1 |
| 1 | 2 |
| 2 | 2 |
| 3 | 1 |
| 3 | 2 |
| 3 | 3 |
+-----------+------------+
A record X | Y in this table denotes that student X has solved exercise Y.
In the question, a common use case of LINQ's GroupBy method is described: Take a collection and group it into chunks, where the rows in each chunk share a common key.
In our example, we might want to get a Dictionary<int, List<int>>, which contains a list of solved exercises for each student. With LINQ, this is very straightforward:
var result = solvedExercises
.GroupBy(e => e.StudentId)
.ToDictionary(e => e.Key, e => e.Select(e2 => e2.ExerciseId).ToList());
Output (for full code see dotnetfiddle):
Student #1: 1 2
Student #2: 2
Student #3: 1 2 3
This is easy to represent with C# datatypes, since we can nest List and Dictionary as deep as we like to.
Now we try to imagine this as an SQL query result. SQL query results are usually represented as a table, where we can freely choose the returned columns. To represent our above query as SQL query result, we would need to
generate multiple result tables,
put the grouped rows into an array or
somehow insert a "result set separator".
As far as I know, none of these approaches is implemented in practice. At most, there are some hacky work-arounds like MySQL's GROUP_CONCAT, which allows to combine the result rows into a string (relevant SO answer).
Thus we see, that SQL cannot yield results that match LINQ's notion of GroupBy.
Instead, SQL only allows so-called aggregation: If we, for example, wanted to count how many exercises have been passed by a student, we would write
SELECT StudentId,COUNT(ExerciseId)
FROM solvedExercises
GROUP BY StudentId
...which will yield
+-----------+-------------------+
| StudentId | COUNT(ExerciseId) |
+-----------+-------------------+
| 1 | 2 |
| 2 | 1 |
| 3 | 3 |
+-----------+-------------------+
Aggregation functions reduce a set of rows into a single value, usually a scalar. Examples are row count, sum, maximum value, minimum value, and average.
This is implemented by EF Core: Executing
var result = solvedExercises
.GroupBy(e => e.StudentId)
.Select(e => new { e.Key, Count = e.Count() })
.ToDictionary(e => e.Key, e => e.Count);
generates the above SQL. Note the Select, which tells EF which aggregation function it should use for the generated SQL query.
In summary, the LINQ GroupBy function is much more general than the SQL GROUP BY statement, which due to SQL's restrictions only allows to return a single, two dimensional result table. Thus, queries like the one in the question and the first example in this answer have to be evaluated in memory, after downloading the SQL result set.
Instead of implicitly doing this, in EF Core 3.0 the developers chose to throw an exception in this case; this prevents accidental downloading of an entire, potentially large table with millions of rows, which might get unnoticed during development due to a small test database.
Your .GroupBy(y => y.LanguageCode).ToDictionaryAsync(y => y.Key, y => y.Select(z => z.Name)); cannot be converted to SQL.
EF Core 3.0 will throw exception to make sure you know that all records in Units will be fetched from database before grouping and map to Dictionary.
It's top breaking change in EF Core 3.0.
https://learn.microsoft.com/en-us/ef/core/what-is-new/ef-core-3.0/breaking-changes
One possible solution (works for me) is to make the GroupBy on a List object.
var units = (
await context.Units
.SelectMany(y => y.UnitsI18N)
.GroupBy(y => y.LanguageCode)
.ToDictionaryAsync(y => y.Key, y => y.Select(z => z.Name))
).ToList().OrderBy(y => y.Name);
The linq GroupBy method can do things that a database query cannot. This is why linq is throwing the exception. It's not a missing feature, but in older versions of linq, it simply enumerated the entire table and then ran the GroupBy locally.
Linq query syntax happens to have a group keyword that can be translated to a database query.
Here's a mostly working example of how to run your query on the database using query syntax:
var kvPairs = from y in context.Units
from u in y.UnitsI18N
orderby u.Name
group u by u.LanguageCode into g
select new KeyValuePair<string,IEnumerable<string>>(g.Key, g.Select(z => z.Name));
return new Dictionary<string,IEnumerable<string>>>(kvPairs);
See this article from Microsoft for more information: https://learn.microsoft.com/en-us/ef/core/querying/complex-query-operators#groupby
var test = unitOfWork.PostCategory.GetAll().Include(u=>u.category).GroupBy(g => g.category.name).Select(s => new
{
name = s.Key,
count = s.Count()
}).OrderBy(o=>o.count).ToList();
you can try this code part... it will works.. I have tried
Client-Side Group-By is Supported
Tested with EF Core 3.1.15.0
The following code returns the Client side GroupBy is not supported. error:
MyEntity
.GroupBy(x => x.MyProperty)
.ToDictionaryAsync(x => x.Key, x => x.Count())
.Dump();
But for some reason, you can add a .Select() after the .GroupBy(), and it compiles and runs the expected SQL:
MyEntity
.GroupBy(x => x.MyProperty)
.Select(g => new { Key = g.Key, Count = g.Count() })
.ToDictionaryAsync(x => x.Key, x => x.Count)
.Dump();
Compiles to:
SELECT [t].[MyProperty] AS [Key], COUNT(*) AS [Count]
FROM [dbo].[MyEntity] AS [t]
GROUP BY [t].[MyProperty]
Source: https://stackoverflow.com/a/11564436/14565661
I have a table, Items, which has a many to one relationship with two distinct parents.
I want to select the counts of ParentA for each ParentB.
In SQL this is simple:
SELECT "ParentBId", count(distinct "ParentAId")
FROM "Items"
GROUP BY "ParentBId"
In Linq I have this statement:
var itemCounts = await _context.Items
.GroupBy(item => item.ParentBId,
(parentBId, items) => new
{
ParentBId = parentBId,
Count = items.Select(item => item.ParentAId).Distinct().Count(),
}).ToDictionaryAsync(group => group.ParentBId, group => group.Count);
When running this query, EF is blowing up with this error:
System.InvalidOperationException: Processing of the LINQ expression 'AsQueryable<string>(Select<Item, string>(
source: NavigationTreeExpression
Value: default(IGrouping<string, Item>)
Expression: (Unhandled parameter: e),
selector: (item) => item.ParentAId))' by 'NavigationExpandingExpressionVisitor' failed. This may indicate either a bug or a limitation in EF Core. See https://go.microsoft.com/fwlink/?linkid=2101433 for more detailed information.
at Microsoft.EntityFrameworkCore.Query.Internal.NavigationExpandingExpressionVisitor.VisitMethodCall(MethodCallExpression methodCallExpression)
at Microsoft.EntityFrameworkCore.Query.Internal.NavigationExpandingExpressionVisitor.VisitMethodCall(MethodCallExpression methodCallExpression)
...
The Items table does use Table per hierarchy with a discriminator column to determine what the item type is. I do not know if this is a factor.
I have seen lots of people recommend the items.Select(i => i.Field).Distinct().Count() option, but this doesn't seem to be working here. Any other suggestions?
Thanks!
Currently any kind of distinction inside groups (like Distinct inside ElementSelector of GroupBy or another GroupBy inside ElementSelector of GroupBy) isn't supported by EF Core. If you insist on using EF in this case, you have to fetch some data in memory:
var result = (await _context.Items
.Select(p => new { p.ParentAId, p.ParentBId })
.Distinct()
.ToListAsync()) // When EF supports mentioned cases above, you can remove this line!
.GroupBy(i => i.ParentBId, i => i.ParentAId)
.ToDictionary(g => g.Key, g => g.Distinct().Count());
The goal is to get the first DateTime and Last DateTime from a collection on an Entity (Foreign Key). My Entity is an organization and my collection are Invoices. I'm grouping results since Organizations unfortunately are not Unique. I'm dealing with duplicate data and cannot assume my organizations are unique so I'm grouping by a Number field on my Entity.
I'm using .NET Core 2.1.2 with Entity Framework.
I'm trying to get the following query generated from LINQ:
SELECT MIN([organization].[Id]) AS Id, MIN([organization].[Name]) AS Name,
MIN([organization].[Number]) AS Number, MIN([invoice].[Date])
AS First, MAX([invoice].[Date]) AS Last
FROM [organization]
INNER JOIN [invoice] ON [invoice].[OrganizationId] = [organization].[Id]
GROUP BY [organization].[Number], [organization].[Name]
ORDER BY [organization].[Name]
However I have no idea how to get to write the LINQ query to get it to generate this result.
I got as far as:
await _context
.Organization
.Where(z => z.Invoices.Any())
.GroupBy(organization => new
{
organization.Number,
organization.Name
})
.Select(grouping => new
{
Id = grouping.Min(organization => organization.Id),
Name = grouping.Min(organization => organization.Name),
Number= grouping.Min(organization => organization.Number),
//First = ?,
//Last = ?
})
.OrderBy(z => z.Name)
.ToListAsync();
I have no clue how to write the LINQ query in such a way that it generates the above.
I have a couple questions still:
Are the Min statements for Id, Name and Number correct ways of getting the
first element in the grouping?
Do I need a join statement or is "WHERE EXISTS" better (this got generated before I changed the code)?
Does anyone know how to finish writing the LINQ statement? Because I have to get the first and last Date from the Invoices Collection on my Organization Entity:
organization.Invoices.Min(invoice => invoice.Date)
organization.Invoices.Max(invoice => invoice.Date)
Here is the trick.
To make inner join by using collection navigation property simple use SelectMany and project all primitive properties that you need later (this is important for the current EF Core query translator). Then perform the GroupBy and project the key properties / aggregates. Finally do the ordering.
So
var query = _context
.Organization
.SelectMany(organization => organization.Invoices, (organization, invoice) => new
{
organization.Id,
organization.Number,
organization.Name,
invoice.Date
})
.GroupBy(e => new
{
e.Number,
e.Name
})
.Select(g => new
{
Id = g.Min(e => e.Id),
Name = g.Key.Name,
Number = g.Key.Number,
First = g.Min(e => e.Date),
Last = g.Max(e => e.Date),
})
.OrderBy(e => e.Name);
is translated to
SELECT MIN([organization].[Id]) AS [Id], [organization].[Name], [organization].[Number],
MIN([organization.Invoice].[Date]) AS [First], MAX([organization.Invoice].[Date]) AS [Last]
FROM [Organization] AS [organization]
INNER JOIN [Invoice] AS [organization.Invoice] ON [organization].[Id] = [organization.Invoice].[OrganizationId]
GROUP BY [organization].[Number], [organization].[Name]
ORDER BY [organization].[Name]
SELECT
[TimeStampDate]
,[User]
,count(*) as [Usage]
FROM [EFDP_Dev].[Admin].[AuditLog]
WHERE [target] = '995fc819-954a-49af-b056-387e11a8875d'
GROUP BY [Target], [User] ,[TimeStampDate]
ORDER BY [Target]
My database table has the columns User, TimeStampDate, and Target (which is a GUID).
I want to retrieve all items for each date for each user and display count of entries.
The above SQL query works. How can I convert it into LINQ to SQL? Am using EF 6.1 and my entity class in C# has all the above columns.
Create Filter basically returns an IQueryable of the entire AuditLogSet :
using (var filter = auditLogRepository.CreateFilter())
{
var query = filter.All
.Where(it => it.Target == '995fc819-954a-49af-b056-387e11a8875d')
.GroupBy(i => i.Target, i => i.User, i => i.TimeStamp);
audits = query.ToList();
}
Am not being allowed to group by on 3 columns in LINQ and I am also not sure how to select like the above SQL query with count. Fairly new to LINQ.
You need to specify the group by columns in an anonymous type like this:-
var query = filter.All
.Where(it => it.Target == '995fc819-954a-49af-b056-387e11a8875d')
.GroupBy(x => new { x.User, x.TimeStampDate })
.Select(x => new
{
TimeStampDate= x.Key.TimeStampDate,
User = x.Key.User,
Usage = x.Count()
}).ToList();
Many people find query syntax simpler and easier to read (this might not be the case, I don't know), here's the query syntax version anyway.
var res=(from it in filter.All
where it.Target=="995fc819-954a-49af-b056-387e11a8875d"
group it by new {it.Target, it.User, it.TimeStampDate} into g
orderby g.Key.Target
select new
{
TimeStampDate= g.Key.TimeStampDate,
User=g.Key.User,
Usage=g.Count()
});
EDIT: By the way you don't need to group by Target neither OrderBy, since is already filtered, I'm leaving the exact translation of the query though.
To use GroupBy you need to create an anonymous object like this:
filter.All
.Where(it => it.Target == '995fc819-954a-49af-b056-387e11a8875d')
.GroupBy(i => new { i.Target, i.User, i.TimeStamp });
It is unnecessary to group by target in your original SQL.
filter.All.Where( d => d.Target == "995fc819-954a-49af-b056-387e11a8875d")
.GroupBy(d => new {d.User ,d.TimeStampDate} )
.Select(d => new {
User = d.Key.User,
TimeStampDate = d.Key.TimeStampDate,
Usage = d.Count()
} );
I'm trying to find all customer codes where the customer has a status of "A" and whose code does not contain any letter using LINQ query.
var activeCustomers = Customers.Where(x => x.Status == "A" && x.Code.Any(n => !char.IsLetter(n))).Select(x => x.Code);
When I run this query in LinqPad I get the following error:
You'll need to do this as a two part query. First, you could get all the users who's status is "A":
var activeCustomers = Customers.Where(x => x.Status == "A").ToList();
After you've got those in-memory, you can create an additional filter for char.IsDigit:
var codes = activeCustomers.Where(x => x.Code.Any(n => !char.IsLetter(n)))
.Select(x => x.Code)
.ToArray();
As commenters have stated, IsLetter() cannot be translated to SQL. However, you could do the following, which will first retrieve all items with Status "A" from the database, then will apply your criteria after retrieval:
var activeCustomers = Customers.Where(x => x.Status == "A").AsEnumerable().Where(x => x.Code.Any(n => !char.IsLetter(n))).Select(x => x.Code);
You'll have to determine if it's acceptable (from a performance perspective) to retrieve all customers with "A" and then process.
The AsEnumerable() transitions your LINQ query to working not with IQueryable (which works with SQL) but with IEnumerable, which is used for plain LINQ to objects.
Since it is LINQ 2 SQL, there is no natural way to translate char.IsLetter to something SQL can understand. You can hydrate a query that retrieves your potential candidates and then apply an addition in-memory filter. This also solves the issue where LINQ 2 SQL has a preference for a string and you are dealing with chars
var activeCustomers = Customers.Where(x => x.Status == "A").ToList();
var filteredCustomers = activeCustomers.Where(x =>
x.Code.Any(n => !char.IsLetter(n))).Select(x => x.Code).ToList();
There are two performance hits here. First, you're retrieving all potential records, which isn't too desirable. Second, in your above code you were only interested in an enumerable collection of codes, which means our query is including far more data than we originally wanted.
You could tighten up the query by only returning back to columns necessary to apply your filtering:
var activeCustomers = Customers.Where(x => x.Status == "A")
Select(x => new Customer{ Status = x.Status, Code = x.Code }).ToList();
You still return more sets than you need, but your query includes fewer columns.