How to translate this Queryable linq function - c#

I'm struggling trying to generate this LINQ function in a correct T-SQL function.
Please check the following sentence:
// determine the max count of exams applied by students
IQueryable query = (from at in Database.Current.AnsweredTests
where at.TestId == id
group at by at.StudentId into s
select s.Count()).Max();
As you can see this function is wrong talking about syntactically, because Max extension returns int. So which I'm trying to accomplish is to generate a correct T-SQL.
Something like this:
MAX(SELECT x.COUNT()
FROM...
GROUP BY StudentId)
I just did this because I want a good performance, and that is performing a low performance. So my problem is how can I write a correct LINQ sentence with the aggregate functions like MAX and COUNT.
UPDATE:
SELECT [GroupBy1].[A1] AS [C1]
FROM ( SELECT
[Extent1].[StudentId] AS [K1],
COUNT(1) AS [A1]
FROM [dbo].[AnsweredTests] AS [Extent1]
WHERE CAST( [Extent1].[TestId] AS int) = #p__linq__0
GROUP BY [Extent1].[StudentId]
) AS [GroupBy1]
This is what generate the IQueryable (if I remove the max extension, of course). I would like to know if is there a way to include the aggregate function MAX inside of that T-SQL Query to improve the performance on the Server side.

You could also word your query in the following way:
SELECT TOP 1 COUNT(*)
FROM AnsweredTests
WHERE TestId = #id
GROUP BY StudentId
ORDER BY COUNT(*) DESC
Following that logic, this (untested) should be what you are looking for:
var result = (from at in Database.Current.AnsweredTests
where at.TestId == id
group at by at.StudentId into s
orderby s.Count() descending
select s.Count()).First()

You can do ORDER BY DESCENDING and then take first:
var Max = (from at in Database.Current.AnsweredTests
where at.TestId == id
group at by at.StudentId into s
select new { Count = s.Count() }).OrderByDescending(o=>o.Count).First();

Related

Get only rows with the latest date for each name

I'm trying to write a query that returns only those rows that contain the latest date for each name.
So for example, this data:
Name
Date Sold
More Columns...
Bob
2021-01-05
Mike
2021-01-18
Susan
2021-01-23
Bob
2021-02-04
Susan
2021-02-16
Mike
2021-03-02
Would produce this result:
Name
Date Sold
More Columns...
Bob
2021-02-04
Susan
2021-02-16
Mike
2021-03-02
It's sort of like a GROUP BY, but I'm not aggregating anything. I only want to filter the original rows.
How could I write such a query?
NOTE: In the end, this will be a SQL Server query but I need to write it using Entity Framework.
UPDATE: In reality, this is part of a much more complex query. It would be extremely difficult for me to implement this as a raw SQL query. If at all possible, I need to implement using Entity Framework.
Two options
Select top 1 with ties *
From YourTable
Order by row_number() over (partition by Name order by Sold_Date desc)
or slightly more performant
with cte as (
Select *
,RN = row_number() over (partition by Name order by Sold_Date desc)
From YourTable
)
Select *
From cte
Where RN=1
Adapted from
Error while flattening the IQueryable<T> after GroupBy()
var names = _context.Items.Select(row => row.Name).Distinct();
var items =
from name in names
from item in _context.Items
.Where(row => row.Name == name)
.OrderByDescending(row => row.DateSold)
.Take(1)
select item;
var results = items.ToArrayAsync();
Let's break this down:
A query expression which establishes the keys for our next query. Will eventually be run as a subquery.
var names = _context.Items.Select(row => row.Name).Distinct();
Another query, starting with the keys...
var items =
from name in names
... and for each key, let's find the matching row ...
from item in _context.Items
.Where(row => row.Name == name)
.OrderByDescending(row => row.DateSold)
.Take(1)
... and we want that row.
select item;
Run the combined query.
var results = items.ToArrayAsync();
try this
;with Groups as
(
Select [Name], max([Date Sold]) as [Date Sold]
From Table
Group By [Name]
)
Select Table.* From Groups
Inner Join Table on Table.[Name] = Groups.Name And Table.[Date Sold] = Groups.[Date Sold]

Linq generate select with no from

In sql I can do this
select 'a' as MyColumn
so, i have a query in linq with entity framework that get some data from the database, but, i need union that query with one row, and in sql I can do this:
select ... from ...
union
select 'a' as MyColumn
How can i generate this query with linq?
I tried to do this:
var query = (from ... select new {..}).Union(new List<...> { new ...() { MyColumn = 'a' } })
But i gess that Entity Framework DON'T know how to translate that in memory list to sql
I need to get an IQueryable result, not a List or other in memory Collection, because i need to join that result to other sql linq querys in the future.
This isn't possible and you shouldn't do it. Both for the same reason: Entity Framework will try to translate the whole LINQ statement into SQL, including the local list (new List<...>).
The reason why it's not possible is that EF has no way to translate C# objects into SQL constructs.
The reason why you shouldn't do it is that it's incredibly wasteful: you build the list in C# code, EF (if it could) translates it into a SQL statement, the database runs the SQL statement and converts it to a result set, EF receives the result set and converts it into the list you originally offered it.
Just to demonstrate it, I'll show what happens if you do this with a list of primitive values which EF does know how to translate into SQL:
var ints = Enumerable.Range(1,5);
var res = Products.Select(c => c.Id).Union(ints).ToList();
This produces the following SQL statement:
SELECT
[Distinct1].[C1] AS [C1]
FROM ( SELECT DISTINCT
[UnionAll5].[ProductId] AS [C1]
FROM (SELECT
[Extent1].[ProductId] AS [ProductId]
FROM [dbo].[Product] AS [Extent1]
UNION ALL
SELECT
1 AS [C1]
FROM ( SELECT 1 AS X ) AS [SingleRowTable1]
UNION ALL
SELECT
2 AS [C1]
FROM ( SELECT 1 AS X ) AS [SingleRowTable2]
UNION ALL
SELECT
3 AS [C1]
FROM ( SELECT 1 AS X ) AS [SingleRowTable3]
UNION ALL
SELECT
4 AS [C1]
FROM ( SELECT 1 AS X ) AS [SingleRowTable4]
UNION ALL
SELECT
5 AS [C1]
FROM ( SELECT 1 AS X ) AS [SingleRowTable5]) AS [UnionAll5]
) AS [Distinct1]
As you see, for each element in the list EF generated a SingleRowTablex entry to build a "temp table" to UNION with the ids from the actual query.
Conclusion: just query what you need from the database and add to the result afterwards. It's easy enough to do that:
(from ... select new {..})
.AsEnumerable() // continue in memory
.Union(...)

This Any is better or not than this contains?

I am using EF6 and I would like to get the records in a table which are in a group of IDs.
In my test for example I am using 4 IDs.
I try two options, the first is with any.
dbContext.MyTable
.Where(x => myIDS.Any(y=> y == x.MyID));
And the T-SQL that this linq exrepsion generates is:
SELECT
*
FROM [dbo].[MiTabla] AS [Extent1]
WHERE EXISTS (SELECT
1 AS [C1]
FROM (SELECT
[UnionAll2].[C1] AS [C1]
FROM (SELECT
[UnionAll1].[C1] AS [C1]
FROM (SELECT
cast(130 as bigint) AS [C1]
FROM ( SELECT 1 AS X ) AS [SingleRowTable1]
UNION ALL
SELECT
cast(139 as bigint) AS [C1]
FROM ( SELECT 1 AS X ) AS [SingleRowTable2]) AS [UnionAll1]
UNION ALL
SELECT
cast(140 as bigint) AS [C1]
FROM ( SELECT 1 AS X ) AS [SingleRowTable3]) AS [UnionAll2]
UNION ALL
SELECT
cast(141 as bigint) AS [C1]
FROM ( SELECT 1 AS X ) AS [SingleRowTable4]) AS [UnionAll3]
WHERE [UnionAll3].[C1] = [Extent1].[MiID]
)
How can is seen, the T-SQL is a "where exists" that use many subqueries and unions.
The second option is with contains.
dbContext.MyTable
.Where(x => myIDS.Contains(x.MiID));
And the T-SQL:
SELECT
*
FROM [dbo].[MiTabla] AS [Extent1]
WHERE [Extent1].[MiID] IN (cast(130 as bigint), cast(139 as bigint), cast(140 as bigint), cast(141 as bigint))
The contains is translated into "where in", but the query is much less complex.
I have read that any it use to be faster, so I have the doubt if the any is, although it is more complex at a first glance, is faster or not.
Thank so much.
EDIT: I have some test (I don't know if this is the best way to test this).
System.Diagnostics.Stopwatch miswContains = new System.Diagnostics.Stopwatch();
miswContains.Start();
for (int i = 0; i < 100; i++)
{
IQueryable<MyTable> iq = dbContext.MyTable
.Where(x => myIDS.Contains(x.MyID));
iq.ToArrayAsync();
}
miswContains.Stop();
System.Diagnostics.Stopwatch miswAny = new System.Diagnostics.Stopwatch();
miswAny.Start();
for (int i = 0; i < 20; i++)
{
IQueryable<MyTable> iq = dbContext.Mytable
.Where(x => myIDS.Any(y => y == x.MyID));
iq.ToArrayAsync();
}
miswAny.Stop();
the results are that miswAny is about 850ms and the miswContains is about 4251ms.
So the second option, with contaions, is slower.
Your second option is the fastest solution I can think of (at least for not very large arrays of ids) provided your MiTabla.MiID is in an index.
If you want to read more about in clause performance: Is SQL IN bad for performance?.
If you know the ID, then using LINQ2SQL Count() method would create a much cleaner and faster SQL code (than both Any and Contains):
dbContext.MyTable
.Where(x => myIDS.Count(y=> y == x.MyID) > 0);
The generated SQL for the count should look something like this:
DECLARE #p0 Decimal(9,0) = 12345
SELECT COUNT(*) AS [value]
FROM [ids] AS [t0]
WHERE [t0].[id] = #p0
You can tell by the shape of the queries that Any is not scalable at all. It doesn't take many elements in myIDS (~50 probably) to get a SQL exception that the maximum nesting level has exceeded.
Contains is much better in this respect. It can handle a couple of thousands of elements before its performance gets severely affected.
So I would go for the scalable solution, even though Any may be faster with small numbers. It is possible to make Contains even better scalable.
I have read that any it use to be faster,
In LINQ-to-objects that's generally true, because the enumeration stops at the first hit. But with LINQ against a SQL backend, the generated SQL is what counts.

EntityFramework Group by not included in SQL statement

I'm trying to create a query similar to this:
select randomId
from myView
where ...
group by randomId
NOTE: EF doesn't support the distinct so I was thinking of going around the lack of it with the group by (or so I think)
randomId is numeric
Entity Framework V.6.0.2
This gives me the expected result in < 1 second query
When trying to do the same with EF I have been having some issues.
If I do the LINQ similar to this:
context.myView
.Where(...)
.GroupBy(mt => mt.randomId)
.Select({ Id = group.Key, Count = group.Count() } )
I will get sort of the same result but forcing a count and making the query > 6 seconds
The SQL EF generates is something like this:
SELECT
1 AS [C1],
[GroupBy1].[K1] AS [randomId],
[GroupBy1].[A1] AS [C2]
FROM (
SELECT
[Extent1].[randomId] AS [K1],
COUNT(1) AS [A1]
FROM [dbo].[myView] AS [Extent1]
WHERE (...)
GROUP BY [Extent1].[randomId]
) AS [GroupBy1]
But, if the query had the count commented out it would be back to < 1 second
If I change the Select to be like:
.Select({ Id = group.Key} )
I will get all of rows without the group by statement in the SQL query and no Distinct whatsoever:
SELECT
[Extent1].[anotherField] AS [anotherField], -- 'this field got included automatically on this query and I dont know why, it doesnt affect outcome when removed in SQL server'
[Extent1].[randomId] AS [randomId]
FROM [dbo].[myView] AS [Extent1]
WHERE (...)
Other failed attempts:
query.GroupBy(x => x.randomId).Select(group => group.FirstOrDefault());
The query that was generated is as follows:
SELECT
[Limit1].ALL FIELDS,...
FROM (SELECT
[Extent1].[randomId] AS [randomId]
FROM [dbo].[myView] AS [Extent1]
WHERE (...) AS [Project1]
OUTER APPLY (SELECT TOP (1)
[Extent2].ALL FIELDS,...
FROM [dbo].[myView] AS [Extent2]
WHERE (...) AS [Limit1] -- same as the where above
This query performed rather poorly and still managed to return all Ids for the where clause.
Does anyone have an idea on how to force the usage of the group by without an aggregating function like a count?
In SQL it works but then again I have the distinct keyword as well...
Cheers,
J
var query = from p in TableName
select new {Id = p.ColumnNameId};
var distinctItems = query.Distinct().ToList();
Here is the linq query however you should be able to write an equivalent from EF dbset too. If you have issues let me know.
Cheers!

LINQ to Entities find top records from ordered groupings

I have a problem that I know how to solve in SQL but not with Linq to Entities.
My data looks like this:
ID GROUP TIMESTAMP
-- ----- ---------
1 A 2011-06-20
2 A 2011-06-21
3 B 2011-06-21
4 B 2011-06-22
5 B 2011-06-23
6 C 2011-06-30
I want to retrieve all the Entity objects (not just the ID) such that I am only getting the most recent record from each group. (ie. the records with ids 2, 5, 6)
In SQL I would do something like this:
SELECT * FROM my_table a
WHERE a.timestamp =
(SELECT MAX(timestamp) FROM my_table b
WHERE a.group = b.group)
(For the sake of this question you can assume that timestamp is unique within each group).
I'd like to do this query against a WCF Data Service using Linq to Entities but I can't seem to have a nested query that references the outside query like this. Can anyone help?
Possibly not as clean and efficient as the hand written version but here's what I came up with
var q = from a in db.MyEntities
where a.Timestamp == (from b in db.MyEntities
where b.Group == a.Group
select b.Timestamp).Max()
select a;
which translates into this SQL
SELECT
[Project1].[Id] AS [Id],
[Project1].[Group] AS [Group],
[Project1].[Timestamp] AS [Timestamp]
FROM ( SELECT
[Extent1].[Id] AS [Id],
[Extent1].[Group] AS [Group],
[Extent1].[Timestamp] AS [Timestamp],
[SSQTAB1].[A1] AS [C1]
FROM [MyEntities] AS [Extent1]
OUTER APPLY
(SELECT
MAX([Extent2].[Timestamp]) AS [A1]
FROM [MyEntities] AS [Extent2]
WHERE [Extent2].[Group] = [Extent1].[Group]) AS [SSQTAB1]
) AS [Project1]
WHERE [Project1].[Timestamp] = [Project1].[C1]
Hi try to use linqer that will convert your sql statements to linq query.
Linqer
Best Regards
This should work:
var query = db.my_table
.GroupBy(p=>p.group)
.Select(p=>p.OrderByDescending(q=>q.timestamp).First());
Here you go.A simple way to do.
var result = (from x in my_table
group x by x.Group into g
select new
{
g.Key,
timestamp = g.Max(x => x.TimeStamp),
g //This will return everything in g
});

Categories