Sql is fast but when converted to linq it's slow - c#

This is fast
SELECT Foo,
count(*)
FROM
(SELECT Foo
FROM MyTable
GROUP BY Foo,
Bar,
Baz) AS Subquery
GROUP BY Foo
This is fast
var query = from fooGrp in
(from rv in _myRepository.AsQueryable()
group rv by new {rv.Foo, rv.Bar, rv.Baz}
into grp
select grp)
group fooGrp by fooGrp.Key.Foo
into grp2
select new {grp2.Key, Count = grp2.Count()};
query.ToDictionary(x => x.Key, x => x.Count);
This is slow, really slow!
_myRepository.AsQueryable()
.GroupBy(x => new { x.Foo, x.Bar, x.Baz })
.GroupBy(x => x.Key.Foo)
.ToDictionary(x => x.Key, x => x.Count());
I don't understand :(
What is the difference between the two linq expressions? They both return the expected result set.
The generated SQL for the first expression (fast) is:
SELECT
1 AS [C1],
[GroupBy1].[K1] AS [Foo],
[GroupBy1].[A1] AS [C2]
FROM ( SELECT
[Distinct1].[Foo] AS [K1],
COUNT(1) AS [A1]
FROM ( SELECT DISTINCT
[Extent1].[Foo] AS [Foo],
[Extent1].[Bar] AS [Bar],
[Extent1].[Baz] AS [Baz]
FROM [dbo].[MyTable] AS [Extent1]
) AS [Distinct1]
GROUP BY [Distinct1].[Foo]
) AS [GroupBy1]
The generated SQL for the second expression (slow) is:
Is so long that it exceeds the character limit of this post, so cannot post it :/

So turns out that the LINQ expressions are not identical.
The correct linq expression is:
_myRepository.AsQueryable()
.GroupBy(x => new {x.Foo, x.Bar, x.Baz})
.GroupBy(x => x.Key.Foo)
.Select(x => new {x.Key, Count = x.Count()})
.ToDictionary(x => x.Key, x => x.Count);
I was missing a select, which I didn't expect because in normal SQL you can only select on columns that are in the group by clause. But LINQ does all kinds of magic to get the rest of the columns included, unless you limit it in the select.

Related

Linq Query With Multiple Joins Not Giving Correct Results

I have a Linq query which is being used to replace a database function. This is the first one with multiple joins and I can't seem to figure out why it returns 0 results.
If you can see any difference which could result in the incorrect return it would be greatly appreciated......I've been trying to solve it longer than I should have.
Linq Query
context.StorageAreaRacks
.Join(context.StorageAreas, sar => sar.StorageAreaId, sa => sa.Id, (sar, sa) => new { sar, sa })
.Join(context.StorageAreaTypes, xsar => xsar.sar.StorageAreaId, sat => sat.Id, (xsar, sat) => new { xsar, sat })
.Join(context.Racks, xxsar => xxsar.xsar.sar.RackId, r => r.Id, (xxsar, r) => new { xxsar, r })
.Where(x => x.xxsar.sat.IsManual == false)
.Where(x => x.r.IsEnabled == true)
.Where(x => x.r.IsVirtual == false)
.Select(x => new { x.xxsar.sat.Id, x.xxsar.sat.Name })
.Distinct()
.ToList();
This is the query which is generated by the LINQ query
SELECT
[Distinct1].[C1] AS [C1],
[Distinct1].[Id] AS [Id],
[Distinct1].[Name] AS [Name]
FROM ( SELECT DISTINCT
[Extent2].[Id] AS [Id],
[Extent2].[Name] AS [Name],
1 AS [C1]
FROM [dbo].[StorageAreaRacks] AS [Extent1]
INNER JOIN [dbo].[StorageAreaTypes] AS [Extent2] ON [Extent1].[StorageAreaId] = [Extent2].[Id]
INNER JOIN [dbo].[Racks] AS [Extent3] ON [Extent1].[RackId] = [Extent3].[Id]
WHERE (0 = [Extent2].[IsManual]) AND (1 = [Extent3].[IsEnabled]) AND (0 = [Extent3].[IsVirtual])
) AS [Distinct1]
Sql Query which produces required results
SELECT DISTINCT sat.Name, sat.Id
FROM StorageAreaRacks sar
JOIN StorageAreas sa on sa.id = sar.StorageAreaId
JOIN StorageAreaTypes sat on sat.id = sa.StorageAreaTypeId
JOIN Racks r on r.id = sar.RackId
WHERE sat.IsManual = 0
AND r.IsEnabled = 1
AND r.IsVirtual = 0
Using joins with LINQ method syntax is hard to read and error prone.
Using joins with LINQ query syntax is better, but still error prone (you can join by the wrong key as you did) and does not give you information about join cardinality.
The best for LINQ to Entities queries is to use navigation properties (as Gert Arnold suggested in the comments and not only - see Don’t use Linq’s Join. Navigate!) because they have none of the aforementioned drawbacks.
The whole query should be something like this:
var query = context.StorageAreaRacks
.Where(sar => !sar.StorageArea.StorageAreaType.IsManual
&& sar.Rack.IsEnabled && !sar.Rack.IsVirtual)
.Select(sar => new
{
sar.StorageArea.StorageAreaType.Id,
sar.StorageArea.StorageAreaType.Name,
})
.Distinct();
or
var query = (
from sar in context.StorageAreaRacks
let sat = sar.StorageArea.StorageAreaType
let r = sar.Rack
where !sat.IsManual && r.IsEnabled && !r.IsVirtual
select new { sat.Id, sat.Name })
.Distinct();
Simple, readable and almost no place for mistakes. Navigation properties are one of the most beautiful features of EF, don't miss them.
Your LINQ doesn't translate the SQL properly; it Joins the StorageAreaTypes on the StorageAreaRack.StorageAreaId instead of on the StorageAreas.StorageAreaTypeId, which is why EF drops the StorageAreas Join - it has no effect on the outcome.
I think it is clearer if you elevate the members of each join to flatten the anonymous objects and name them based on their members (that are the join tables). Also, no reason to separate the Where clauses, LINQ can use && as well as SQL using AND. Also, if you have boolean values, don't compare them to true or false. Also there is no reason to pass range variables through that aren't used later.
Putting it all together:
var ans = context.StorageAreaRacks
.Join(context.StorageAreas, sar => sar.StorageAreaId, sa => sa.Id, (sar, sa) => new { sar, sa })
.Join(context.StorageAreaTypes, sarsa => sarsa.sa.StorageAreaTypeId, sat => sat.Id, (sarsa, sat) => new { sarsa.sar, sat })
.Join(context.Racks, sarsat => sarsat.sar.RackId, r => r.Id, (sarsat, r) => new { sarsat.sat, r })
.Where(satr => !satr.sat.IsManual && satr.r.IsEnabled && !satr.r.IsVirtual)
.Select(satr => new { satr.sat.Id, satr.sat.Name })
.Distinct()
.ToList();
However, I think when multiple joins are involved and when translating SQL, LINQ comprehension syntax can be easier to understand:
var ans = (from sar in context.StorageAreaRacks
join sa in context.StorageAreas on sar.StorageAreaId equals sa.Id
join sat in context.StorageAreaTypes on sa.StorageAreaTypeId equals sat.Id
join r in context.Racks on sar.RackId equals r.Id
where !sat.IsManual && r.IsEnabled && !r.IsVirtual
select new {
sat.Name,
sat.Id
}).Distinct().ToList();
You are missing a Where for your rack ID != null in your LINQ statement, and a Distinct().

Relevant linq query for the SQL

My query initially was
select *
from Personalization_Mapping
The relevant LINQ query was
List<Personalization_Mapping> list = _appDbContext.Personalization_Mapping.OrderBy(s => s.ID).ToList();
Now I need the relevant unique columns for which, I changed the SQL to
SELECT *
FROM
(SELECT
*,
ROW_NUMBER() OVER(PARTITION BY CustomerID ORDER BY ID DESC) rn
FROM
Personalization_Mapping) a
WHERE
rn = 1
Could any one help me finding the equivalent LINQ query?
Thanks in advance.
var result = _appDbContext.Personalization_Mapping.OrderByDescending(x => x.ID)
.GroupBy(x => x.CustomerID)
.Select(g => new {g, count= g.Count()})
.SelectMany(t => t.g.Select(b => b)
.Zip(Enumerable.Range(1,t.count), (j,i) => new {j.Property1, j.Property2, rn = i}));
Replace Property1, Property2 with actual entities.
Now apply filter for row 1
result.Where(x => x.rn == 1);
Hope this helps.

Entity Framework grouping by column from join

I have the next query:
select VisitLines.ProcedureId, COUNT(DISTINCT VisitLines.VisitId) as nt
from Visits
LEFT JOIN VisitLines ON Visits.Id = VisitLines.VisitId
WHERE Visits.VisitStatusId = 1 AND Visits.IsActive = 1 AND VisitLines.IsActive = 1
GROUP BY VisitLines.ProcedureId
Main question: Does ability exists to grouping by column from join using linq ? I'm wondering how to do it using 'collection' column.
Is it possible to force EF to generate COUNT(DISTINCT column) ? IQueryable.GroupBy.Select(x => x.Select(n => n.Number).Distinct().Count()) generate query with few subqueries which much slower then COUNT(DISTINCT )
I found. Need to use SelectMany with second parameter resultSelector:
dbContext.Visits.Where(x => x.IsActive)
.SelectMany(x => x.VisitLines, (v, vl) => new
{
v.Id,
vl.ProcedureId
})
.GroupBy(x => x.ProcedureId)
.Select(x => new
{
Id = x.Key,
VisitCount = x.Count()
}).ToArray();
It generates the desired SQL, but with exception that I need distinct count by visit.
And if I change VisitCount = x.Distinct().Count() then EF generates a query with few subqueries again. But the main issue resolved

How can i get primary of a new table generated from OrderBy / GroupBy?

How can I get primary of a new table generated from OrderBy / GroupBy?
var something = (from m in _db.Requests
where m.StoreID == myRequest.StoreID
where m.AcceptedTime != null
where System.Data.Entity.DbFunctions.TruncateTime(m.RequestTime) == today
group m by m.StaffID into g
let TotalPoints = g.Count()
orderby TotalPoints ascending
select new { User = g.Key});
then, I try to get the 1st result which will be the least times "m" appeared in my Requests table
var thisStaff = something.Select(o=>o.User).Take(1).ToString();
However, the value of "thisStaff" is not StaffID which is the Key of my Request table. The value in it is
SELECT TOP (1) [Project1].[StaffID] AS [StaffID]
FROM ( SELECT [GroupBy1].[A1] AS [C1], [GroupBy1].[K1] AS [StaffID]
FROM ( SELECT [Extent1].[StaffID] AS [K1], COUNT(1) AS [A1]
FROM [dbo].[Requests] AS [Extent1]
WHERE ([Extent1].[StoreID] = #p__linq__0) AND
([Extent1].[AcceptedTime] IS NOT NULL) AND
((convert (datetime2, convert(varchar(255), [Extent1].[RequestTime], 102) , 102)) = #p__linq__1)
GROUP BY [Extent1].[StaffID] ) AS [GroupBy1] ) AS [Project1]
ORDER BY [Project1].[C1] ASC
Please suggest how i should change it. By the way, I've also tried using the following and get almost same result.
var something2 = _db.Requests
.Where(o => o.StoreID == myRequest.StoreID)
.Where(o => o.AcceptedTime != null)
.Where(o => System.Data.Entity.DbFunctions.TruncateTime(o.RequestTime) == today)
.GroupBy(x => x.StaffID)
.Select(x => new
{
Count = x.Count(),
Name = x.Key,
})
.OrderBy(x => x.Count)
.Take(1);
Your query is fine, you're just lacking the method call that will call the database and return the result (a feature of EF is deferred execution), FirstOrDefault() should do the trick:
var thisStaff = something.Select(o => o.User).FirstOrDefault();
The value that you see is the value that the IQueryable you have constructed returns for its ToString method (it will be the SQL that is run against the database when the query is executed).

Returning (blanks) in many to many Linq query

A follow up to this question: Changing a linq query to filter on many-many
I have the following Linq query
public static List<string> selectedLocations = new List<string>();
// I then populate selectedLocations with a number of difference strings, each
// corresponding to a valid Location
viewModel.people = (from c in db.People
select c)
.OrderBy(x => x.Name)
.ToList();
// Here I'm basically filtering my dataset to include Locations from
// my array of selectedLocations
viewModel.people = from c in viewModel.people
where (
from a in selectedLocations
where c.Locations.Any(o => o.Name == a)
select a
).Any()
select c;
How can I modify the query so that it also returns people that have NO location set at all?
You can do filtering on database side:
viewModel.people =
(from p in db.People
where !p.Locations.Any() ||
p.Locations.Any(l => selectedLocations.Contains(l.Name))
orderby p.Name
select p).ToList();
Or lambda syntax:
viewModel.people =
db.People.Where(p => !p.Locations.Any() ||
p.Locations.Any(l => selectedLocations.Contains(l.Name)))
.OrderBy(p => p.Name)
.ToList();
EF will generate two EXISTS subqueries in this case. Something like:
SELECT [Extent1].[Name]
[Extent1].[Id]
-- other fields from People table
FROM [dbo].[People] AS [Extent1]
WHERE (NOT EXISTS (SELECT 1 AS [C1]
FROM [dbo].[PeopleLocations] AS [Extent2]
WHERE [Extent2].[PersonId] = [Extent1].[Id])
OR EXISTS (SELECT 1 AS [C1]
FROM [dbo].[PeopleLocations] AS [Extent3]
WHERE [Extent3].[PersonId] = [Extent1].[Id])
AND [Extent3].[Name] IN ('location1', 'location2')))
ORDER BY [Extent1].[Name] ASC

Categories