Linq-to-Sql query with group by - c#

There is a table in database:
[id] [int] IDENTITY(1,1) NOT NULL,
[name] [varchar](150) NOT NULL,
[date_of_birth] [date] NOT NULL,
I need to get a datasource for GridView which contains 2 columns.
----------------
| age || count |
----------------
| 20 || 3 |
| 21 || 4 |
| 25 || 5 |
----------------
Is it possible to do it with one query?
What I've tried:
var dates = (from u in db.Users
group u by u.date_of_birth into g
select new { age = calculateAge(g.Key) }).ToList();
var dates1 = from d in dates
group d by d.age into g
select new { age = g.Key, count = g.Count() };
GridView1.DataSource = dates1;
GridView1.DataBind();
This works but I think there is a way to make it more simpler. Or not?
P.S. calculateAge has the following signature
private int calculateAge(DateTime date_of_birth)

First of all you have a logic flaw in the first query. You are using grouping by date, not by age, so peoples who born in same year but on different day will be grouped into different sets. This 'group by' is superseded by 2nd query 'group by' clause.
2nd thing to note is that your 2nd query will use actual results of 1st query (see ToList() call) so it will be running on .NET, not on SQL side.
Here is my vision for you query (count number of peoples with same age):
var dates = from u in db.Users
select new { age = calculateAge(g.Key) } into ages
group ages by ages.age into g
select new { age = g.Key, count = g.Count() };
Or even without anonymous type declaration:
var dates = from u in db.Users
select calculateAge(g.Key) into ages
group ages by ages into g
select new { age = g.Key, count = g.Count() };

Related

LINQ Left Join and GroupBy

Let say I have this SQL Query:
SELECT
PromotionSlot.StaffFName,
SUM(PromotionSlot.Max_Occupancy) AS TOTAL,
SUM(DISTINCT(Booking.Quantity)) AS Occupied
From PromotionSlot
LEFT JOIN Booking
ON PromotionSlot.StaffID=Booking.StaffID
GROUP BY PromotionSlot.StaffFName
Result:
|StaffFName |TOTAL |Occupied|
-----------------------------------
|Jason |13 |1 |
|John Doe |9 |0 |
|Marry Jane |7 |2 |
This is my DB Table:
PromotionSlot TABLE: ID(PK),Max_Occupancy,StaffFName..., StaffID(FK)
Booking TABLE: ID(PK), Quantity...,StaffID(FK)
How can I translate it into LINQ? This is my attempt:
var staffData =
(from ps in dc.PromotionSlots
join b in dc.Bookings on ps.StaffID = b.StaffID
group ps by ps.StaffFName into NewGroup
select new dataView
{
StaffFName = NewGroup.Key,
Total = NewGroup.Sum(a => a.Max_Occupancy),
//problem:
//Occupied = NewGroup.Sum(b => b.Quantity)
}
Plan to have Occupied = NewGroup.Sum(b => b.Quantity) but when I try to point the b to the quantity column from Booking table it shows error(red-line) and I think the problems comes from group ps by ps.StaffFName into NewGroup makes it available for PromotionSlot table instead of Booking table. But I totally have no idea how to solve this!
Based on your SQL query, what you need is take Distinct quanties and Sum them.
var staffData =
(from ps in dc.PromotionSlots
join b in dc.Bookings on ps.StaffID = b.StaffID into slots
from slot in slots.DefaultIfEmpty()
group new {ps, slot} by ps.StaffFName into NewGroup
select new dataView
{
StaffFName = NewGroup.Key,
Total = NewGroup.Sum(a => a.ps!=null? a.ps.Max_Occupancy: 0),
//problem:
Occupied = NewGroup.Select(x=>x.slot.Quantity).Distinct().Sum()
}

how to group by in LINQ and not select all columns

I am trying to write a LINQ query which is complex for me now since I am a newbie to LINQ.
I have a table like below...
UserId | CompanyId | ProblemDescription | CreatedTime | TimeSpentMins
-----------------------------------------------------------------------
1 | 95 | Sysmtem is crashed | 2016-01-01 15:23 | 25
1 | 95 | Total is incorrect | 2016-01-01 15:45 | 45
I want to write a LINQ query that will do the job below. CreatedTime has date and time but I want to group it by only date.
SELECT UserId, CompanyId,CreateTime Sum(TimeSpentMins)
FROM TransactionLogs
GROUP BY UserId, CompanyId, Convert(DATE,CreatedTime)
How can I write this LINQ? I wanted to put my code below but I got nothing :(
Simply use the GroupBy extension method and use EntityFunctions.TruncateTime method to get only the date part:-
var result = db.TransactionLogs
.GroupBy(x => new
{
CreateTime = EntityFunctions.TruncateTime(x.CreatedTime),
UserId,
CompanyId
})
.Select(x => new
{
UserId = x.Key.UserId,
CompanyId = x.Key.CompanyId,
CreateTime = x.Key.CreateTime,
TotalTimeSpentMins = x.Sum(z => z.TimeSpentMins)
});
try this one:
var result = db.TransactionLogs
.GroupBy(_ => new {
_.UserId, _.CompanyId, DbFunctions.TruncateTime(_.CreatedTime)})
.Select(_ => new {
_.UserId, _.CompanyId, DbFunctions.TruncateTime(_.CreatedTime),
Total = _.Sum(t => t.TimeSpentMins)});

Speed up the linq group by statement

I have a table like this
UserID Year EffectiveDate Type SpecialExpiryDate
1 2015 7/1/2014 A
1 2016 7/1/2015 B 10/1/2015
there is no ExpriyDate in the table because it is only valid for one year, so the expiry date can be calculated from the effective date by adding a year.
The result I want to get is like this (the current year's effective date and the next year's expiry date)
UserID EffectiveDate ExpiryDate
1 7/1/2014 7/1/2016
And If the user's type is B, then there will be a special expiry date, so for this person, the result will be
UserID EffectiveDate ExpiryDate
1 7/1/2014 10/1/2015
Here is the code I wrote
var result = db.Table1
.Where(x => x.Year>= 2015 && (x.Type == "A" || x.Type == "B"))
.GroupBy(y => y.UserID)
.OrderByDescending(x => x.FirstOrDefault().Year)
.Select(t => new
{
ID = t.Key,
Type = t.FirstOrDefault().Type,
EffectiveDate = t.FirstOrDefault().EffectiveDate,
ExpiryDate = t.FirstOrDefault().SpecialExpiryDate != null ? t.FirstOrDefault().SpecialExpiryDate : (t.Count() >= 2 ? NextExpiryDate : CurrentExpiryDate)
}
);
The code can get the result I need, but the problem is that in the result set there are about 10000 records which took about 5 to 6 seconds. The project is for a web search API, so I want to speed it up, is there a better way to do the query?
Edit
Sorry I made a mistake, in the select clause it should be
EffectiveDate = t.LastOrDefault().EffectiveDate
but in the Linq of C#, it didn't support this LastOrDefault function transfered to sql, and it cause the new problem, what is the easiest way to get the second item of the group?
You could generate the calculated data on the fly, using a View in your database.
Something like this (pseudocode):
Create View vwUsers AS
Select
UserID,
Year,
EffectiveDate,
EffectiveData + 1 as ExpiryDate, // <--
Type,
SpecialExpiryDate
From
tblUsers
And just connect your LINQ query to that.
Try this:
var result =
db
.Table1
.Where(x => x.Year>= 2015 && (x.Type == "A" || x.Type == "B"))
.GroupBy(y => y.UserID)
.SelectMany(y => y.Take(1), (y, z) => new
{
ID = y.Key,
z.Type,
z.EffectiveDate,
ExpiryDate = z.SpecialExpiryDate != null
? z.SpecialExpiryDate
: (t.Count() >= 2 ? NextExpiryDate : CurrentExpiryDate),
z.Year,
})
.OrderByDescending(x => x.Year);
The .SelectMany(y => y.Take(1) effectively does the .FirstOrDefault() part of your code. By doing this once rather than for many properties you may improve the speed immensely.
In a test I performed using a similarly structured query I got these sub-queries being run when using your approach:
SELECT t0.increment_id
FROM sales_flat_order AS t0
GROUP BY t0.increment_id
SELECT t0.hidden_tax_amount
FROM sales_flat_order AS t0
WHERE ((t0.increment_id IS NULL AND #n0 IS NULL) OR (t0.increment_id = #n0))
LIMIT 0, 1
-- n0 = [100000001]
SELECT t0.customer_email
FROM sales_flat_order AS t0
WHERE ((t0.increment_id IS NULL AND #n0 IS NULL) OR (t0.increment_id = #n0))
LIMIT 0, 1
-- n0 = [100000001]
SELECT t0.hidden_tax_amount
FROM sales_flat_order AS t0
WHERE ((t0.increment_id IS NULL AND #n0 IS NULL) OR (t0.increment_id = #n0))
LIMIT 0, 1
-- n0 = [100000002]
SELECT t0.customer_email
FROM sales_flat_order AS t0
WHERE ((t0.increment_id IS NULL AND #n0 IS NULL) OR (t0.increment_id = #n0))
LIMIT 0, 1
-- n0 = [100000002]
(This continued on for two sub-queries per record number.)
If I ran my approach I got this single query:
SELECT t0.increment_id, t1.hidden_tax_amount, t1.customer_email
FROM (
SELECT t2.increment_id
FROM sales_flat_order AS t2
GROUP BY t2.increment_id
) AS t0
CROSS APPLY (
SELECT t3.customer_email, t3.hidden_tax_amount
FROM sales_flat_order AS t3
WHERE ((t3.increment_id IS NULL AND t0.increment_id IS NULL) OR (t3.increment_id = t0.increment_id))
LIMIT 0, 1
) AS t1
My approach should be much faster.

Gnarly Linq query

Have the following table structure
I need the count of transcriptions by statuses where the records do not have a workflow folder. This does the trick:
from p in Transcriptions
where p.WorkflowfolderID == null
group p by p.TranscriptionStatus.Description into grouped
select new
{
xp=grouped.Key,
xp1= grouped.Count(),
}
Now I need to add the number of records where the Dueon date is in the past as in it is past the due by date.Something like
EntityFunctions.DiffHours(p.DueOn,DateTime.Today)>0
How do I include this in the resultset without firing 2 SQL queries? I am happy to get it as a third column with the same value in every row. Also is there anyway to get the percentage into the mix as in:
Status | Count | % |
------------------------------
Status1 | 20 | 20%
Status2 | 30 | 30%
Status3 | 30 | 30%
Overdue |20 | 20%
I have added Overdue as a row but perfectly happy to get it as a column with the same values.
Edited Content
Well this is the best I could come up with. Its not a single query but there is only one SQL trip. The result is:
Status | Count
----------------
Status1 | 20
Status2 | 30
Status3 | 30
Overdue |20
var q1= from p in Transcriptions
where p.WorkflowfolderID == null
group p by p.TranscriptionStatus.Description into grouped
select new
{
status= (string)grouped.Key,
count= grouped.Count()
};
var q2 =(
from p in Transcriptions select new {status = "Overdue",
count = (from x in Transcriptions
where x.DueOn.Value < DateTime.Now.AddHours(-24)
group x by x.TranscriptionID into
grouped select 1).Count() }).Distinct();
q1.Union(q2)
It is a Union clause with the % calculation to be done once the results are returned. The weird thing is that I couldn't figure out any clean way to represent the following SQL in a LINQ statement which has resulted in the rather messy LINQ in the var q2.
SELECT COUNT(*) , 'test' FROM [Transcription]
You can add a condition to Count:
from p in Transcriptions
where p.WorkflowfolderID == null
group p by p.TranscriptionStatus.Description into grouped
select new
{
xp=grouped.Key,
xp1= grouped.Count(),
xp2= grouped
.Count(p => EntityFunctions.DiffHours(p.DueOn, DateTime.Today) > 0)
}
By the way, with entity framework you can also use p.DueOn < DateTime.Today.
#Gert Arnold
from p in Transcriptions
where p.WorkflowfolderID == null
group p by p.TranscriptionStatus.Description into grouped
select new
{
status= (string)grouped.Key,
count= grouped.Count(),
overdue= grouped.Count(p => p.DueOn < EntityFunctions.AddHours(DateTime.Today, -24)),
}
The above query does work as I wanted it to . It produces the outcome in the format
Status| Count | Overdue
----------------------
status1|2|0
status2|1|1
The only downside is the generated SQL is running 2 queries BOTH with inner joins . My original idea with the Union may be a better idea performance wise but you answered my query and for that I am grateful.
Can this query be represented in some other cleaner manner than my above attempt -
SELECT COUNT(*) , 'test' FROM [Transcription]

how to order a group result with Linq?

How can I order the results from "group ... by... into..." statement in linq?
For instance:
var queryResult = from records in container.tableWhatever
where records.Time >= DateTime.Today
group records by tableWhatever.tableHeader.UserId into userRecords
select new { UserID = userRecords.Key, Records = userRecords };
The query returns records in table "contain.tableWhatever" grouped by "UserId". I want the returned results within each group ordered by time decending. How can I do that?
More specific, assume the above query return only one group like the following:
{UserID = 1, Records= {name1 5/3/2010_7:10pm;
name2 5/3/2010_8:10pm;
name3 5/3/2010_9:10pm} }
After insert the orderby statement in the above query, the returned results should be like this:
{UserID = 1, Records= {name3 5/3/2010_9:10pm;
name2 5/3/2010_8:10pm;
name1 5/3/2010_7:10pm} }
Thanks for help!
Simply use the OrderByDescending extension to order the records in the anonymous type.
var queryResult = from records in container.tableWhatever
where records.Time >= DateTime.Today
group records by tableWhatever.tableHeader.UserId into userRecords
select new
{
UserID = userRecords.Key,
Records = userRecords.OrderByDescending( u => u.Time )
};
could you do:
var queryResult = from records in container.tableWhatever
where records.Time >= DateTime.Today
group records by tableWhatever.tableHeader.UserId into userRecords
select new { UserID = userRecords.Key, Records = userRecords.OrderByDescending(r=>r.Time) };

Categories