Finding consecutive attendance for a series of events - c#

I am trying to find a SQL only solution to an issue related to calculating consecutive event attendance. The events occur on different days so I cannot use any sequential date method for determining consecutive attendance. To count consecutive attendance for a single person I would start with the most recent event and work my way back in time. I would count each event that the person attended and when I hit an event the person did not attend I would stop. This allows me to have a count of recent consecutive attendance of events. Currently, all of the data is hosted in SQL tables and below is sample schema with data:
USERS
ID UserName MinutesWatched
--- -------- --------------
1 jdoe 30
2 ssmith 400
3 bbaker 350
4 tduke 285
EVENTS
ID Name StartDate
-- ----------- ---------
1 1st Event 07/15/2018
2 2nd Event 07/16/2018
3 3rd Event 07/18/2018
4 4th Event 07/20/2018
ATTENDANCE
ID User_ID Event_ID
-- ------- --------
1 1 1
2 1 2
3 1 3
4 1 4
5 2 4
6 2 3
7 3 4
8 3 2
9 3 1
10 4 4
11 4 3
12 4 2
For an output I am trying to get:
OUTPUT
User_ID Consecutive WatchedMinutes
------- ----------- --------------
1 4 30
2 2 400
3 1 350
4 3 285
I have built out C# code to do this in an iterative fashion but it is slow when I am dealing with 300,000+ users and hundreds of events. I would love to see a SQL version of this.
Below is the method that calculates top event viewers as requested by Dan. The output is actually just a string that lists the Top X event viewers.
public string GetUsersTopWatchedConsecutiveStreams(int topUserCount)
{
string results = "Top " + topUserCount + " consecutive viewers - ";
Dictionary<ChatUser, int> userinfo = new Dictionary<ChatUser, int>();
using (StorageModelContext db = new StorageModelContext())
{
IQueryable<ChatUser> allUsers = null;
if (mainViewModel.CurrentStream != null)
allUsers = db.ViewerHistory.Include("Stream").Include("User").Where(x => x.Stream.Id == mainViewModel.CurrentStream.Id).Select(x => x.User);
else
allUsers = db.ViewerHistory.Include("Stream").Include("User").Where(x => x.Stream.Id == (db.StreamHistory.OrderByDescending(s => s.StreamEnd).FirstOrDefault().Id)).Select(x => x.User);
foreach (var u in allUsers)
{
int totalStreams = 0;
var user = db.Users.Include("History").Where(x => x.UserName == u.UserName).FirstOrDefault();
if (user != null)
{
var streams = user.History;
if (streams != null)
{
var allStreams = db.StreamHistory.OrderByDescending(x => x.StreamStart);
foreach (var s in allStreams)
{
var vs = streams.Where(x => x.Stream == s);
if (vs.Count() > 0)
totalStreams++;
else
break;
}
}
}
userinfo.Add(u, totalStreams);
totalStreams = 0;
}
var top = userinfo.OrderByDescending(x => x.Value).ThenByDescending(x => x.Key.MinutesWatched).Take(topUserCount);
int cnt = 1;
foreach (var t in top)
{
results += "#" + cnt + ": " + t.Key + "(" + t.Value.ToString() + "), ";
cnt++;
}
if (cnt > 1)
results = results.Substring(0, results.Length - 2);
}
return results;
}
mainViewModel.CurrentStream is null when no event is actively running. When a live event is occurring it will contain an object with information related to the live stream event.

Maybe you want to give this one a try:
Events get a row number in descending order (by StartDate), also the attendances by user get a number in descending StartDate order. Now, the differences of the event numbers and the attendance numbers will be the same for consecutive attendances. I use these differences for grouping, count the attendances in the group and return the group with the lowest difference (by user):
WITH
evt (ID, StartDate, evt_no) AS (
SELECT ID, StartDate,
ROW_NUMBER() OVER (ORDER BY StartDate DESC)
FROM EVENTS
),
att ([User_ID], grp_no) AS (
SELECT [User_ID], evt_no -
ROW_NUMBER() OVER (PARTITION BY [User_ID] ORDER BY StartDate DESC)
FROM ATTENDANCE a
INNER JOIN evt ON a.Event_ID = evt.ID
),
con ([User_ID], Consecutive, rn) AS (
SELECT [User_ID], COUNT(*),
ROW_NUMBER() OVER (PARTITION BY User_ID ORDER BY grp_no)
FROM att
GROUP BY [User_ID], grp_no
)
SELECT u.ID AS [User_ID], u.UserName, u.MinutesWatched, con.Consecutive
FROM con
INNER JOIN USERS u ON con.[User_ID] = u.ID
WHERE con.rn = 1;
Would be interested in how long this query runs on your system.

You seem to want the largest event id that a person did not attend, which is smaller than the largest id the person did attend. Then you want to count the number the person attended.
The following approach handles this as:
Combine the users with all events up to the maximum event
Get the largest event that doesn't match
Bring back the rows where the count is 0 and count them
So, this gives the events with the count:
select u.user_id,
sum(case when a.event_id is null then e.id end) over (partition by user_id) as max_nonmatch_event_id
from (select user_id, max(event_id) as max_event_id
from attendance
group by user_id
) u join
events e
on e.id <= u.max_event_id left join
attendance a
on a.user_id = u.id and a.event_id = e.id
order by num_nulls_gt;
One more subquery should do the rest:
select u.user_id, count(*) as num_consecutive
from (select u.user_id,
sum(case when a.event_id is null then e.id end) over (partition by user_id) as max_nonmatch_event_id
from (select user_id, max(event_id) as max_event_id
from attendance
group by user_id
) u join
events e
on e.id <= u.max_event_id left join
attendance a
on a.user_id = u.id and a.event_id = e.id
) ue
where event_id > max_nonmatch_event_id
group by user_id;

Related

Pagination in a SQL Server stored procedure with duplicated data

I have a stored procedure in SQL Server that gets contact persons based on multiple filters (e.g. DateOfBirth, DisplayName, ...) from multiple tables. I need to alter the stored procedure to include pagination and total count, since the pagination was done in the backend. PartyId is the unique key. The caveat is that a person can have multiple emails and phones, and let's say we search for DisplayName = "Sarah", the query will return the following :
TotalCount PartyId DisplayName EmailAddress PhoneNumber
-----------------------------------------------------------------
3 1 Sarah sarah#gmail.com 1
3 1 Sarah sarah2#gmail.com 1
3 1 Sarah sarah#gmail.com 2
This is roughly what the stored procedure does, the assigned values for CurrentPage and PageSize and the ORDER BY OFFSET on the bottom I included to test the pagination :
DECLARE #CurrentPage int = 1
DECLARE #PageSize int = 1000
SELECT
COUNT(*) OVER () as TotalCount,
p.Id AS PartyId,
e.EmailAddress,
pn.PhoneNumber
etc.....
FROM
[dbo].[Party] AS p WITH(NOLOCK)
INNER JOIN
[dbo].[Email] AS e WITH(NOLOCK) ON p.[Id] = e.[PartyID]
INNER JOIN
[dbo].[PhoneNumber] AS pn WITH(NOLOCK) ON p.[Id] = pn.[PartyID]
etc.....
WHERE
p.PartyType = 1 /*Individual*/
GROUP BY
p.Id, e.EmailAddress, pn.PhoneNumber etc...
ORDER BY
p.Id
OFFSET (#CurrentPage - 1) * #PageSize ROWS
FETCH NEXT #PageSize ROWS ONLY
This is what we do in the backend to group by PartyId and assign the corresponding emails and phones.
var responseModel = unitOfWork.PartyRepository.SearchContacts(model);
if (responseModel != null && responseModel.Count == 0)
{
return null;
}
// get multiple phones/emails for a party
var emailAddresses = responseModel.GroupBy(p => new { p.PartyId, p.EmailAddress })
.Select(x => new {
x.Key.PartyId,
x.Key.EmailAddress
});
var phoneNumbers = responseModel.GroupBy(p => new { p.PartyId, p.PhoneNumber, p.PhoneNumberCreateDate })
.Select(x => new {
x.Key.PartyId,
x.Key.PhoneNumber,
x.Key.PhoneNumberCreateDate
}).OrderByDescending(p => p.PhoneNumberCreateDate);
// group by in order to avoid multiple records with different email/phones
responseModel = responseModel.GroupBy(x => x.PartyId)
.Select(grp => grp.First())
.ToList();
var list = Mapper.Map<List<SearchContactResponseModelData>>(responseModel);
// add all phones/emails to respective party
list = list.Select(x =>
{
x.EmailAddresses = new List<string>();
x.EmailAddresses.AddRange(emailAddresses.Where(y => y.PartyId == x.PartyId).Select(y => y.EmailAddress));
x.PhoneNumbers = new List<string>();
x.PhoneNumbers.AddRange(phoneNumbers.Where(y => y.PartyId == x.PartyId).Select(y => y.PhoneNumber));
return x;
}).ToList();
var sorted = SortAndPagination(model, model.SortBy, list);
SearchContactResponseModel result = new SearchContactResponseModel()
{
Data = sorted,
TotalCount = list.Count
};
return result;
And the response will be :
{
"TotalCount": 1,
"Data": [
{
"PartyId": 1,
"DisplayName": "SARAH",
"EmailAddresses": [
"sarah#gmail.com",
"sarah2#gmail.com"
],
"PhoneNumbers": [
"1",
"2"
]
}
]
}
The TotalCount returned from the stored procedure obviously is not the real one, and after the backend code (where we assign the emails/phones and group by id) we get the real totalCount which is 1 instead of 3.
If we have 3 persons with the name Sarah, because of multiple phones/emails the totalCount in the stored procedure will be lets say 9 and the real count will be 3 and if I execute the stored procedure to get persons from 1 to 2, the pagination wont work because of the 9 records.
How can I implement pagination in the above scenario ?
You might try using a CTE to isolate the query against the Party table. This would allow you to pull the right number of rows (and the proper total row count) without having to worry about the expansion from the emails and phone numbers.
It would look something like this (rearranging your query above):
DECLARE #CurrentPage int = 1;
DECLARE #PageSize int = 1000;
WITH PartyList AS (
SELECT
COUNT(*) OVER () as TotalCount,
p.Id AS PartyId
FROM
[dbo].[Party] AS p WITH(NOLOCK)
WHERE
p.PartyType = 1 /*Individual*/
GROUP BY -- You might not need this now depending on your data
p.Id
ORDER BY
p.Id
OFFSET (#CurrentPage - 1) * #PageSize ROWS
FETCH NEXT #PageSize ROWS ONLY
)
SELECT
pl.TotalCount,
pl.PartyId,
e.EmailAddress,
pn.PhoneNumber
FROM PartyList AS pl
INNER JOIN
[dbo].[Email] AS e WITH(NOLOCK) ON pl.[PartyId] = e.[PartyID]
INNER JOIN
[dbo].[PhoneNumber] AS pn WITH(NOLOCK) ON pl.[PartyId] = pn.[PartyID];
Please be aware that the CTE will require the prior statement to end in a semicolon.

How to filter LINQ query by table column and get count

I'm trying to get a list of students based on their status, grouped by their college.
So I have three tables, Students and Colleges. Each student record has a status, that can be 'Prospect', 'Accepted' or 'WebApp'. What I need to do is get a list of students based on the status selected and then display the College's name, along with the number of students that go to that college and have their status set to the status passed in. I think this needs to be an aggregate query, since the counts are coming from the string Status field.
I'm not sure how to do this in MS SQL, since the count is coming from the same table and it's based on the status field's value.
Here is the start of my query, which takes in the search parameters, but I can't figure out how to filter on the status to return the counts.
SELECT Colleges.Name, [Status], Count([Status])
FROM Students
JOIN Colleges ON Students.UniversityId = Colleges.id OR Students.+College = Colleges.Name
GROUP BY Students.[Status], Colleges.Name
ORDER BY Colleges.Name;
Accepts = Status('Accepted')
WebApps = Status('WebApp')
Total = Sum(Accpets + WebApps)
Select
Colleges.Name,
SUM(Case when Students.Status like 'Accepted' then 1 else 0 end) Accepts,
SUM(Case when Students.Status like 'WebApp' then 1 else 0 end) WebApps,
COUNT(*) Total
from Students
join Colleges on Students.UniversityId = Colleges.Id OR Students.CurrentCollege = Colleges.Name
Group by Colleges.Name
The LINQ:
var results =
(from c in db.Colleges // db is your DataContext
select new
{
CollegeName = c.Name,
AcceptedStatus = db.Students.Count(r => r.Status.ToUpper() == "ACCEPTED" && (r.UniversityId == c.Id || r.CurrentCollege == c.Name)),
WebAppStatus = db.Students.Count(r => r.Status.ToUpper() == "WEBAPP" && (r.UniversityId== c.Id || r.CurrentCollege == c.Name)),
Total = db.Students.Count(s => s.UniversityId == c.Id || s.CurrentCollege == c.Name)
}).ToList();
Try this http://www.linqpad.net/
Its free and you can convert the linq to sql queries

EF Sum between 3 tables

Say we got a Database design like this.
Customer
Id Name
1 John
2 Jack
Order
Id CustomerId
1 1
2 1
3 2
OrderLine
Id OrderId ProductId Quantity
1 1 1 10
2 1 2 20
3 2 1 30
4 3 1 10
How would I create an entity framework query to calculate the total Quantity a given Customer has ordered of a given Product?
Input => CustomerId = 1 & ProductId = 1
Output => 40
This is what I got so far, through its not complete and still missing the Sum.
var db = new ShopTestEntities();
var orders = db.Orders;
var details = db.OrderDetails;
var query = orders.GroupJoin(details,
order => order.CustomerId,
detail => detail.ProductId,
(order, orderGroup) => new
{
CustomerID = order.CustomerId,
OrderCount = orderGroup.Count()
});
I find it's easier to use the special Linq syntax as opposed to the extension method style when I'm doing joins and groupings, so I hope you don't mind if I write it in that style.
This is the first approach that comes to mind for me:
int customerId = 1;
int productId = 1;
var query = from orderLine in db.OrderLines
join order in db.Orders on orderLine.OrderId equals order.Id
where order.CustomerId == customerId && orderLine.ProductId == productId
group orderLine by new { order.CustomerId, orderLine.ProductId } into grouped
select grouped.Sum(g => g.Quantity);
// The result will be null if there are no entries for the given product/customer.
int? quantitySum = query.SingleOrDefault();
I can't check what kind of SQL this will generate at the moment, but I think it should be something pretty reasonable. I did check that it gave the right result when using Linq To Objects.

How to return value from 2 tables in one linq query

please consider this table:
PK_Id Number Year Month Value
-------------------------------------------------------------------------
1 1 2000 5 100000
410 4 2000 6 10000
8888 1 2001 5 100
I Id=8888 and now I want to first select record with Id=8888 and second select previos year of that record*(I mean Id=1)*. How I can do this with linq and one query.
basically we have some queries that first it should find a value from a table (that may be not PK) and find Corresponding records in another tables. How I can do this with linq and one reference to database.
thanks
from a in Record
where a.PK_Id == 8888
from b in Record
where b.Number == a.Number && b.Year == a.Year - 1
select new { Current = a, Previous = b }
or
Record
.Where(a => a.PK_Id == 888)
.SelectMany(a =>
Record
.Where(b => b.Number == a.Number && b.Year == a.Year - 1)
.Select(b => new { Current = a, Previous = b })
If I understand your question right, then you need to filter the data of one table and join two tables.
You can join the tables and filter your data
var query = from c in Table1
join o in Table2 on c.Col1 equals o.Col2
where o.Col3 == "x"
select c;
or you can filter your data from one table and then join the tables (result will be the same)
var query = from c in Table1.Where(item => item.Col3 == "x")
join o in Table2 on c.Col1 equals o.Col2
select c;

How to perform aggregate function on last 4 rows of data?

I've got a table off the following model.
public class WeeklyNums
{
public int FranchiseId { get; set; }
public DateTime WeekEnding { get; set; }
public decimal Sales { get; set; }
}
I need a fourth column that calculates the minimum for this week and the previous three weeks. So the output would look like this.
1 7-Jan $1
1 14-Jan $2
1 21-Jan $3
1 28-Jan $4 **1**
1 4-Feb $4 **2**
1 11-Feb $6 **3**
1 18-Feb $4 **4**
1 25-Feb $8 **4**
1 3-Mar $7 **4**
I have no idea where to even start. Even some help with solving it in SQL would be helpful.
thx!
Consider using outer apply:
select yt1.*
, hist.four_week_min
from YourTable yt1
outer apply
(
select min(col1) as four_week_min
from YourTable yt2
where yt2.dt between dateadd(wk, -3, yt1.dt) and yt1.dt
) hist
Working example at SQL Fiddle.
var runningMins = from weekNum in data
select new
{
FranchiseId = weekNum.FranchiseId,
WeekEnding = weekNum.WeekEnding,
Sales = weekNum.Sales,
LastThreeWeeks = data.OrderByDescending( x => x.WeekEnding )
.Where( x => x.WeekEnding <= weekNum.WeekEnding )
.Take( 4 )
.Min( x => x.Sales )
};
SQL Query that will return minimum of the current and the three previous regardless of whether the dates are exactly three weeks apart:
With RnkItems As
(
Select DateVal, Sales
, Row_Number() Over ( Order By DateVal ) As Rnk
From SourceData
)
Select *
, (
Select Min(Sales)
From RnkItems As R1
Where R1.Rnk Between R.Rnk - 3 And R.Rnk
)
From RnkItems R
Order By 1
SQL Fiddle version
I know I'm too late, but here's the linq version:
var result = from w1 in db.Table
from w2 in db.Table.Where(x => x.WeekEnding >= w1.WeekEnding.AddDays(-28))
select new
{
FranchiseId = w1.FranchiseId,
WeekEnding = w1.WeekEnding,
Sales = w1.Sales,
SalesMin = w2.Min(x => x.Sales)
};

Categories