Pagination in a SQL Server stored procedure with duplicated data - c#

I have a stored procedure in SQL Server that gets contact persons based on multiple filters (e.g. DateOfBirth, DisplayName, ...) from multiple tables. I need to alter the stored procedure to include pagination and total count, since the pagination was done in the backend. PartyId is the unique key. The caveat is that a person can have multiple emails and phones, and let's say we search for DisplayName = "Sarah", the query will return the following :
TotalCount PartyId DisplayName EmailAddress PhoneNumber
-----------------------------------------------------------------
3 1 Sarah sarah#gmail.com 1
3 1 Sarah sarah2#gmail.com 1
3 1 Sarah sarah#gmail.com 2
This is roughly what the stored procedure does, the assigned values for CurrentPage and PageSize and the ORDER BY OFFSET on the bottom I included to test the pagination :
DECLARE #CurrentPage int = 1
DECLARE #PageSize int = 1000
SELECT
COUNT(*) OVER () as TotalCount,
p.Id AS PartyId,
e.EmailAddress,
pn.PhoneNumber
etc.....
FROM
[dbo].[Party] AS p WITH(NOLOCK)
INNER JOIN
[dbo].[Email] AS e WITH(NOLOCK) ON p.[Id] = e.[PartyID]
INNER JOIN
[dbo].[PhoneNumber] AS pn WITH(NOLOCK) ON p.[Id] = pn.[PartyID]
etc.....
WHERE
p.PartyType = 1 /*Individual*/
GROUP BY
p.Id, e.EmailAddress, pn.PhoneNumber etc...
ORDER BY
p.Id
OFFSET (#CurrentPage - 1) * #PageSize ROWS
FETCH NEXT #PageSize ROWS ONLY
This is what we do in the backend to group by PartyId and assign the corresponding emails and phones.
var responseModel = unitOfWork.PartyRepository.SearchContacts(model);
if (responseModel != null && responseModel.Count == 0)
{
return null;
}
// get multiple phones/emails for a party
var emailAddresses = responseModel.GroupBy(p => new { p.PartyId, p.EmailAddress })
.Select(x => new {
x.Key.PartyId,
x.Key.EmailAddress
});
var phoneNumbers = responseModel.GroupBy(p => new { p.PartyId, p.PhoneNumber, p.PhoneNumberCreateDate })
.Select(x => new {
x.Key.PartyId,
x.Key.PhoneNumber,
x.Key.PhoneNumberCreateDate
}).OrderByDescending(p => p.PhoneNumberCreateDate);
// group by in order to avoid multiple records with different email/phones
responseModel = responseModel.GroupBy(x => x.PartyId)
.Select(grp => grp.First())
.ToList();
var list = Mapper.Map<List<SearchContactResponseModelData>>(responseModel);
// add all phones/emails to respective party
list = list.Select(x =>
{
x.EmailAddresses = new List<string>();
x.EmailAddresses.AddRange(emailAddresses.Where(y => y.PartyId == x.PartyId).Select(y => y.EmailAddress));
x.PhoneNumbers = new List<string>();
x.PhoneNumbers.AddRange(phoneNumbers.Where(y => y.PartyId == x.PartyId).Select(y => y.PhoneNumber));
return x;
}).ToList();
var sorted = SortAndPagination(model, model.SortBy, list);
SearchContactResponseModel result = new SearchContactResponseModel()
{
Data = sorted,
TotalCount = list.Count
};
return result;
And the response will be :
{
"TotalCount": 1,
"Data": [
{
"PartyId": 1,
"DisplayName": "SARAH",
"EmailAddresses": [
"sarah#gmail.com",
"sarah2#gmail.com"
],
"PhoneNumbers": [
"1",
"2"
]
}
]
}
The TotalCount returned from the stored procedure obviously is not the real one, and after the backend code (where we assign the emails/phones and group by id) we get the real totalCount which is 1 instead of 3.
If we have 3 persons with the name Sarah, because of multiple phones/emails the totalCount in the stored procedure will be lets say 9 and the real count will be 3 and if I execute the stored procedure to get persons from 1 to 2, the pagination wont work because of the 9 records.
How can I implement pagination in the above scenario ?

You might try using a CTE to isolate the query against the Party table. This would allow you to pull the right number of rows (and the proper total row count) without having to worry about the expansion from the emails and phone numbers.
It would look something like this (rearranging your query above):
DECLARE #CurrentPage int = 1;
DECLARE #PageSize int = 1000;
WITH PartyList AS (
SELECT
COUNT(*) OVER () as TotalCount,
p.Id AS PartyId
FROM
[dbo].[Party] AS p WITH(NOLOCK)
WHERE
p.PartyType = 1 /*Individual*/
GROUP BY -- You might not need this now depending on your data
p.Id
ORDER BY
p.Id
OFFSET (#CurrentPage - 1) * #PageSize ROWS
FETCH NEXT #PageSize ROWS ONLY
)
SELECT
pl.TotalCount,
pl.PartyId,
e.EmailAddress,
pn.PhoneNumber
FROM PartyList AS pl
INNER JOIN
[dbo].[Email] AS e WITH(NOLOCK) ON pl.[PartyId] = e.[PartyID]
INNER JOIN
[dbo].[PhoneNumber] AS pn WITH(NOLOCK) ON pl.[PartyId] = pn.[PartyID];
Please be aware that the CTE will require the prior statement to end in a semicolon.

Related

Finding consecutive attendance for a series of events

I am trying to find a SQL only solution to an issue related to calculating consecutive event attendance. The events occur on different days so I cannot use any sequential date method for determining consecutive attendance. To count consecutive attendance for a single person I would start with the most recent event and work my way back in time. I would count each event that the person attended and when I hit an event the person did not attend I would stop. This allows me to have a count of recent consecutive attendance of events. Currently, all of the data is hosted in SQL tables and below is sample schema with data:
USERS
ID UserName MinutesWatched
--- -------- --------------
1 jdoe 30
2 ssmith 400
3 bbaker 350
4 tduke 285
EVENTS
ID Name StartDate
-- ----------- ---------
1 1st Event 07/15/2018
2 2nd Event 07/16/2018
3 3rd Event 07/18/2018
4 4th Event 07/20/2018
ATTENDANCE
ID User_ID Event_ID
-- ------- --------
1 1 1
2 1 2
3 1 3
4 1 4
5 2 4
6 2 3
7 3 4
8 3 2
9 3 1
10 4 4
11 4 3
12 4 2
For an output I am trying to get:
OUTPUT
User_ID Consecutive WatchedMinutes
------- ----------- --------------
1 4 30
2 2 400
3 1 350
4 3 285
I have built out C# code to do this in an iterative fashion but it is slow when I am dealing with 300,000+ users and hundreds of events. I would love to see a SQL version of this.
Below is the method that calculates top event viewers as requested by Dan. The output is actually just a string that lists the Top X event viewers.
public string GetUsersTopWatchedConsecutiveStreams(int topUserCount)
{
string results = "Top " + topUserCount + " consecutive viewers - ";
Dictionary<ChatUser, int> userinfo = new Dictionary<ChatUser, int>();
using (StorageModelContext db = new StorageModelContext())
{
IQueryable<ChatUser> allUsers = null;
if (mainViewModel.CurrentStream != null)
allUsers = db.ViewerHistory.Include("Stream").Include("User").Where(x => x.Stream.Id == mainViewModel.CurrentStream.Id).Select(x => x.User);
else
allUsers = db.ViewerHistory.Include("Stream").Include("User").Where(x => x.Stream.Id == (db.StreamHistory.OrderByDescending(s => s.StreamEnd).FirstOrDefault().Id)).Select(x => x.User);
foreach (var u in allUsers)
{
int totalStreams = 0;
var user = db.Users.Include("History").Where(x => x.UserName == u.UserName).FirstOrDefault();
if (user != null)
{
var streams = user.History;
if (streams != null)
{
var allStreams = db.StreamHistory.OrderByDescending(x => x.StreamStart);
foreach (var s in allStreams)
{
var vs = streams.Where(x => x.Stream == s);
if (vs.Count() > 0)
totalStreams++;
else
break;
}
}
}
userinfo.Add(u, totalStreams);
totalStreams = 0;
}
var top = userinfo.OrderByDescending(x => x.Value).ThenByDescending(x => x.Key.MinutesWatched).Take(topUserCount);
int cnt = 1;
foreach (var t in top)
{
results += "#" + cnt + ": " + t.Key + "(" + t.Value.ToString() + "), ";
cnt++;
}
if (cnt > 1)
results = results.Substring(0, results.Length - 2);
}
return results;
}
mainViewModel.CurrentStream is null when no event is actively running. When a live event is occurring it will contain an object with information related to the live stream event.
Maybe you want to give this one a try:
Events get a row number in descending order (by StartDate), also the attendances by user get a number in descending StartDate order. Now, the differences of the event numbers and the attendance numbers will be the same for consecutive attendances. I use these differences for grouping, count the attendances in the group and return the group with the lowest difference (by user):
WITH
evt (ID, StartDate, evt_no) AS (
SELECT ID, StartDate,
ROW_NUMBER() OVER (ORDER BY StartDate DESC)
FROM EVENTS
),
att ([User_ID], grp_no) AS (
SELECT [User_ID], evt_no -
ROW_NUMBER() OVER (PARTITION BY [User_ID] ORDER BY StartDate DESC)
FROM ATTENDANCE a
INNER JOIN evt ON a.Event_ID = evt.ID
),
con ([User_ID], Consecutive, rn) AS (
SELECT [User_ID], COUNT(*),
ROW_NUMBER() OVER (PARTITION BY User_ID ORDER BY grp_no)
FROM att
GROUP BY [User_ID], grp_no
)
SELECT u.ID AS [User_ID], u.UserName, u.MinutesWatched, con.Consecutive
FROM con
INNER JOIN USERS u ON con.[User_ID] = u.ID
WHERE con.rn = 1;
Would be interested in how long this query runs on your system.
You seem to want the largest event id that a person did not attend, which is smaller than the largest id the person did attend. Then you want to count the number the person attended.
The following approach handles this as:
Combine the users with all events up to the maximum event
Get the largest event that doesn't match
Bring back the rows where the count is 0 and count them
So, this gives the events with the count:
select u.user_id,
sum(case when a.event_id is null then e.id end) over (partition by user_id) as max_nonmatch_event_id
from (select user_id, max(event_id) as max_event_id
from attendance
group by user_id
) u join
events e
on e.id <= u.max_event_id left join
attendance a
on a.user_id = u.id and a.event_id = e.id
order by num_nulls_gt;
One more subquery should do the rest:
select u.user_id, count(*) as num_consecutive
from (select u.user_id,
sum(case when a.event_id is null then e.id end) over (partition by user_id) as max_nonmatch_event_id
from (select user_id, max(event_id) as max_event_id
from attendance
group by user_id
) u join
events e
on e.id <= u.max_event_id left join
attendance a
on a.user_id = u.id and a.event_id = e.id
) ue
where event_id > max_nonmatch_event_id
group by user_id;

How to filter LINQ query by table column and get count

I'm trying to get a list of students based on their status, grouped by their college.
So I have three tables, Students and Colleges. Each student record has a status, that can be 'Prospect', 'Accepted' or 'WebApp'. What I need to do is get a list of students based on the status selected and then display the College's name, along with the number of students that go to that college and have their status set to the status passed in. I think this needs to be an aggregate query, since the counts are coming from the string Status field.
I'm not sure how to do this in MS SQL, since the count is coming from the same table and it's based on the status field's value.
Here is the start of my query, which takes in the search parameters, but I can't figure out how to filter on the status to return the counts.
SELECT Colleges.Name, [Status], Count([Status])
FROM Students
JOIN Colleges ON Students.UniversityId = Colleges.id OR Students.+College = Colleges.Name
GROUP BY Students.[Status], Colleges.Name
ORDER BY Colleges.Name;
Accepts = Status('Accepted')
WebApps = Status('WebApp')
Total = Sum(Accpets + WebApps)
Select
Colleges.Name,
SUM(Case when Students.Status like 'Accepted' then 1 else 0 end) Accepts,
SUM(Case when Students.Status like 'WebApp' then 1 else 0 end) WebApps,
COUNT(*) Total
from Students
join Colleges on Students.UniversityId = Colleges.Id OR Students.CurrentCollege = Colleges.Name
Group by Colleges.Name
The LINQ:
var results =
(from c in db.Colleges // db is your DataContext
select new
{
CollegeName = c.Name,
AcceptedStatus = db.Students.Count(r => r.Status.ToUpper() == "ACCEPTED" && (r.UniversityId == c.Id || r.CurrentCollege == c.Name)),
WebAppStatus = db.Students.Count(r => r.Status.ToUpper() == "WEBAPP" && (r.UniversityId== c.Id || r.CurrentCollege == c.Name)),
Total = db.Students.Count(s => s.UniversityId == c.Id || s.CurrentCollege == c.Name)
}).ToList();
Try this http://www.linqpad.net/
Its free and you can convert the linq to sql queries

LINQ query to get count of records

I'm trying to retrieve some records from database along with a count, with LINQ.
DataTable dtByRecipe = (from tbrp in context.tblRecipeParents
join tbrc in context.tblRecipeChilds on tbrp.RecipeParentID equals tbrc.RecipeParentID
join tbp in context.tblProducts on tbrc.ProductID equals tbp.ProductID
join tbps in context.tblProductSales.AsEnumerable()
on tbp.ProductID equals tbps.ProductID
join tbs in context.tblSales.AsEnumerable()
on tbps.ProductSalesID equals tbs.ProductSalesID select new
{
tbrp.Recipe,
tbp.ProductID,
tbps.ProductSalesID,
tbrp.Yield,
Product = tbp.ProductCode + " - " + tbp.ProductDescription,
ProductYield = tbrp.Yield,
TotalYield = "XXX",
Cost = "YYY"
}).AsEnumerable()
.Select(item => new {
item.Recipe,
Count = GetCount(item.ProductID, item.ProductSalesID, context),
item.Yield,
Product = item.Product,
ProductYield = item.ProductYield,
TotalYield = "XXX",
Cost = "YYY"
}).OrderBy(o => o.Recipe).ToDataTable();
private int GetCount ( int ProductID, int ProductSalesID, MTBARKER_DBEntities context )
{
int query = ( from tbps in context.tblProductSales
join tbp in context.tblProducts on tbps.ProductID equals tbp.ProductID
join tbs in context.tblSales
on tbps.ProductSalesID equals tbs.ProductSalesID
where tbp.ProductID == ProductID && tbps.ProductSalesID == ProductSalesID
select tbs ).Count();
return query;
}
In above query I get the expected result but since there are around 10K records in the database it consumes a lot of time to produce the result. The issue is with the following approach I have used to get the count.
Count = GetCount(item.ProductID, item.ProductSalesID, context),
Is there any productive way that I could prevent this issue?
Well Stored Procedures is best choice for performance.Use Stored Procedures in the Entity Framework for selection and for reporting.

Sort the SQL Query in the order of IN

I am writing a query
SELECT * FROM EMPLOYEES WHERE EMP_ID IN (10,5,3,9,2,8,6)
I want the result should be in the following order
Emp_id Emp_Name
10 John
5 Joe
3 Tippu
9 Rich
2 Chad
8 Chris
6 Rose
Basically in the same order of IN Clause. Is it possible to do that? Please let me know.
PS: I can either do it in SQL or after I get the resultset if I can use LINQ or something to sort in front end option also will work for me (I have the Emp IDs in array in front end)
Thanks
String comment answer; this will give the same result as original answer but matching on strings:
string orgList = "John,Joe,Tippu,Rich,Chad,Chris,Rose";
List<string> orderArray = new List<string>(orgList.Split(",".ToCharArray()));
// the linq to do the ordering
var result = ourList.OrderBy(e => {
int loc = orderArray.IndexOf(e.Name);
return loc == -1? int.MaxValue: loc;
});
as a side note the original answer would probably had been better with these two lines:
string orgList = "10,5,3,9,2,8,6";
List<int> orderArray = new List<int>(orgList.Split(",".ToCharArray()));
instead of using integer constants. Using the code above will order by an arbitrary comma separated list of integers.
The solution below in Linq gives this result:
void Main()
{
// some test data
List<Person> ourList = new List<Person>()
{
new Person() { ID = 1, Name = "Arron" },
new Person() { ID = 2, Name = "Chad" },
new Person() { ID = 3, Name = "Tippu" },
new Person() { ID = 4, Name = "Hogan" },
new Person() { ID = 5, Name = "Joe" },
new Person() { ID = 6, Name = "Rose" },
new Person() { ID = 7, Name = "Bernard" },
new Person() { ID = 8, Name = "Chris" },
new Person() { ID = 9, Name = "Rich" },
new Person() { ID = 10, Name = "John" }
};
// what we will use to order
List<int> orderArray = new List<int>(){10,5,3,9,2,8,6};
// the linq to do the ordering
var result = ourList.OrderBy(e => {
int loc = orderArray.IndexOf(e.ID);
return loc == -1? int.MaxValue: loc;
});
// good way to test using linqpad (get it at linqpad.com
result.Dump();
}
// test class so we have some thing to order
public class Person
{
public int ID { get; set; }
public string Name { get; set; }
}
Original bad SQL answer
WITH makeMyOrder
(
SELECT 10 as ID, 1 as Ord
UNION ALL
SELECT 5 as ID, 2 as Ord
UNION ALL
SELECT 3 as ID, 3 as Ord
UNION ALL
SELECT 9 as ID, 4 as Ord
UNION ALL
SELECT 2 as ID, 5 as Ord
UNION ALL
SELECT 8 as ID, 6 as Ord
UNION ALL
SELECT 6 as ID, 7 as Ord
),
SELECT *
FROM EMPLOYEES E
JOIN makeMyOrder O ON E.EMP_ID = O.ID
ORDER BY O.Ord
What, Linq-To-SQL doesn't have a magic button you can press to make it do this? :-)
To do this in SQL Server, you need a function that will turn your list into a set and maintain the order. There are many ways to skin this cat; here's one:
CREATE FUNCTION dbo.SplitInts_Ordered
(
#List VARCHAR(MAX),
#Delimiter VARCHAR(255)
)
RETURNS TABLE
AS
RETURN (SELECT [Index] = ROW_NUMBER() OVER (ORDER BY Number), Item
FROM (SELECT Number, Item = CONVERT(INT, SUBSTRING(#List, Number,
CHARINDEX(#Delimiter, #List + #Delimiter, Number) - Number))
FROM (SELECT ROW_NUMBER() OVER (ORDER BY [object_id])
FROM sys.all_objects) AS n(Number)
WHERE Number <= CONVERT(INT, LEN(#List))
AND SUBSTRING(#Delimiter + #List, Number, LEN(#Delimiter)) = #Delimiter
) AS y);
GO
Now you can just say:
DECLARE #list VARCHAR(MAX);
SET #list = '10,5,3,9,2,8,6';
SELECT e.Emp_Id, e.Emp_Name -- never use * in production code
FROM dbo.Employees AS e -- always use schema prefix
INNER JOIN dbo.SplitInts_Ordered(#list, ',') AS x
ON x.Item = e.Emp_Id
ORDER BY x.[Index];
A much, much, much, much, much better approach is to stop passing a comma-separated list at all, and use Table-Valued Parameters. This is a set of things, not a string or some JSON obscenity. Create a DataTable in your C# code, with two columns, the list and the order. Then create a table type:
CREATE TYPE dbo.SortedList AS TABLE(ID INT, [Order] INT);
Then a stored procedure that takes this as a parameter:
CREATE PROCEDURE dbo.GetTheList
#x dbo.SortedList READONLY
AS
BEGIN
SET NOCOUNT ON;
SELECT e.Emp_Id, e.Emp_Name
FROM dbo.Employees AS e
INNER JOIN #x AS x
ON x.ID = e.Emp_Id
ORDER BY x.[Order];
END
GO
Whether you can do this with Linq-To-SQL, I'm not sure; people seem to jump on the Linq bandwagon very quickly, because it makes things so easy. Well, as long as you don't need to actually do anything.

EF Sum between 3 tables

Say we got a Database design like this.
Customer
Id Name
1 John
2 Jack
Order
Id CustomerId
1 1
2 1
3 2
OrderLine
Id OrderId ProductId Quantity
1 1 1 10
2 1 2 20
3 2 1 30
4 3 1 10
How would I create an entity framework query to calculate the total Quantity a given Customer has ordered of a given Product?
Input => CustomerId = 1 & ProductId = 1
Output => 40
This is what I got so far, through its not complete and still missing the Sum.
var db = new ShopTestEntities();
var orders = db.Orders;
var details = db.OrderDetails;
var query = orders.GroupJoin(details,
order => order.CustomerId,
detail => detail.ProductId,
(order, orderGroup) => new
{
CustomerID = order.CustomerId,
OrderCount = orderGroup.Count()
});
I find it's easier to use the special Linq syntax as opposed to the extension method style when I'm doing joins and groupings, so I hope you don't mind if I write it in that style.
This is the first approach that comes to mind for me:
int customerId = 1;
int productId = 1;
var query = from orderLine in db.OrderLines
join order in db.Orders on orderLine.OrderId equals order.Id
where order.CustomerId == customerId && orderLine.ProductId == productId
group orderLine by new { order.CustomerId, orderLine.ProductId } into grouped
select grouped.Sum(g => g.Quantity);
// The result will be null if there are no entries for the given product/customer.
int? quantitySum = query.SingleOrDefault();
I can't check what kind of SQL this will generate at the moment, but I think it should be something pretty reasonable. I did check that it gave the right result when using Linq To Objects.

Categories