Solution for complex LINQ query with ROW_NUMBER() and PARTITION BY - c#

This is my first question. For a school assignment I'm writing a program in ASP.net MVC with Rider. It is gonna be cinema webapp. The query gets the show which is played in every hall at the moment. So, for 6 halls I have 6 Id's and all of the ID's should give me back:
HallId
MovieTitle
Showtime (Starttime)
The code I build was this and it works in my Query-console:
SELECT "HallId", "Title", "StartAt"
FROM (SELECT *, ROW_NUMBER() OVER (PARTITION BY "HallId" ORDER BY "StartAt") rn
FROM "Showtime" where "StartAt"::time < now()::time) x
JOIN "Movie" M ON "MovieId" = M."Id"
WHERE x.rn = 1
ORDER BY "HallId"
I need a LINQ-query for this, but I couldn't get it working. I use Postgres by the way. That is why the “”.
Does someone has a answer for me?

your question is not clear enough about the columns names but you can use the same as following linq query
var result =
(from s in dbentities.Showtime
join r in dbEntities.Movie on s.Mid equals r.Mid
where s.StartAt < DateTime.Now && r.rn == 1).ToList();

This was my solution:
After a long search, I found the next (magical) solution. Works like hell for me:
public IEnumerable<Showtime> MovieNext(){
return _context.Showtime
.FromSqlRaw("SELECT tbl.* FROM (SELECT *, ROW_NUMBER() OVER (PARTITION BY "HallId" ORDER BY "StartAt") row
FROM myDb."Showtime"
WHERE "StartAt" > now()) tbl
JOIN myDb."Movie" M ON "MovieId" = M."Id"
WHERE tbl.row = 1 ORDER BY "HallId"");
}

Related

EntityFramework Group by not included in SQL statement

I'm trying to create a query similar to this:
select randomId
from myView
where ...
group by randomId
NOTE: EF doesn't support the distinct so I was thinking of going around the lack of it with the group by (or so I think)
randomId is numeric
Entity Framework V.6.0.2
This gives me the expected result in < 1 second query
When trying to do the same with EF I have been having some issues.
If I do the LINQ similar to this:
context.myView
.Where(...)
.GroupBy(mt => mt.randomId)
.Select({ Id = group.Key, Count = group.Count() } )
I will get sort of the same result but forcing a count and making the query > 6 seconds
The SQL EF generates is something like this:
SELECT
1 AS [C1],
[GroupBy1].[K1] AS [randomId],
[GroupBy1].[A1] AS [C2]
FROM (
SELECT
[Extent1].[randomId] AS [K1],
COUNT(1) AS [A1]
FROM [dbo].[myView] AS [Extent1]
WHERE (...)
GROUP BY [Extent1].[randomId]
) AS [GroupBy1]
But, if the query had the count commented out it would be back to < 1 second
If I change the Select to be like:
.Select({ Id = group.Key} )
I will get all of rows without the group by statement in the SQL query and no Distinct whatsoever:
SELECT
[Extent1].[anotherField] AS [anotherField], -- 'this field got included automatically on this query and I dont know why, it doesnt affect outcome when removed in SQL server'
[Extent1].[randomId] AS [randomId]
FROM [dbo].[myView] AS [Extent1]
WHERE (...)
Other failed attempts:
query.GroupBy(x => x.randomId).Select(group => group.FirstOrDefault());
The query that was generated is as follows:
SELECT
[Limit1].ALL FIELDS,...
FROM (SELECT
[Extent1].[randomId] AS [randomId]
FROM [dbo].[myView] AS [Extent1]
WHERE (...) AS [Project1]
OUTER APPLY (SELECT TOP (1)
[Extent2].ALL FIELDS,...
FROM [dbo].[myView] AS [Extent2]
WHERE (...) AS [Limit1] -- same as the where above
This query performed rather poorly and still managed to return all Ids for the where clause.
Does anyone have an idea on how to force the usage of the group by without an aggregating function like a count?
In SQL it works but then again I have the distinct keyword as well...
Cheers,
J
var query = from p in TableName
select new {Id = p.ColumnNameId};
var distinctItems = query.Distinct().ToList();
Here is the linq query however you should be able to write an equivalent from EF dbset too. If you have issues let me know.
Cheers!

Is there a way to check if one or more rows of a result set sum up to a specific value?

This is kind of a complicated question to phrase so bear with me. Let's say I have a query that return a set of integers.
2387
3357
3471
4885
5867
6170
8170
9777
12970
13190
17670
20470
160159
These obvious all mean something to me, even if it's tough to see how they do for you. For ease, they represent a measurement. Now my first try is to match a specific database values to a number obtained through an upload process In this case I want to match 37,174
Now, obviously, by looking at you can see that no ONE record matches the amount I'm looking for. My real question would be, is there any way to see if some combination of certain amounts would total to the amount I'm looking for. I'm looking for something that would preferably be able to be rolled into a SQL query, but I use C# for all of my processing, so if there is something that I'm missing that I can utilize, a nudge in the right direction would be appreciated. I tried a Google search, and because of the delicacy of phrasing the question, I could not find anything relevant/useful. Still a newbie, so I don't know if there is just a method or a class in C# or some functionality of Postgres that will permit this.
Edit** I know how I could do it using loops, but I know that that would be a poor performance choice.
Using the Combinatronics Library:
var values = new int[] { 1, 2, 3, 4, 5};
var target = 9;
var candidates = Enumerable.Range(1,values.Count())
.SelectMany(x => new Combinations<int>(values, x))
.Where(x => x.Sum() == target);
This will give you all possible combinations which match your target value. It's up to you if you'd prefer the first one (use FirstOrDefault()), or apply some more logic.
In your example, no combination adds up to 37174.
The only way to do this in SQL is a brute force approach. For instance, the following query will consider all combinations of three numbers:
select *
from t t1 join
t t2
on t1.val < t2.val join
t.t3
on t2.val < t3.val
where t1.val + t2.val + t3.val = 37174;
The combinations are ordered from smallest to largest values, with no duplicates.
If you want the closest sum to your goal, then you can do something like:
select
from t t1 join
t t2
on t1.val < t2.val join
t.t3
on t2.val < t3.val
order by abs(t1.val + t2.val + t3.val - 37174)
limit 1;
If you want up-to three numbers, then include a 0 value in the list.
And, all of these generalize to a fixed number of joins.
To do a variable number, you need to use recursive queries.
You do have one advantage using SQL over, say, C# for this type of search. SQL can take advantage of multi-threaded parallelism by default.
As the numbers increase, the number of options increase exponentially.
Eg:
If you have three numbers, then you have to check 7 combinations
A, B, C, A+B, A+C, B+C, A+B+C
With four, there are
A, B, C, D,
A+B, A+C, A+D, B+C, B+D, C+D,
A+B+C, A+B+D, A+C+D, B+C+D
A+B+C+D
and so on.
So I would say, No, there is no simple SQL way to absolutely find the answer.
However, you could do it with a cross join to find the simpler solutions.
Eg : where your table is t with a field i containing the values, add a 0 figure to the results and...
insert t (i)
select 0 union
select 2387 union
select 3357 union
select 3471 union
select 4885 union
select 5867 union
select 6170 union
select 8170 union
select 9777 union
select 12970 union
select 13190 union
select 17670 union
select 20470 union
select 160159
select *, t1.i + t2.i + t3.i + t4.i + t5.i + t6.i + t7.i
from t t1
cross join t t2
cross join t t3
cross join t t4
cross join t t5
cross join t t6
cross join t t7
where t1.i+t2.i+t3.i+t4.i+t5.i+t6.i+t7.i = 37174
Which will give you the combination...
2387 3471 4885 13190 4885 4885 3471
Now you may have the restriction that duplicates aren't allowed, in which case there is no solution from your data
Try something like:
WITH RECURSIVE match(val, res) AS
(SELECT st.val , 1234 - st.val as res
FROM your_table st
WHERE 1234 - st.val >= 0
UNION ALL
SELECT nt.val, match.res - nt.val
FROM your_table nt
JOIN match ON match.res - nt.val >= 0
),
final_match (val, res) AS
(SELECT match.val , match.res
FROM match
WHERE match.res = 0
UNION ALL
SELECT match.val, match.res
FROM match
JOIN final_match ON final_match.val = match.res
)
SELECT *
FROM final_match
ORDER BY res DESC;
The idea - to build recursively all combinations of numbers, that can lead to your sum.
Then pick one, that has your_number - sum = 0
Using SQL Server syntax - I don't know if there's a postgres equivalent of recursive CTEs
;with cte as
(
select cast(val as varchar(MAX)) as valtxt
, val
, val as summed
from #temp
union all
select C.valtxt +' + ' + CAST(C2.val as varchar(max))
, C2.val
, C.summed + C2.val
from cte C
inner join #temp C2
on c.val < C2.val
where C.summed <= 37174
)
select top 1 valtxt, summed from cte order by ABS(summed - 37174)
(where #temp is the table containing the values)

How to translate this Queryable linq function

I'm struggling trying to generate this LINQ function in a correct T-SQL function.
Please check the following sentence:
// determine the max count of exams applied by students
IQueryable query = (from at in Database.Current.AnsweredTests
where at.TestId == id
group at by at.StudentId into s
select s.Count()).Max();
As you can see this function is wrong talking about syntactically, because Max extension returns int. So which I'm trying to accomplish is to generate a correct T-SQL.
Something like this:
MAX(SELECT x.COUNT()
FROM...
GROUP BY StudentId)
I just did this because I want a good performance, and that is performing a low performance. So my problem is how can I write a correct LINQ sentence with the aggregate functions like MAX and COUNT.
UPDATE:
SELECT [GroupBy1].[A1] AS [C1]
FROM ( SELECT
[Extent1].[StudentId] AS [K1],
COUNT(1) AS [A1]
FROM [dbo].[AnsweredTests] AS [Extent1]
WHERE CAST( [Extent1].[TestId] AS int) = #p__linq__0
GROUP BY [Extent1].[StudentId]
) AS [GroupBy1]
This is what generate the IQueryable (if I remove the max extension, of course). I would like to know if is there a way to include the aggregate function MAX inside of that T-SQL Query to improve the performance on the Server side.
You could also word your query in the following way:
SELECT TOP 1 COUNT(*)
FROM AnsweredTests
WHERE TestId = #id
GROUP BY StudentId
ORDER BY COUNT(*) DESC
Following that logic, this (untested) should be what you are looking for:
var result = (from at in Database.Current.AnsweredTests
where at.TestId == id
group at by at.StudentId into s
orderby s.Count() descending
select s.Count()).First()
You can do ORDER BY DESCENDING and then take first:
var Max = (from at in Database.Current.AnsweredTests
where at.TestId == id
group at by at.StudentId into s
select new { Count = s.Count() }).OrderByDescending(o=>o.Count).First();

LINQ to Entities find top records from ordered groupings

I have a problem that I know how to solve in SQL but not with Linq to Entities.
My data looks like this:
ID GROUP TIMESTAMP
-- ----- ---------
1 A 2011-06-20
2 A 2011-06-21
3 B 2011-06-21
4 B 2011-06-22
5 B 2011-06-23
6 C 2011-06-30
I want to retrieve all the Entity objects (not just the ID) such that I am only getting the most recent record from each group. (ie. the records with ids 2, 5, 6)
In SQL I would do something like this:
SELECT * FROM my_table a
WHERE a.timestamp =
(SELECT MAX(timestamp) FROM my_table b
WHERE a.group = b.group)
(For the sake of this question you can assume that timestamp is unique within each group).
I'd like to do this query against a WCF Data Service using Linq to Entities but I can't seem to have a nested query that references the outside query like this. Can anyone help?
Possibly not as clean and efficient as the hand written version but here's what I came up with
var q = from a in db.MyEntities
where a.Timestamp == (from b in db.MyEntities
where b.Group == a.Group
select b.Timestamp).Max()
select a;
which translates into this SQL
SELECT
[Project1].[Id] AS [Id],
[Project1].[Group] AS [Group],
[Project1].[Timestamp] AS [Timestamp]
FROM ( SELECT
[Extent1].[Id] AS [Id],
[Extent1].[Group] AS [Group],
[Extent1].[Timestamp] AS [Timestamp],
[SSQTAB1].[A1] AS [C1]
FROM [MyEntities] AS [Extent1]
OUTER APPLY
(SELECT
MAX([Extent2].[Timestamp]) AS [A1]
FROM [MyEntities] AS [Extent2]
WHERE [Extent2].[Group] = [Extent1].[Group]) AS [SSQTAB1]
) AS [Project1]
WHERE [Project1].[Timestamp] = [Project1].[C1]
Hi try to use linqer that will convert your sql statements to linq query.
Linqer
Best Regards
This should work:
var query = db.my_table
.GroupBy(p=>p.group)
.Select(p=>p.OrderByDescending(q=>q.timestamp).First());
Here you go.A simple way to do.
var result = (from x in my_table
group x by x.Group into g
select new
{
g.Key,
timestamp = g.Max(x => x.TimeStamp),
g //This will return everything in g
});

Linq2Sql: Get every N'th row [duplicate]

Anybody know how to write a LINQ to SQL statement to return every nth row from a table? I'm needing to get the title of the item at the top of each page in a paged data grid back for fast user scanning. So if i wanted the first record, then every 3rd one after that, from the following names:
Amy, Eric, Jason, Joe, John, Josh, Maribel, Paul, Steve, Tom
I'd get Amy, Joe, Maribel, and Tom.
I suspect this can be done... LINQ to SQL statements already invoke the ROW_NUMBER() SQL function in conjunction with sorting and paging. I just don't know how to get back every nth item. The SQL Statement would be something like WHERE ROW_NUMBER MOD 3 = 0, but I don't know the LINQ statement to use to get the right SQL.
Sometimes, TSQL is the way to go. I would use ExecuteQuery<T> here:
var data = db.ExecuteQuery<SomeObjectType>(#"
SELECT * FROM
(SELECT *, ROW_NUMBER() OVER (ORDER BY id) AS [__row]
FROM [YourTable]) x WHERE (x.__row % 25) = 1");
You could also swap out the n:
var data = db.ExecuteQuery<SomeObjectType>(#"
DECLARE #n int = 2
SELECT * FROM
(SELECT *, ROW_NUMBER() OVER (ORDER BY id) AS [__row]
FROM [YourTable]) x WHERE (x.__row % #n) = 1", n);
Once upon a time, there was no such thing as Row_Number, and yet such queries were possible. Behold!
var query =
from c in db.Customers
let i = (
from c2 in db.Customers
where c2.ID < c.ID
select c2).Count()
where i%3 == 0
select c;
This generates the following Sql
SELECT [t2].[ID], [t2]. --(more fields)
FROM (
SELECT [t0].[ID], [t0]. --(more fields)
(
SELECT COUNT(*)
FROM [dbo].[Customer] AS [t1]
WHERE [t1].[ID] < [t0].[ID]
) AS [value]
FROM [dbo].[Customer] AS [t0]
) AS [t2]
WHERE ([t2].[value] % #p0) = #p1
Here's an option that works, but it might be worth checking that it doesn't have any performance issues in practice:
var nth = 3;
var ids = Table
.Select(x => x.Id)
.ToArray()
.Where((x, n) => n % nth == 0)
.ToArray();
var nthRecords = Table
.Where(x => ids.Contains(x.Id));
Just googling around a bit I haven't found (or experienced) an option for Linq to SQL to directly support this.
The only option I can offer is that you write a stored procedure with the appropriate SQL query written out and then calling the sproc via Linq to SQL. Not the best solution, especially if you have any kind of complex filtering going on.
There really doesn't seem to be an easy way to do this:
How do I add ROW_NUMBER to a LINQ query or Entity?
How to find the ROW_NUMBER() of a row with Linq to SQL
But there's always:
peopleToFilter.AsEnumerable().Where((x,i) => i % AmountToSkipBy == 0)
NOTE: This still doesn't execute on the database side of things!
This will do the trick, but it isn't the most efficient query in the world:
var count = query.Count();
var pageSize = 10;
var pageTops = query.Take(1);
for(int i = pageSize; i < count; i += pageSize)
{
pageTops = pageTops.Concat(query.Skip(i - (i % pageSize)).Take(1));
}
return pageTops;
It dynamically constructs a query to pull the (nth, 2*nth, 3*nth, etc) value from the given query. If you use this technique, you'll probably want to create a limit of maybe ten or twenty names, similar to how Google results page (1-10, and Next), in order to avoid getting an expression so large the database refuses to attempt to parse it.
If you need better performance, you'll probably have to use a stored procedure or a view to represent your query, and include the row number as part of the stored proc results or the view's fields.

Categories