Sql to Linq difficulty - group by, having - c#

I have the following query in SQL which I would like to convert to LINQ:
select profile_id from t
where child_id in (1, 2 ,3, ...) //this will be a list of integers CHILDREN
group by profile_id
having count(distinct child_id) = 3
I am having a difficulty how to write the last line in my sql query into linq. The following is my work so far:
public IQueryable<ProfileChildRelationship> GetPCRelByCids(List<int> children)
{
var query = from pcr in this._entities.ProfileChildRelationships
where children.Contains(pcr.pcChildIDF)
group pcr by pcr.pcProfileIDF into g
??? having ...?
select pcr;
return query;
}
I think that may main problem is that many convert a having sql statement into a where linq statement, but in my case i do not think it is possible to write another where after the group by linq statement!
Update:
The situation: I have a number of children, each of which has many different profiles, (some may be the same). A user will select a number of children, from which I would like to derive their common profiles. That is, if profile X is found for EVERY child, than I will get it, if profile Y is found for every child except one, than it would be invalid!

Sounds like you want a where clause here...
var query = from pcr in this._entities.ProfileChildRelationships
where children.Contains(pcr.pcChildIDF)
group pcr by pcr.pcProfileIDF into g
where g.Select(x => x.ChildId).Distinct().Count() == 3
select g.Key; // This is the profile ID

Related

MySQL IN clause: How can I get multiple rows when I use a same value multiple times?

I'm programming a C# Windows Forms Application in Visual Studio and I'm trying to get data about prices of products and the amount a user has added a product to its shopping list from my local MySQL-database into a List(int).
What I do is following:
If a user has added a product 4 times to their shopping list, I'm adding the barcode of the product 4 times to my List(int).
This is working but when I'm reading out all items of the List with the String.Join()-method into the IN-clause of my query and execute it, it only returns a row one time altough the IN-operator has the same barcode multiple times.
The following is how I'm adding barcodes to my List(int)
int count = 0;
List<int> barcodes = new List<int>();
MySqlCommand cmd = new MySqlCommand("SELECT product_barcode, amount FROM shopping_list_items WHERE shopping_list_id = " + current_shoppingListID + ";", db.connection);
cmd.ExecuteNonQuery();
var reader = cmd.ExecuteReader();
while (reader.Read())
{
do
{
barcodes.Add(Int32.Parse(reader["product_barcode"].ToString()));
count++;
} while (count < Int32.Parse(reader["amount"].ToString()));
}
reader.Close();
This is how I'm executing my query and assign the values to variables:
MySqlCommand cmdSum = new MySqlCommand("SELECT sum(price) AS 'total', supermarket_id FROM prices WHERE barcode IN (" + String.Join(", ", barcodes) + ") GROUP BY supermarket_id;", db.connection);
cmdSum.ExecuteNonQuery();
var readerSum = cmdSum.ExecuteReader();
while (readerSum.Read())
{
switch (double.Parse(readerSum["supermarket_id"].ToString()))
{
case 1:
sumSupermarket1 = double.Parse(readerSum["total"].ToString());
break;
case 2:
sumSupermarket2 = double.Parse(readerSum["total"].ToString());
break;
case 3:
sumSupermarket3 = double.Parse(readerSum["total"].ToString());
break;
}
}
A simplified query just to make it simple may look like this:
SELECT name FROM products WHERE barcode IN (13495, 13495, 13495);
If the above one is my query then I want it to return 3 the same rows.
So my question now is, how can I get multiple rows altough I use a same value multiple times in the IN-clause of a MySQL-query?
Q: how can I get multiple rows altough I use a same value multiple times in the IN-clause of a MySQL-query?
A: We don't. That's not how IN () works.
Note that
WHERE foo IN ('fee','fi','fi','fi')`
Is shorthand for
WHERE ( foo = 'fee'
OR foo = 'fi'
OR foo = 'fi'
OR foo = 'fi'
)
Understand what's happening here. MySQL is going to examine each row, and for each row it checks to see if this condition returns TRUE or not. If the row satisfies the condition, the row gets returned. Otherwise the row is not returned.
It doesn't matter that a row with foo value of 'fi' satisfies multiple conditions. All MySQL cares about is that the condition inside the parens ultimately evaluates to TRUE.
As an illustration, consider:
WHERE ( t.picked_by = 'peter piper'
OR t.picked_amount = 'peck'
OR t.name LIKE '%pickled%'
OR t.name LIKE '%pepper%'
)
There could be a row that satisfies every one of these conditions. But the WHERE clause is only asking if the entire condition evaluates to TRUE. If it does, return the row. If it doesn't, then exclude the row. We don't get four copies of a row because more than one of the conditions is satisfied.
So how do we get a set with multiple copies of a row?
As one possible option, we could use separate SELECT statements and combine the results with UNION ALL set operator. Something like this:
SELECT p1.name FROM product p1 WHERE p1.barcode IN (13495)
UNION ALL
SELECT p2.name FROM product p2 WHERE p2.barcode IN (13495)
UNION ALL
SELECT p3.name FROM product p3 WHERE p3.barcode IN (13495)
Note that the result from this query is significantly different than the result from the original query.
There are other query patterns that can return an equivalent set.
FOLLOWUP
Without an understanding of the use case, the specification, I'm just guessing at what we are attempting to achieve. Based on the two queries shown in the code (which follows a common pattern we see in code that is vulnerable to SQL Injection),
The shopping list:
SELECT i.product_barcode
, i.amount
FROM shopping_list_item i
WHERE i.shopping_list_id = :id
What is amount? Is that the quantity ordered? We want two cans of this, or three pounds of that? Seems like we would want to multiply the unit price by the quantity ordered to get the cost. (Two cans is going to cost twice as much as one can.)
If what we are after is the total cost of the items on the shopping list from multiple stores, we could do something like this:
SELECT SUM(p.price * s.amount) AS `total`
, p.supermarket_id
FROM ( SELECT i.product_barcode
, i.amount
FROM shopping_list_item i
WHERE i.shopping_list_id = :id
) s
JOIN price p
ON p.barcode = s.product_barcode
GROUP
BY p.supermarket_id
Note that if a particular product_barcode is not available for particular supermarket_id, that item on the list will be excluded from the total, i.e. we could get a lower total for a supermarket that doesn't have everything on our list.
For performance, we can eliminate the inline view, and write the query like this:
SELECT SUM(p.price * i.amount) AS `total`
, p.supermarket_id
FROM shopping_list_item i
JOIN price p
ON p.barcode = i.product_barcode
WHERE i.shopping_list_id = :id
GROUP
BY p.supermarket_id
If we absolutely have to rip through the shopping list query, and then use the rows from that to create a second query, we could form a query that looks something like this:
SELECT SUM(p.price * i.amount) AS `total`
, p.supermarket_id
FROM ( -- shopping_list here
SELECT '13495' AS product_barcode, '1'+0 AS amount
UNION ALL SELECT '13495', '1'+0
UNION ALL SELECT '13495', '1'+0
UNION ALL SELECT '12222', '2'+0
UNION ALL SELECT '15555', '5'+0
-- end shopping_list
) i
JOIN price p
ON p.barcode = i.product_barcode
WHERE i.shopping_list_id = :id
GROUP
BY p.supermarket_id
You would probably be better off investigating LINQ to SQL rather than using direct SQL and injection.
You can use an inline table join to accomplish what you want:
"SELECT sum(price) AS 'total', supermarket_id
FROM (select "+barcodes[0]+"as bc union all select "+String.Join(" union all select ", barcodes.Skip(1).ToArray())+") w
JOIN prices p ON p.barcode = w.bc
GROUP BY supermarket_id;"
Note: If you can name the column with the inline table alias (I couldn't test that) you could simplify the inline table generation.

Linq - How to group by multiple fields and count the records by one of the fields

I want to have just the same result as the following query:
select SITECODE, COUNT(SITECODE), DATEPART(MONTH, LOG_DATE)
from dbo.LOG_VEHICLE_LOOKUP GROUP BY DATEPART(MONTH, LOG_DATE), SITECODE
I'm using Entity Framework as the ORM
My current linq looks like this:
var model = from log in _repository.GetPostcodeLookupLogs()
group log by new { log.LOG_DATE.Month, log.SITECODE } into y
select new
{
y.Key.SITECODE,
y.Key.Month,
Count = y.Count()
};
Seems I was thinking the correct way. I got confused because I wanted to use
_repository.GetVehicleLookupLogs()
instead of:
_repository.GetPostcodeLookupLogs()
Both tables had the same columns so I got some results but not the ones I expected

Linq correlated subquery to same table on multiple columns

I've looked at several other questions related to correlated subqueries but it's still not clear to me how to accomplish what I need. I'm using Entity Framework and C#, and have a table called STEWARDSHIP with the following columns:
STEWARDSHIP_ID (the primary key)
SITE_ID
VISIT_DATE
VISIT_TYPE_ID
I need to identify cases where the same combination of SITE_ID, VISIT_DATE, VISIT_TYPE_ID exists more than once because it could represent a duplicate entry made by end users in error, and then I need to report on the details of these entries. In SQL I would do this by joining to the temporary result of a GROUP BY/HAVING like so:
SELECT * FROM stewardship AS s2,
(SELECT site_id, visit_type_id, CAST(visit_date AS DATE) AS visit_date
FROM stewardship
GROUP BY site_id, visit_type_id, CAST(visit_date AS DATE)
HAVING COUNT(*) > 1) AS s
WHERE s2.site_id = s.site_id
AND s2.visit_type_id = s.visit_type_id
AND CAST(s2.visit_date AS DATE) = s.visit_date
What's the best way to accomplish this in Linq?
Since you're open to a different approach that should be more performant, here is the new SQL to get what I think you're after.
select distinct s1.*
from stewardship s1
inner join stewardship s2 on
s1.stewardship_id <> s2.stewardship_id and
s1.site_id = s2.site_id and
s1.visit_type_id = s2.visit_type_id and
cast(s1.visit_date as date) = cast(s2.visit_date as date)
order by s1.site_id, s1.visit_type_id
Now, to translate that to LINQ, you can use the following statement.
var duplicates = (
from s in Stewardships
join s2 in Stewardships
on new { s.Site_id, s.Visit_type_id, s.Visit_date.Date } equals new { s2.Site_id, s2.Visit_type_id, s2.Visit_date.Date }
where s.Stewardship_id != s2.Stewardship_id
select s)
.Distinct()
.OrderBy(s => s.Site_id)
.ThenBy(s => s.Visit_type_id)
Note that you cannot use anything other than an equijoin for expression joins, so I had to put the non-equijoin (ensuring our matches aren't on the same record via PK) in the where expression. You could also accomplish this with lambdas via the Except() extension method.
The order by is there for readability of the results and to match the SQL statement above.
I hope this helps!
It would be fairly similar to what you've already got.
from s in context.stewardships
group s by new {s.site_id, s.visit_type_id, visit_date} into g
where g.Count() > 1
select g;
This would give you groups of stewardships with similar values. You could "flatten" those results with a SelectMany afterward, but you might find them more useful to work with in groups.
Note that you may need to use SqlFunctions or something to do the equivalent of the cast to date.

Very confused about how to Group by in LINQ, and even more confused about HAVING

I'm new to Linq. I'm trying to convert this simple SQL query, but can't find any great resources on how to convert SQL to LINQ. My SQL query is:
SELECT SomeValue
FROM SomeTable
GROUP BY SomeValue
HAVING SUM (OtherValue) > 0;
I don't quite understand LINQ well enough to do this. I'm having a lot of trouble getting GROUP BY to work, I haven't even attempted to tackle HAVING SUM yet.
I've tried a few things. This is the most recent, though it's still very wrong:
from entry in SomeTable
group entry by entry.SomeValue into grp
select grp.Select(x => x.SomeValue).toList();
Any help or resources would be great.
An equivalent LINQ statement would look like this:
SomeTable.GroupBy(x => x.SomeValue)
.Where(g => g.Sum(x => x.OtherValue) > 0)
.Select(g => g.Key);
It first groups by SomeValue. The result will be a list of groups. Each group in turn contains all rows that have the same SomeValue.
The next step creates a filter (Where). This filter will return only those group where the sum of OtherValue of the rows in this group is greater than zero.
Finally, from the filtered groups it will select the key of each group. The key of a group is the value that has been specified in the GroupBy.
To get the grouping working it looks like you want this (in query syntax instead of lambdas):
from entry in SomeTable
group entry by entry.SomeValue into grp
where grp.Sum(x => x.OtherValue) > 0
select grp.Key;

Is this LINQ Query "correct"?

I have the following LINQ query, that is returning the results that I expect, but it does not "feel" right.
Basically it is a left join. I need ALL records from the UserProfile table.
Then the LastWinnerDate is a single record from the winner table (possible multiple records) indicating the DateTime the last record was entered in that table for the user.
WinnerCount is the number of records for the user in the winner table (possible multiple records).
Video1 is basically a bool indicating there is, or is not a record for the user in the winner table matching on a third table Objective (should be 1 or 0 rows).
Quiz1 is same as Video 1 matching another record from Objective Table (should be 1 or 0 rows).
Video and Quiz is repeated 12 times because it is for a report to be displayed to a user listing all user records and indicate if they have met the objectives.
var objectiveIds = new List<int>();
objectiveIds.AddRange(GetObjectiveIds(objectiveName, false));
var q =
from up in MetaData.UserProfile
select new RankingDTO
{
UserId = up.UserID,
FirstName = up.FirstName,
LastName = up.LastName,
LastWinnerDate = (
from winner in MetaData.Winner
where objectiveIds.Contains(winner.ObjectiveID)
where winner.Active
where winner.UserID == up.UserID
orderby winner.CreatedOn descending
select winner.CreatedOn).First(),
WinnerCount = (
from winner in MetaData.Winner
where objectiveIds.Contains(winner.ObjectiveID)
where winner.Active
where winner.UserID == up.UserID
orderby winner.CreatedOn descending
select winner).Count(),
Video1 = (
from winner in MetaData.Winner
join o in MetaData.Objective on winner.ObjectiveID equals o.ObjectiveID
where o.ObjectiveNm == Constants.Promotions.SecVideo1
where winner.Active
where winner.UserID == up.UserID
select winner).Count(),
Quiz1 = (
from winner2 in MetaData.Winner
join o2 in MetaData.Objective on winner2.ObjectiveID equals o2.ObjectiveID
where o2.ObjectiveNm == Constants.Promotions.SecQuiz1
where winner2.Active
where winner2.UserID == up.UserID
select winner2).Count(),
};
You're repeating join winners table part several times. In order to avoid it you can break it into several consequent Selects. So instead of having one huge select, you can make two selects with lesser code. In your example I would first of all select winner2 variable before selecting other result properties:
var q1 =
from up in MetaData.UserProfile
select new {up,
winners = from winner in MetaData.Winner
where winner.Active
where winner.UserID == up.UserID
select winner};
var q = from upWinnerPair in q1
select new RankingDTO
{
UserId = upWinnerPair.up.UserID,
FirstName = upWinnerPair.up.FirstName,
LastName = upWinnerPair.up.LastName,
LastWinnerDate = /* Here you will have more simple and less repeatable code
using winners collection from "upWinnerPair.winners"*/
The query itself is pretty simple: just a main outer query and a series of subselects to retrieve actual column data. While it's not the most efficient means of querying the data you're after (joins and using windowing functions will likely get you better performance), it's the only real way to represent that query using either the query or expression syntax (windowing functions in SQL have no mapping in LINQ or the LINQ-supporting extension methods).
Note that you aren't doing any actual outer joins (left or right) in your code; you're creating subqueries to retrieve the column data. It might be worth looking at the actual SQL being generated by your query. You don't specify which ORM you're using (which would determine how to examine it client-side) or which database you're using (which would determine how to examine it server-side).
If you're using the ADO.NET Entity Framework, you can cast your query to an ObjectQuery and call ToTraceString().
If you're using SQL Server, you can use SQL Server Profiler (assuming you have access to it) to view the SQL being executed, or you can run a trace manually to do the same thing.
To perform an outer join in LINQ query syntax, do this:
Assuming we have two sources alpha and beta, each having a common Id property, you can select from alpha and perform a left join on beta in this way:
from a in alpha
join btemp in beta on a.Id equals btemp.Id into bleft
from b in bleft.DefaultIfEmpty()
select new { IdA = a.Id, IdB = b.Id }
Admittedly, the syntax is a little oblique. Nonetheless, it works and will be translated into something like this in SQL:
select
a.Id as IdA,
b.Id as Idb
from alpha a
left join beta b on a.Id = b.Id
It looks fine to me, though I could see why the multiple sub-queries could trigger inefficiency worries in the eyes of a coder.
Take a look at what SQL is produced though (I'm guessing you're running this against a database source from your saying "table" above), before you start worrying about that. The query providers can be pretty good at producing nice efficient SQL that in turn produces a good underlying database query, and if that's happening, then happy days (it will also give you another view on being sure of the correctness).

Categories