QueryOver with dynamic join and disjuction - c#

I need to create a query that, depending on the input of the method, uses a join or not. Given the following model:
Account
{
Group Group;
Account KeyUserInteral;
Account KeyUserExternal;
DateTime DeactivationDate;
}
Group
{
Account KeyUserInteral;
Account KeyUserExternal;
}
I want to either
get all Accounts that a given different Account is entered as either KeyUser
or
get all Accounts that a given different Account is entered as either KeyUser or is entered as the Account.Groups either KeyUser
and then depending on the DeactivationDate filter that result to include only active Accounts or not.
For this, I tried it using this method:
public IList<Account> ListByKeyUser(int keyUserId_, bool includeGroupAccounts_, bool onlyActive_)
{
Account keyUser = Get(keyUserId_);
Disjunction keyUserRestriction = new Disjunction();
keyUserRestriction.Add<Account>(acc_ => acc_.KeyUserInternal == keyUser || acc_.KeyUserExternal == keyUser);
IQueryOver<Account, Account> query = Session.QueryOver<Account>();
if (includeGroupAccounts_) {
query.JoinQueryOver(acc_ => acc_.Group, JoinType.LeftOuterJoin)
.Where(grp_ => grp_.KeyUserInternal == keyUser || grp_.KeyUserExternal == keyUser);
}
query.Where(keyUserRestriction);
if (onlyActive_) {
query.Where(acc_ => acc_.DeactivationDate > DateTime.Now);
}
return query.OrderBy(acc_ => acc_.Name).Asc.List<Account>();
}
Unfortunatley, the created SQL is not exactly what I need: (Excluded the SELECT as that's not really interesting I think)
SELECT [...]
FROM accounts this_ left outer join groups group1_ on this_.userGroup=group1_.id
WHERE
(group1_.keyUserInternal = #p0 or group1_.keyUserExternal = #p1)
and
((this_.keyUserInternal = #p2 or this_.keyUserExternal = #p3))
and this_.deactivationDate > #p4
ORDER BY this_.name asc;
What I need is this:
SELECT [...]
FROM accounts this_ left outer join groups group1_ on this_.userGroup=group1_.id
WHERE
((group1_.keyUserInternal = #p0 or group1_.keyUserExternal = #p1)
or (this_.keyUserInternal = #p2 or this_.keyUserExternal = #p3))
and this_.deactivationDate > #p4
ORDER BY this_.name asc;
Basically, I just need to somehow move the "join condition" into the "or". I tried it by adding the where into the disjunction:
if (includeGroupAccounts_) {
query.JoinQueryOver(acc_ => acc_.Group, JoinType.LeftOuterJoin);
keyUserRestriction.Add<Account>(acc_ => acc_.Group.KeyUserInternal == keyUser || acc_.Group.KeyUserExternal == keyUser);
}
But that creates:
SELECT [...]
FROM accounts this_ left outer join groups group1_ on this_.userGroup=group1_.id
WHERE
((this_.keyUserInternal = #p0 or this_.keyUserExternal = #p1)
or (this_.keyUserInternal = #p2 or this_.keyUserExternal = #p3))
and this_.deactivationDate > #p4
ORDER BY this_.name asc;
Which totally ignores the Group join...
How can I make this work?

A bit more digging returned the following as a solution:
if (includeGroupAccounts_) {
Group groupAlias = null;
query.JoinAlias(acc_ => acc_.Group, () => groupAlias, JoinType.LeftOuterJoin);
keyUserRestriction.Add(() => groupAlias.KeyUserExternal == keyUser || groupAlias.KeyUserInternal == keyUser);
}
This results in exactly the SQL I needed, where the "join where" is contained in the "or", while the "datetime check" is still an "and" for everything.
SELECT [...]
FROM accounts this_ left outer join groups groupalias1_ on this_.userGroup=groupalias1_.id
WHERE (
(this_.keyUserInternal = #p0 or this_.keyUserExternal = #p1)
or (groupalias1_.keyUserExternal = #p2 or groupalias1_.keyUserInternal = #p3)
)
and this_.deactivationDate > #p4
ORDER BY this_.name asc

Related

Is it possible to convert this SQL query to linq?

I need to count three values on a single table. In plain SQL, it is written like this way:
select
count (*) as num_products,
sum(case when CreatedAt > '{sql.ToSqlDate(_CreatedAfter)}' then 1 else 0 end) num_new,
sum(case when UpdatedAt > '{sql.ToSqlDate(_UpdatedAfter)}' then 1 else 0 end) num_updated
from
Products
While switching to EF Core, I tried to convert it to Linq, like this
var res = (from p in _db.Products
let total = _db.Products.Count()
let NewProducts = _db.Products.Count(s => s.CreatedAt > crDate.Date)
let UpdatedProducts = _db.Products.Count(s => s.UpdatedAt > updDate.Date)
select new { total, NewProducts, UpdatedProducts } );
var response = res.ToList();
but the resulting SQL query seems not optimized
SELECT
(SELECT COUNT(*) FROM [Products] AS [p0]) AS [total],
(SELECT COUNT(*) FROM [Products] AS [s]
WHERE [s].[CreatedAt] > '2019-07-31') AS [NewProducts],
(SELECT COUNT(*) FROM [Products] AS [s0]
WHERE [s0].[UpdatedAt] > '2019-07-01') AS [UpdatedProducts]
FROM
[Products] AS [p]
Maybe somebody can help to translate the original SQL query to linq?
tia
ish
A more literal translation of that query, that generates a query more likely to execute in a single scan of the target table would be:
var q =
from p in db.Products
select new
{
p.Id,
NewProduct = p.CreatedAt > DateTime.Parse("2019-07-31") ? 1 : 0,
UpdatedProduct = p.UpdatedAt > DateTime.Parse("2019-07-01") ? 1 : 0
} into counts
group counts by 1 into grouped
select new
{
ProductCount = grouped.Count(),
NewProductCount = grouped.Sum(r => r.NewProduct),
UpdatedProductCount = grouped.Sum(r => r.UpdatedProduct)
};
Which translates to something like:
SELECT COUNT(*) AS [ProductCount],
SUM([t].[NewProduct]) AS [NewProductCount],
SUM([t].[UpdatedProduct]) AS [UpdatedProductCount]
FROM (
SELECT [p].[Id], CASE
WHEN [p].[CreatedAt] > #__Parse_0
THEN 1 ELSE 0
END AS [NewProduct], CASE
WHEN [p].[UpdatedAt] > #__Parse_1
THEN 1 ELSE 0
END AS [UpdatedProduct], 1 AS [Key]
FROM [Products] AS [p]
) AS [t]
GROUP BY [t].[Key]
You do not need a from clause in your linq because you aren't not going over the rows. just use three statements:
var total = _db.Products.Count();
var NewProducts = _db.Products.Count(s => s.CreatedAt > crDate.Date);
var UpdatedProducts = _db.Products.Count(s => s.UpdatedAt > updDate.Date) ;

Linq query in EF Core 2 when using joins and pagination order by column alias doesn't work

We are currently trying to write out sorting into our server-side pagination using Linq and EF Core 2. We are running into an issue where the column alias being produced by Linq does not work while using pagination. However if we do not paginate it works as intended.
All of the columns within the outputted queries are aliases as we have different property names in the model and database column names are different, but this shouldn't make a difference to our knowledge.
This is the Linq query without the pagination:
var source = from p in _ppmRepository.GetAll()
join jt in _jobTypeRepository.GetAll() on p.PpmFkeyInSeq equals jt.Id into jtdata
from jt in jtdata.DefaultIfEmpty()
join a in _assetRepository.GetAll() on p.PpmFkeyArSeq equals a.Id into aData
from a in aData.DefaultIfEmpty()
where p.PpmFkeyBgSeq == bldId
orderby p.PpmFreq
select new BuildingPpmListViewModel
{
PpmId = p.Id,
PpmFreq = p.PpmFreq,
PpmNextService = p.PpmNextService,
TotalCost = p.TotalCost,
PpmPeriodUnits = p.PpmPeriodUnits,
PpmFkeyPriDesc = p.PpmFkeyPriDesc,
JtTitle = jt.JtTitle,
AssetId = p.PpmFkeyArSeq,
AssetDescription = a.AssetDescription,
IsDeleted = p.IsDeleted
};
source = source.Where(i => i.JtTitle.Contains("audit") && i.AssetDescription.Contains("df"));
This is the outputted query produced by ef core which works:
SELECT [p].[PPM_SEQ] AS [PpmId], [p].[PPM_FREQ] AS [PpmFreq], [p].[PPM_NEXT_SERVICE] AS [PpmNextService],
CAST([p].[TotalCost] AS float) AS [TotalCost], [p].[PPM_PERIOD_UNITS] AS [PpmPeriodUnits], [p].[PPM_FKEY_PRI_DESC] AS [PpmFkeyPriDesc],
[t].[jt_title] AS [JtTitle], [p].[PPM_FKEY_AR_SEQ] AS [AssetId], [t0].[AR_DESCRIPTION] AS [AssetDescription], [p].[Deleted] AS [IsDeleted]
FROM [PPMs] AS [p]
LEFT JOIN (
SELECT [j].*
FROM [JobTypes] AS [j]
) AS [t] ON [p].[PPM_FKEY_IN_SEQ] = [t].[jt_seq]
LEFT JOIN (
SELECT [a].*
FROM [Assets] AS [a]
) AS [t0] ON [p].[PPM_FKEY_AR_SEQ] = [t0].[ar_seq]
WHERE ([p].[PPM_FKEY_BG_SEQ] = 172) AND ((CHARINDEX(N'audit', [t].[jt_title]) > 0) AND (CHARINDEX(N'df', [t0].[AR_DESCRIPTION]) > 0))
ORDER BY [PpmFreq]
This is the Linq query with the pagination:
var source = from p in _ppmRepository.GetAll()
join jt in _jobTypeRepository.GetAll() on p.PpmFkeyInSeq equals jt.Id into jtdata
from jt in jtdata.DefaultIfEmpty()
join a in _assetRepository.GetAll() on p.PpmFkeyArSeq equals a.Id into aData
from a in aData.DefaultIfEmpty()
where p.PpmFkeyBgSeq == bldId
orderby p.PpmFreq
select new BuildingPpmListViewModel
{
PpmId = p.Id,
PpmFreq = p.PpmFreq,
PpmNextService = p.PpmNextService,
TotalCost = p.TotalCost,
PpmPeriodUnits = p.PpmPeriodUnits,
PpmFkeyPriDesc = p.PpmFkeyPriDesc,
JtTitle = jt.JtTitle,
AssetId = p.PpmFkeyArSeq,
AssetDescription = a.AssetDescription,
IsDeleted = p.IsDeleted
};
source = source.Where(i => i.JtTitle.Contains("audit") && i.AssetDescription.Contains("df")).Skip(0).Take(50);
This is the output of the pagination where in the over function order by PpmFreq is the alias of [p].[PPM_FREQ] that SQL can not find:
SELECT [t1].[PpmId], [t1].[PpmFreq], [t1].[PpmNextService], [t1].[TotalCost], [t1].[PpmPeriodUnits],
[t1].[PpmFkeyPriDesc], [t1].[JtTitle], [t1].[AssetId], [t1].[AssetDescription], [t1].[IsDeleted]
FROM (
SELECT [p].[PPM_SEQ] AS [PpmId], [p].[PPM_FREQ] AS [PpmFreq], [p].[PPM_NEXT_SERVICE] AS [PpmNextService],
CAST([p].[TotalCost] AS float) AS [TotalCost], [p].[PPM_PERIOD_UNITS] AS [PpmPeriodUnits], [p].[PPM_FKEY_PRI_DESC] AS
[PpmFkeyPriDesc], [t].[jt_title] AS [JtTitle], [p].[PPM_FKEY_AR_SEQ] AS [AssetId], [t0].[AR_DESCRIPTION] AS [AssetDescription],
[p].[Deleted] AS [IsDeleted], ROW_NUMBER() OVER(ORDER BY [PpmFreq]) AS [__RowNumber__]
FROM [PPMs] AS [p]
LEFT JOIN (
SELECT [j].*
FROM [JobTypes] AS [j]
) AS [t] ON [p].[PPM_FKEY_IN_SEQ] = [t].[jt_seq]
LEFT JOIN (
SELECT [a].*
FROM [Assets] AS [a]
) AS [t0] ON [p].[PPM_FKEY_AR_SEQ] = [t0].[ar_seq]
WHERE (([p].[PPM_FKEY_BG_SEQ] = 172)) AND ((CHARINDEX(N'audit', [t].[jt_title]) > 0)
AND (CHARINDEX(N'df', [t0].[AR_DESCRIPTION]) > 0))
) AS [t1]
WHERE ([t1].[__RowNumber__] > 0) AND ([t1].[__RowNumber__] <= (50))
This looks to be where our issues are coming from as we can slightly modify it to get a correct result from the database:
ROW_NUMBER() OVER(ORDER BY [PpmFreq]) AS [__RowNumber__]
If we were to modify the above statement to also include the table alias as [p].[PPM_FREQ], like so: ROW_NUMBER() OVER(ORDER BY [p].[PPM_FREQ]) AS [__RowNumber__] then our issues are resolved, but that doesnt seem possible with our current linq query.
See if following works better :
var source = (from p in _ppmRepository.GetAll()
join jt in _jobTypeRepository.GetAll() on p.PpmFkeyInSeq equals jt.Id into jtdata
from jt in jtdata.DefaultIfEmpty()
join a in _assetRepository.GetAll() on p.PpmFkeyArSeq equals a.Id into aData
from a in aData.DefaultIfEmpty()
select new BuildingPpmListViewModel
{
PpmId = p.Id,
PpBgSeq = p.PpmFkeyBgSeq,
PpmFreq = p.PpmFreq,
PpmNextService = p.PpmNextService,
TotalCost = p.TotalCost,
PpmPeriodUnits = p.PpmPeriodUnits,
PpmFkeyPriDesc = p.PpmFkeyPriDesc,
JtTitle = jt.JtTitle,
AssetId = p.PpmFkeyArSeq,
AssetDescription = a.AssetDescription,
IsDeleted = p.IsDeleted
})
.Where(x => x.PpBgSeq == bldId)
.OrderBy(x => x.PpmFreq)
.ToList();
This is a known issue of that we have later filed with the ef core team directly.
This is a known issue which has been fixed for upcoming release of 2.1
You can see more details and possible work-around here
github.com/aspnet/EntityFrameworkCore/issues/9535`
Smit Patel
If you run a nightly build you can fix the above issue.

Why are multiple where in LINQ so slow?

Using C# and Linq to SQL, I found that my query with multiple where is orders of magnitude slower than with a single where / and.
Here is the query
using (TeradiodeDataContext dc = new TeradiodeDataContext())
{
var filterPartNumberID = 71;
var diodeIDsInBlades = (from bd in dc.BladeDiodes
select bd.DiodeID.Value).Distinct();
var diodesWithTestData = (from t in dc.Tests
join tt in dc.TestTypes on t.TestTypeID equals tt.ID
where tt.DevicePartNumberID == filterPartNumberID
select t.DeviceID.Value).Distinct();
var result = (from d in dc.Diodes
where d.DevicePartNumberID == filterPartNumberID
where diodesWithTestData.Contains(d.ID)
where !diodeIDsInBlades.Contains(d.ID)
orderby d.Name
select d);
var list = result.ToList();
// ~15 seconds
}
However, when the condition in the final query is this
where d.DevicePartNumberID == filterPartNumberID
& diodesWithTestData.Contains(d.ID)
& !diodeIDsInBlades.Contains(d.ID)
// milliseconds
it is very fast.
Comparing the SQL in result before calling ToList(), here are the queries (value 71 manually added in place of #params)
-- MULTIPLE WHERE
SELECT [t0].[ID], [t0].[Name], [t0].[M2MID], [t0].[DevicePartNumberID], [t0].[Comments], [t0].[Hold]
FROM [dbo].[Diode] AS [t0]
WHERE (NOT (EXISTS(
SELECT NULL AS [EMPTY]
FROM (
SELECT DISTINCT [t2].[value]
FROM (
SELECT [t1].[DiodeID] AS [value]
FROM [dbo].[BladeDiode] AS [t1]
) AS [t2]
) AS [t3]
WHERE [t3].[value] = [t0].[ID]
))) AND (EXISTS(
SELECT NULL AS [EMPTY]
FROM (
SELECT DISTINCT [t6].[value]
FROM (
SELECT [t4].[DeviceID] AS [value], [t5].[DevicePartNumberID]
FROM [dbo].[Test] AS [t4]
INNER JOIN [dbo].[TestType] AS [t5] ON [t4].[TestTypeID] = ([t5].[ID])
) AS [t6]
WHERE [t6].[DevicePartNumberID] = (71)
) AS [t7]
WHERE [t7].[value] = [t0].[ID]
)) AND ([t0].[DevicePartNumberID] = 71)
ORDER BY [t0].[Name]
and
-- SINGLE WHERE
SELECT [t0].[ID], [t0].[Name], [t0].[M2MID], [t0].[DevicePartNumberID], [t0].[Comments], [t0].[Hold]
FROM [dbo].[Diode] AS [t0]
WHERE ([t0].[DevicePartNumberID] = 71) AND (EXISTS(
SELECT NULL AS [EMPTY]
FROM (
SELECT DISTINCT [t3].[value]
FROM (
SELECT [t1].[DeviceID] AS [value], [t2].[DevicePartNumberID]
FROM [dbo].[Test] AS [t1]
INNER JOIN [dbo].[TestType] AS [t2] ON [t1].[TestTypeID] = ([t2].[ID])
) AS [t3]
WHERE [t3].[DevicePartNumberID] = (71)
) AS [t4]
WHERE [t4].[value] = [t0].[ID]
)) AND (NOT (EXISTS(
SELECT NULL AS [EMPTY]
FROM (
SELECT DISTINCT [t6].[value]
FROM (
SELECT [t5].[DiodeID] AS [value]
FROM [dbo].[BladeDiode] AS [t5]
) AS [t6]
) AS [t7]
WHERE [t7].[value] = [t0].[ID]
)))
ORDER BY [t0].[Name]
The two SQL queries execute in < 1 second in SSMS and produce the same results.
So I'm wondering why the first is slower on the LINQ side. It's worrying to me because I know I've used multiple where elsewhere, without being aware of a such a severe performance impact.
This question even has answered with both multiple & and where. And this answer even suggests using multiple where clauses.
Can anyone explain why this happens in my case?
Because writing like this
if (someParam1 != 0)
{
myQuery = myQuery.Where(q => q.SomeField1 == someParam1)
}
if (someParam2 != 0)
{
myQuery = myQuery.Where(q => q.SomeField2 == someParam2)
}
is NOT(upd) the same as (in case when someParam1 and someParam2 != 0)
myQuery = from t in Table
where t.SomeField1 == someParam1
&& t.SomeField2 == someParam2
select t;
is (NOT deleted) the same as
myQuery = from t in Table
where t.SomeField1 == someParam1
where t.SomeField2 == someParam2
select t;
UPD
Yes, I do mistake. Second query is same, first is not same.
First and Second queries not EXACTLY the same. Let me show you what I mean.
1st query with lamda-expression writen as
t.Where(r => t.SomeField1 == someParam1 && t.SomeField2 == someParam2)
2nd query as
t.Where(r => r.SomeField1 == someParam1).Where(r => r.SomeField2 == someParam2)
In this case in generated SQL Predicate with SomeField2 goes first (it is important, see below)
In 1st case we getting this SQL:
SELECT <all field from Table>
FROM table t
WHERE t.SomeField1 = :someParam1
AND t.SomeField2 = :someParam2
In 2 case the SQL is:
SELECT <all field from Table>
FROM table t
WHERE t.SomeField2 = :someParam2
AND t.SomeField1 = :someParam1
As we see there are 2 'same' SQLs. As we see, the OP's SQLs are also 'same', they are different in order of predicates in WHERE clause (as in my example). And I guess that SQL optimizer generate 2 different execution plans and may be(!!!) doing NOT EXISTS, then EXISTS and then filtering take more time than do first filtering and after that do EXISTS and NOT EXISTS
UPD2
It is a 'problem' of Linq Provider (ORM). I'm using another ORM (linq2db), and it generates for me EXACTLY the same SQLs in both cases.

Use parameter as column in where in SQL Server stored procedure

BEGIN
DECLARE #SQLQuery AS NVARCHAR(MAX)
IF(#Search IS NOT NULL)
BEGIN
DECLARE #dyColumn sysname ;
IF(#Filter = 'IsNew')
BEGIN
SET #dyColumn = 'IsNew'
END
ELSE IF(#Filter = 'IsOnSale')
BEGIN
SET #dyColumn = 'IsOnSale'
END
ELSE IF(#Filter = 'IsFeatured')
BEGIN
SET #dyColumn = 'IsFeatured'
END
SET #SQLQuery = 'SELECT P.*, C.Id AS CategoryId, C.Name AS CategoryName, C.Logo AS CategoryLogo,
CO.Id AS CompanyId, CO.Name AS CompanyName, CO.Logo AS CompanyLogo, COUNT(*) OVER() TotalCount
FROM Products P
JOIN Categories C ON P.CategoryId = C.Id
JOIN Companies CO ON P.CompanyId = CO.Id
WHERE P.Name LIKE %'+#Search+'% AND '+#dyColumn+' = true
ORDER BY P.Name
OFFSET '+CAST(#PageSize AS nvarchar(100))+'*('+CAST(#PageNumber AS nvarchar(100)) +'- 1) ROWS
FETCH NEXT '+CAST(#PageSize AS nvarchar(100))+'ROWS ONLY OPTION (RECOMPILE);'
EXECUTE(#SQLQuery)
END
This is the query and its giving this error during run time
Incorrect syntax near '2'.
Invalid usage of the option NEXT in the FETCH statement.
which means query getting wrong after
WHERE P.Name LIKE %'+#Search+'%
you don't need dynamic sql for this just test for the value of the filter value in combination with the column in your WHERE expression:
SELECT
P.*
,C.Id AS CategoryId
,C.Name AS CategoryName
,C.Logo AS CategoryLogo
,CO.Id AS CompanyId
,CO.Name AS CompanyName
,CO.Logo AS CompanyLogo
,COUNT(*) OVER() TotalCount
FROM
Products P
JOIN Categories C
ON P.CategoryId = C.Id
JOIN Companies CO
ON P.CompanyId = CO.Id
WHERE
#Search IS NOT NULL
AND P.Name LIKE '%' + #Search + '%'
AND (
(#Filter= 'IsNew' AND IsNew = 1)
OR (#Filter= 'IsOnSale' AND IsOnSale = 1)
OR (#Filter= 'IsFeatured' AND IsFeatured = 1)
OR (#Filter NOT IN ('IsNew','IsOnSale','IsFeatured'))
)
ORDER BY
P.Name
OFFSET (#PageSize)*(#PageNumber)- 1 ROWS
FETCH NEXT (#PageSize) ROWS ONLY OPTION (RECOMPILE);
If you really want to use dynamic SQL instead of Executing it SELECT #SQLQuery and then look for the syntax issues by copying to another query window.

LINQ to SQL using GROUP BY and COUNT(DISTINCT)

I have to perform the following SQL query:
select answer_nbr, count(distinct user_nbr)
from tpoll_answer
where poll_nbr = 16
group by answer_nbr
The LINQ to SQL query
from a in tpoll_answer
where a.poll_nbr = 16 select a.answer_nbr, a.user_nbr distinct
maps to the following SQL query:
select distinct answer_nbr, distinct user_nbr
from tpoll_answer
where poll_nbr = 16
So far, so good. However the problem raises when trying to GROUP the results, as I'm not being able to find a LINQ to SQL query that maps to the first query I wrote here (thank you LINQPad for making this process a lot easier). The following is the only one that I've found that gives me the desired result:
from answer in tpoll_answer where answer.poll_nbr = 16 _
group by a_id = answer.answer_nbr into votes = count(answer.user_nbr)
Which in turns produces the follwing ugly and non-optimized at all SQL query:
SELECT [t1].[answer_nbr] AS [a_id], (
SELECT COUNT(*)
FROM (
SELECT CONVERT(Bit,[t2].[user_nbr]) AS [value], [t2].[answer_nbr], [t2].[poll_nbr]
FROM [TPOLL_ANSWER] AS [t2]
) AS [t3]
WHERE ([t3].[value] = 1) AND ([t1].[answer_nbr] = [t3].[answer_nbr]) AND ([t3].[poll_nbr] = #p0)
) AS [votes]
FROM (
SELECT [t0].[answer_nbr]
FROM [TPOLL_ANSWER] AS [t0]
WHERE [t0].[poll_nbr] = #p0
GROUP BY [t0].[answer_nbr]
) AS [t1]
-- #p0: Input Int (Size = 0; Prec = 0; Scale = 0) [16]
-- Context: SqlProvider(Sql2008) Model: AttributedMetaModel Build: 3.5.30729.1
Any help will be more than appreciated.
There isn't direct support for COUNT(DISTINCT {x})), but you can simulate it from an IGrouping<,> (i.e. what group by returns); I'm afraid I only "do" C#, so you'll have to translate to VB...
select new
{
Foo= grp.Key,
Bar= grp.Select(x => x.SomeField).Distinct().Count()
};
Here's a Northwind example:
using(var ctx = new DataClasses1DataContext())
{
ctx.Log = Console.Out; // log TSQL to console
var qry = from cust in ctx.Customers
where cust.CustomerID != ""
group cust by cust.Country
into grp
select new
{
Country = grp.Key,
Count = grp.Select(x => x.City).Distinct().Count()
};
foreach(var row in qry.OrderBy(x=>x.Country))
{
Console.WriteLine("{0}: {1}", row.Country, row.Count);
}
}
The TSQL isn't quite what we'd like, but it does the job:
SELECT [t1].[Country], (
SELECT COUNT(*)
FROM (
SELECT DISTINCT [t2].[City]
FROM [dbo].[Customers] AS [t2]
WHERE ((([t1].[Country] IS NULL) AND ([t2].[Country] IS NULL)) OR (([t1]
.[Country] IS NOT NULL) AND ([t2].[Country] IS NOT NULL) AND ([t1].[Country] = [
t2].[Country]))) AND ([t2].[CustomerID] <> #p0)
) AS [t3]
) AS [Count]
FROM (
SELECT [t0].[Country]
FROM [dbo].[Customers] AS [t0]
WHERE [t0].[CustomerID] <> #p0
GROUP BY [t0].[Country]
) AS [t1]
-- #p0: Input NVarChar (Size = 0; Prec = 0; Scale = 0) []
-- Context: SqlProvider(Sql2008) Model: AttributedMetaModel Build: 3.5.30729.1
The results, however, are correct- verifyable by running it manually:
const string sql = #"
SELECT c.Country, COUNT(DISTINCT c.City) AS [Count]
FROM Customers c
WHERE c.CustomerID != ''
GROUP BY c.Country
ORDER BY c.Country";
var qry2 = ctx.ExecuteQuery<QueryResult>(sql);
foreach(var row in qry2)
{
Console.WriteLine("{0}: {1}", row.Country, row.Count);
}
With definition:
class QueryResult
{
public string Country { get; set; }
public int Count { get; set; }
}
The Northwind example cited by Marc Gravell can be rewritten with the City column selected directly by the group statement:
from cust in ctx.Customers
where cust.CustomerID != ""
group cust.City /*here*/ by cust.Country
into grp
select new
{
Country = grp.Key,
Count = grp.Distinct().Count()
};
Linq to sql has no support for Count(Distinct ...). You therefore have to map a .NET method in code onto a Sql server function (thus Count(distinct.. )) and use that.
btw, it doesn't help if you post pseudo code copied from a toolkit in a format that's neither VB.NET nor C#.
This is how you do a distinct count query. Note that you have to filter out the nulls.
var useranswercount = (from a in tpoll_answer
where user_nbr != null && answer_nbr != null
select user_nbr).Distinct().Count();
If you combine this with into your current grouping code, I think you'll have your solution.
simple and clean example of how group by works in LINQ
http://www.a2zmenu.com/LINQ/LINQ-to-SQL-Group-By-Operator.aspx
I wouldn't bother doing it in Linq2SQL. Create a stored Procedure for the query you want and understand and then create the object to the stored procedure in the framework or just connect direct to it.

Categories