LINQ Count multiple values from a collection - c#

In SQL what I'm trying to accomplish is
SELECT
SUM(CASE WHEN Kendo=1 THEN 1 ELSE 0 END) as KendoCount,
SUM(CASE WHEN Icenium=1 THEN 1 ELSE 0 END) as IceniumCount
FROM
Contacts
I'd like to do this in a C# program using LINQ.
Contacts is a List where Contact has many Booleans such as Kendo and Icenium and I need to know how many are true for each of the Booleans.

At least with LINQ to SQL, the downside of the count functions is that it requires separate SQL requests for each .count method. I suspect Jessie is trying to run a single scan over the table rather than multiple scans for each predicate. Depending on the logic and number of columns you are creating, this may not perform as well. Closer to the original request, try using sum with a ternary if clause as such (from Northwind):
from e in Employees
group e by "" into g
select new {
isUs = g.Sum (x => x.Country == "USA" ? 1 : 0),
NotUs = g.Sum (x => x.Country != "USA" ? 0 : 1)
}
LINQ to SQL generates the following (YMMV with other ORM's):
SELECT SUM(
(CASE
WHEN [t1].[Country] = #p1 THEN #p2
ELSE #p3
END)) AS [isUs], SUM(
(CASE
WHEN [t1].[Country] <> #p4 THEN #p5
ELSE #p6
END)) AS [NotUs]
FROM (
SELECT #p0 AS [value], [t0].[Country]
FROM [Employees] AS [t0]
) AS [t1]
GROUP BY [t1].[value]

var KendoCount = db.Contacts.Where(x => x.Kendo).Count();
var IceniumCount = db.Contacts.Where(x => x.Icenium).Count();

I would do this as two separate queries:
int kendoCount = db.Contacts.Count(c => c.Kendo);
int iceniumCount = db.Contacts.Count(c => c.Icenium);
Given that these queries will automatically translate into optimized SQL, this will likely be similar in speed or even potentially faster than any query option, and is far simpler to understand.
Note that, if this is for Entity Framework, you'll need to write this as:
int kendoCount = db.Contacts.Where(c => c.Kendo).Count();
int iceniumCount = db.Contacts.Where(c => c.Icenium).Count();

var result = Contacts
.GroupBy(c => new
{
ID = "",
})
.Select(c => new
{
KendoCount = c.Sum(k => k.Kendo ? 1 : 0),
IceniumCount = c.Sum(k => k.Icenium ? 1: 0),
})
.ToArray()

Related

Is it possible to convert this SQL query to linq?

I need to count three values on a single table. In plain SQL, it is written like this way:
select
count (*) as num_products,
sum(case when CreatedAt > '{sql.ToSqlDate(_CreatedAfter)}' then 1 else 0 end) num_new,
sum(case when UpdatedAt > '{sql.ToSqlDate(_UpdatedAfter)}' then 1 else 0 end) num_updated
from
Products
While switching to EF Core, I tried to convert it to Linq, like this
var res = (from p in _db.Products
let total = _db.Products.Count()
let NewProducts = _db.Products.Count(s => s.CreatedAt > crDate.Date)
let UpdatedProducts = _db.Products.Count(s => s.UpdatedAt > updDate.Date)
select new { total, NewProducts, UpdatedProducts } );
var response = res.ToList();
but the resulting SQL query seems not optimized
SELECT
(SELECT COUNT(*) FROM [Products] AS [p0]) AS [total],
(SELECT COUNT(*) FROM [Products] AS [s]
WHERE [s].[CreatedAt] > '2019-07-31') AS [NewProducts],
(SELECT COUNT(*) FROM [Products] AS [s0]
WHERE [s0].[UpdatedAt] > '2019-07-01') AS [UpdatedProducts]
FROM
[Products] AS [p]
Maybe somebody can help to translate the original SQL query to linq?
tia
ish
A more literal translation of that query, that generates a query more likely to execute in a single scan of the target table would be:
var q =
from p in db.Products
select new
{
p.Id,
NewProduct = p.CreatedAt > DateTime.Parse("2019-07-31") ? 1 : 0,
UpdatedProduct = p.UpdatedAt > DateTime.Parse("2019-07-01") ? 1 : 0
} into counts
group counts by 1 into grouped
select new
{
ProductCount = grouped.Count(),
NewProductCount = grouped.Sum(r => r.NewProduct),
UpdatedProductCount = grouped.Sum(r => r.UpdatedProduct)
};
Which translates to something like:
SELECT COUNT(*) AS [ProductCount],
SUM([t].[NewProduct]) AS [NewProductCount],
SUM([t].[UpdatedProduct]) AS [UpdatedProductCount]
FROM (
SELECT [p].[Id], CASE
WHEN [p].[CreatedAt] > #__Parse_0
THEN 1 ELSE 0
END AS [NewProduct], CASE
WHEN [p].[UpdatedAt] > #__Parse_1
THEN 1 ELSE 0
END AS [UpdatedProduct], 1 AS [Key]
FROM [Products] AS [p]
) AS [t]
GROUP BY [t].[Key]
You do not need a from clause in your linq because you aren't not going over the rows. just use three statements:
var total = _db.Products.Count();
var NewProducts = _db.Products.Count(s => s.CreatedAt > crDate.Date);
var UpdatedProducts = _db.Products.Count(s => s.UpdatedAt > updDate.Date) ;

Convert SQL to EF Linq

I have the following query:
SELECT COUNT(1)
FROM Warehouse.WorkItems wi
WHERE wi.TaskId = (SELECT TaskId
FROM Warehouse.WorkItems
WHERE WorkItemId = #WorkItemId)
AND wi.IsComplete = 0;
And since we are using EF, I'd like to be able to use the Linq functionality to generate this query. (I know that I can give it a string query like this, but I would like to use EF+Linq to generate the query for me, for refactoring reasons.)
I really don't need to know the results of the query. I just need to know if there are any results. (The use of an Any() would be perfect, but I can't get the write code for it.)
So... Basically, how do I write that SQL query as a LINQ query?
Edit: Table Structure
WorkItemId - int - Primary Key
TaskId - int - Foreign Key on Warehouse.Tasks
IsComplete - bool
JobId - int
UserName - string
ReportName - string
ReportCriteria - string
ReportId - int - Foreign Key on Warehouse.Reports
CreatedTime - DateTime
The direct translation could be something like this
var result = db.WorkItems.Any(wi =>
!wi.IsComplete && wi.TaskId == db.WorkItems
.Where(x => x.WorkItemId == workItemId)
.Select(x => x.TaskId)
.FirstOrDefault()));
Taking into account the fact that SQL =(subquery), IN (subquery) and EXISTS(subquery) in nowadays modern databases are handled identically, you can try this instead
var result = db.WorkItems.Any(wi =>
!wi.IsComplete && db.WorkItems.Any(x => x.WorkItemId == workItemId
&& x.TaskId == wi.TaskId));
Turns out that I just needed to approach the problem from a different angle.
I came up with about three solutions with varying Linq syntaxes:
Full method chain:
var q1 = Warehouse.WorkItems
.Where(workItem => workItem.TaskId == (from wis in Warehouse.WorkItems
where wis.WorkItemId == workItemId
select wis.TaskId).First())
.Any(workItem => !workItem.IsComplete);
Mixed query + method chain:
var q2 = Warehouse.WorkItems
.Where(workItem => workItem.TaskId == Warehouse.WorkItems
.Where(wis => wis.WorkItemId == workItemId)
.Select(wis => wis.TaskId)
.First())
.Any(workItem => !workItem.IsComplete);
Full query:
var q3 = (from wi in Warehouse.WorkItems
where wi.TaskId == (from swi in Warehouse.WorkItems
where swi.WorkItemId == workItemId
select swi.TaskId).First()
where !wi.IsComplete
select 1).Any();
The only problems with this is that it comes up with some really jacked up SQL:
SELECT
(CASE
WHEN EXISTS(
SELECT NULL AS [EMPTY]
FROM [Warehouse].[WorkItems] AS [t0]
WHERE (NOT ([t0].[IsComplete] = 1)) AND ([t0].[TaskId] = ((
SELECT TOP (1) [t1].[TaskId]
FROM [Warehouse].[WorkItems] AS [t1]
WHERE [t1].[WorkItemId] = #p0
)))
) THEN 1
ELSE 0
END) AS [value]
You can use the Any() function like so:
var result = Warehouse.WorkItems.Any(x => x.WorkItemId != null);
In short, you pass in your condition, which in this case is checking whether or not any of the items in your collection have an ID
The variable result will tell you whether or not all items in your collection have ID's.
Here's a helpful webpage to help you get started with LINQ: http://www.dotnetperls.com/linq
Subquery in the original SQL was a useless one, thus not a good sample for Any() usage. It is simply:
SELECT COUNT(*)
FROM Warehouse.WorkItems wi
WHERE WorkItemId = #WorkItemId
AND wi.IsComplete = 0;
It looks like, since the result would be 0 or 1 only, guessing the purpose and based on seeking how to write Any(), it may be written as:
SELECT CASE WHEN EXISTS ( SELECT *
FROM Warehouse.WorkItems wi
WHERE WorkItemId = #WorkItemId AND
wi.IsComplete = 0 ) THEN 1
ELSE 0
END;
Then it makes sense to use Any():
bool exists = db.WorkItems.Any( wi => wi.WorkItemId == workItemId & !wi.IsComplete );
EDIT: I misread the original query in a hurry, sorry. Here is an update on the Linq usage:
bool exists = db.WorkItems.Any( wi =>
db.WorkItems
.SingleOrDefault(wi.WorkItemId == workItemId).TaskId == wi.TaskId
&& !wi.IsComplete );
If the count was needed as in the original SQL:
var count = db.WorkItems.Count( wi =>
db.WorkItems
.SingleOrDefault(wi.WorkItemId == workItemId).TaskId == wi.TaskId
&& !wi.IsComplete );
Sorry again for the confusion.

Have EF Linq Select statement Select a constant or a function

I have a Select statement that is currently formatted like
dbEntity
.GroupBy(x => x.date)
.Select(groupedDate => new {
Calculation1 = doCalculation1 ? x.Sum(groupedDate.Column1) : 0),
Calculation2 = doCalculation2 ? x.Count(groupedDate) : 0)
In the query doCalculation1 and doCalculation2 are bools that are set earlier. This creates a case statement in the Sql being generated, like
DECLARE #p1 int = 1
DECLARE #p2 int = 0
DECLARE #p3 int = 1
DECLARE #p4 int = 0
SELECT (Case When #p1 = 1 THEN Sum(dbEntity.Column1)
Else #p2
End) as Calculation1,
(Case When #p3 = 1 THEN Count(*)
Else #p4
End) as Calculation2
What I want to happen is for the generated sql is to be like this when doCalculation1 is true
SELECT SUM(Column1) as Calculation1, Count(*) as Calculation2
and like this when doCalculation2 is false
SELECT 0 as Calculation1, Count(*) as Calculation2
Is there any way to force a query through EF to act like this?
Edit:
bool doCalculation = true;
bool doCalculation2 = false;
dbEntity
.Where(x => x.FundType == "E")
.GroupBy(x => x.ReportDate)
.Select(dateGroup => new
{
ReportDate = dateGroup.Key,
CountInFlows = doCalculation2 ? dateGroup.Count(x => x.Flow > 0) : 0,
NetAssetEnd = doCalculation ? dateGroup.Sum(x => x.AssetsEnd) : 0
})
.ToList();
generates this sql
-- Region Parameters
DECLARE #p0 VarChar(1000) = 'E'
DECLARE #p1 Int = 0
DECLARE #p2 Decimal(5,4) = 0
DECLARE #p3 Int = 0
DECLARE #p4 Int = 1
DECLARE #p5 Decimal(1,0) = 0
-- EndRegion
SELECT [t1].[ReportDate],
(CASE
WHEN #p1 = 1 THEN (
SELECT COUNT(*)
FROM [dbEntity] AS [t2]
WHERE ([t2].[Flow] > #p2) AND ([t1].[ReportDate] = [t2].[ReportDate]) AND ([t2].[FundType] = #p0)
)
ELSE #p3
END) AS [CountInFlows],
(CASE
WHEN #p4 = 1 THEN CONVERT(Decimal(33,4),[t1].[value])
ELSE CONVERT(Decimal(33,4),#p5)
END) AS [NetAssetEnd]
FROM (
SELECT SUM([t0].[AssetsEnd]) AS [value], [t0].[ReportDate]
FROM [dbEntity] AS [t0]
WHERE [t0].[FundType] = #p0
GROUP BY [t0].[ReportDate]
) AS [t1]
which has many index scans and a spool and a join in the execution plan. It also takes about 20 seconds on average to run on the test set, with the production set going to be much larger.
I want it to run in the same speed as sql like
select reportdate, 1, sum(AssetsEnd)
from vwDailyFundFlowDetail
where fundtype = 'E'
group by reportdate
which runs in about 12 seconds on average and has the majority of the query tied up in a single index seek in the execution plan. What the actual sql output is doesnt matter, but the performance appears to be much worse with the case statements.
As for why I am doing this, I need to generate a dynamic select statements like I asked in Dynamically generate Linq Select. A user may select one or more of a set of calculations to perform and I will not know what is selected until the request comes in. The requests are expensive so we do not want to run them unless they are necessary. I am setting the doCalculation bools based on the user request.
This query is supposed to replace some code that inserts or deletes characters from a hardcoded sql query stored as a string, which is then executed. That runs fairly fast but is a nightmare to maintain
It would technically be possible to pass the Expression in your Select query through an expression tree visitor, which checks for constant values on the left-hand side of ternary operators, and replaces the ternary expression with the appropriate sub-expression.
For example:
public class Simplifier : ExpressionVisitor
{
public static Expression<T> Simplify<T>(Expression<T> expr)
{
return (Expression<T>) new Simplifier().Visit(expr);
}
protected override Expression VisitConditional(ConditionalExpression node)
{
var test = Visit(node.Test);
var ifTrue = Visit(node.IfTrue);
var ifFalse = Visit(node.IfFalse);
var testConst = test as ConstantExpression;
if(testConst != null)
{
var value = (bool) testConst.Value;
return value ? ifTrue : ifFalse;
}
return Expression.Condition(test, ifTrue, ifFalse);
}
protected override Expression VisitMember(MemberExpression node)
{
// Closed-over variables are represented as field accesses to fields on a constant object.
var field = (node.Member as FieldInfo);
var closure = (node.Expression as ConstantExpression);
if(closure != null)
{
var value = field.GetValue(closure.Value);
return VisitConstant(Expression.Constant(value));
}
return base.VisitMember(node);
}
}
Usage example:
void Main()
{
var b = true;
Expression<Func<int, object>> expr = i => b ? i.ToString() : "N/A";
Console.WriteLine(expr.ToString()); // i => IIF(value(UserQuery+<>c__DisplayClass0).b, i.ToString(), "N/A")
Console.WriteLine(Simplifier.Simplify(expr).ToString()); // i => i.ToString()
b = false;
Console.WriteLine(Simplifier.Simplify(expr).ToString()); // i => "N/A"
}
So, you could use this in your code something like this:
Expression<Func<IGrouping<DateTime, MyEntity>>, ClassYouWantToReturn> select =
groupedDate => new {
Calculation1 = doCalculation1 ? x.Sum(groupedDate.Column1) : 0),
Calculation2 = doCalculation2 ? x.Count(groupedDate) : 0
};
var q = dbEntity
.GroupBy(x => x.date)
.Select(Simplifier.Simplify(select))
However, this is probably more trouble than it's worth. SQL Server will almost undoubtedly optimize the "1 == 1" case away, and allowing Entity Framework to produce the less-pretty query shouldn't prove to be a performance problem.
Update
Looking at the updated question, this appears to be one of the few instances where producing the right query really does matter, performance-wise.
Besides my suggested solution, there are a few other choices: you could use raw sql to map to your return type, or you could use LinqKit to choose a different expression based on what you want, and then "Invoke" that expression inside your Select query.

Linq-to-SQL: arithmetic operation on consecutive elements

For example, I have a table:
Date |Value
----------|-----
2015/10/01|5
2015/09/01|8
2015/08/01|10
Is there any way using Linq-to-SQL to get a new sequence which will be an arithmetic operation between consecutive elements in the previously ordered set (for example, i.Value - (i-1).Value)? It must be executed on SQL Server 2008 side, not application side.
For example dataContext.GetTable<X>().OrderByDescending(d => d.Date).Something(.......).ToArray(); should return 3, 2.
Is it possible?
You can try this:
var q = (
from i in Items
orderby i.ItemDate descending
let prev = Items.Where(x => x.ItemDate < i.ItemDate).FirstOrDefault()
select new { Value = i.ItemValue - (prev == null ? 0 : prev.ItemValue) }
).ToArray();
EDIT:
If you slightly modify the above linq query to:
var q = (from i in Items
orderby i.ItemDate descending
let prev = Items.Where(x => x.ItemDate < i.ItemDate).FirstOrDefault()
select new { Value = (int?)i.ItemValue - prev.ItemValue }
).ToArray();
then you get the following TSQL query sent to the database:
SELECT ([t0].[ItemValue]) - ((SELECT [t2].[ItemValue]
FROM (SELECT TOP (1) [t1].[ItemValue]
FROM [Items] AS [t1]
WHERE [t1].[ItemDate] < [t0].[ItemDate]) AS [t2]
)) AS [Value]
FROM [Items] AS [t0]
ORDER BY [t0].[ItemDate] DESC
My guess now is if you place an index on ItemDate field this shouldn't perform too bad.
I wouldn't let SQL do this, it would create an inefficient SQL query (I think).
I could create a stored procedure, but if the amount of data is not too big I can also use Linq to objects:
List<x> items=dataContext.GetTable<X>().OrderByDescending(d => d.Date).ToList();//Bring data to memory
var res = items.Skip(1).Zip(items, (cur, prev) => cur.Value - prev.Value);
At the end, I might use a foreach for readability

How can i get primary of a new table generated from OrderBy / GroupBy?

How can I get primary of a new table generated from OrderBy / GroupBy?
var something = (from m in _db.Requests
where m.StoreID == myRequest.StoreID
where m.AcceptedTime != null
where System.Data.Entity.DbFunctions.TruncateTime(m.RequestTime) == today
group m by m.StaffID into g
let TotalPoints = g.Count()
orderby TotalPoints ascending
select new { User = g.Key});
then, I try to get the 1st result which will be the least times "m" appeared in my Requests table
var thisStaff = something.Select(o=>o.User).Take(1).ToString();
However, the value of "thisStaff" is not StaffID which is the Key of my Request table. The value in it is
SELECT TOP (1) [Project1].[StaffID] AS [StaffID]
FROM ( SELECT [GroupBy1].[A1] AS [C1], [GroupBy1].[K1] AS [StaffID]
FROM ( SELECT [Extent1].[StaffID] AS [K1], COUNT(1) AS [A1]
FROM [dbo].[Requests] AS [Extent1]
WHERE ([Extent1].[StoreID] = #p__linq__0) AND
([Extent1].[AcceptedTime] IS NOT NULL) AND
((convert (datetime2, convert(varchar(255), [Extent1].[RequestTime], 102) , 102)) = #p__linq__1)
GROUP BY [Extent1].[StaffID] ) AS [GroupBy1] ) AS [Project1]
ORDER BY [Project1].[C1] ASC
Please suggest how i should change it. By the way, I've also tried using the following and get almost same result.
var something2 = _db.Requests
.Where(o => o.StoreID == myRequest.StoreID)
.Where(o => o.AcceptedTime != null)
.Where(o => System.Data.Entity.DbFunctions.TruncateTime(o.RequestTime) == today)
.GroupBy(x => x.StaffID)
.Select(x => new
{
Count = x.Count(),
Name = x.Key,
})
.OrderBy(x => x.Count)
.Take(1);
Your query is fine, you're just lacking the method call that will call the database and return the result (a feature of EF is deferred execution), FirstOrDefault() should do the trick:
var thisStaff = something.Select(o => o.User).FirstOrDefault();
The value that you see is the value that the IQueryable you have constructed returns for its ToString method (it will be the SQL that is run against the database when the query is executed).

Categories