How to do Linq aggregates when there might be an empty set? - c#

I have a Linq collection of Things, where Thing has an Amount (decimal) property.
I'm trying to do an aggregate on this for a certain subset of Things:
var total = myThings.Sum(t => t.Amount);
and that works nicely. But then I added a condition that left me with no Things in the result:
var total = myThings.Where(t => t.OtherProperty == 123).Sum(t => t.Amount);
And instead of getting total = 0 or null, I get an error:
System.InvalidOperationException: The null value cannot be assigned to
a member with type System.Decimal which is a non-nullable value type.
That is really nasty, because I didn't expect that behavior. I would have expected total to be zero, maybe null - but certainly not to throw an exception!
What am I doing wrong? What's the workaround/fix?
EDIT - example
Thanks to all for your comments. Here's some code, copied and pasted (not simplified). It's LinqToSql (perhaps that's why you couldn't reproduce my problem):
var claims = Claim.Where(cl => cl.ID < 0);
var count = claims.Count(); // count=0
var sum = claims.Sum(cl => cl.ClaimedAmount); // throws exception

I can reproduce your problem with the following LINQPad query against Northwind:
Employees.Where(e => e.EmployeeID == -999).Sum(e => e.EmployeeID)
There are two issues here:
Sum() is overloaded
LINQ to SQL follows SQL semantics, not C# semantics.
In SQL, SUM(no rows) returns null, not zero. However, the type inference for your query gives you decimal as the type parameter, instead of decimal?. The fix is to help type inference select the correct type, i.e.:
Employees.Where(e => e.EmployeeID == -999).Sum(e => (int?)e.EmployeeID)
Now the correct Sum() overload will be used.

To get a non-nullable result, you need to cast the amount to a nullable type, and then handle the case of Sum returning null.
decimal total = myThings.Sum(t => (decimal?)t.Amount) ?? 0;
There's another question devoted to the (ir)rationale.

it throws an exception because the result of the combined sql query is null and this cant be assigned to the decimal var. If you did the following then your variable would be null (I assume ClaimedAmount is decimal):
var claims = Claim.Where(cl => cl.ID < 0);
var count = claims.Count(); // count=0
var sum = claims.Sum(cl => cl.ClaimedAmount as decimal?);
then you should get the functionality you desire.
You could also do ToList() at the point of the where statement and then the sum would return 0 but that would fall foul of what has been said elsewhere about LINQ aggregates.

If t has a property like a 'HasValue', then I would change the expression to:
var total =
myThings.Where(t => (t.HasValue) && (t.OtherProperty == 123)).Sum(t => t.Amount);

Related

How can I make Sum() return 0 instead of 'null'?

I'm trying to use LINQ-to-entities to query my DB, where I have 3 tables: Room, Conference, and Participant. Each room has many conferences, and each conference has many participants. For each room, I'm trying to get a count of its conferences, and a sum of all of the participants for all of the room's conferences. Here's my query:
var roomsData = context.Rooms
.GroupJoin(
context.Conferences
.GroupJoin(
context.Participants,
conf => conf.Id,
part => part.ConferenceId,
(conf, parts) => new { Conference = conf, ParticipantCount = parts.Count() }
),
rm => rm.Id,
data => data.Conference.RoomId,
(rm, confData) => new {
Room = rm,
ConferenceCount = confData.Count(),
ParticipantCount = confData.Sum(cd => cd.ParticipantCount)
}
);
When I try and turn this into a list, I get the error:
The cast to value type 'System.Int32' failed because the materialized value is null. Either the result type's generic parameter or the query must use a nullable type.
I can fix this by changing the Sum line to:
ParticipantCount = confData.Count() == 0 ? 0 : confData.Sum(cd => cd.ParticipantCount)
But the trouble is that this seems to generate a more complex query and add 100ms onto the query time. Is there a better way for me to tell EF that when it is summing ParticipantCount, an empty list for confData should just mean zero, rather than throwing an exception? The annoying thing is that this error only happens with EF; if I create an empty in-memory List<int> and call Sum() on that, it gives me zero, rather than throwing an exception!
You may use the null coalescing operator ?? as:
confData.Sum(cd => cd.ParticipantCount ?? 0)
I made it work by changing the Sum line to:
ParticipantCount = (int?)confData.Sum(cd => cd.ParticipantCount)
Confusingly, it seems that even though IntelliSense tells me that the int overload for Sum() is getting used, at runtime it is actually using the int? overload because the confData list might be empty. If I explicitly tell it the return type is int? it returns null for the empty list entries, and I can later null-coalesce the nulls to zero.
Use Enumerable.DefaultIfEmpty:
ParticipantCount = confData.DefaultIfEmpty().Sum(cd => cd.ParticipantCount)
Instead of trying to get EF to generate a SQL query that returns 0 instead of null, you change this as you process the query results on the client-side like this:
var results = from r in roomsData.AsEnumerable()
select new
{
r.Room,
r.ConferenceCount,
ParticipantCount = r.ParticipantCount ?? 0
};
The AsEnumerable() forces the SQL query to be evaluated and the subsequent query operators are client-side LINQ-to-Objects.

Handling null field in aggregate using linq

I have come across a scenario where i am summing in my LINQ query.
The property could have actually NULL in database.
However, when we apply aggregate i.e. SUM on same field in collection using LINQ it calculates/returns 0 for null
I am avoiding sum for null field as following.
TotalDays = x.Select(y => y.day.HasValue ? x.Sum(z => z.day) : null).FirstOrDefault(),
Is it nice way or could have even better?
Null values sum to zero because naturally they can neither add nor subtract to the tally so generally one wants zero in such cases.
Consider:
(new int?[]{0, null, 3, 2}).Sum() // result is 5. Other linq providers do similar.
Where this can sometimes cause a problem is if you want to note all-null result-sets separately:
(new int?[]{null, null}).Sum() // result is 0, but maybe we want to note that there was indeed no values.
We could do this with:
source.Any(x => x.HasValue) ? source.Sum() : default(int?);
Which to bring back to your example would be:
int? totalDays = x.Any(y => y.day.HasValue) ? x.Sum(y => y.day) : default(int?);
However you might prefer to do:
int? totalDays = x.Sum(y => y.day);
if (totalDays == 0 && y.All(y => !y.day.HasValue))
totalDays = null;
Then you only examine the set to see if all values are null in the case of receiving the 0 result (any other result is not possible in this case).
Checking Any() first is more efficient when all-null results are more common, and doing Sum() first is more efficient when all-null results less common, because in each case you are only doing two operations in the less common case.

Sum a IQueryable Column with where clause

I have 2 queries. One to find where Value column is greater than 150,000 and i need the count of entries. The second one is the sum of that rather than count.
The Count works perfectly but the sum crashes and provides this error
{"The cast to value type 'System.Decimal' failed because the
materialized value is null. Either the result type's generic parameter
or the query must use a nullable type."}
Working code:
var excessCount = closedDealNonHost.Any() ? closedDealNonHost.Where(x => x.Value > 150000).Count() : 0;
Crashing Code:
var excessSum = CloseDealNonHost = closedDealNonHost.Any() ? closedDealNonHost.Where(x => x.Value > 150000).Sum(x => x.Value) : 0;
You can solve the issue by explicitly casting to decimal? in Sum like:
var excessSum = CloseDealNonHost = closedDealNonHost.Any() ? closedDealNonHost
.Where(x => x.Value > 150000)
.Sum(x => (decimal?) x.Value) : 0;
The issue is due to generated SQL from LINQ expression, and at C# end it will try to return decimal which can't accommodate a null value, hence the error.
You may see: Linq To Entities: Queryable.Sum returns Null on an empty list

Sum() Returns null in Entity Framework Query

I have a big Entity Framework query that includes these lines.
var programs = from p in Repository.Query<Program>()
where p.OfficeId == CurrentOffice.Id
let totalCharges = p.ProgramBillings.Where(b => b.Amount > 0 && b.DeletedDate == null).Select(b => b.Amount).Sum()
let totalCredits = p.ProgramBillings.Where(b => b.Amount < 0 && b.DeletedDate == null).Select(b => -b.Amount).Sum()
let billingBalance = (totalCharges - totalCredits)
When I materialize the data, I get the following error:
The cast to value type 'Decimal' failed because the materialized value is null. Either the result type's generic parameter or the query must use a nullable type.
If I change my query as follows (added in two type casts), the error goes away.
var programs = from p in Repository.Query<Program>()
where p.OfficeId == CurrentOffice.Id
let totalCharges = (decimal?)p.ProgramBillings.Where(b => b.Amount > 0 && b.DeletedDate == null).Select(b => b.Amount).Sum()
let totalCredits = (decimal?)p.ProgramBillings.Where(b => b.Amount < 0 && b.DeletedDate == null).Select(b => -b.Amount).Sum()
let billingBalance = (totalCharges - totalCredits)
I do not understand this. ProgramBilling.Amount is a non-nullable Decimal. If I hover over the Sum() call, Intellisense says it returns type Decimal. And yet additional tests confirmed that, in my second version, totalCharges and totalCredits are both set to null for those rows where ProgramBillings has no data.
Questions:
I understood Sum() returned 0 for an empty collection. Under what circumstances is this not true?
And if sometimes that is not true, then why when I hover over Sum(), Intellisense shows it returns type Decimal and not Decimal? It appears Intellisense had the same understanding that I had.
EDIT:
It would seem that an easy fix is to do something like Sum() ?? 0m. But that's illegal, giving me the error:
Operator '??' cannot be applied to operands of type 'decimal' and 'decimal'
I understood Sum() returned 0 for an empty collection. Under what circumstances is this not true?
When you're not using LINQ to objects, as is the case here. Here you have a query provider that is translating this query into SQL. The SQL operation has different semantics for its SUM operator.
And if sometimes that is not true, then why when I hover over Sum(), Intellisense shows it returns type Decimal and not Decimal? It appears Intellisense had the same understanding that I had.
The C# LINQ SUM operator doesn't return a nullable value; it needs to have a non-null value, but the SQL SUM operator has different semantics, it returns null when summing an empty set, not 0. The fact that the null value is provided in a context where C# requires a non-null value is the entire reason everything is breaking. If the C# LINQ SUM operator here returned a nullable value, then null could just be returned without any problems.
It is the differences between the C# operator and the SQL operator it is being used to represent that is causing this error.
I've got the same issue in one of my EF queries when the collection is empty, one quick fix for this is to cast to nullable decimal :
var total = db.PaiementSet.Sum(o => (Decimal?)o.amount) ?? 0M;
hope it helps.
Prior to the .Sum add a DefaultIfEmpty(0.0M)

Linq with boolean function to relational db in Entity Framework

Probably a few things wrong with my code here but I'm mostly having a problem with the syntax. Entry is a model for use in Entries and contains a TimeStamp for each entry. Member is a model for people who are assigned entries and contains an fk for Entry. I want to sort my list of members based off of how many entries the member has within a given period (arbitrarily chose 30 days).
A. I'm not sure that the function I created works correctly, but this is aside from the main point because I haven't really dug into it yet.
B. I cannot figure out the syntax of the Linq statement or if it's even possible.
Function:
private bool TimeCompare(DateTime TimeStamp)
{
DateTime bound = DateTime.Today.AddDays(-30);
if (bound <= TimeStamp)
{
return true;
}
return false;
}
Member list:
public PartialViewResult List()
{
var query = repository.Members.OrderByDescending(p => p.Entry.Count).Where(TimeCompare(p => p.Entry.Select(e => e.TimeStamp));
//return PartialView(repository.Members);
return PartialView(query);
}
the var query is my problem here and I can't seem to find a way to incorporate a boolean function into a .where statement in a linq.
EDIT
To summarize I am simply trying to query all entries timestamped within the past 30 days.
I also have to emphasize the relational/fk part as that appears to be forcing the Timestamp to be IEnumerable of System.Datetime instead of simple System.Datetime.
This errors with "Cannot implicitly convert timestamp to bool" on the E.TimeStamp:
var query = repository.Members.Where(p => p.Entry.First(e => e.TimeStamp) <= past30).OrderByDescending(p => p.Entry.Count);
This errors with Operator '<=' cannot be applied to operands of type 'System.Collections.Generic.IEnumerable' and 'System.DateTime'
var query = repository.Members.Where(p => p.Entry.Select(e => e.TimeStamp) <= past30).OrderByDescending(p => p.Entry.Count);
EDIT2
Syntactically correct but not semantically:
var query = repository.Members.Where(p => p.Entry.Select(e => e.TimeStamp).FirstOrDefault() <= timeComparison).OrderByDescending(p => p.Entry.Count);
The desired result is to pull all members and then sort by the number of entries they have, this pulls members with entries and then orders by the number of entries they have. Essentially the .where should somehow be nested inside of the .count.
EDIT3
Syntactically correct but results in a runtime error (Exception Details: System.ArgumentException: DbSortClause expressions must have a type that is order comparable.
Parameter name: key):
var query = repository.Members.OrderByDescending(p => p.Entry.Where(e => e.TimeStamp <= timeComparison));
EDIT4
Closer (as this line compiles) but it doesn't seem to be having any effect on the object. Regardless of how many entries I add for a user it doesn't change the sort order as desired (or at all).
var timeComparison = DateTime.Today.AddDays(-30).Day;
var query = repository.Members.OrderByDescending(p => p.Entry.Select(e => e.TimeStamp.Day <= timeComparison).FirstOrDefault());
A bit of research dictates that Linq to Entities (IE: This section)
...var query = repository.Members.OrderByDescending(...
tends to really not like it if you use your own functions, since it will try to map to a SQL variant.
Try something along the lines of this, and see if it helps:
var query = repository.Members.AsEnumerable().Where(TimeCompare(p => p.Entry.Select(e => e.TimeStamp).OrderByDescending(p => p.Entry.Count));
Edit: I should just read what you are trying to do. You want it to grab only the ones within the last X number of days, correct? I believe the following should work, but I would need to test when I get to my home computer...
public PartialViewResult List()
{
var timeComparison = DateTime.Today.AddDays(-30);
var query = repository.Members.Where(p => p.Entry.Select(e => e.TimeStamp).FirstOrDefault() <= timeComparison).OrderByDescending(p => p.Entry.Count));
//return PartialView(repository.Members);
return PartialView(query);
}
Edit2: This may be a lack of understanding from your code, but is e the same type as p? If so, you should be able to just reference the timestamp like so:
public PartialViewResult List()
{
var timeComparison = DateTime.Today.AddDays(-30);
var query = repository.Members.Where(p => p.TimeStamp <= timeComparison).OrderByDescending(p => p.Entry.Count));
//return PartialView(repository.Members);
return PartialView(query);
}
Edit3: In Edit3, I see what you are trying to do now (I believe). You're close, but OrderByDescending would need to go on the end. Try this:
var query = repository.Members
.Select(p => p.Entry.Where(e => e.TimeStamp <= timeComparison))
.OrderByDescending(p => p.Entry.Count);
Thanks for all the help Dylan but here is the final answer:
public PartialViewResult List()
{
var timeComparison = DateTime.Today.AddDays(-30).Day;
var query = repository.Members
.OrderBy(m => m.Entry.Where(e => e.TimeStamp.Day <= timeComparison).Count());
return PartialView(query);
}

Categories