Reusable functions for use with Linq-to-Entities - c#

I have some stats code that I want to use in various places to calculate success / failure percentages of schedule Results. I recently found a bug in the code and this was due to the fact it was replicated in each LINQ statement, I then decided it would be better to have common code to do this. The problem being, of course, is that a normal function, when executed on SQL server, throws a NotSupportedException because the fuinction doesnt exist in SQL Server.
How can I write a reusable stats code that gets executed on SQL server or is this not possible?
Here is the code I have written for Result
public class Result
{
public double CalculateSuccessRatePercentage()
{
return this.ExecutedCount == 0 ? 100 : ((this.ExecutedCount - this.FailedCount) * 100.0 / this.ExecutedCount);
}
public double CalculateCoveragePercentage()
{
return this.PresentCount == 0 ? 0 : (this.ExecutedCount * 100.0 / this.PresentCount);
}
}
And it is used like so (results is IQueryable, and throws the exception):
schedule.SuccessRatePercentage = (int)Math.Ceiling(results.Average(r => r.CalculateSuccessRatePercentage()));
schedule.CoveragePercentage = (int)Math.Ceiling(results.Average(r => r.CalculateCoveragePercentage()));
or like this (which works, because we do this on a single result)
retSchedule.SuccessRatePercentage = (byte)Math.Ceiling(result.CalculateSuccessRatePercentage());
retSchedule.CoveragePercentage = (byte)Math.Ceiling(result.CalculateCoveragePercentage());
Edit
As per #Fred's answer I now have the following code, which works for an IQueryable
schedule.SuccessRatePercentage = (int)Math.Ceiling(scheduleResults.Average(ScheduleResult.CalculateSuccessRatePercentageExpression()));
schedule.CoveragePercentage = (int)Math.Ceiling(scheduleResults.Average(ScheduleResult.CalculateCoveragePercentageExpression()));
The only problem, albeit a minor one, is that this code will not work for individual results i.e.
retSchedule.SuccessRatePercentage = (byte)Math.Ceiling(/* How do I use it here for result */);

You can't pass functions to SQL - you would need to declare the function on the actual SQL database and then call that from your code.
What you could do/try is this:
Expression<Func<Result, double>> CalculateCoveragePercentage()
{
return r => r.PresentCount == 0 ? 0 : (r.ExecutedCount * 100.0 / r.PresentCount);
}
It needs to be interpreted instead of executed so that EF can translate it to SQL. The problem is, I've only heard of this being possible when it's passed directly into a where clause.
Since you are able to do these calculations when you apply them directly inside of your LINQ query, I'm inclined to think that it should also be possible to declare those calculations as Expression<Func<..., ...>> and them pass them in.
The only way to know for sure is to try (unless you feel like looking into EF's ExpressionBuilder)
UPDATE:
I should have mentioned that, if this would work, you need to pass this expression into a Select statement:
// Assuming you have Results declared as a DbSet or IDbSet, such as:
DbSet<Result> Results
// You could do something like this (just to illustrate that
// it would be interpreted rather than executed):
List<double> allCoveragePercentages = Results.Select(CalculateCoveragePercentage)
.ToList();
UPDATE #2:
In order for this to work with individual results (or in any case whatsoever), you need to pass it into a clause that accepts the expression. Examples are Select, Where, Average (apparently), anything that does not returns results.
From the top of my head (I'm sure I'm missing a few):
List: ToArray, ToDictionary, ToList, ToLookup
Single result: First, FirstOrDefault, Single, SingleOrDefault, Last, LastOrDefault
Computation: Count, Sum, Max, Min
Since the above clauses return results, they (for as far as I know) only accept Predicates (a function that can only return 'true' or 'false')
You may have coincidentally got it right with your .Average(CalculateCoveragePercentage)
So if you were to get a single result with .FirstOrDefault(), you would pass in your expression inside of a select clause right before that: .Select(CalculateCoveragePercentage).FirstOrDefault(). That is, if you don't need the actual entity but just the calculation. Be aware though that this particular example will return 0 if there were no Result results. You may or may not want this behavior.
Of course, if you already have your result (it's not an IQueryable anymore) then you can simple do:
var coveragePercentage = CalculateCoveragePercentage().Compile().Invoke(result);
But that would kind of defeat the purpose of the expression - for this situation you should just add a method to your Result class that calculates the CoveragePercentage of a given instance.

Related

Linq: let statement as a constant?

I am working on a Linq query expression statement, and want to use a let statement to set a constant value as part of the query to reduce potential errors.
In my example... (this is total mock up). the let statement "validateCount" is the question.
from referenceRow in context.SomeTable
// We want 8 in the Take and the results to still be 8 after the where.
// instead of typing in the value twice (in this example), i wanted to use a constant value here
let validateCount = 8
// sub query expression in my example
let subQuery = from sub in context.SomeTable
where sub.SomeColumn == rowReference.SomeColumn
orderby sub.SomeColumn descending
select sub
// use the sub query, take X amount, apply filter, check to see if Count() is the same as Take()
where subQuery.Take(validateCount) // this is where we fail. if we put in the constant 8, we are ok.
.Where(x => x.SomeOtherColumn == "Some value")
.Count() == validateCount // this is fine
select referenceRow
Unfortunately it seems that the let expression "validateCount" which has a single value of 8 in this example, can only work in the comparison part of .Count() but cannot be passed into the .Take() without throwing an exception.
Limit must be a DbConstantExpression or a DbParameterReferenceExpression.
Parameter name: count
Looking for a solution to use some user-defined constant in a single code location that can be used in the rest of the query expression, both in the .Take() and .Count() without having to be updated several spots in the code.
our application allows users to supply their own linq query expression to build their queries. I can't define anything outside the scope of this query, and must be within, using something like a 'let' statement.
let statement generates intermediate anonymous type projection (Select call) in the query expression tree. EF query provider (as indicated by the exception message) requires Skip and Take arguments to be resolved to constant or variable values (i.e. to be able to be evaluated locally), hence the let cannot be used for that purpose.
Instead, the constants/variables used in Skip / Take expressions should be defined outside of the query and used inside.
To define a constant value you would use:
const int validateCount = 8;
var query = (from .... Take(validateCount) ...);
To define a variable value (SQL query parameter):
int validateCount = 8;
var query = (from .... Take(validateCount) ...);
Here the C# compiler will turn validateCount into closure and EF query provider will be happy to bind a parameter (with that value).
our application allows users to supply their own linq query expression to build their queries. I can't define anything outside the scope of this query, and must be within, using something like a 'let' statement.
When supplying their own queries, the users should follow the same Skip / Take argument rules as above, i.e. define their constants and variables outside of their queries.

Get First Single matched element or First if there's no match?

Is that possible in LINQ to write a nice one-liner to get a first matched element or if there's no match than get first element in the collection?
E.g. you have a collection of parrots and you want yellow parrot but if there's no yellow parrots - then any will do, something like this:
Parrots.MatchedOrFirst(x => x.Yellow == true)
I'm trying to avoid double-go to SQL Server and the ORM we use in this particular case is Dapper.
What about:
var matchedOrFirst = Parrots.FirstOrDefault(x => x.Yellow == true)
?? Parrots.FirstOrDefault();
Edit
For structs, this should work:
var matchedOrFirst = Parrots.Any(x => x.Yellow == true)
? Parrots.First(x => x.Yellow == true)
: Parrots.FirstOrDefault();
Edit: It was a linq to SQL solution
First building a handy extension
public static T MatchedOrFirstOrDefault<T>(this IQueryable<T> collection, System.Linq.Expressions.Expression<Func<T, Boolean>> predicate)
{
return (from item in collection.Where(predicate) select item)
.Concat((from item in collection select item).Take(1))
.ToList() // Convert to query result
.FirstOrDefault();
}
Using the code
var matchedOrFirst = Parrots.MatchedOrFirstOrDefault(x => x.Yellow);
If you want to avoid a 2nd SQL call and since requires branching logic, its unlikely that Dapper will know how to convert a LINQ query you come up with into appropriate SQL IIF, CASE, or whatever other SQL-specific functions you end up using.
I recommend you write a simple stored procedure to do that and call it from Dapper.
Depending on its usage though, if this page only has one or two queries on it already, and is located reasonably close (latency wise) to the server, a 2nd simple SELECT won't hurt the overall application that much. Unless it is in a loop or something, or your example is trivial compared to the actual query regarding the cost of the first SELECT.

Linq Sum() precision

In my project I have been using Linq's Sum() a lot. It's powered by NHibernate on MySQL. In my Session Factory I have explicitly asked NHibernate to deal with exactly 8 decimal places when it comes to decimals:
public class DecimalsConvention : IPropertyConvention
{
public void Apply(IPropertyInstance instance)
{
if (instance.Type.GetUnderlyingSystemType() == typeof(decimal))
{
instance.Scale(8);
instance.Precision(20);
}
}
}
However, I found out that .Sum() rounds up the numbers in 5 decimal places:
var opasSum = opasForThisIp.Sum(x => x.Amount); // Amount is a decimal
In the above statement opaSum equals to 2.46914 while it should be 2.46913578 (calculated directly on MySQL). opasForThisIp is of type IQueryable<OutgoingPaymentAssembly>.
I need all the Linq calculations to handle 8 decimal places when it comes to decimals.
Any ideas of how to fix this?
Edit 1: I have found var opasSum = Enumerable.Sum(opasForThisIp, opa => opa.Amount); to produce the correct result, however the question remains, why .Sum() rounds up the result and how can we fix it?
Edit 2: The produced SQL seems to be problematic:
select cast(sum(outgoingpa0_.Amount) as DECIMAL(19,5)) as col_0_0_
from `OutgoingPaymentAssembly` outgoingpa0_
where outgoingpa0_.IncomingPayment_id=?p0
and (outgoingpa0_.OutgoingPaymentTransaction_id is not null);
?p0 = 24 [Type: UInt64 (0)]
Edit 3: var opasSum = opasForThisIp.ToList().Sum(x => x.Amount); also produces the correct result.
Edit 4: Converting the IQueryable<OutgoingPaymentAssembly> to an IList<OutgoingPaymentAssembly> made the original query: var opasSum = opasForThisIp.Sum(x => x.Amount); to work.
x.Amount is being converted to a low precision minimum type from "LINQ-to-SQL" conversion, because your collection is IQueryable.
There are several workarounds, the easiest of which is to change the type of your collection to IList, or call ToList() on your collection, forcing the linq query to run as LINQ-to-Objects.
var opasSum = opasForThisIp.ToList().Sum(x => x.Amount);
Note:
If you don't want to lose deferred execution by moving away from the IQueryable, you could try casting the Amount to a decimal inside of the linq query.
From MSDN decimal and numeric (Transact-SQL):
In Transact-SQL statements, a constant with a decimal point is
automatically converted into a numeric data value, using the minimum
precision and scale necessary. For example, the constant 12.345 is
converted into a numeric value with a precision of 5 and a scale of 3.
Edit (to include great explanation of different .NET collection types:
Taken from the answer to this SO question.
IQueryable is intended to allow a query provider (for example, an
ORM like LINQ to SQL or the Entity Framework) to use the expressions
contained in a query to translate the request into another format. In
other words, LINQ-to-SQL looks at the properties on the entities that
you're using along with the comparisons you're making and actually
creates a SQL statement to express (hopefully) an equivalent request.
IEnumerable is more generic than IQueryable (though all
instances of IQueryable implement IEnumerable) and only defines
a sequence. However, there are extension methods available within the
Enumerable class that define some query-type operators on that
interface and use ordinary code to evaluate these conditions.
List is just an output format, and while it implements
IEnumerable, is not directly related to querying.
In other words, when you're using IQueryable, you're defining and
expression that gets translated into something else. Even though
you're writing code, that code never gets executed, it only gets
inspected and turned into something else, like an actual SQL query.
Because of this, only certain things are valid within these
expressions. For instance, you cannot call an ordinary function that
you define from within these expressions, since LINQ-to-SQL doesn't
know how to turn your call into a SQL statement. Most of these
restrictions are only evaluated at runtime, unfortunately.
When you use IEnumerable for querying, you're using
LINQ-to-Objects, which means you are writing the actual code that is
used for evaluating your query or transforming the results, so there
are, in general, no restrictions on what you can do. You can call
other functions from within these expressions freely.

What are the mechanics of the expression tree limitation here?

The fact that I expected this to work and it didn't leads me to search for the piece of the picture that I don't see.
Imagine that query being remoted over to a database. How is the database engine supposed to reach over the internet as it is executing the query and tell the "count" variable on your machine to update itself? There is no standard mechanism for doing so, and therefore anything that would mutate a variable on the local machine cannot be put into an expression tree that would run on a remote machine.
More generally, queries that cause side effects when they are executed are very, very bad queries. Never, ever put a side effect in a query, even in the cases where doing so is legal. It can be very confusing. Remember, a query is not a for loop. A query results in an object represents the query, not the results of the query. A query which has side effects when executed will execute those side effects twice when asked for its results twice, and execute them zero times when asked for the results zero times. A query where the first part mutates a variable will mutate that variable before the second clause executes, not during the execution of the second clause. As a result, many query clauses give totally bizarre results when they depend on side effect execution.
For more thoughts on this, see Bill Wagner's MSDN article on the subject:
http://msdn.microsoft.com/en-us/vcsharp/hh264182
If you were writing a LINQ-to-Objects query you could do what you want by zipping:
var zipped = model.DirectTrackCampaigns.Top("10").Zip(Enumerable.Range(0, 10000), (first, second)=>new { first, second });
var res = zipped.Select(c=> new Campaign { IsSelected = c.second % 2 == 0, Name = c.first.CampaignName };
That is, make a sequence of pairs of numbers and campaigns, then manipulate the pairs.
How you'd do that in LINQ-to-whatever-you're-using, I don't know.
When you pass anything to a method that expects an Expression, nothing is actually evaluated at that time. All it does is, it breaks apart the code and creates an Expression tree out of it.
Now, in .Net 4.0, alot of things were added to the Expression API (including Expression.Increment and Expression.Assign which would actually be what you're doing), however the compiler (I think it's a limitation of the C# compiler) hasn't been updated yet to take advantage of the new 4.0 Expression stuff. Therefore, we're limited to method calls, and assignment calls will not work. Hypothetically, this could be supported in the future.
IsSelected = (count++) % 2 == 0
Pretty much means:
IsSelected = (count = count + 1) % 2 == 0
That's where the assignment is happening and the source of the error.
According to the different answers to this question, this should be possible with .NET 4.0 using expressions, though not in lambdas.

date difference in EF4

i need to get a difference of two dates ,one date from a table and one the current date, and the difference should be less than 9 0days. i need to use this as filter in where clause of the linq
i tried doing this
var list = from p in context.persons
where ((p.CreateDT).Subtract(DateTime.Now).Days < 90)
select p;
i get this excpetion :
LINQ to Entities does not recognize the method 'System.TimeSpan Subtract(System.DateTime)' method, and this method cannot be translated into a store expression.
I did research other articles but nothing helped..Any ideas
Trick here is that it can't translate all your fancy oo hoo-ha to plain old sql. Trick is to flip it on it's head:
var before = DateTime.Now.AddDays(-90);
var persons = context.Persons.Where(x => x.CreateDT > before);
EXPLANATION
Remember that everything in the WHERE bit of your LINQ statement must be translated from C# to SQL by the EF. It is very, very capable out of the box and handles most basic tasks, but it has no idea how to understand the most rudimentary method calls, such as DateTime.Subtract(). So the idea here is to let it do what it does best by precalculating a value and then passing that to the data tier.
The first line subtracts 90 days from the current time by adding negative 90 days. The second line passes it off to the database server.
The second line should translate to the SQL WHERE CreateDT > #BEFORETHIS
Update
It seems that EF doesn't support subtracting dates and returning a TimeSpan. Here's one way to solve the problem:
DateTime oldestDate = DateTime.Now.AddDays(-90);
var list = from p in context.persons
where p.CreateDT >= oldestDate
select p;
See this thread on Stackoverflow.
Try doing simply (p.CreateDate - DateTime.Now).Days < 90. Instead of calling DateTime.Subtract(). In some cases the operator overloads are implemented for Entity Framework even when the corresponding named methods are not.
If that doesn't work you could instead use ESQL or a stored procedure. As a final, dirty solution, you could call context.persons.ToList() and then call the DateTime.Subtract().

Categories