Convert Sum to an Aggregate product expression - c#

I have this expression:
group i by i.ItemId into g
select new
{
Id = g.Key,
Score = g.Sum(i => i.Score)
}).ToDictionary(o => o.Id, o => o.Score);
and instead of g.Sum I'd like to get the mathematical product using Aggregate.
To make sure it worked the same as .Sum (but as product) I tried make an Aggregate function that would just return the sum...
Score = g.Aggregate(0.0, (sum, nextItem) => sum + nextItem.Score.Value)
However, this does not give the same result as using .Sum. Any idas why?
nextItem.Score is of type double?.

public static class MyExtensions
{
public static double Product(this IEnumerable<double?> enumerable)
{
return enumerable
.Aggregate(1.0, (accumulator, current) => accumulator * current.Value);
}
}

The thing is that in your example you are starting the multiplication with 0.0 - A multiplication with zero yields zero, at the end the result will be zero.
Correct is to use the identity property of multiplication. While adding zero to a number leaves the number of unchanged, the same property holds true for a multiplication with 1. Hence, the correct way to start a product aggregate is to kick off multiplication wit the number 1.0.

If you aren't sure about initial value in your aggregate query and you don't acutally need one (like in this example) I would recommend you not to use it at all.
You can use Aggregate overload which doesn't take the initial value - http://msdn.microsoft.com/en-us/library/bb549218.aspx
Like this
int product = sequence.Aggregate((x, acc) => x * acc);
Which evaluates to item1 * (item2 * (item3 * ... * itemN)).
instead of
int product = sequence.Aggregate(1.0, (x, acc) => x * acc);
Which evaluates to 1.0 * (item1 * (item2 * (item3 * ... * itemN))).
//edit:
There is one important difference though. Former one does throw an InvalidOperationException when the input sequence is empty. Latter one returns seed value, therefore 1.0.

Related

How to get biggest element in HashSet of object by field?

Suppose I have the class
public class Point {
public float x, y, z;
}
And I've created this hashset:
HashSet<Point> H;
How can I get the element of H with the biggest z? It doesn't need necessarily to use Linq.
You can use Aggregate to mimic MaxBy functionality (note that you need to check if collection has any elements first):
var maxByZ = H.Aggregate((point, point1) => point.z > point1.z ? point : point1);
When .NET 6 is out it should have built in MaxBy.
You could do this:
int maxZ = H.Max(point => point.Z);
var maxPointByZ = H.Where(point => point.Z == maxZ).FirstOrDefault();
This works by first retrieving the largest value of Z in the set:
H.Max(point2 => point2.Z) //Returns the largest value of Z in the set
And then by doing a simple where statement to get the record where Z is equal to that value. If there are multiple values, it will get the first one, so you may want to sort the enumerable in advance.

How to convert from decimal to double in Linq to Entity

Suppose we have table T which has two columns A and B with float and money types respectively. I want to write a linq query like following T-SQL statement:
Select A, B, A * B as C
From SomeTable
Where C < 1000
I tried to cast like following
var list = (from row in model.Table
where ((decimal)row.A) * row.B < 1000
select new { A = row.A,
B = row.B ,
C = ((decimal)row.A) * row.B}
).ToList();
but it does not allow the cast operation. It throw an exception:
Casting to Decimal is not supported in Linq to Entity queries, because
the required precision and scale information cannot be inferred.
My question is how to convert double to decimal in Linq? I don't want to fetch data from database.
Update:
I notice the converting decimal to double works but reverse operation throws the exception. So,
Why can't we convert double to decimal? Does Sql server do the same mechanism in t-sql too? Doesn't it affect precision?
The difference between a float (double) and a decimal, is that a float is decimal precise. If you give the float a value of 10.123, then internally it could have a value 10.1229999999999, which is very near to 10.123, but not exactly.
A decimal with a precision of x decimals will always be accurate until the x-th decimal.
The designer of your database thought that type A didn't need decimal accuracy (or he was just careless). It is not meaningful to give the result of a calculation more precision than the input parameters.
If you really need to convert your result into a decimal, calculate your formula as float / double, and cast to decimal after AsEnumerable:
(I'm not very familiar with your syntax, so I'll use the extension method syntax)
var list = model.Table.Where(row => row.A * row.B < 1000)
.Select(row => new
{
A = row.A,
B = row.B,
})
.AsEnumerable()
.Select(row => new
{
A = row.A,
B = row.B,
C = (decimal)row.A * (decimal)row.B,
});
Meaning:
From my Table, take only rows that have values such that row.A * row.B
< 1000.
From each selected row, select the values from columns A and B.
Transfer those two values to local memory (= AsEnumerable),
for every transferred row create a new object with three properties:
A and B have the transferred values.
C gets the the product of the decimal values of transferred A and B
You can avoid AsEnumerable() explaining to Entity how many fractional digits you want.
var list = (from row in model.Table
where ((decimal)row.A) * row.B < 1000
select new { A = row.A,
B = row.B ,
C = (((decimal)((int)row.A)*100))/100) * row.B}
).ToList();

calculate sum of list properties excluding min and max value with linq

This is what I have so far:
decimal? total = list.Sum(item => item.Score);
What I would like to do is to exclude the min and max value in the list and then get the total value.
Is it possible to do all that in one linq statement?
list.OrderBy(item => item.Score)
.Skip(1)
.Reverse()
.Skip(1)
.Sum(item => item.Score);
You can try ordering the list first, then skip first item (minimum) and take all but the last (maximum) from the rest:
decimal? total = list.OrderBy(x => x.Score)
.Skip(1)
.Take(list.Count - 2)
.Sum(x => x.Score);
This is not the nicest code imaginable, but it does have the benefits of
only enumerating through the entire collection once (though it does get the first value three times).
Not require any much more memory than that to hold the IEnumerator and two Tuple<int, int, long, long> objects (which you'd not have if using OrderBy, ToList and sorting, etc.). This lets it work with arbitrarily large IEnumerable collections.
A single Linq expression (which is what you wanted).
Handles the edge cases (values.Count() < 2) properly:
when there's no values, using Min() and Max() on an IEnumerable will throw an InvalidOperationException
when there's one value, naïve implementations will do something like Sum() - Min() - Max() on the IEnumerable which returns the single value, negated.
I know you've already accepted an answer, but here it is: I'm using a single call to Enumerable.Aggregate.
public static long SumExcludingMinAndMax(IEnumerable<int> values)
{
// first parameter: seed (Tuple<running minimum, running maximum, count, running total>)
// second parameter: func to generate accumulate
// third parameter: func to select final result
var result = values.Aggregate(
Tuple.Create<int, int, long, long>(int.MaxValue, int.MinValue, 0, 0),
(accumulate, value) => Tuple.Create<int, int, long, long>(Math.Min(accumulate.Item1, value), Math.Max(accumulate.Item2, value), accumulate.Item3 + 1, accumulate.Item4 + value),
accumulate => accumulate.Item3 < 2 ? 0 : accumulate.Item4 - accumulate.Item1 - accumulate.Item2);
return result;
}
If you want to exclude all min- and max-values, pre-calculate both values and then use Ènumerable.Where to exclude them:
decimal? min = list.Min(item => item.Score);
decimal? max = list.Max(item => item.Score);
decimal? total = list
.Where(item=> item.Score != min && item.Score != max)
.Sum(item => item.Score);
You should pre-process list before sum to exclude min and max.

Returning two values immediately surrounding a test value in an IEnumerable<float>

I have an IEnumerable<float> containing distinct values found in a three dimensional array.
Given a test value, I want to take two elements from my distinct IEnumerable, the closest value which is greater than or equal to the test value, and the closest value which is less than the test.
In other words, if my test value is 80.5, and my list contains:
1.0
1.65
2.345
99.439
Then I want an IEnumerable<float> or a Tuple<float,float> back which contains 2.345 and 99.439.
Is there a LINQ statement or combination of such which will do that? An approach without LINQ?
Without using LINQ and assuming that there are only values > 0 in an input collection. There's no need to sort the collection first.
public Tuple<float, float> GetClosestValues(IEnumerable<float> values, float target)
{
float lower = 0;
float upper = Single.MaxValue;
foreach (var v in values)
{
if (v < target && v > lower) lower = v;
if (v > target && v < upper) upper = v;
}
return Tuple.Create(lower, upper);
}
In a tuple:
var t = Tuple.Create(list.Where(x => x <= value).Max(),
list.Where(x => x >= value).Min()
);
Although you don't state what the output should be if the value is in the list - in this case it would be a tuple with the same value for both "nodes"
Tuple.Create(
values.OrderBy(i => i)
.SkipWhile(i => i < test)
.FirstOrDefault(),
values.OrderByDescending(i => i)
.SkipWhile(i => i >= test)
.FirstOrDefault());
Sort (ascending), skip all values less than test, take the first value greater than or equal to test.
Sort (descending), skip all values greater than or equal to test, take the first value less than test.
double[] data = { .1,5.34,3.0,5.6 };
double test = 4.0;
var result = data.Aggregate(Tuple.Create(double.MinValue, double.MaxValue),
(minMax, x) => Tuple.Create(
x < test && x > minMax.Item1 ? x : minMax.Item1,
x >= test && x < minMax.Item2 ? x : minMax.Item2));
Assuming your list is sorted (and few other assumptions on the requirements (such as what happens if there is a direct hit).
float prev=0;
foreach(float item in YourIEnumerableFloatVar)
{
if (item > target)
{
return new Tuple<float, float> (prev, item);
}
prev = item;
}

Alternatives to nested Select in Linq

Working on a clustering project, I stumbled upon this, and I'm trying to figure out if there's a better solution than the one I've come up with.
PROBLEM : Given a List<Point> Points of points in R^n ( you can think at every Point as a double array fo dimension n), a double minDistance and a distance Func<Point,Point,double> dist , write a LINQ expression that returns, for each point, the set of other points in the list that are closer to him than minDistance according to dist.
My solution is the following:
var lst = Points.Select(
x => Points.Where(z => dist(x, z) < minDistance)
.ToList() )
.ToList();
So, after noticing that
Using LINQ is probably not the best idea, because you get to calculate every distance twice
The problem doesn't have much practical use
My code, even if bad looking, works
I have the following questions:
Is it possible to translate my code in query expression? and if so, how?
Is there a better way to solve this in dot notation?
The problem definition, that you want "for each point, the set of other points" makes it impossible to solve without the inner query - you could just disguise it in clever manner. If you could change your data storage policy, and don't stick to LINQ then, in general, there are many approaches to Nearest Neighbour Search problem. You could for example hold the points sorted according to their values on one axis, which can speed-up the queries for neighbours by eliminating early some candidates without full distance calculation. Here is the paper with this approach: Flexible Metric Nearest Neighbor Classification.
Because Points is a List you can take advantage of the fact that you can access each item by its index. So you can avoid comparing each item twice with something like this:
var lst =
from i in Enumerable.Range(0, Points.Length)
from j in Enumerable.Range(i + 1, Points.Length - i - 1)
where dist(Points[i], Points[j]) < minDistance
select new
{
x = Points[i], y = Points[j]
};
This will return a set composed of all points within minDistance of each other, but not exactly what the result you wanted. If you want to turn it into some kind of Lookup so you can see which points are close to a given point you can do this:
var lst =
(from i in Enumerable.Range(0, Points.Length)
from j in Enumerable.Range(i + 1, Points.Length - i - 1)
where dist(Points[i], Points[j]) < minDistance
select new { x = Points[i], y = Points[j] })
.SelectMany(pair => new[] { pair, { x = pair.y, y = pair.x })
.ToLookup(pair => pair.x, pair => pair.y);
I think you could add some bool Property to your Point class to mark it's has been browsed to prevent twice calling to dist, something like this:
public class Point {
//....
public bool IsBrowsed {get;set;}
}
var lst = Points.Select(
x => {
var list = Points.Where(z =>!z.IsBrowsed&&dist(x, z) < minDistance).ToList();
x.IsBrowsed = true;
return list;
})
.ToList();

Categories