linq Average within a groupby - c#

In this sample below I would like to get the Average of the Answer Yes/No within a groupby. I am trying to get the count of Yes and the count of No and set the Average to PercentYes .
PercentYes = g.Average(g.Where(f => f.Answer == "Yes").Count(),g.Where(f => f.Answer == "No").Count())
https://dotnetfiddle.net/oyx6Ju
error message
Compilation error (line 41, col 20): No overload for method 'Average' takes 2 arguments

Average indeed does not take two arguments. It doesn't calculate an average of what you pass it, it calculates an average from what you call it on (in this case g), using an optional argument as a function to pull a specific value from each instance of the collection (each instance of g).
You may be over-thinking it. Not everything needs LINQ. You don't even want an average. You want a percent. Just divide the "Yes" count by the total. For example:
var yesCount = (decimal)g.Count(f => f.Answer == "Yes");
var noCount = (decimal)g.Count(f => f.Answer == "No");
var percentYes = yesCount / (yesCount + noCount);
(Note: I'm casting the "count" values to decimal to avoid integer division, which would otherwise always result in 0 here.)

Related

Converting a number with 000 and a -1 to its positive representation with only one 0

Read a list of non-negative integer values, sentinel -1 (i.e. end the
program and display the output), and print the list replacing each
sequence of zeros with a single zero.
Example input
100044022000301-1
Then the output will be:
10440220301
the last problem of my list, I don't have a clue how to solve it, I tough in removing the zeros and transforming then in than adding a 0 after that
feels bad
Something like this: Linq (in order to take the value before sentinel -1) and Regular expressions (turn 2 or more consequent 0 into single 0):
given a list we can find out the last value before sentinel as
var value = list
.TakeWhile(item => item != sentinel)
.Last();
to turn two or more consequent 0 into single one we can use Regex:
string removed = Regex.Replace(value.ToString(), "0{2,}", "0");
Code:
// initial " list of non-negative integer values"
// I've declared it as long, since 100044022000301 > int.MaxValue
List<long> list = new List<long>() {
4555223,
123,
456,
100044022000301L, // we want this value (just before the sentinel)
-1L, // sentinel
789,
};
long result = long.Parse(Regex.Replace(list
.TakeWhile(item => item != -1) // up to sentinel
.Last() // last value up to sentinel
.ToString(),
"0{2,}", // change two or more consequent 0
"0")); // into 0

What is best practice to use when checking if decimal matches a query

I have a for loop that loops a list of transactions, which all contains amount. If the amount is correct, I want that transaction to be included in a new list.
So in code:
decimal searchAmount = 33.03;
foreach (var order in listOforders)
{
if(order.amount == searchAmount)
{
addOrderToList()
}
}
The currency used doesn't use more than two decimals, so that's okay. These three scenarios, should all add the order to the list.
order.Amount = 33.03
search.Amount = 33
order.Amount = 33.03
search.Amount = 33.03
order.Amount = 33.99
search.Amount = 33.9
Note:
This is a search. When the customer comes back, and says "I have a problem with the product I purchased, and it's not purchased on a registered customer", searching for the amount on the customers bank receipt is a great function. This is a retail brick and mortar store scenario, so some customers choose to not register themselves.
If you want to discard the fractional part completely, using a combination of LINQ and Math.Truncate
var orders = listOfOrders.Where(o => Math.Truncate(o.amount) == Math.Truncate(searchAmount))
.ToList();
Math.Truncate returns the integral part of a given decimal, Where selects only appropriate orders (do read up on LINQ's deferred execution if you don't know it) and ToList materializes the query into a list.
EDIT: given your edit, this is probably what you're looking for:
var orders
= listOfOrders.Where(
o => Math.Truncate(o.amount) == Math.Truncate(searchAmount)
&& o.amount.ToString(CultureInfo.InvariantCulture)
.StartsWith(searchAmount.ToString(CultureInfo.InvariantCulture)))
.ToList();
This first verifies if the integral part of the numbers match and then uses string comparison to check if the actual amount starts with what was inputted (by your lazy user).
Use your if condition like this. Round to 2 decimal places and compare.
if((Math.Round(order.amount,2) - Math.Round(searchAmount,2)) <= 0.9M)
{
addOrderToList();
}
What if you use Math.Truncate(number)? Just like:
if(Math.Truncate(order.amount) == Math.Truncate(searchAmount))
If am right you need to do define some maximum difference constant and use it something like that
decimal maxDiff = 0.03;
decimal searchAmount = 33.03;
var result = listOfOrders.Where(o => Math.Abs(o.Amount - searchAmount) <= maxDiff);
You need to use Math.Floor method to match all numbers i.e absolute value
decimal searchAmount = 33.03;
var temp = Math.Floor(searchAmount);
foreach (var order in listOforders)
{
if(Math.Floor(order.amount) == temp)
{
addOrderToList()
}
}
There is no need for any call to Math as a cast to int will do the same. Your code could be changed to;
int searchAmount = 33;
listOforders.Where(o => (int)o.Amount == searchAmount)
.ForEach(o => addOrderToList());

How to filter or drop a value based on the previous one using Deedle in C#?

I am dealing with data from sensors. Sometimes these sensors have blackouts and brownouts, in consequence I can have the following kind of Time Series in a Frame, let's call it "myData":
[7.438984; 0,000002; 7.512345; 0.000000; 7.634912; 0.005123; 7.845627...]
Because I need only 3 decimals precision, I rounded the data from the frame:
var myRoundedData = myData.ColumnApply((Series<DateTime, double> numbers) => numbers.Select(kvp => Math.Round(kvp.Value, 3)));
I get the columns from the frame and filtered the Zeros "0.000":
var myFilteredTimeSeries = from kvp in myTimeSeries where kvp.Value != 0.000 select kvp;
So, my Time Series is partially filtered:
[7.439; 7.512; 7.635; 0.006; 7.846...]
However, the value "0.006" is not valid!
How could I implement an elegant filtering syntax based on the previous value, something like a "percent limit" in the rate of change:
if (0.006 / 7.635) * 100 < 0.1 then ---> drop / delete(0.006)
If you want to look just at the previous/next value, then you can shift the series by one and zip it with the original. This will give you a series of pairs (a value together with the previous/next value):
var r = actual.ZipInner(actual.Shift(1));
If you want to look at more elements around the specified one, then you'll need one of the windowing functions provided by Deedle:
Floating windows and chunking
The simplest example would be to use WindowInto to get a value together with 4 values before it:
var res = values.WindowInto(5, win =>
// 'win' is a series with the values - do something clever here!
);
One of the keys is to stay focused in methods that involve the value and its "neighbourhood", just like #tomaspetricek pointed before (Thanks!).
My goal was to find a "free-of-noise" time stamp or keys to build a Frame and perform an AddColumn operation, which is by nature a JoinKind.Left operation.
To solve the problem I used the Pairwise() method to get focused on "Item1" (current value), and "Item2" (next value) as follows:
double filterSensibility = 5.0 // % percentage
var myBooleanFilteredTimeSeries = myTimeSeries.Pairwise().Select(kvp => (kvp.Value.Item2 / kvp.Value.Item1) * 100 < filterSensibility);
Here I could write the relation I wanted! (see question) Then based on the Time Series (example) posted before I got:
myBooleanFilteredTimeSeries = [FALSE; FALSE; FALSE, TRUE; FALSE...]
TRUE means that this value is noisy! So I get only the FALSE boolean values with:
var myDateKeysModel = from kvp in myBooleanFilteredTimeSeries where kvp.Value == false select kvp;
I created a frame from this last Time Series:
myCleanDateTimeKeysFrame = Frame.FromRecords(myDateKeysModel);
Finally, I add the original (noisy) Time Series to the previously created Frame:
myCleanDateTimeKeysFrame.AddColumn("Column Title", myOrginalTimeSeries);
...et voilà!

"value must be a number less than infinity" error when the variable is a integer

I'm not sure if I should ask this, but the details might help on finding out :p
I have a table like this
And I'm using compute to get the interval where a double variable stands in the Variavel column
double value = 6;
double max Convert.ToDouble(DataAccess.Instance.tabela1vert0caso1W.Compute("MIN(Variavel)", "Variavel >= " + value.ToString(CultureInfo.InvariantCulture)));
double min = Convert.ToDouble(DataAccess.Instance.tabela1vert0caso1W.Compute("MAX(Variavel)", "Variavel <= " + value.ToString(CultureInfo.InvariantCulture)));
The problem here is that I get the inifity error on the double min line, however it only happens when I'm between 5 and 15, if I choose any other value, i get the program to work properly.
Any hint?
By the way I checked the value of value just before the lines, and it's still 6.
I'm not sure what the actual problem is, however, you could always use Linq-To-DataTable which is more powerful (supports the whole .NET framework) and also more readable:
var variavels = DataAccess.Instance.tabela1vert0caso1W.AsEnumerable()
.Select(row => row.Field<int>("Variavel"));
double max = variavels.Where(d => d >= value).Max();
double min = variavels.Where(d => d <= value).Min();
Its possible for the compute method to return DBNull.Value. It seems unclear when if at all the Max() function will return this, but possibly your select returns 0 rows?
I suggest you add a check for DBNull.Value and set min to value when it occurs

calculate sum of list properties excluding min and max value with linq

This is what I have so far:
decimal? total = list.Sum(item => item.Score);
What I would like to do is to exclude the min and max value in the list and then get the total value.
Is it possible to do all that in one linq statement?
list.OrderBy(item => item.Score)
.Skip(1)
.Reverse()
.Skip(1)
.Sum(item => item.Score);
You can try ordering the list first, then skip first item (minimum) and take all but the last (maximum) from the rest:
decimal? total = list.OrderBy(x => x.Score)
.Skip(1)
.Take(list.Count - 2)
.Sum(x => x.Score);
This is not the nicest code imaginable, but it does have the benefits of
only enumerating through the entire collection once (though it does get the first value three times).
Not require any much more memory than that to hold the IEnumerator and two Tuple<int, int, long, long> objects (which you'd not have if using OrderBy, ToList and sorting, etc.). This lets it work with arbitrarily large IEnumerable collections.
A single Linq expression (which is what you wanted).
Handles the edge cases (values.Count() < 2) properly:
when there's no values, using Min() and Max() on an IEnumerable will throw an InvalidOperationException
when there's one value, naïve implementations will do something like Sum() - Min() - Max() on the IEnumerable which returns the single value, negated.
I know you've already accepted an answer, but here it is: I'm using a single call to Enumerable.Aggregate.
public static long SumExcludingMinAndMax(IEnumerable<int> values)
{
// first parameter: seed (Tuple<running minimum, running maximum, count, running total>)
// second parameter: func to generate accumulate
// third parameter: func to select final result
var result = values.Aggregate(
Tuple.Create<int, int, long, long>(int.MaxValue, int.MinValue, 0, 0),
(accumulate, value) => Tuple.Create<int, int, long, long>(Math.Min(accumulate.Item1, value), Math.Max(accumulate.Item2, value), accumulate.Item3 + 1, accumulate.Item4 + value),
accumulate => accumulate.Item3 < 2 ? 0 : accumulate.Item4 - accumulate.Item1 - accumulate.Item2);
return result;
}
If you want to exclude all min- and max-values, pre-calculate both values and then use Ènumerable.Where to exclude them:
decimal? min = list.Min(item => item.Score);
decimal? max = list.Max(item => item.Score);
decimal? total = list
.Where(item=> item.Score != min && item.Score != max)
.Sum(item => item.Score);
You should pre-process list before sum to exclude min and max.

Categories