LINQ GroupBy into a new object - c#

We are working on some LINQ stuff and are new to using the GroupBy extension.
I am editing this post to include my actual code as I tried to use some simple example but it seems that it making it more confusing for those trying to help. Sorry for that.
NOTE We need to sum the Amount field below. We did not attempt that yet as we are just trying to figure out how to extract the list from the groupBy.
Here is my code:
myCSTotal2.AddRange(userTotals.Where(w => w.priceinfoId == priceinfoID).GroupBy(g => g.termLength, o => new Model.MyCSTotal2
{
PriceinfoID = o.priceinfoId,
BillcodeID = o.billcodeid,
JobTypeID = o.jobtypeID,
SaleTypeID = o.saletypeID,
RegratesID = o.regratesID,
NatAccPerc = o.natAcctPerc,
NatIgnInCommCalc = o.natIgnInCommCalc,
TermLength = (int)o.termLength,
Amount = o.RMR1YrTotal / 12,
RuleEvaluation = 0
}).Select(grp => grp.ToList()));
The error we get when trying to do this is:
Argument 1: cannot convert from
IEnumerable<List<MyCSTotal2>> to IEnumerable<MyCSTotal2>
EDIT: Thanks for the help. Here is what we ended up with:
myCSTotal2.AddRange(userTotals.Where(w => w.priceinfoId == priceinfoID)
.GroupBy(g => g.termLength)
.SelectMany(cl => cl.Select( o => new Model.MyCSTotal2
{
PriceinfoID = o.priceinfoId,
BillcodeID = o.billcodeid,
JobTypeID = o.jobtypeID,
SaleTypeID = o.saletypeID,
RegratesID = o.regratesID,
NatAccPerc = o.natAcctPerc,
NatIgnInCommCalc = o.natIgnInCommCalc,
TermLength = (int)o.termLength,
Amount = cl.Sum(m=>m.RMR1YrTotal / 12),
RuleEvaluation = 0
})));

In order to flatten the groups you need to use SelectMany extension method:
SelectMany(grp => grp.ToList())
But if that is your current query you don't need to group, you need to project your collection using Select:
myCSTotal2.AddRange(userTotals.Where(w => w.priceinfoId == priceinfoID)
.Select( o => new Model.MyCSTotal2
{
PriceinfoID = o.priceinfoId,
BillcodeID = o.billcodeid,
JobTypeID = o.jobtypeID,
SaleTypeID = o.saletypeID,
RegratesID = o.regratesID,
NatAccPerc = o.natAcctPerc,
NatIgnInCommCalc = o.natIgnInCommCalc,
TermLength = (int)o.termLength,
Amount = o.RMR1YrTotal / 12,
RuleEvaluation = 0
});

I see no reason in using GroupBy as there are no aggregation functions involved. If you want to have Persons distinct by termLength. Write a DistinctBy. You will get the desired collection this way
public static IEnumerable<TSource> DistinctBy<TSource, TKey>
(this IEnumerable<TSource> source, Func<TSource, TKey> keySelector)
{
HashSet<TKey> seenKeys = new HashSet<TKey>();
foreach (TSource element in source)
{
if (seenKeys.Add(keySelector(element)))
{
yield return element;
}
}
}
Then use the extension like this
var collection = userTotals
.Where(w => w.priceinfoId == priceinfoID)
.DistinctBy(g => g.termLength)
.Select(o => new Model.MyCSTotal2
{
PriceinfoID = o.priceinfoId,
BillcodeID = o.billcodeid,
JobTypeID = o.jobtypeID,
SaleTypeID = o.saletypeID,
RegratesID = o.regratesID,
NatAccPerc = o.natAcctPerc,
NatIgnInCommCalc = o.natIgnInCommCalc,
TermLength = (int)o.termLength,
Amount = o.RMR1YrTotal / 12,
RuleEvaluation = 0
});

Related

Can I use SqlFunctions.DateDiff() in Entity Framework with a dynamic datePartArg?

The following code throws an EntityCommandCompilationException because of the commented line:
var datePartArg = "dd";
var minutesInStatePerSegment = await db.History_WorkPlaceStates
.Where(x => selector.StartTimeUtc <= x.Started && x.Ended < selector.EndTimeUtc)
.Select(x => new {
start = x.Started,
minutes = x.Minutes,
state = x.State,
})
.GroupBy(x => new {
//This causes an exception:
segment = SqlFunctions.DateDiff(datePartArg, selector.StartTimeUtc, x.start),
state = x.state,
})
.Select(x => new {
state = x.Key.state,
segment = x.Key.segment,
minutes = x.Sum(y => y.minutes),
}).ToListAsync();
This happens because DateDiff within SQL Server can only use a literal string for its first argument, and cannot use a variable. Entity Framework generates a variable within SQL, and so we get the exception.
Is there a way to get around this problem?
You can create the following extension method to get around this. Just substitute the GroupBy function with this extension method when using DateDiff:
public static IQueryable<IGrouping<TKey, TSource>> GroupByDateDiff<TSource, TKey>(this IQueryable<TSource> source, Expression<Func<TSource, TKey>> keySelector) {
var body = (NewExpression)keySelector.Body;
var transformedBodyArguments = body.Arguments.Select(arg => arg switch {
MethodCallExpression callNode
when callNode.Method.Name == "DateDiff" && callNode.Arguments[0].NodeType != ExpressionType.Constant
=> getTransformedDateDiffCall(callNode),
_ => arg,
}).ToArray();
var updatedExpr = keySelector.Update(body.Update(transformedBodyArguments), keySelector.Parameters);
return source.GroupBy(updatedExpr);
MethodCallExpression getTransformedDateDiffCall(MethodCallExpression dateDiffCallNode) {
var dateDiffFirstArg = dateDiffCallNode.Arguments[0];
if (dateDiffFirstArg.NodeType != ExpressionType.MemberAccess) {
throw new ArgumentException($"{nameof(GroupByDateDiff)} was unable to parse the datePartArg argument to the DateDiff function.");
}
var replacementExpression = Expression.Constant((string)GetMemberValue((MemberExpression)dateDiffFirstArg));
var alternativeArgs = dateDiffCallNode.Arguments.Skip(1).Prepend(replacementExpression).ToArray();
return dateDiffCallNode.Update(dateDiffCallNode.Object, alternativeArgs);
};
}
private static object GetMemberValue(MemberExpression member) {
var objectMember = Expression.Convert(member, typeof(object));
var getterLambda = Expression.Lambda<Func<object>>(objectMember);
var getter = getterLambda.Compile();
return getter();
}

Is there a way to simplify this with a loop or linq statement?

I'm trying to find out if there is a way to create a loop for my example code below
// the objects below create a list of decimals
var ema12 = calc.ListCalculationData.Select(i => (double)i.Ema12);
var ema26 = calc.ListCalculationData.Select(i => (double)i.Ema26);
var ema = calc.ListCalculationData.Select(i => (double)i.Ema);
var adl = calc.ListCalculationData.Select(i => (double)i.ADL);
var r1 = GoodnessOfFit.RSquared(ema12);
var r2 = GoodnessOfFit.RSquared(ema26);
var r3 = GoodnessOfFit.RSquared(ema);
var r4 = GoodnessOfFit.RSquared(adl);
I'm trying to get something similar to the below pseudo code. Please keep in mind that each var item is a list of decimals
foreach (var item in calc.ListCalculationData.AsEnumerable())
{
var item2 = calc.ListCalculationData.Select(i => (double)item);
var r1 = GoodnessOfFit.RSquared(item2);
}
More information:
ListCalculationData is a list of my custom class that I have added below. What I'm trying to do is cycle through each variable in that class and perform a select query to perform the goodness of fit rsquared calculation on the list of decimals that the select query returns so it simplifies my code and makes it similar to my pseudo code
public class CalculationData
{
public decimal Ema { get; set; }
public decimal Ema12 { get; set; }
public decimal Ema26 { get; set; }
public decimal ADL { get; set; }
}
Update: I tried this for a local function and it fails with ; expected and invalid {
double r(Func<CalculationData, double> f) =>
{ GoodnessOfFit.RSquared(calc.ListCalculationData.Select(f), vectorArray) };
Update 2: This is what I have my current code set to because of the recommendations but obviously this doesn't work because it says that the name i doesn't exist in this context at this section: nameof(i.Ema12) and also because I'm using mostly pseudo code
MultipleRegressionInfo rn(Func<CalculationData, double> f, string name, int days)
{
MultipleRegressionInfo mrInfo = new MultipleRegressionInfo
{
RSquaredValue = GoodnessOfFit.RSquared(calc.ListCalculationData.Select(f), vectorArray),
ListValues = (List<double>)calc.ListCalculationData.Select(f).ToList(),
ValueName = name,
Days = days
};
listMRInfo.Add(mrInfo);
return mrInfo;
};
MultipleRegressionInfo rnList(Func<CalculationData, List<decimal>> f, string name, int days)
{
MultipleRegressionInfo mrInfo = new MultipleRegressionInfo
{
RSquaredValue = GoodnessOfFit.RSquared(calc.ListCalculationData.Select(f), vectorArray),
ListValues = (List<double>)calc.ListCalculationData.Select(f).ToList(),
ValueName = name,
Days = days
};
listMRInfo.Add(mrInfo);
return mrInfo;
};
foreach (CalculationData calc in ListCalculationData)
{
foreach (object value in calc)
{
if (value == typeof(decimal))
{
MultipleRegressionInfo r1 = rn(i => (double)i.value, nameof(i.value), 100)
}
else if (value == typeof(List<decimal>)
{
MultipleRegressionInfo r1 = rnList(i => i.value, nameof(i.value), 100)
}
}
}
You can either express each individual field as a lambda that retrieves a particular field value (I think this is better) or as a string or PropertyType value that uses reflection to achieve the same thing.
var getters = new Func<CalculationData, double>[] {
(i) => (double)i.Ema12,
(i) => (double)i.Ema26,
(i) => (double)i.Ema,
(i) => (double)i.ADL,
};
Then it's just a matter of getting each individual IEnumerable<double> sequence and calculating its RSquared value.
var dataseries = getters.Select((func) => calc.ListCalculationData.Select(func));
double[] results = dataseries.Select((data) => GoodnessOfFit.RSquared(data)).ToArray();
From comments:
This is similar to what I'm looking for but I have over 40 variables in my class and I added more information to try to explain what I'm trying to do but I'm trying to prevent the extra 40 lines of code to do something similar to your code
The following should do what you're asking, using reflection.
IEnumerable<Func<CalculationData, double>> getters =
typeof(CalculationData).GetProperties()
.Select<PropertyInfo, Func<CalculationData, double>>(
(PropertyInfo p) => (CalculationData x) => (double)(decimal)p.GetValue(x)
);
Edit: The question was edited again, and I'm no longer certain you need the indirection of the getters. see https://dotnetfiddle.net/Sb65DZ for a barebones example of how I'd write this code.
In Visual Studio 2015+ you can use local functions (not tested):
double r(Func<CalculationData, double> f) =>
GoodnessOfFit.RSquared(calc.ListCalculationData.Select(f));
double r1 = r(i => (double)i.Ema12), r2 = r(i => (double)i.Ema26),
r3 = r(i => (double)i.Ema) , r4 = r(i => (double)i.ADL);
or a bit less efficient lambda:
Func<Func<CalculationData, double>, double> r = f =>
GoodnessOfFit.RSquared(calc.ListCalculationData.Select(f));
double r1 = r(i => (double)i.Ema12), r2 = r(i => (double)i.Ema26),
r3 = r(i => (double)i.Ema) , r4 = r(i => (double)i.ADL);
Another alternative could be converting them to array:
Func<CalculationData, double>[] lambdas = { i => (double)i.Ema12, i => (double)i.Ema26,
i => (double)i.Ema, i => (double)i.ADL };
double[] r = Array.ConvertAll(lambdas, f =>
GoodnessOfFit.RSquared(calc.ListCalculationData.Select(f)));
To find the property with the max rsquared value using reflection, you can try this:
Tuple<double, string> maxR = typeof(CalculationData).GetProperties().Max(p => Tuple.Create(
GoodnessOfFit.RSquared(calc.ListCalculationData.Select(i => Convert.ToDouble(p.GetValue(i)))), p.Name));
double maxRvalue = maxR.Item1;
string maxRname = maxR.Item2;
You could use an extension method to collect a common sequence of operations together.
public static class CalculationDataExtensions
{
public static IEnumerable<double> CalcRSquared(
this IEnumerable<CalculationData> source,
Func<CalculationData, decimal> propertySelector)
{
IEnumerable<double> values = source
.Select(propertySelector)
.Select(x => (double)x);
return GoodnessOfFit.RSquared(values);
}
}
called by
var r1 = calc.ListCalculationData.CalcRSquared(x => x.Ema12);
var r2 = calc.ListCalculationData.CalcRSquared(x => x.Ema26);
var r3 = calc.ListCalculationData.CalcRSquared(x => x.Ema);
var r4 = calc.ListCalculationData.CalcRSquared(x => x.ADL);

How can I select the last digit of an integer in a LINQ .select?

I have this LINQ select:
var extendedPhrases = phrases
.Select(x => new ExtendedPhrase()
{
Ajlpt = x.Ajlpt,
Bjlpt = x.Bjlpt,
Created = x.Created // an int?
});
If I define:
public int? CreatedLast { get; set; }
Then how can I populate that with the last digit of x.Created?
If you are looking for the last digit of the Created property, the use the % operator like this:
var extendedPhrases = phrases
.Select(x => new ExtendedPhrase()
{
Ajlpt = x.Ajlpt,
Bjlpt = x.Bjlpt,
Created = x.Created,
CreatedLast = x.Created % 10
});
The first way to come to mind is to call .ToString().Last():
var extendedPhrases = phrases
.Select(x => new ExtendedPhrase()
{
Ajlpt = x.Ajlpt,
Bjlpt = x.Bjlpt,
Created = x.Created,
CreatedLast = x.Created?.ToString().Last()
});
If you aren't using the latest shiny C#, then null protection can be done with:
var extendedPhrases = phrases
.Select(x => new ExtendedPhrase()
{
Ajlpt = x.Ajlpt,
Bjlpt = x.Bjlpt,
Created = x.Created,
CreatedLast = x.Created.HasValue ? x.Created.ToString().Last() : null
});
And some conversion back to an int? left as an exercise to the reader.

Add query to linq var

I never used this before. I need to add another member to the query. How can I add "link" to "source"?
var titles = regexTitles.Matches(pageContent).Cast<Match>();
var dates = regexDate.Matches(pageContent).Cast<Match>();
var link = regexLink.Matches(pageContent).Cast<Match>();
var source = titles.Zip(dates, (t, d) => new { Title = t, Date = d });
foreach (var item in source)
{
var articleTitle = item.Title.Groups[1].Value;
var articleDate = item.Date.Groups[1].Value;
//var articleLink = item.Link.Groups[1].Value;
Console.WriteLine(articleTitle);
Console.WriteLine(articleDate);
//Console.WriteLine(articleLink);
Console.WriteLine("");
}
Console.ReadLine();
It sounds like you just need another call to Zip. The first call will pair up the titles and dates, and the second call will pair up the title/date pairs with the links:
var source = titles.Zip(dates, (t, d) => new { t, d })
.Zip(link, (td, l) => new { Title = td.t,
Date = td.d,
Link = l });
or (equivalently, just using projection initializers):
var source = titles.Zip(dates, (Title, Date) => new { Title, Date })
.Zip(link, (td, Link) => new { td.Title, td.Date, Link });
(I sometimes think it would be nice to have another couple of overloads for Zip to take three or four sequences... it wouldn't be too hard. Maybe I'll add those to MoreLINQ :)
You can easily use Zip once more as in the below code:
var source = titles.Zip(dates, (t, d) => new { Title = t, Date = d });
.Zip(link, (d, l) => new { Title = d.Title,
Date = d.Date,
Link= l });

use LINQ to find the product with the cheapest value?

Im learning LINQ and I want to find the cheapest product from the following list:
List<Product> products = new List<Product> {
new Product {Name = "Kayak", Price = 275M, ID=1},
new Product {Name = "Lifejacket", Price = 48.95M, ID=2},
new Product {Name = "Soccer ball", Price = 19.50M, ID=3},
};
I have come up with the following but somehow it feels like it is not the best way to do it:
var cheapest = products.Find(p => p.Price == products.Min(m => m.Price));
can you show me the right way to achieve this.
You should use MinBy:
public static TSource MinBy<TSource>(
this IEnumerable<TSource> source,
Func<TSource, IComparable> projectionToComparable
) {
using (var e = source.GetEnumerator()) {
if (!e.MoveNext()) {
throw new InvalidOperationException("Sequence is empty.");
}
TSource min = e.Current;
IComparable minProjection = projectionToComparable(e.Current);
while (e.MoveNext()) {
IComparable currentProjection = projectionToComparable(e.Current);
if (currentProjection.CompareTo(minProjection) < 0) {
min = e.Current;
minProjection = currentProjection;
}
}
return min;
}
}
Just add this as a method in a public static class (EnumerableExtensions?).
Now you can say
var cheapest = products.MinBy(x => x.Price);
Alternatively, you could simply order them and take the first result, this is assuming you're after the Product object and not the Price value; like so.
var cheapestProduct = products.OrderBy(p => p.Price).FirstOrDefault();
var mostExpensiveProduct = products.OrderByDescending(p => p.Price).FirstOrDefault();
You need to order by the price first and then select the first.
if(products.Any()){
var productWithCheapestPrice = products.OrderBy(p => p.Price).First();
}

Categories