Conditional Groupby in LINQ - c#

I m trying to do very similar linq statements but conditional on conditions they are slightly different. right now i just repeat the entire statement with small amendments but this should be possible much more concise. What I struggle with is to do the conditional groupby and also conditional select within the statement. My long version is:
class data
{
public string year;
public string quarter;
public string month;
public string week;
public string tariff;
public double volume;
public double price;
}
class results
{
public string product_code;
public string tariff;
public double volume;
public double price;
}
class Program
{
public static List<results> aggregationfunction(List<data> inputdata, string tarifftype, string timecategory)
{
List<results> returndata = new List<results>();
if (tarifftype.Equals("daynight") & timecategory.Equals("yearly"))
{
returndata = inputdata.GroupBy(a => new { a.tariff, a.year })
.Select(g => new results { product_code = g.Select(a => a.year).First(), tariff = g.Select(a => a.tariff).First(), volume = g.Sum(a => a.volume), price = g.Average(a => a.price) })
.ToList();
}
else if (tarifftype.Equals("allday") & timecategory.Equals("yearly"))
{
returndata = inputdata.GroupBy(a => new { a.year })
.Select(g => new results { product_code = g.Select(a => a.year).First(), tariff = "allday", volume = g.Sum(a => a.volume), price = g.Average(a => a.price) })
.ToList();
}
else if (tarifftype.Equals("daynight") & timecategory.Equals("quarterly"))
{
returndata = = inputdata.GroupBy(a => new { a.tariff, a.year, a.quarter })
.Select(g => new results { product_code = g.Select(a => a.year).First() + "_" + g.Select(a => a.quarter).First(), tariff = g.Select(a => a.tariff).First(), volume = g.Sum(a => a.volume), price = g.Average(a => a.price) })
.ToList();
}
else if (tarifftype.Equals("allday") & timecategory.Equals("quarterly"))
{
returndata = inputdata.GroupBy(a => new { a.year, a.quarter })
.Select(g => new results { product_code = g.Select(a => a.year).First() + "_" + g.Select(a => a.quarter).First(), tariff = "allday", volume = g.Sum(a => a.volume), price = g.Average(a => a.price) })
.ToList();
}
return returndata;
}
}
Any pointers would be appreciated. As you can see the group by and allocation of tariff and product code differ but this shouldnt mean I need to repeat it all, does it?

Shorter code:
return inputdata
.GroupBy(a => new
{
a.year,
quarter = timecategory == "quarterly" ? a.quarter : string.Empty,
tariff = tarifftype == "daynight" ? a.tariff : "allday"
})
.Select(g => new results
{
product_code = g.Key.year + (string.IsNullOrEmpty(g.Key.quarter) ? "" : "_" + g.Key.quarter),
tariff = g.Key.tariff,
volume = g.Sum(a => a.volume),
price = g.Average(a => a.price)
})
.ToList();

The IGrouping<TKey, data> items returned by inputdata.GroupBy (where data is the element type of inputdata) will always implement IEnumerable<data> regardless of the anonymous key type (which may be either {year} or {tariff, year} etc). So all the GroupBy values could be generalized to IEnumerable<IEnumerable<data>> and you could then break it down to two steps:
IEnumerable<IEnumerable<data>> groups;
if (condition) {
groups = inputdata.GroupBy(a => new { a.year });
} else if (condition) {
groups = inputdata.GroupBy(a => new { a.tariff, a.year });
} else {
groups = inputdata.GroupBy(a => new { a.year, a.quarter });
}
returndata = groups.Select(...)
The conditional select should be simpler to implement because you can just use conditional operators inline, or expand the selector function to a multi-line block, e.g.:
.Select(g => {
var product_code = ...;
if (condition) {
product_code += "_" + ...;
}
return new results { product_code = product_code, ... };
})

Related

Does not display data in linq C#

I have Linq which counts the goods, the problem is that the names that I pass, they do not work
ProductName, CompanyName, CustomerName,
Maybe there is a error in Linq?
It produces many anonymous methods that have these fields, but after ToList() everything does not work
public async Task<IEnumerable<SalesReportItem>> GetReportData(DateTime dateStart, DateTime dateEnd)
{
dateStart = new DateTime(2000, 1, 1);
var context = await _contextFactory.CreateDbContextAsync();
var queryable = context.SalesTransactionRecords.Join(
context.Products,
salesTransactionRecords => salesTransactionRecords.ProductId,
products => products.Id,
(salesTransactionRecords, products) =>
new
{
salesTransactionRecords,
products
})
.Join(context.Companies,
combinedEntry => combinedEntry.salesTransactionRecords.CompanyId,
company => company.Id,
(combinedEntry, company) => new
{
combinedEntry,
company
})
.Join(context.VendorCustomers,
combinedEntryAgain => combinedEntryAgain.combinedEntry.salesTransactionRecords.CustomerId,
vendorCustomer => vendorCustomer.Id,
(combinedEntryAgain, vendorCustomer) => new
{
CompanyName = combinedEntryAgain.company.Name,
CustomerName = vendorCustomer.Name,
ProductId = combinedEntryAgain.combinedEntry.products.Id,
ProductName = combinedEntryAgain.combinedEntry.products.Name,
combinedEntryAgain.combinedEntry.salesTransactionRecords.MovementType,
combinedEntryAgain.combinedEntry.salesTransactionRecords.Period,
combinedEntryAgain.combinedEntry.salesTransactionRecords.Quantity,
combinedEntryAgain.combinedEntry.salesTransactionRecords.Amount,
}).Where(x => x.Period >= dateStart && x.Period <= dateEnd)
.GroupBy(combinedEntryAgain => new
{
combinedEntryAgain.ProductId,
combinedEntryAgain.ProductName,
combinedEntryAgain.CompanyName,
combinedEntryAgain.CustomerName,
}
).Select(x => new SalesReportItem
{
ProductId = x.Key.ProductId,
Quantity = x.Sum(a => a.Quantity),
Amount = x.Sum(x => (x.MovementType == TableMovementType.Income ? x.Amount : -(x.Amount)))
});
var items = await queryable.ToListAsync();
return _mapper.Map<IEnumerable<SalesReportItem>>(items);
}
my mistake was that I did not specify the fields in the select, otherwise everything is buzzing, the upper code is working
Select(x => new SalesReportItem
{
ProductId = x.Key.ProductId,
ProductName = x.Key.ProductName,
CompanyName = x.Key.CompanyName,
CustomerName = x.Key.CustomerName,
Quantity = x.Sum(x => (x.MovementType == TableMovementType.Income ? x.Quantity : - x.Quantity)),
Amount = x.Sum(x => (x.MovementType == TableMovementType.Income? x.Amount: - x.Amount))
});
Thanks for the help
Hans Kesting

Optimize linq query by storing value in select

I have problem with linq query. In Select I am getting the same item twice which makes code execution much longer than I can afford. Is there any way to store x.OrderByDescending(z => z.Date).FirstOrDefault() item inside Select query?
Execution time: 180 ms
var groups = dataContext.History
.GroupBy(a => new { a.BankName, a.AccountNo })
.Select(x => new HistoryReportItem
{
AccountNo = x.FirstOrDefault().AccountNo,
BankName = x.FirstOrDefault().BankName,
IsActive = x.FirstOrDefault().IncludeInCheck,
})
.ToList();
Execution time: 1200 ms
var groups = dataContext.History
.GroupBy(a => new { a.BankName, a.AccountNo })
.Select(x => new HistoryReportItem
{
AccountNo = x.FirstOrDefault().AccountNo,
BankName = x.FirstOrDefault().BankName,
IsActive = x.FirstOrDefault().IncludeInCheck,
LastDate = x.OrderByDescending(z => z.Date).FirstOrDefault().Date,
})
.ToList();
Execution time: 2400 ms
var groups = dataContext.History
.GroupBy(a => new { a.BankName, a.AccountNo })
.Select(x => new HistoryReportItem
{
AccountNo = x.FirstOrDefault().AccountNo,
BankName = x.FirstOrDefault().BankName,
IsActive = x.FirstOrDefault().IncludeInCheck,
LastDate = x.OrderByDescending(z => z.Date).FirstOrDefault().Date,
DataItemsCount = x.OrderByDescending(z => z.Date).FirstOrDefault().CountItemsSend
})
.ToList();
You can try doing the select in two steps:
var groups = dataContext.History
.GroupBy(a => new { a.BankName, a.AccountNo })
.Select(x => new
{
first = x.FirstOrDefault();
lastDate = x.OrderByDescending(z => z.Date).FirstOrDefault();
}
.Select(x => new HistoryReportItem
{
AccountNo = x.first.AccountNo,
BankName = x.first.BankName,
IsActive = x.first.IncludeInCheck,
LastDate = x.lastDate.Date,
DataItemsCount = x.lastDate.CountItemsSend
})
.ToList();
If this fails, it might be because the engine can't convert it completely to SQL, and you can try adding an AsEnumerable() between the two Selects.

optimize the comparison in two lists with LINQ

I have two lists of object:
Customer And Employee
I need to check if there is at least 1 Client with the same name as an employee.
Currently I have:
client.ForEach(a =>
{
if (employee.Any(m => m.Name == a.Name && m.FirstName==a.FirstName)
{
// OK TRUE
}
});
can I improve reading by doing it in another way?
why won't you check it before hand using join?
var mergedClients = Client.Join(listSFull,
x => new { x.Name, x.FirstName},
y => new { Name = y.Name, FirstName= y.FirstName},
(x, y) => new { x, y }).ToList();
and then iterate over the new collection:
mergedClients.ForEach(a =>
//your logic
Only disadvantage of this approach (if it bothers you) is that null values will not be included.
I would go either with Join
var isDuplicated = clients.Join(employees,
c => new { c.Name, c.FirstName },
e => new { e.Name, e.FirstName },
(c, e) => new { c, e })
.Any();
or Intersect
var clientNames = clients.Select(c => new { c.Name, c.FirstName });
var employeeNames = employees.Select(e => new { e.Name, e.FirstName });
var isDuplicated = clientNames.Intersect(employeeNames).Any();
Both of Join and Intersect use hashing, and are close to O(n).
Note: equality (and hash code) of anonymous objects (new { , }) is evaluated as for a value type. I.e. two anonymous objects are equal (implies have same hash code) when all their fields are equal.
=== EDIT: Ok, I was interested myself (hope your question was about performance :P)
[TestMethod]
public void PerformanceTest()
{
var random = new Random();
var clients = Enumerable.Range(0, 10000)
.Select(_ => new Person { FirstName = $"{random.Next()}",
LastName = $"{random.Next()}" })
.ToList();
var employees = Enumerable.Range(0, 10000)
.Select(_ => new Person { FirstName = $"{random.Next()}",
LastName = $"{random.Next()}" })
.ToList();
var joinElapsedMs = MeasureAverageElapsedMs(() =>
{
var isDuplicated = clients.Join(employees,
c => new { c.FirstName, c.LastName },
e => new { e.FirstName, e.LastName },
(c, e) => new { c, e })
.Any();
});
var intersectElapsedMs = MeasureAverageElapsedMs(() =>
{
var clientNames = clients.Select(c => new { c.FirstName, c.LastName });
var employeeNames = employees.Select(e => new { e.FirstName, e.LastName });
var isDuplicated = clientNames.Intersect(employeeNames).Any();
});
var anyAnyElapsedMs = MeasureAverageElapsedMs(() =>
{
var isDuplicated = clients.Any(c => employees.Any(
e => c.FirstName == e.FirstName && c.LastName == e.LastName));
});
Console.WriteLine($"{nameof(joinElapsedMs)}: {joinElapsedMs}");
Console.WriteLine($"{nameof(intersectElapsedMs)}: {intersectElapsedMs}");
Console.WriteLine($"{nameof(anyAnyElapsedMs)}: {anyAnyElapsedMs}");
}
private static double MeasureAverageElapsedMs(Action action) =>
Enumerable.Range(0, 10).Select(_ => MeasureElapsedMs(action)).Average();
private static long MeasureElapsedMs(Action action)
{
var stopWatch = Stopwatch.StartNew();
action();
return stopWatch.ElapsedMilliseconds;
}
public class Person
{
public string FirstName { get; set; }
public string LastName { get; set; }
}
Output:
joinElapsedMs: 5.9
intersectElapsedMs: 3.5
anyAnyElapsedMs: 3185.8
Note: any-any is O(n^2) - (in worst case) every employee is iterated per each iterated client.

Linq group by with parent object

How do I group so that I don't loose the parent identifier.
I have the following
var grouped = mymodel.GroupBy(l => new { l.AddressId })
.Select(g => new
{
AddressId = g.Key.AddressId,
Quotes = g.SelectMany(x => x.Quotes).ToList(),
}).ToList();
this returns
{ AddressId1, [Quote1, Quote2, Quote3...]}
{ AddressId2, [Quote12, Quote5, Quote8...]}
Now I would like to group these by Quote.Code and Quote.Currency, So that Each address has 1 Object-Quote (that is if all 4 quotes belonging to the address have the same Code and Currency). I would like the sum of Currency in that object.
This works, but I can't get how to add Address to this result:
var test = grouped.SelectMany(y => y.Quotes).GroupBy(x => new { x.Code, x.Currency }).Select(g => new
{
test = g.Key.ToString()
});}
this gives compile error, whenever i try to add AddressId to result:
var test1 = grouped.SelectMany(y => y.Quotes, (parent, child) => new { parent.AddressId, child }).GroupBy(x => new { x.Provider, x.Code, x.Currency, x.OriginalCurrency }).Select(g => new
{
test = g.Key.ToString(),
Sum = g.Sum(x => x.Price)
});
compiler error as well:
var test1 = grouped.Select(x => new { x.AddressId, x.Quotes.GroupBy(y => new { y.Provider, y.Code, y.Currency, y.OriginalCurrency }).Select(g => new
{
addr = x.AddressId,
test = g.Key.ToString(),
Sum = g.Sum(q => q.Price)
};
I would do that this way:
var grouped = mymodel.GroupBy(l => new { l.AddressId })
.Select(g => new
{
AddressId = g.Key.AddressId,
QuotesByCode = g.SelectMany(x => x.Quotes)
.GroupBy(x=>x.Code)
.Select(grp=>new
{
Code = grp.Key.Code,
SumOfCurrency=grp.Sum(z=>z.Currency)
}).ToList(),
}).ToList();

The LINQ expression node type 'ArrayIndex' is not supported in LINQ to Entities

var residenceRep =
ctx.ShiftEmployees
.Include(s => s.UserData.NAME)
.Include(s => s.ResidenceShift.shiftName)
.Join(ctx.calc,
sh => new { sh.empNum, sh.dayDate },
o => new { empNum = o.emp_num, dayDate = o.trans_date },
(sh, o) => new { sh, o })
.Where(s => s.sh.recordId == recordId && s.o.day_flag.Contains("R1"))
.OrderBy(r => r.sh.dayDate)
.Select(r => new
{
dayDate = r.sh.dayDate,
empNum = r.sh.empNum,
empName = r.sh.UserData.NAME,
shiftId = r.sh.shiftId,
shiftName = r.sh.ResidenceShift.shiftName,
recordId,
dayState = r.o.day_desc.Split('[', ']')[1]
}).ToList();
I get an exception :
The LINQ expression node type 'ArrayIndex' is not supported in LINQ to
Entities
How i could find an alternative to Split('[', ']')[1] in this query
You must commit the query and do the split after loading the data:
var residenceRep =
ctx.ShiftEmployees
.Include(s => s.UserData.NAME)
.Include(s => s.ResidenceShift.shiftName)
.Join(ctx.calc,
sh => new { sh.empNum, sh.dayDate },
o => new { empNum = o.emp_num, dayDate = o.trans_date },
(sh, o) => new { sh, o })
.Where(s => s.sh.recordId == recordId && s.o.day_flag.Contains("R1"))
.OrderBy(r => r.sh.dayDate)
.Select(r => new
{
dayDate = r.sh.dayDate,
empNum = r.sh.empNum,
empName = r.sh.UserData.NAME,
shiftId = r.sh.shiftId,
shiftName = r.sh.ResidenceShift.shiftName,
recordId = r.sh.recordId,
dayState = r.o.day_desc,
})
.ToList()//Here we commit the query and load data
.Select(x=> {
var parts = x.dayState.Split('[', ']');
return new {
x.dayDate,
x.empNum,
x.empName,
x.shiftId,
x.shiftName,
x.recordId,
dayState = parts.Length > 1 ?parts[1]:"",
};
})
.ToList();
I had this Issue and the approach that I've chose was that get all element I wanted and save them into a List and then filter the actual data on that list.
I know this is not the best answer but it worked for me.

Categories