Couting items and grouping them by levels - c#

I am using Entity Framework 6 and I have the following Linq Query:
IDictionary<BodyMassIndexLevel, Int32> bmistats =
context.Evaluations
// Get all evaluations where Height and Weight measures were done
.Where(x => x.Height != null && x.Weight != null)
// Select the date of the evaluation, the worker id and calculate BMI
.Select(x => new { Date = x.Date, Worker = x.Worker.Id, BMI = x.Weight.Value / Math.Pow(x.Height.Value / 100, 2) })
// Group by worker
.GroupBy(x => x.Worker)
// Get the most recent evaluation for each worker and so the most recent BMI
.Select(x => x.OrderByDescending(y => y.Date).Select(y => new { BMI = y.BMI }).FirstOrDefault())
// Cache the result in memory
.ToList()
// Count the number of BMIS in each level
.With(z =>
new Dictionary<BodyMassIndexLevel, Int32> {
{ BodyMassIndexLevel.SevereThinness, z.Count(w => w.BMI < 16) },
{ BodyMassIndexLevel.MildThinness, z.Count(w => w.BMI >= 16 && w.BMI < 17) },
{ BodyMassIndexLevel.ModerateThinness, z.Count(w => w.BMI >= 17 && w.BMI < 18.5) },
{ BodyMassIndexLevel.Normal, z.Count(w => w.BMI >= 18.5 && w.BMI < 25) },
{ BodyMassIndexLevel.PreObese, z.Count(w => w.BMI >= 25 && w.BMI < 30) },
{ BodyMassIndexLevel.ObeseClassI, z.Count(w => w.BMI >= 30 && w.BMI < 35) },
{ BodyMassIndexLevel.ObeseClassII, z.Count(w => w.BMI >= 35 && w.BMI < 40) },
{ BodyMassIndexLevel.ObeseClassIII, z.Count(w => w.BMI >= 40) }
}
);
I have two questions:
Is is possible to improve the performance of this query?
Can I move the Count part in levels to the query and so having not ToList()?

For example
Make something like Truncate for BMI after
// Select the date of the evaluation, the worker id and calculate BMI
Create BMILevel table with columns (BMILevelName | BMIValues) containing rows like (BodyMassIndexLevel.ModerateThinness, 17), (BodyMassIndexLevel.PreObese, 25), (BodyMassIndexLevel.PreObese, 26), etc.
JOIN your select query with *BMILevel* table on query.BMI = BMILevel.BMIValue, than GroupBy BMILevel.BMILevelName and finally Count for all groups.
Alternatively you may define BMILevel with columns (BMILevelName | BMIValueBeginInterval, BMIValueEndInterval) containing rows like (BodyMassIndexLevel.ModerateThinness, 17, 18), (BodyMassIndexLevel.PreObese, 25, 30).
And thus perform
query JOIN BMILevel ON query.BMI BETWEEN BMILevel.BMIValueBeginInterval AND BMILevel.BMIValueEndInterval
I consider EF can transform '<', '&&', '>' within .Where() (or Join) call properly
UPDATE:
If you don't want to create another one table, you may try create in-memory list of objects of sample type
class BMILevel {
public BMILevelEnum BMILevelName {get;set;}
public double BMILevelValueBeginInterval {get;set;}
public double BMILevelValueEndInterval {get;set;}
}
than create in-memory collection:
var bmiLevels = new List<BMILevel> { new BMILevel {...}, ... }
and use it in the way I describe it above.
I don't know how good EF 6 is, but old versions were unable to handle operations with non-entities (it couldn't translate expressions to proper SQL) and thus it results in inefficient querying or errors.
The only way to perform your query faster is to delegate it to SQL server. You can use EF abilities and thus it's possible that it requires of the creation of a new table. Another way - use ADO.NET (SqlCommand, SqlConnection, etc) and do it bypassing EF.

Related

Return a list of string using LINQ through a loop of conditionals

I have a class named Skill and I received a list of it through a parameter and I need to create a list of strings by LINQ that has some rules.
My Class
public class Skill {
public int id {get;set;}
public int year {get;set;}
public int xp {get;set;}
}
Dummy data:
var skills = new List<Skill>(){
new Skill() { id=1, year = 9, xp = 95 } ,
new Skill() { id=2, year = 5 } ,
};
Rules:
// year goes at max 10
// xp goes at max 100
The list of strings I must create is like this:
for each year until 10 plus xp until 100 (if has)
// '1-9-95'
// '1-9-96'
// '1-9-97'
// '1-9-98'
// '1-9-99'
// '1-9-99'
// '1-9-100'
// '1-10-95'
// '1-10-96'
// '1-10-97'
// '1-10-98'
// '1-10-99'
// '1-10-99'
// '1-10-100'
// '2-5'
// '2-6'
// '2-7'
// '2-8'
// '2-9'
// '2-10'
I got it using for statement, but I was wondering about using LINQ.
You need SelectMany and Enumerable.Range:
int maxYear = 10, maxXp = 100;
List<string> resultList = skills
.Where(skill => skill.year <= maxYear && skill.xp <= maxXp) // skip invalid
.SelectMany(skill => Enumerable.Range(skill.year, maxYear - skill.year + 1)
.SelectMany(y => Enumerable.Range(skill.xp, maxXp - skill.xp + 1)
.Select(xp => $"{skill.id}-{y}-{xp}")))
.ToList();
.NET Fiddle: https://dotnetfiddle.net/c80wJs
I think i have overlooked that "(if has)", so you want to list xp only if available:
int maxYear = 10, maxXp = 100;
List<string> resultList = skills
.Where(skill => skill.year <= maxYear && skill.xp <= maxXp) // skip invalid
.SelectMany(skill => Enumerable.Range(skill.year, maxYear - skill.year + 1)
.SelectMany(y => Enumerable.Range(skill.xp, skill.xp == 0 ? 1 : maxXp - skill.xp + 1)
.Select(xp => skill.xp > 0 ? $"{skill.id}-{y}-{xp}" : $"{skill.id}-{y}")))
.ToList();
.NET-fiddle for this (thanks to Rand Random): https://dotnetfiddle.net/06BIqg

Pivot the table result using linq c#

want to Pivot this table using linq c#
My Table is here
Since the question does not provide what to pivot by.. I did a pivot to count on a period of 50. Change it to your preference. Check this fiddle.
var result = myList
.GroupBy(x => x.Branch)
.Select(y => new {
Branch = y.Key,
FirstPeriod = y.Count(z => z.Quantity > 100 && z.Quantity <= 150),
SecondPeriod = y.Count(z => z.Quantity > 150 && z.Quantity <= 200),
ThirdPeriod = y.Count(z => z.Quantity > 200 && z.Quantity <= 250)
}).ToList();
References:
Excellent Pivot Example
Method used in the fiddle

C# LINQ select until amount >= 0

This is example database table
Name | Quantanity
Book I | 1
Book II | 13
Book III | 5
etc...
And I want to select this rows until I will have 100 books usinq LINQ expression.
I was trying
.TakeWhile(x => (amount -= x.Quantanity) > 0);
But it gave me an error
"Expression tree cannot contain assignment operator"
int bookCount = 0;
var query = books
.OrderBy(b => b.Quantity) // to get count 100, otherwise exceed is likely
.AsEnumerable()
.Select(b => {
bookCount += b.Quantanity;
return new { Book = b, RunningCount = bookCount };
})
.TakeWhile(x => x.RunningCount <= 100)
.Select(x => x.Book);
Tim's solution is good, but note about it --- Only the part before the AsEnumerable() is being executed by the data server -- Basically, you are pulling the entire table into memory, and then processes it.
Let's see if we can improve that:
int bookCount = 0;
var query1 = (from b in books
where b.Quantity > 0 && b. Quantity <= 100
orderby b.Quantity
select b).Take(100).AsEnumerable();
var query = query1
.Select(b => {
bookCount += b.Quantity;
return new { Book = b, RunningCount = bookCount };
})
.TakeWhile(x => x.RunningCount <= 100)
.Select(x => x.Book);
This limits us to only 100 records in memory to look thru to get to a count of 100.

Grouping data between ranges using LINQ in C#

I have made a following code to create me a range between two numbers, and data is separated in 7 columns:
private List<double> GetRangeForElements(double minPrice, double maxPrice)
{
double determineRange = Math.Round(maxPrice / 7.00d, 3);
var ranges = new List<double>();
ranges.Insert(0, Math.Round(minPrice, 3));
ranges.Insert(1, determineRange);
for (int i = 2; i < 8; i++)
{
ranges.Insert(i, Math.Round(determineRange * i, 3));
}
return ranges;
}
Now I have list of ranges when I call the method:
var ranges = GetRangeForElements(1,1500);
On the other side now I have the data (a list) that contains following data (properties):
public double TransactionPrice {get;set;}
public int SaleAmount {get;set;}
Input data would be:
Transaction price Sale amount
114.5 4
331.5 6
169.59 8
695.99 14
1222.56 5
Generated range for between 1 and 1500 is:
1
214.28
428.57
642.85
857.14
1071.43
1285.71
1500.00
And the desired output would be:
Range Sales
(1 - 214.28) 12
(214.28 - 428.57) 6
(428.57 - 642.85) 0
(642.85 - 857.14) 14
(857.14 - 1071.43) 0
(1071.43 - 1285.71) 5
(1285.71 - 1500) 0
I've tried something like this:
var priceGroups = _groupedItems.GroupBy(x => ranges.FirstOrDefault(r => r > x.TransactionPrice))
.Select(g => new { Price = g.Key, Sales = g.Sum(x=>x.Sales) })
.ToList();
But this doesn't gives me what I want, the results I receive are completely messed up (I was able to verify the data and results manually)...
Can someone help me out?
P.S. guys, the ranges that have no sales should simply have value set to 0...
#blinkenknight here's a pic of what I'm saying, min price is = 2.45 , max price = 2.45
and the output of the 2nd method you posted is:
Since GetRangeForElements returns a List<double>, you cannot group by it. However, you can group by range index, and then use that index to get the range back:
var rangePairs = ranges.Select((r,i) => new {Range = r, Index = i}).ToList();
var priceGroups = _groupedItems
.GroupBy(x => rangePairs.FirstOrDefault(r => r.Range >= x.TransactionPrice)?.Index ?? -1)
.Select(g => new { Price = g.Key >= 0 ? rangePairs[g.Key].Range : g.Max(x => x.TransactionPrice), Sales = g.Sum(x=>x.Sales) })
.ToList();
Assuming that _groupedItems is a list, you could also start with ranges, and produce the results directly:
var priceGroups = ranges.Select(r => new {
Price = r
, Sales = _groupedItems.Where(x=>ranges.FirstOrDefault(y=>y >= x.TransactionPrice) == r).Sum(x => x.Sales)
});
Note: Good chances are, your GetRangeForElements has an error: it assumes that minPrice is relatively small in comparison to maxPrice / 7.00d. To see this problem, consider what would happen if you pass minPrice=630 and maxPrice=700: you will get 630, 100, 200, 300, ... instead of 630, 640, 650, .... To fix this problem, compute (maxPrice - minPrice) / 7.00d and use it as a step starting at minPrice:
private List<double> GetRangeForElements(double minPrice, double maxPrice) {
double step = (maxPrice - minPrice) / 7.0;
return Enumerable.Range(0, 8).Select(i => minPrice + i*step).ToList();
}

Find distinct value of one field in a period

I have a table (detecc) with these fields:
uname string
door string
dt double (seconds since 1/1/1970)
I have this query that works well:
double dt1= SeconsdSince1970(DateTime.Now);
double dt0= dt1 - 3600;
var doorSearch = new string[] { "D1", "D2" };
System.Int32 cNow = (from d in detecc
where doorSearch.Contains(d.door) &&
(d.dt >= dt0 && d.dt <= dt1)
select d.uname).Distinct().Count();
But if I want to retrieve the users (uname), I get all records (duplicates):
double dt1= SeconsdSince1970(DateTime.Now);
double dt0= dt1 - 3600;
var doorSearch = new string[] { "D1", "D2" };
var lisUname = (from d in detecc
where doorSearch.Contains(d.door) &&
(d.dt >= dt0 && d.dt <= dt1)
select d.uname).Distinct();
How can I get distinct usernames?
If you are working with mongodb collections try this
// .ToList() converts to poco list
var lisUname = (from d in detecc
where doorSearch.Contains(d.door) &&
(d.dt >= dt0 && d.dt <= dt1)
select d.uname).ToList();
// distinct is now executed in c# context rather mongodb context
var distinctList = lisUname.Distinct();
A cleaner syntax
var list = detecc
.Where(d => doorSearch.Contains(d.door) && (d.dt >= dt0 && d.dt <= dt1)
.Select(x => x.uname)
.ToList();
Performance
Note: .Select will always end your query and pass all data (complete documents) to native code. So you get back the complete data from server and your code then is selecting your desired fields. If you want to pull only requested fields another mongodb query approach is required.
Refer to Documentation on Linq driver
For better performance regarding distinct use:
var list = detecc
.Where(d => doorSearch.Contains(d.door) && (d.dt >= dt0 && d.dt <= dt1)
.Distinct() // that way distinct is executed on server side
.ToList();
Why chaining .Count() with mongodb does not work
.Count() is meant to be used seperately with own parameters.
Refer to this article Why the MongoDB Count Property Returns All Records
// Example
int userCount = db.GetCollection("detecc")
.Count(Query.EQ("uname", searchedUName));
A more performant approach with aggregation
For a more performant approach use mongodb aggretation like
db.collection.aggregate([
{ "$match": { "$and": [ { "prop1": "" }, { "prop2": "" } ] } },
{ "$group": { "_id": "$messageId" } }
])
Please refer to this answer: MongoDb Distinct with query C# driver

Categories