Use LINQ to load cumulative average into MVC model - c#

Writing a stats web site for church softball team.
I have a SQL view that calculates the batting average per game from stats in a table. I use that view to build a model in MVC called PlayerStats.
I want to create and load data into a model that looks like this:
public class BattingAverage
{
[Key]
public int stat_id { get; set; }
public int player_id { get; set; }
public decimal batting_avg { get; set; }
public decimal cumulative_batting_avg { get; set; }
}
In my controller I am doing this:
var model = (from s in PlayerStats
where s.season_id == sid && s.player_id == pid
orderby s.game_no
select g).AsEnumerable()
.Select(b => new BattingAverage
{
stat_id = b.stat_id,
player_id = b.player_id,
batting_avg = b.batting_avg,
cumulative_batting_avg = ***[[ WHAT TO DO HERE?? ]]***
});
I don't know how to calculate that cumulative average to load into my model. The end goal is to Json.Encode the model data for use in AmCharts.
UPDATE - I tried this:
cumulative_batting_avg = getCumulativeAvg(sid, pid, b.game_no)
And my function:
public decimal getCumulativeAvg(int season_id, int player_id, int game_no)
{
var averages = PlayerStats.Where(g => g.season_id == season_id && g.player_id == player_id && g.game_no <= game_no).ToArray();
var hits = averages.Select(a => a.Hits).Sum();
var atbats = averages.Select(a => a.AtBats).Sum();
if (atbats == 0)
{
return 0.0m; // m suffix for decimal type
}
else
{
return hits / atbats;
}
}
This returned a correct average in the first row, but then zeroes for the rest. When I put a break on the return, I see that hits and atbats are properly accumulating inside the function, but for some reason the avg isn't being added to the model. What am I doing wrong?

Yeah basically you want to have a subquery to pull the average across all of the games, is my understanding correct? You can use the LET keyword so that while the main query is pulling in the context of the current game, the let subquery can pull across all of the games, something like this: Simple Example Subquery Linq
Which might translate to something like:
from s in PlayerStats
let cs = context.PlayerStats.Where(i => i.PlayerID == s.PlayerID).Select(i => i.batting_avg).Average()
.
.
select new {
batting_avg = b.batting_avg, /* for the current game */
cumulative_batting_avg = cs
}
Something like that, though I might be off syntax a little. With that type of approach, I often worry about performance with LINQ subqueries (you never know what it's going to render) and so you may want to consider using a stored procedure (really depends on how much data you have)

Now from the comments I think I understand what a cumulative batting average is. It sounds like for a given game, it's an average based on the sum of hits and at bats in that game and the previous games.
So let's say that you have a collection of stats that's already filtered by a player and season:
var playerSeasonStats = PlayerStats.Where(g =>
g.season_id == season_id && g.player_id == player_id && g.game_no <= game_no)
.ToArray();
I originally wrote the next part as a Linq expression. But that's just a habit, and in this case it's much simpler and easier to read with a normal for/each loop.
var playerSeasonStats = PlayerStats as PlayerStat[] ?? PlayerStats;
var averages = new List<BattingAverage>();
int cumulativeHits = 0;
int cumulativeAtBats = 0;
foreach (var stat in playerSeasonStats.OrderBy(stat => stat.game_no))
{
cumulativeHits += stat.Hits;
cumulativeAtBats += stat.AtBats;
var average = new BattingAverage
{
player_id = stat.player_id,
stat_id = stat.stat_id,
batting_avg = stat.Hits/stat.AtBats,
cumulative_batting_avg = cumulativeHits/cumulativeAtBats
};
averages.Add(average);
}

Related

How to search date in a dynamic query using IQueryable in c#

I need to search in sql server database table. I am using IQueryable to build a dynamic query like below
var searchTerm = "12/04";
var samuraisQueryable = _context.Samurais.Include(x => x.Quotes).AsQueryable();
samuraisQueryable = samuraisQueryable.Where(x => x.Name.Contains(searchTerm) ||
x.CreatedDate.HasValue && x.CreatedDate.Value.ToString()
.Contains(searchTerm)
var results = samuraisQueryable.ToList();
The above query is just an example, actual query in my code is different and more complicated.
Samurai.cs looks like
public class Samurai
{
public Samurai()
{
Quotes = new List<Quote>();
}
public int Id { get; set; }
public string Name { get; set; }
public DateTime? CreatedDate { get; set; }
public List<Quote> Quotes { get; set; }
}
The data in the table looks like
I don't see any results becasue the translated SQL from the above query converts the date in a different format (CONVERT(VARCHAR(100), [s].[CreatedDate])). I tried to specify the date format in the above query but then I get an error that The LINQ expression cannot be translated.
samuraisQueryable = samuraisQueryable.Where(x => x.Name.Contains(searchTerm) ||
x.CreatedDate.HasValue && x.CreatedDate.Value.ToString("dd/MM/yyyy")
.Contains(searchTerm)
If (comments) users will want to search partially on dates, then honestly: the best thing to do is for your code to inspect their input, and parse that into a range query. So; if you see "12/04", you might parse that into a day (in the current year, applying your choice of dd/mm vs mm/dd), and then do a range query on >= that day and < the next day. Similarly, if you see "2021", your code should do the same, but as a range query. Trying to do a naïve partial match is not only computationally expensive (and hard to write as a query): it also isn't very useful to the user. Searching just on "2" for example - just isn't meaningful as a "contains" query.
Then what you have is:
(var startInc, var endExc) = ParseRange(searchTerm);
samuraisQueryable = samuraisQueryable.Where(
x => x.CreatedDate >= startInc && x.CreationDate < endExc);

Linq ToString() how do I convert?

I know that Linq cannot handle ToString() and I've read a few work arounds, most seem to be doing the casting outside of the Linq query, but this is for the output where I am trying to shove it into a list and this is where it is blowing up.
As the code will show below I did some casting elsewhere already in order to make it fit in Linq, but the very last part has the tostring and this I need to rewrite too but I'm not sure how.
DateTime edate = DateTime.Parse(end);
DateTime sdate = DateTime.Parse(start);
var reading = (from rainfall in db.trend_data
join mid in db.measurements on rainfall.measurement_id equals mid.measurement_id
join siteid in db.sites on mid.site_id equals siteid.site_id
where siteid.site_name == insite && rainfall.trend_data_time >= sdate && rainfall.trend_data_time <= edate
select new GaugeData() { SiteID = siteid.site_name, Data_Time = rainfall.trend_data_time, Trend_Data = float.Parse(rainfall.trend_data_avg.ToString()) }).ToList();
Linq will handle it, however Linq2Entities will not since EF will want to relay that expression to the DbProvider which doesn't understand/translate all .Net methods.
When it comes to extracting data from entities, the path with the least pain is to have your entity definitions should use the compatible .Net types matching the SQL data types. Then when you want to load that data into view models / DTOs where you might want to perform formatting or data type translation, let the ViewModel/DTO handle that or pre-materialized your Linq2Entity query into an anonymous type list and then process the translations /w Linq2Object.
Without knowing the data type of your TrendDataAvg, an example with a value stored as a decimal, but you want to work with float:
Formatting in ViewModel example:
public class TrendData // Entity
{ // ...
public decimal trend_data_avg { get; set; }
// ...
}
public class GuageData // ViewModel
{
public decimal trend_data_avg { get; set; } // Raw value.
public float Trend_Data // Formatted value.
{
get { return Convert.ToSingle(trend_data_avg); }
}
}
var reading = (from rainfall in db.trend_data
join mid in db.measurements on rainfall.measurement_id equals mid.measurement_id
join siteid in db.sites on mid.site_id equals siteid.site_id
where siteid.site_name == insite && rainfall.trend_data_time >= sdate && rainfall.trend_data_time <= edate
select new GaugeData() { SiteID = siteid.site_name, Data_Time = rainfall.trend_data_time, trend_data_avg = rainfall.trend_data_avg }).ToList();
Anonymous Types Example:
public class GuageData // ViewModel
{
public float Trend_Data { get; set; }
}
var reading = (from rainfall in db.trend_data
join mid in db.measurements on rainfall.measurement_id equals mid.measurement_id
join siteid in db.sites on mid.site_id equals siteid.site_id
where siteid.site_name == insite && rainfall.trend_data_time >= sdate && rainfall.trend_data_time <= edate
select new
{
siteid.site_name,
rainfall.trend_data_time,
rainfall.trend_data_avg
}.ToList() // Materializes our Linq2Entity query to POCO anon type.
.Select( x=> new GaugeData
{
SiteID = site_name,
Data_Time = trend_data_time,
Trend_Data = Convert.ToSingle(trend_data_avg)
}).ToList();
Note: If you use the Anonymous Type method and want to utilize paging, additional filtering, etc. then be sure to do it before the initial .ToList() call so that it is processed by the Linq2EF. Otherwise you would be fetching a much larger set of data from EF than is necessary with potential performance and resource utilization issues.
Additionally, if you set up your navigation properties in your entities you can avoid all of the explicit join syntax. EF is designed to do the lifting when it comes to the relational DB, not just an alternate syntax to T-SQL.
// Given trend data has a single measurement referencing a single site.
var gaugeData = db.trend_data
.Where(x => x.trend_data_time >= sdate
&& x.trend_data_time <= edate
&& x.measurement.site.site_name == insite))
.Select(x => new
{
x.measurement.site.site_name,
x.trend_data_time,
x.trend_data_avg
}).ToList()
.Select( x=> new GaugeData
{
SiteID = site_name,
Data_Time = trend_data_time,
Trend_Data = Convert.ToSingle(trend_data_avg)
}).ToList();
You could use the Convert.ToSingle() method, float is an alias for system.single.
Trend_Data = Convert.ToSingle(rainfall.trend_data_avg)

Filter c# list using timestamp, Take first record of each 5 seconds

I have a scenario like to filter the record based on timings.That is first record in a range of 5 seconds.
Example :
Input data :
data timings
1452 10:00:11
1455 10:00:11
1252 10:00:13
1952 10:00:15
1454 10:00:17
1451 10:00:19
1425 10:00:20
1425 10:00:21
1459 10:00:23
1422 10:00:24
Expected output
1452 10:00:11
1454 10:00:17
1459 10:00:23
I have tried to group the data based on timings like below
listSpacialRecords=listSpacialRecords.GroupBy(x => x.timings).Select(x => x.FirstOrDefault()).ToList();
But using this i can only filter the data using same time.
It hope someone can help me to resolve this
List contain huge data, so is there any way rather than looping through list ?
This works for me:
var results =
source
.Skip(1)
.Aggregate(
source.Take(1).ToList(),
(a, x) =>
{
if (x.timings.Subtract(a.Last().timings).TotalSeconds >= 5.0)
{
a.Add(x);
}
return a;
});
I get your desired output.
This should do (assuming listSpacialRecords is in order)
var result = new List<DateTime>();
var distance = TimeSpan.FromSeconds(5);
var pivot = default(DateTime);
foreach(var record in listSpacialRecords)
{
if(record.timings > pivot)
{
result.Add(record.timings); // yield return record.timings; as an alternative if you need defered execution
pivot = record.timings +distance;
}
}
If not, easiest but maybe not the most efficient way would be to change the foreach a littlebit
foreach(var time in listSpacialRecords.OrderBy(t=>t))
Doing this only using Linq is possible, but wont benefit readability.
assuming your class looks something like this:
public class DataNode
{
public int Data { get; set; }
public TimeSpan Timings { get; set; }
}
I wrote an extension method:
public static IEnumerable<DataNode> TimeFilter(this IEnumerable<DataNode> list, int timeDifference )
{
DataNode LastFound = null;
foreach (var item in list.OrderByDescending(p=> p.Timings))
{
if (item.Timings > LastFound?.Timings.Add(new TimeSpan(0,0,timeDifference)))
{
LastFound = item;
yield return item;
}
}
}
This can then be used like this
var list = new List<DataNode>();
var result = list.TimeFilter(5);
Something like this approach may work, using the % Operator (Modulo)
Assumptions
The list is in order
You don't care if it skips missing seconds
There is always a first element
And this is only within a 24 hour period
Note : Totally untested
var seconds = listSpacialRecords
.First() // get the first element
.Timmings
.TimeOfDay // convert it to TimeSpan
.TotalSeconds; // get the total seconds of the day
var result = listSpacialRecords
.Where(x => (x.Timmings
.TimeOfDay
.TotalSeconds - seconds) % 5 == 0)
// get the difference and mod 5
.ToList();

LINQ two datatables join

I am stuck with this problem.
I am able to solve this problem by using foreach but there has to be a better solution for this.
I have a two datatables.
First has a column named "Por"
Second has two column named "FirstPor" and "LastPor".
My goals is to use LINQ for creating a new datatable depending on condition like this.
foreach ( DataRow row in FirstDatatable.Rows )
{
foreach ( DataRow secondRow in SecondDatatable.Rows )
{
if ( row["Por"] >= secondRow["FirstPor"] && row["Por"] <= secondRow["LastPor"] )
FinalDataTable.Rows.Add(row);
}
}
I am new in LINQ and this could be problem for me. I am able to do that via Parallel.Foreach but I think that LINQ could be much faster. The condition is simple. Get each number from first table ("Por" column) and check in which row it belongs from second table ( "Por" >= "FirstPor" && "Por" <= "LastPor" ). I think it is simple for anybody who's working this every day.
Yep, there is another task. The columns are STRING type, so conversion is needed in LINQ statement.
Yes, I have just modified my Parallel code to hybrid LINQ/Parallel and seems I am done. I used what James and Rahul wrote and put that to my code. Now, the process takes 52 seconds to estimate 421 000 rows :) It's much better.
public class Range
{
public int FirstPor { get; set; }
public int LastPor { get; set; }
public int PackageId { get; set; }
}
var ranges = (from r in tmpDataTable.AsEnumerable()
select new Range
{
FirstPor = Int32.Parse(r["FirstPor"] as string),
LastPor = Int32.Parse(r["LastPor"] as string),
PackageId = Int32.Parse(r["PackageId"] as string)
}).ToList();
Parallel.ForEach<DataRow>(dt.AsEnumerable(), row =>
{
int por = Int32.Parse(row["Por"].ToString());
lock (locker)
{
row["PackageId"] = ranges.First(range => por >= range.FirstPor && por <= range.LastPor).PackageId;
}
worker.ReportProgress(por);
});
(Untested code ahead.....)
var newRows = FirstDatatable.Rows
.Where(row =>SecondDatatable.Rows
.Any(secondRow => row["Por"] >= secondRow["FirstPor"]
&& row["Por"] <= secondRow["LastPor"]);
FinalDataTable.Rows.AddRange(newRows);
However, if speed is your real concern, my first suggestion is to dump the datatables, and use a list. I'm gonna gues that SecondDatatable, is largely fixed, and probabaly changes less than once a day. So, less create a nice in-memory structure for that:
class Range
{
public int FirstPor {get; set;}
public int LastPor {get; set;}
}
var ranges = (from r in SecondDatatable.Rows
select new Range
{
FirstPor = Int32.Parse(r["FirstPor"]),
LastPor = Int32.Parse(r["LastPor"])
}).ToList();
Then our code becomes:
var newRows = FirstDatatable.Rows
.Where(row =>ranges
.Any(range => row["Por"] >= range.FirstPor
&& row["Por"] <= range.LastPor).ToList();
Which by itself should make this considerably faster.
Now, on a success, it will scan the Ranges up until it finds one that matches. On a failure, it will have to scan the whole list before it gives up. So, the first thing we need to do to speed this up, is to sort the list of ranges. Then we only have to scan up to the point where the low end of the range it higher than the value we are looking for. That should cut the processing time for those rows outside any range in half.
Try this:-
DataTable FinalDataTable = (from x in dt1.AsEnumerable()
from y in dt2.AsEnumerable()
where x.Field<int>("Por") >= y.Field<int>("FirstPor")
&& x.Field<int>("Por") <= y.Field<int>("LastPor")
select x).CopyToDataTable();
Here is the complete Working Fiddle, (I have tested with some sample data with your existing code and my LINQ code) you can copy paste the same in your editor and test because DotNet Fiddle is not supporting AsEnumerable.

Ravendb mapreduce grouping by multiple fields

We have a site that contains streaming video and we want to display three reports of most watched videos in the last week, month and year (a rolling window).
We store a document in ravendb each time a video is watched:
public class ViewedContent
{
public string Id { get; set; }
public int ProductId { get; set; }
public DateTime DateViewed { get; set; }
}
We're having trouble figuring out how to define the indexes / mapreduces that would best support generating those three reports.
We have tried the following map / reduce.
public class ViewedContentResult
{
public int ProductId { get; set; }
public DateTime DateViewed { get; set; }
public int Count { get; set; }
}
public class ViewedContentIndex :
AbstractIndexCreationTask<ViewedContent, ViewedContentResult>
{
public ViewedContentIndex()
{
Map = docs => from doc in docs
select new
{
doc.ProductId,
DateViewed = doc.DateViewed.Date,
Count = 1
};
Reduce = results => from result in results
group result by result.DateViewed
into agg
select new
{
ProductId = agg.Key,
Count = agg.Sum(x => x.Count)
};
}
}
But, this query throws an error:
var lastSevenDays = session.Query<ViewedContent, ViewedContentIndex>()
.Where( x => x.DateViewed > DateTime.UtcNow.Date.AddDays(-7) );
Error: "DateViewed is not indexed"
Ultimately, we want to query something like:
var lastSevenDays = session.Query<ViewedContent, ViewedContentIndex>()
.Where( x => x.DateViewed > DateTime.UtcNow.Date.AddDays(-7) )
.GroupBy( x => x.ProductId )
.OrderBy( x => x.Count )
This doesn't actually compile, because the OrderBy is wrong; Count is not a valid property here.
Any help here would be appreciated.
Each report is a different GROUP BY if you're in SQL land, that tells you that you need three indexes - one with just the month, one with entries by week, one by month, and one by year (or maybe slightly different depending on how you're actually going to do the query.
Now, you have a DateTime there - that presents some problems - what you actually want to do is index the Year component of the DateTime, the Month component of the date time and Day component of that date time. (Or just one or two of these depending on which report you want to generate.
I'm only para-quoting your code here so obviously it won't compile, but:
public class ViewedContentIndex :
AbstractIndexCreationTask<ViewedContent, ViewedContentResult>
{
public ViewedContentIndex()
{
Map = docs => from doc in docs
select new
{
doc.ProductId,
Day = doc.DateViewed.Day,
Month = doc.DateViewed.Month,
Year = doc.DateViewed.Year
Count = 1
};
Reduce = results => from result in results
group result by new {
doc.ProductId,
doc.DateViewed.Day,
doc.DateViewed.Month,
doc.DateViewed.Year
}
into agg
select new
{
ProductId = agg.Key.ProductId,
Day = agg.Key.Day,
Month = agg.Key.Month,
Year = agg.Key.Year
Count = agg.Sum(x => x.Count)
};
}
}
Hopefully you can see what I'm trying to achieve by this - you want ALL the components in your group by, as they are what make your grouping unique.
I can't remember if RavenDB lets you do this with DateTimes and I haven't got it on this computer so can't verify this, but the theory remains the same.
So, to re-iterate
You want an index for your report by week + product id
You want an index for your report by month + product id
You want an index for your report by year + product id
I hope this helps, sorry I can't give you a compilable example, lack of raven makes it a bit difficult :-)

Categories