We have a site that contains streaming video and we want to display three reports of most watched videos in the last week, month and year (a rolling window).
We store a document in ravendb each time a video is watched:
public class ViewedContent
{
public string Id { get; set; }
public int ProductId { get; set; }
public DateTime DateViewed { get; set; }
}
We're having trouble figuring out how to define the indexes / mapreduces that would best support generating those three reports.
We have tried the following map / reduce.
public class ViewedContentResult
{
public int ProductId { get; set; }
public DateTime DateViewed { get; set; }
public int Count { get; set; }
}
public class ViewedContentIndex :
AbstractIndexCreationTask<ViewedContent, ViewedContentResult>
{
public ViewedContentIndex()
{
Map = docs => from doc in docs
select new
{
doc.ProductId,
DateViewed = doc.DateViewed.Date,
Count = 1
};
Reduce = results => from result in results
group result by result.DateViewed
into agg
select new
{
ProductId = agg.Key,
Count = agg.Sum(x => x.Count)
};
}
}
But, this query throws an error:
var lastSevenDays = session.Query<ViewedContent, ViewedContentIndex>()
.Where( x => x.DateViewed > DateTime.UtcNow.Date.AddDays(-7) );
Error: "DateViewed is not indexed"
Ultimately, we want to query something like:
var lastSevenDays = session.Query<ViewedContent, ViewedContentIndex>()
.Where( x => x.DateViewed > DateTime.UtcNow.Date.AddDays(-7) )
.GroupBy( x => x.ProductId )
.OrderBy( x => x.Count )
This doesn't actually compile, because the OrderBy is wrong; Count is not a valid property here.
Any help here would be appreciated.
Each report is a different GROUP BY if you're in SQL land, that tells you that you need three indexes - one with just the month, one with entries by week, one by month, and one by year (or maybe slightly different depending on how you're actually going to do the query.
Now, you have a DateTime there - that presents some problems - what you actually want to do is index the Year component of the DateTime, the Month component of the date time and Day component of that date time. (Or just one or two of these depending on which report you want to generate.
I'm only para-quoting your code here so obviously it won't compile, but:
public class ViewedContentIndex :
AbstractIndexCreationTask<ViewedContent, ViewedContentResult>
{
public ViewedContentIndex()
{
Map = docs => from doc in docs
select new
{
doc.ProductId,
Day = doc.DateViewed.Day,
Month = doc.DateViewed.Month,
Year = doc.DateViewed.Year
Count = 1
};
Reduce = results => from result in results
group result by new {
doc.ProductId,
doc.DateViewed.Day,
doc.DateViewed.Month,
doc.DateViewed.Year
}
into agg
select new
{
ProductId = agg.Key.ProductId,
Day = agg.Key.Day,
Month = agg.Key.Month,
Year = agg.Key.Year
Count = agg.Sum(x => x.Count)
};
}
}
Hopefully you can see what I'm trying to achieve by this - you want ALL the components in your group by, as they are what make your grouping unique.
I can't remember if RavenDB lets you do this with DateTimes and I haven't got it on this computer so can't verify this, but the theory remains the same.
So, to re-iterate
You want an index for your report by week + product id
You want an index for your report by month + product id
You want an index for your report by year + product id
I hope this helps, sorry I can't give you a compilable example, lack of raven makes it a bit difficult :-)
Related
I need to search in sql server database table. I am using IQueryable to build a dynamic query like below
var searchTerm = "12/04";
var samuraisQueryable = _context.Samurais.Include(x => x.Quotes).AsQueryable();
samuraisQueryable = samuraisQueryable.Where(x => x.Name.Contains(searchTerm) ||
x.CreatedDate.HasValue && x.CreatedDate.Value.ToString()
.Contains(searchTerm)
var results = samuraisQueryable.ToList();
The above query is just an example, actual query in my code is different and more complicated.
Samurai.cs looks like
public class Samurai
{
public Samurai()
{
Quotes = new List<Quote>();
}
public int Id { get; set; }
public string Name { get; set; }
public DateTime? CreatedDate { get; set; }
public List<Quote> Quotes { get; set; }
}
The data in the table looks like
I don't see any results becasue the translated SQL from the above query converts the date in a different format (CONVERT(VARCHAR(100), [s].[CreatedDate])). I tried to specify the date format in the above query but then I get an error that The LINQ expression cannot be translated.
samuraisQueryable = samuraisQueryable.Where(x => x.Name.Contains(searchTerm) ||
x.CreatedDate.HasValue && x.CreatedDate.Value.ToString("dd/MM/yyyy")
.Contains(searchTerm)
If (comments) users will want to search partially on dates, then honestly: the best thing to do is for your code to inspect their input, and parse that into a range query. So; if you see "12/04", you might parse that into a day (in the current year, applying your choice of dd/mm vs mm/dd), and then do a range query on >= that day and < the next day. Similarly, if you see "2021", your code should do the same, but as a range query. Trying to do a naïve partial match is not only computationally expensive (and hard to write as a query): it also isn't very useful to the user. Searching just on "2" for example - just isn't meaningful as a "contains" query.
Then what you have is:
(var startInc, var endExc) = ParseRange(searchTerm);
samuraisQueryable = samuraisQueryable.Where(
x => x.CreatedDate >= startInc && x.CreationDate < endExc);
Im trying to make a program that sorts objects by more then one parameters.
I need the order by to be in the same weight for all the parameters. what functions do i need to use in order to get that result?
I tried to use OrderBy() and then- ThenBy() but its ordering the first parameter first so the ordering isn't equal weight.
values = File.ReadAllLines(filepath)
.Skip(1)
.Select(v => Fund.FromCsv(v))
.OrderByDescending(x => x.sharp)
.ThenBy(x=>x.yearlychange)
.ToList();
For example you can take the stocks market, in that case i want to order the stocks by the yield in the last year but also to order by standard deviation. in that way i can get stock that have the best yield in the last year but also the best standard deviation. it wont be the best yield from all, it will be combined.
As you have been already informed, it is not really a programistic problem, more like algorithm/domain one. Nevertheless, if you already would have the algorithm, you can, of course, do it like this way. (basing on the example you present in the comment)
void Main()
{
var funds = new List<Fund>();
funds.Add(new Fund() { Age = 18, Money = 100000 });
funds.Add(new Fund() { Age = 20, Money = 101000 });
//Here is normally your code with loading it from CSV
//File.ReadAllLines(filepath)
// .Skip(1)
// .Select(v => Fund.FromCsv(v))
var value = funds.OrderBy(t => t.SortingFactor);
}
public class Fund
{
public int Age { get; set; }
public decimal Money { get; set; }
public decimal SortingFactor
{
get
{
//Here is your domain algorithm you must sort out first (your example data would be)
return Money / Age;
}
}
}
I'm not sure if I fully understand your aim but another alternative if fund is not code you can modify is an anonymous object in your order by e.g.
values = File.ReadAllLines(filepath)
.Skip(1)
.Select(v => Fund.FromCsv(v))
.OrderByDescending(x => new { x.sharp, x.yearlychange })
.ToList();
all. Thanks in advance for your help!
I'm requesting paged data from AWS Elasticsearch using NEST. Each document in AWS Elasticsearch index has a content type (Topic, Question and Video). In response I receive a list of documents for the current page and their total results. Everything is ok with it.
So the question is: how can I receive some kind of response map as well as search results in one call?
I mean, for example, here is my response:
Doc 1 - Topic
Doc 2 - Topic
Doc 3 - Video
Doc 4 - Question
Doc 5 - Video
Total results: 5 items.
In addition to this, I'd like to receive the following 'map':
Topic - 2 items
Video - 2 items
Question - 1 item
Is it possible to do in one request? Or how this can be done by several requests? Maybe NEST Aggregations is something of the needed solution, but it seems that do not have 'count' logic
This is the search request for paged data and some models:
public class Document
{
public int DocumentId { get; set; }
public ContentType ContentType { get; set; }
public DateTime UpdatedOn { get; set; }
public string Title { get; set; }
public string Description { get; set; }
}
public virtual DocumentSearchResponse FullTextSearch(DocumentSearchParams searchParams)
{
var resultsSearchRequest = _elasticClient.Search<Document>(s => s.Index("some_index")
.Query(q => q.Term(t => t.Field(f => f.DocumentId).Value(searchParams.ContentId))
&& q.Terms(t => t.Field(f => f.ContentType).Terms(searchParams.GetContentTypesIds()))
&& q.MultiMatch(m => m.Fields(fs => fs.Field(f => f.Title).Field(f => f.Description))
.Query(searchParams.SearchValue)
.Type(TextQueryType.MostFields)))
.Sort(ss => ss.Descending(f => f.UpdatedOn))
.From((searchParams.PageNumber - 1) * searchParams.PageSize)
.Size(searchParams.PageSize));
// Is valid check
return new DocumentSearchResponse
{
PageResults = _searchResponseHelper.ToPageResults(resultsSearchRequest.Documents),
PageResultsMap = new Dictionary<ContentType, int>, // <- here
TotalResultsCount = resultsSearchRequest.HitsMetadata.Total.Value
};
}
So, I've found it out. Maybe, someone will find it helpful.
The thing I've needed are 'Aggregations'. Adding the following row before .Form and .Size will group the search results by the specified field:
.Aggregations(a => a.Terms("contentType", t => t.Field(f => f.ContentType)))
.Query(...)
.Form(...)
.Size(...)
To process the result to the Dictionary I need, add the following (
searchAggregations is searchResponse.Aggregations):
public Dictionary<ContentType, long> ToPageResultsMap(AggregateDictionary searchAggregations)
{
var pageResultsMap = new Dictionary<ContentType, long>();
var buckets = searchAggregations.Terms(DataConsts.SearchContentTypeDocFieldName).Buckets;
foreach (var item in buckets)
{
if(int.TryParse(item.Key, out int contentType))
{
pageResultsMap.Add((ContentType)contentType, item.DocCount.HasValue ? item.DocCount.Value : 0);
}
}
return pageResultsMap;
}
Writing a stats web site for church softball team.
I have a SQL view that calculates the batting average per game from stats in a table. I use that view to build a model in MVC called PlayerStats.
I want to create and load data into a model that looks like this:
public class BattingAverage
{
[Key]
public int stat_id { get; set; }
public int player_id { get; set; }
public decimal batting_avg { get; set; }
public decimal cumulative_batting_avg { get; set; }
}
In my controller I am doing this:
var model = (from s in PlayerStats
where s.season_id == sid && s.player_id == pid
orderby s.game_no
select g).AsEnumerable()
.Select(b => new BattingAverage
{
stat_id = b.stat_id,
player_id = b.player_id,
batting_avg = b.batting_avg,
cumulative_batting_avg = ***[[ WHAT TO DO HERE?? ]]***
});
I don't know how to calculate that cumulative average to load into my model. The end goal is to Json.Encode the model data for use in AmCharts.
UPDATE - I tried this:
cumulative_batting_avg = getCumulativeAvg(sid, pid, b.game_no)
And my function:
public decimal getCumulativeAvg(int season_id, int player_id, int game_no)
{
var averages = PlayerStats.Where(g => g.season_id == season_id && g.player_id == player_id && g.game_no <= game_no).ToArray();
var hits = averages.Select(a => a.Hits).Sum();
var atbats = averages.Select(a => a.AtBats).Sum();
if (atbats == 0)
{
return 0.0m; // m suffix for decimal type
}
else
{
return hits / atbats;
}
}
This returned a correct average in the first row, but then zeroes for the rest. When I put a break on the return, I see that hits and atbats are properly accumulating inside the function, but for some reason the avg isn't being added to the model. What am I doing wrong?
Yeah basically you want to have a subquery to pull the average across all of the games, is my understanding correct? You can use the LET keyword so that while the main query is pulling in the context of the current game, the let subquery can pull across all of the games, something like this: Simple Example Subquery Linq
Which might translate to something like:
from s in PlayerStats
let cs = context.PlayerStats.Where(i => i.PlayerID == s.PlayerID).Select(i => i.batting_avg).Average()
.
.
select new {
batting_avg = b.batting_avg, /* for the current game */
cumulative_batting_avg = cs
}
Something like that, though I might be off syntax a little. With that type of approach, I often worry about performance with LINQ subqueries (you never know what it's going to render) and so you may want to consider using a stored procedure (really depends on how much data you have)
Now from the comments I think I understand what a cumulative batting average is. It sounds like for a given game, it's an average based on the sum of hits and at bats in that game and the previous games.
So let's say that you have a collection of stats that's already filtered by a player and season:
var playerSeasonStats = PlayerStats.Where(g =>
g.season_id == season_id && g.player_id == player_id && g.game_no <= game_no)
.ToArray();
I originally wrote the next part as a Linq expression. But that's just a habit, and in this case it's much simpler and easier to read with a normal for/each loop.
var playerSeasonStats = PlayerStats as PlayerStat[] ?? PlayerStats;
var averages = new List<BattingAverage>();
int cumulativeHits = 0;
int cumulativeAtBats = 0;
foreach (var stat in playerSeasonStats.OrderBy(stat => stat.game_no))
{
cumulativeHits += stat.Hits;
cumulativeAtBats += stat.AtBats;
var average = new BattingAverage
{
player_id = stat.player_id,
stat_id = stat.stat_id,
batting_avg = stat.Hits/stat.AtBats,
cumulative_batting_avg = cumulativeHits/cumulativeAtBats
};
averages.Add(average);
}
Good morning Stack Overflow, I just passed my view a list of products for the last 90 days sorted by date and need to display it in an efficient way that will group the products by date. Linq-To-Sql doesn't understand a lot of the date comparison functions, so I'm kinda lost as to what to do here.
What I would like to do is contain each group of products with the same date inside of a div with a title like "Today", "Yesterday" or "Last 30 Days". Can anybody assist?
So, as i understand you need groups like history groups in skype:
Show messages from: Yesterday * 7 days * 30 days .. etc.
Suppose you have class Product with Date and some others fields, like this:
public class Product
{
public DateTime Date { get; set; }
public string ProductName { get; set; }
}
Than you can create some helper class like this:
public enum ProductGroupEnum
{
Today = 0,
Yesterday = 1,
Last30Days = 30
}
public static class ProductsHelper
{
public static List<Product> GetProductsGroup(ProductGroupEnum group, IEnumerable<Product> products)
{
var daysCount = (int)group;
return products.Where(x => DateTime.Now.Date - x.Date.Date <= new TimeSpan(daysCount, 0, 0, 0)).ToList();
}
}
Instead of enum you can pass date i think, like DateTime.Now.AddDays(-1) for example.
Because if you want 'last 3 month group' it is incorrect to use 90 days.
And a final example of code:
var products = new List<Product>()
{
new Product() {Date = DateTime.Now.AddMinutes(-30), ProductName = "TodayProduct"},
new Product() {Date = DateTime.Now.AddDays(-1), ProductName = "YesteradayProduct"},
new Product() {Date = DateTime.Now.AddDays(-25), ProductName = "LastMonthProduct"}
};
var todayProducts = ProductsHelper.GetProductsGroup(ProductGroupEnum.Today, products);
var yesterdayProducts = ProductsHelper.GetProductsGroup(ProductGroupEnum.Yesterday, products);
So 'todayProducts' will contain only first one products,
but 'yesterdayProducts' will containt first two products(means products from yesterday to today).
Also you can easily use above 'ProductsHelper' helper method in 'view' for products filtering according yous needs.
Hope this help.