Lets say I have a class like
MyClass { string id, string eventType, datetime ts}
ts is the timestamp of the event and Id is on which I want to calculate frequency
I have a hot observable of MyClass , I want to calculate number of events recvd per stringId in the last 30 seconds
If number of events is more than 5 , I raise another event of MyClass (with same Id, and eventType ="New" ) and if it falls down below 3 again , I need to update the previously raised event (with same Id, and eventType ="New" ).
I think I need to use sliding window, I have reached so far
public static IObservable<MyClass> CountFrequency(this IObservable<MyClass> source, TimeSpan withinPeriod, string marker)
{
// var scheduler = new HistoricalScheduler();
// var driveSchedule = source.Subscribe(e => scheduler.AdvanceTo(e.Timestamp));
return source.Window(TimeSpan.FromSeconds(30), TimeSpan.FromSeconds(5))
.SelectMany(sl => sl)
.GroupBy(a => a.id)
.SelectMany(go => go
.Aggregate(new MyClass(), (acc, evt) => CustomAggFrequency(acc, evt, marker))
.Select(count => count)));
}
I am not able to understand
a) How to relate scheduler to the timestamp of the data not system time
b) How to code the logic of CustomAggFrequency
Any suggestions
Will this not work? Your question is phrased such that it's hard to figure out what you want.
var span = TimeSpan.FromSeconds(30);
var shift = TimeSpan.FromSeconds(5);
var query = source
.Window(span, shift)
.Select(window => window
.GroupBy(item => item.id)
.ToDictionary(g => x.Key, g => g.Count() / span.TotalSeconds));
This query will emit a dictionary every 5 seconds that maps each id to its frequency in hertz during the last 30 seconds. Its type is IObservable<Dictionary<string, double>>.
Related
I have code which would need to have GroupBy and would need a unique BehaviorSubject per group of Switch().
We have a stream of stock market values that we group by Symbol and perform level crossing across a number of levels (defined by a BehaviorSubject and a switch to always use the latest values).
So I need to go from this:
var feed = new Subject<double>();
var levels = new BehaviorSubject<double[]>(new[] { 400.0, 500.0, 600.0, 700.0 });
levels
.Select(thresholds => feed
.Buffer(2, 1)
.Where(x => x.Count == 2)
.Select(x => new { LevelsCrossed = thresholds.GetCrossovers(x[0], x[1]), Previous = x[0], Current = x[1] })
.Where(x => x.LevelsCrossed.Any())
.SelectMany(x => x.LevelsCrossed.Select(level => new ThresholdCrossedEvent(level, x.Previous, x.Current))))
.Switch()
.Subscribe(x => Console.WriteLine(JsonConvert.SerializeObject(x)));
And adapt the above to take a stream of Tick below and group by Symbol, each with its own level threshold detection on each grouped Value.
class Tick
{
public string Symbol { get; set; } // The name.
public decimal Value { get; set; } // The value.
}
Outline:
Take Market data
Group by Symbol
Alert on levels (depending on group name, using a dictionary of BehaviorSubject)
Output
Use Switch() to always use latest values from the dictionary
With a naive implementation I have a wrapper class (ReactiveSymbolFeed below), however blurring non-reactive and reactive code can introduce potential concurrency issues that reactive extensions otherwise deals neatly with.
Questions please:
Am I introducing any side effects, or will this cause issue at scale (say 100,000 messages per second across 2,000 groups)?
Since we have many groups each with their own BehaviorSubject that needs Switch() - can we rewrite our Reactive Extensions statement block to include the thresholds levels per symbol group, or is the above wrapper class the right way to do this?
Further context and the wrapper class solution
Instead I create a ReactiveSymbolFeed wrapper that will form the value part of a dictionary per symbol key.
class ReactiveSymbolFeed
{
readonly BehaviorSubject<double[]> levels;
readonly Subject<double> feed;
public ReactiveSymbolFeed(double[] levels)
{
this.feed = new Subject<double>();
this.levels = new BehaviorSubject<double[]>(levels);
this.levels
.Select(thresholds => this.feed
.Buffer(2, 1)
.Where(x => x.Count == 2)
.Select(x => new { LevelsCrossed = thresholds.GetCrossovers(x[0], x[1]), Previous = x[0], Current = x[1] })
.Where(x => x.LevelsCrossed.Any())
.SelectMany(x => x.LevelsCrossed.Select(level => new ThresholdCrossedEvent(level, x.Previous, x.Current))))
.Switch()
.DistinctUntilChanged(x => x.Threshold)
.Subscribe(x => Console.WriteLine(JsonConvert.SerializeObject(x)));
}
public void OnNext(double value) => this.feed.OnNext(value);
public void UpdateThresholds(double[] levels) => this.levels.OnNext(levels);
}
And then use with the below:
// Setup the detection thresholds per Symbol - each Symbol has 1 set of thresholds
var dictionary = new Dictionary<string, ReactiveSymbolFeed>();
dictionary.Add("AAPL", new ReactiveSymbolFeed(new[] { 120.0, 125.0, 130.0 }));
dictionary.Add("VXX", new ReactiveSymbolFeed(new[] { 10.5, 15, 18.5, 20 }));
// Create some test tick data.
var ticks = new[]
{
new Tick { Symbol = "AAPL", Value = 119.0 },
new Tick { Symbol = "VXX", Value = 10.3 },
new Tick { Symbol = "VXX", Value = 10.8 },
new Tick { Symbol = "AAPL", Value = 121.0 },
new Tick { Symbol = "AAPL", Value = 121.0 }
// Followed by many other differnet Symbols and Values
};
// Loop through test data and dispatch it.
foreach(var tick in ticks)
{
if(dictionary.TryGetValue(tick.Symbol, out var value))
value.OnNext(tick.Value);
}
I have the AssessmentItems DB object which contains the items about: Which user evaluated (EvaluatorId), which submission (SubmissionId), based on which rubric item (or criteria)(RubricItemId) and when (DateCreated).
I group by this object by RubricItemId and DateCreated to get compute some daily statistics based on each assessment criteria (or rubric item).
For example, I compute the AverageScore, which works fine and returns an output like: RubricItem: 1, Day: 15/01/2019, AverageScore: 3.2.
_context.AssessmentItems
.Include(ai => ai.RubricItem)
.Include(ai => ai.Assessment)
.Where(ai => ai.RubricItem.RubricId == rubricId && ai.Assessment.Submission.ReviewRoundId == reviewRoundId)
.Select(ai => new
{
ai.Id,
DateCreated = ai.DateCreated.ToShortDateString(),//.ToString(#"yyyy-MM-dd"),
ai.CurrentScore,
ai.RubricItemId,
ai.Assessment.SubmissionId,
ai.Assessment.EvaluatorId
})
.GroupBy(ai => new { ai.RubricItemId, ai.DateCreated })
.Select(g => new
{
g.Key.RubricItemId,
g.Key.DateCreated,
AverageScore = g.Average(ai => ai.CurrentScore),
NumberOfStudentsEvaluating = g.Select(ai => ai.EvaluatorId).Distinct().Count(),
}).ToList();
What I want to do is to compute the average until that day. I mean instead of calculating the average for the day, I want to get the average until that day (that is, I want to consider the assessment scores of the preceding days). The same why, when I compute NumberOfStudentsEvaluating, I want to indicate the total number of students participated in the evaluation until that day.
One approach to achieve this could be to iterate through the result object and compute these properties again:
foreach (var i in result)
{
i.AverageScore = result.Where(r => r.DateCreated <= i.DateCreated).Select(r => r.AverageScore).Average(),
}
But, this is quite costly. I wonder if it is possible to tweak the code a bit to achieve this, or should I start from scratch with another approach.
If you split the query into two halves, you can compute the average as you would like (I also computed the NumberOfStudentsEvaluating on the same criteria) but I am not sure if EF/EF Core will be able to translate to SQL:
var base1 = _context.AssessmentItems
.Include(ai => ai.RubricItem)
.Include(ai => ai.Assessment)
.Where(ai => ai.RubricItem.RubricId == rubricId && ai.Assessment.Submission.ReviewRoundId == reviewRoundId)
.Select(ai => new {
ai.Id,
ai.DateCreated,
ai.CurrentScore,
ai.RubricItemId,
ai.Assessment.SubmissionId,
ai.Assessment.EvaluatorId
})
.GroupBy(ai => ai.RubricItemId);
var ans1 = base1
.SelectMany(rig => rig.Select(ai => ai.DateCreated).Distinct().Select(DateCreated => new { RubricItemId = rig.Key, DateCreated, Items = rig.Where(b => b.DateCreated <= DateCreated) }))
.Select(g => new {
g.RubricItemId,
DateCreated = g.DateCreated.ToShortDateString(), //.ToString(#"yyyy-MM-dd"),
AverageScore = g.Items.Average(ai => ai.CurrentScore),
NumberOfStudentsEvaluating = g.Items.Select(ai => ai.EvaluatorId).Distinct().Count(),
}).ToList();
Scenario
I'm receiving differents notification ids every 100 ms (Source1) and I need to do put every id in a Cache with the specific received date, if the id came twice I only update the date. After that I need to search information for the ids invoking a service, when I receive that information on my app, I need to show it ordered by the received date, updating the screen every 5 seconds. If any id is not refreshed in the range of 10 seconds by the Source1, it needs to change of state to display it in a different category or state
Problem
I'm trying to use Reactive Extensions to solve this problem, but I'm not sure if it's the correct technology because:
I don't know where I should have the cache and how to manage those states
How is the best way to manage the concurrency in general to invoke the external service in the meantime I can receive more ids could be new or old
At the end to have clean list as a result of information where I can see which elements are being updated and which of them not.
Can anyone help me? Thanks
It sounds like the .Scan operator might meet your needs.
Try this:
var source = new Subject<int>();
var query =
source
.Scan(new Dictionary<int, DateTime>(), (a, x) =>
{
a[x] = DateTime.Now;
return new Dictionary<int, DateTime>(a);
})
.Select(x => x.OrderByDescending(y => y.Value));
You can test this with the following code:
var values = new [] { 1, 2, 1, 3, 2, 1 };
Observable
.Interval(TimeSpan.FromSeconds(5.0))
.Take(values.Length)
.Select(x => values[x])
.Subscribe(source);
I get:
It's better though to use ImmutableDictionary so then the query looks like this:
var query =
source
.Scan(
new Dictionary<int, DateTime>().ToImmutableDictionary(),
(a, x) => a.SetItem(x, DateTime.Now))
.Select(x => x.OrderByDescending(y => y.Value));
var query =
source
.Scan(ImmutableDictionary<int, DateTime>.Empty, (a, x) => a.SetItem(x, DateTime.Now))
.Select(x => Observable.Interval(TimeSpan.FromSeconds(5.0)).Select(y => x).StartWith(x))
.Switch()
.Select(x => x.OrderByDescending(y => y.Value));
Try this query - it continues to produce values when your source does, but every 5 seconds after the latest value to come out it repeats the last item (unless the source produces a value and it then reset the 5 second timer).
To generate a "measure" every 5 sec I'm doing something like :
var Events = Observable.
Interval(TimeSpan.FromSeconds(5)).
Select(i => factory.GenerateRandomMeasure())
I would like to do the same but based on an existing Measure collection.
I assume I have to do something like :
var Events = existingList.ToObservable();
But is It possible to do add an interval notion in order to get each list item with a interval? (one item every 5 sec for example)
You can do either of these which work just fine:
(1)
var Events =
Observable
.Interval(TimeSpan.FromSeconds(5))
.Zip(existingList, (i, x) => x)
.Select(i => factory.GenerateRandomMeasure());
(2)
var Events2 =
Observable
.Generate(
0,
x => x < existingList.Count,
x => x + 1,
x => existingList[x],
x => TimeSpan.FromSeconds(5))
.Select(i => factory.GenerateRandomMeasure());
The first is probably more sensible and easier to write. The second is very much worth learning if you don't know it already as .Generate is very powerful and can be used in a lot of places.
I have a simple use case where:
Receive a notification of events
Perform some action on the event
Print the content after x interval
How can I do the above step in a single Rx pipeline?
Something like below:
void Main()
{
var observable = Observable.Interval(TimeSpan.FromSeconds(1));
// Receive event and call Foo()
observable.Subscribe(x=>Foo());
// After 1 minute, I want to print the result of count
// How do I do this using above observable?
}
int count = 0;
void Foo()
{
Console.Write(".");
count ++;
}
I think this does what you want:
var observable =
Observable
.Interval(TimeSpan.FromSeconds(1))
.Do(x => Foo())
.Window(() => Observable.Timer(TimeSpan.FromMinutes(1.0)));
var subscription =
observable
.Subscribe(xs => Console.WriteLine(count));
However, it's a bad idea to mix state with observables. If you had two subscriptions you'd increment count twice as fast. It's better to encapsulate your state within the observable so that each subscription would get a new instance of count.
Try this instead:
var observable =
Observable
.Defer(() =>
{
var count = 0;
return
Observable
.Interval(TimeSpan.FromSeconds(1))
.Select(x =>
{
Console.Write(".");
return ++count;
});
})
.Window(() => Observable.Timer(TimeSpan.FromMinutes(0.1)))
.SelectMany(xs => xs.LastAsync());
var subscription =
observable
.Subscribe(x => Console.WriteLine(x));
I get this kind of output:
...........................................................59
............................................................119
............................................................179
............................................................239
Remembering that it starts with 0 then this is timing pretty well.
After seeing paulpdaniels answer I realized that I could replace my Window/SelectMany/LastAsync with the simpler Sample operator.
Also, if we don't really need the side-effect of incrementing a counter then this whole observable shrinks down to this:
var observable =
Observable
.Interval(TimeSpan.FromSeconds(1.0))
.Do(x => Console.Write("."))
.Sample(TimeSpan.FromMinutes(1.0));
observable.Subscribe(x => Console.WriteLine(x));
Much simpler!
I would use Select + Sample:
var observable = Observable.Interval(TimeSpan.FromSeconds(1))
.Select((x, i) => {
Foo(x);
return i;
})
.Do(_ => Console.Write("."))
.Sample(TimeSpan.FromMinutes(1));
observable.Subscribe(x => Console.WriteLine(x));
Select has an overload that returns the index of the current value, by returning that and then sampling at 1 minute intervals, you can get the last value emitted during that interval.