Why do I NOT need Publish on this cold observable? - c#

Since I'm having a cold Observable here and I subscribe to "grouped" several times, why do I NOT need Publish here? I would have expect it to bring up unwanted results when I run it but to my surprise it works with and without Publish. Why is that?
var subject = new List<string>
{
"test",
"test",
"hallo",
"test",
"hallo"
}.ToObservable();
subject
.GroupBy(x => x)
.SelectMany(grouped => grouped.Scan(0, (count, _) => ++count)
.Zip(grouped, (count, chars) => new { Chars = chars, Count = count }))
.Subscribe(result => Console.WriteLine("You typed {0} {1} times",
result.Chars, result.Count));
// I Would have expect that I need to use Publish like that
//subject
// .GroupBy(x => x)
// .SelectMany(grouped => grouped.Publish(sharedGroup =>
// sharedGroup.Scan(0, (count, _) => ++count)
// .Zip(sharedGroup, (count, chars) =>
// new { Chars = chars, Count = count })))
// .Subscribe(result => Console.WriteLine("You typed {0} {1} times",
// result.Chars, result.Count));
Console.ReadLine();
EDIT
As Paul noticed since we are subscribing to the underlying cold observable twice, we should be going over the sequence twice. However, I had no luck to make this effect visible. I tried to insert debug lines but for example this prints "performing" just once.
var subject = new List<Func<string>>
{
() =>
{
Console.WriteLine("performing");
return "test";
},
() => "test",
() => "hallo",
() => "test",
() => "hallo"
}.ToObservable();
subject
.Select(x => x())
.GroupBy(x => x)
.SelectMany(grouped => grouped.Scan(0, (count, _) => ++count)
.Zip(grouped, (count, chars) => new { Chars = chars, Count = count }))
.Subscribe(result => Console.WriteLine("You typed {0} {1} times",
result.Chars, result.Count));
I wonder if we can make the effect visible that we are dealing with an cold observable and are not using Publish(). In another step I would like to see how Publish() (see above) makes the effect goes away.
EDIT 2
As Paul suggested, I created a custom IObservable<string> for debugging purposes. However, if you set a breakpoint in it's Subscribe() method you will notice that it's just going to be hit once.
class Program
{
static void Main(string[] args)
{
var subject = new MyObservable();
subject
.GroupBy(x => x)
.SelectMany(grouped => grouped.Scan(0, (count, _) => ++count)
.Zip(grouped, (count, chars) => new { Chars = chars, Count = count }))
.Subscribe(result => Console.WriteLine("You typed {0} {1} times",
result.Chars, result.Count));
Console.ReadLine();
}
}
class MyObservable : IObservable<string>
{
public IDisposable Subscribe(IObserver<string> observer)
{
observer.OnNext("test");
observer.OnNext("test");
observer.OnNext("hallo");
observer.OnNext("test");
observer.OnNext("hallo");
return Disposable.Empty;
}
}
So for me the question is still open. Why do I not need Publish here on this cold Observable?

You're only using your List-based source once, so you won't see duplicate subscription effects there. The key to answering your question is the following observation:
An IGroupedObservable<K, T> object flowing out of GroupBy by itself is a subject in disguise.
Internally, GroupBy keeps a Dictionary<K, ISubject<T>>. Whenever a message comes in, it gets sent into the subject with the corresponding key. You're subscribing twice to the grouping object, which is safe, as the subject decouples the producer from the consumer.

Reusing 'grouped' in the Zip means you're effectively doing each grouping twice - however, since your source is Cold, it still works. Does that make sense?

Related

Heartbeat pattern using reactive extension

Given a simple scenario:
A and B are in a room, A talks to B. The room is dark and B couldn't see A. How could B figure out if A is pausing or A is kidnapped from the room?
When A talks, A provides IObservable Talk that B subsequently subscribes to Talk.Subscribe(string=>process what A said). B could at the same time subscribe to Observable.Interval Heartbeat as a heartbeat checking.
My question is what Operator I should use to merge/combine two IObservable so that if there is no item from Talk over two items of Heartbeat, B will assume the A has been kidnapped.
Please note that I want to avoid a variable to store the state because it may cause the side effect if I don't synchronize that variable properly.
Thanks,
Imagine a state variable you want to act on, with the state representing the number of heartbeats since 'A' last spoke. That would look like this:
var stateObservable = Observable.Merge( //State represent number of heartbeats since A last spoke
aSource.Select(_ => new Func<int, int>(i => 0)), //When a talks, set state to 0
bHeartbeat.Select(_ => new Func<int, int>(i => i + 1)) //when b heartbeats, increment state
)
.Scan(0, (state, func) => func(state));
We represent incidents of A speaking as a function resetting the state to 0, and incidents of B heartbeatting as incrementing the state. We then accumulate with the Scan function.
The rest is now easy:
var isKidnapped = stateObservable
.Where(state => state >= 2)
.Take(1);
isKidnapped.Subscribe(_ => Console.WriteLine("A is kidnapped"));
EDIT:
Here's an example with n A sources:
var aSources = new Subject<Tuple<string, Subject<string>>>();
var bHeartbeat = Observable.Interval(TimeSpan.FromSeconds(1)).Publish().RefCount();
var stateObservable = aSources.SelectMany(t =>
Observable.Merge(
t.Item2.Select(_ => new Func<int, int>(i => 0)),
bHeartbeat.Select(_ => new Func<int, int>(i => i + 1))
)
.Scan(0, (state, func) => func(state))
.Where(state => state >= 2)
.Take(1)
.Select(_ => t.Item1)
);
stateObservable.Subscribe(s => Console.WriteLine($"{s} is kidnapped"));
aSources
.SelectMany(t => t.Item2.Select(s => Tuple.Create(t.Item1, s)))
.Subscribe(t => Console.WriteLine($"{t.Item1} says '{t.Item2}'"));
bHeartbeat.Subscribe(_ => Console.WriteLine("**Heartbeat**"));
var a = new Subject<string>();
var c = new Subject<string>();
var d = new Subject<string>();
var e = new Subject<string>();
var f = new Subject<string>();
aSources.OnNext(Tuple.Create("A", a));
aSources.OnNext(Tuple.Create("C", c));
aSources.OnNext(Tuple.Create("D", d));
aSources.OnNext(Tuple.Create("E", e));
aSources.OnNext(Tuple.Create("F", f));
a.OnNext("Hello");
c.OnNext("My name is C");
d.OnNext("D is for Dog");
await Task.Delay(TimeSpan.FromMilliseconds(1200));
e.OnNext("Easy-E here");
a.OnNext("A is for Apple");
await Task.Delay(TimeSpan.FromMilliseconds(2200));

Managing state in a reactive pipeline

I am constructing a reactive pipeline that needs to expand (SelectMany) and then flatten (in this case, ToArray) whilst maintaining access to a piece of state obtained at the beginning of the pipeline.
Here is pseudo-code for what I am attempting:
return Observable
.Start(() => this.GetSearchResults(query))
.SelectMany(results => results.Hits) // results.Hits is a list of IDs. But there is also has a bool property that I want to keep through to the end of my pipeline
.SelectMany(hit => GetById(hit.Id)) // asynchronously load each result
.ToArray() // now need to pull all the results together into a containing data structure, and also include the bool flag from above in it
.Select(resolvedResults => new ...); // need access to both resolvedResults and the bool mentioned in the first comment above
So I'm trying to find a way to cleanly access some state determined at the beginning of the pipeline from the code at the end of the pipeline.
The first thing I tried was using anonymous types to bundle the bool with each result. This quickly got out of hand and was wasteful from a performance perspective.
The second thing I tried was using a subject as follows:
var state = new AsyncSubject<bool>();
return Observable
.Start(() => this.GetSearchResults(query))
.Do(results =>
{
state.OnNext(results.Flag);
state.OnCompleted();
}
.SelectMany(results => results.Hits)
.SelectMany(hit => GetById(hit.Id))
.ToArray()
.Zip(
state,
(results, state) => new ResultContainer(state, results));
This seems to work fine, but feels a little icky to me.
So what I'm wondering is whether there is a cleaner way to manage state in a reactive pipeline.
For reference, here is the actual code (rather than just pseudo-code):
public IObservable<ISearchResults<IContact>> Search(string query, int maximumResultCount = 100, float minimumScore = 0.1F)
{
Ensure.ArgumentNotNull(query, nameof(query));
var moreHitsAvailable = new AsyncSubject<bool>();
return Observable
.Start(
() => this.searchIndexService.Search<IContact>(query, maximumResultCount, minimumScore),
this.schedulerService.DataStoreScheduler)
.Do(
results =>
{
moreHitsAvailable.OnNext(results.MoreHitsAreAvailable);
moreHitsAvailable.OnCompleted();
})
.SelectMany(
results => results
.Hits
.Select(
hit => new
{
Id = hit.Id,
ParsedId = ContactId.Parse(hit.Id)
}))
.SelectMany(
result => this
.GetById(result.ParsedId)
.Select(
contact => new
{
Id = result.Id,
Contact = contact
}))
.Do(
result =>
{
if (result.Contact == null)
{
this.logger.Warn("Failed to find contact with ID '{0}' provided by the search index. Index may be out of date.", result.Id);
}
})
.Select(result => result.Contact)
.Where(contact => contact != null)
.ToArray()
.Zip(
moreHitsAvailable,
(results, more) => new SearchResults<IContact>(more, results.ToImmutableList()))
.PublishLast()
.ConnectUntilCompleted();
}
You could pop out to Query Comprehension Syntax and do something like this
var x = from result in Observable.Start(() => this.GetSearchResults())
let hasMore = result.MoreHitsAreAvailable
from hit in result.Hits
from contact in GetById(hit.Id)
select new { hasMore , contact};
Over to you how to deal with the duplicate hasMore values. As we know it will be just the single distinct value (all true or all false) you could group by.

Using rx to subscribe to event and perform logging after time interval

I have a simple use case where:
Receive a notification of events
Perform some action on the event
Print the content after x interval
How can I do the above step in a single Rx pipeline?
Something like below:
void Main()
{
var observable = Observable.Interval(TimeSpan.FromSeconds(1));
// Receive event and call Foo()
observable.Subscribe(x=>Foo());
// After 1 minute, I want to print the result of count
// How do I do this using above observable?
}
int count = 0;
void Foo()
{
Console.Write(".");
count ++;
}
I think this does what you want:
var observable =
Observable
.Interval(TimeSpan.FromSeconds(1))
.Do(x => Foo())
.Window(() => Observable.Timer(TimeSpan.FromMinutes(1.0)));
var subscription =
observable
.Subscribe(xs => Console.WriteLine(count));
However, it's a bad idea to mix state with observables. If you had two subscriptions you'd increment count twice as fast. It's better to encapsulate your state within the observable so that each subscription would get a new instance of count.
Try this instead:
var observable =
Observable
.Defer(() =>
{
var count = 0;
return
Observable
.Interval(TimeSpan.FromSeconds(1))
.Select(x =>
{
Console.Write(".");
return ++count;
});
})
.Window(() => Observable.Timer(TimeSpan.FromMinutes(0.1)))
.SelectMany(xs => xs.LastAsync());
var subscription =
observable
.Subscribe(x => Console.WriteLine(x));
I get this kind of output:
...........................................................59
............................................................119
............................................................179
............................................................239
Remembering that it starts with 0 then this is timing pretty well.
After seeing paulpdaniels answer I realized that I could replace my Window/SelectMany/LastAsync with the simpler Sample operator.
Also, if we don't really need the side-effect of incrementing a counter then this whole observable shrinks down to this:
var observable =
Observable
.Interval(TimeSpan.FromSeconds(1.0))
.Do(x => Console.Write("."))
.Sample(TimeSpan.FromMinutes(1.0));
observable.Subscribe(x => Console.WriteLine(x));
Much simpler!
I would use Select + Sample:
var observable = Observable.Interval(TimeSpan.FromSeconds(1))
.Select((x, i) => {
Foo(x);
return i;
})
.Do(_ => Console.Write("."))
.Sample(TimeSpan.FromMinutes(1));
observable.Subscribe(x => Console.WriteLine(x));
Select has an overload that returns the index of the current value, by returning that and then sampling at 1 minute intervals, you can get the last value emitted during that interval.

Rx how to group by a key a complex object and later do SelectMany without "stopping" the stream?

This is related to my other question here. James World presented a solution as follows:
// idStream is an IObservable<int> of the input stream of IDs
// alarmInterval is a Func<int, TimeSpan> that gets the interval given the ID
var idAlarmStream = idStream
.GroupByUntil(key => key, grp => grp.Throttle(alarmInterval(grp.Key)))
.SelectMany(grp => grp.IgnoreElements().Concat(Observable.Return(grp.Key)));
<edit 2:
Question: How do I start the timers immediately without waiting for the first events to arrive? That's the root problem in my question, I guess. For that end, I planned on sending off dummy objects with the IDs I know should be there. But as I write in following, I ended up with some other problems. Nevertheless, I'd think solving that too would be interesting.
Forwards with the other interesting parts then! Now, if I'd like to group a complex object like the following and group by the key as follows (won't compile)
var idAlarmStream = idStream
.Select(i => new { Id = i, IsTest = true })
.GroupByUntil(key => key.Id, grp => grp.Throttle(alarmInterval(grp.Key)))
.SelectMany(grp => grp.IgnoreElements().Concat(Observable.Return(grp.Key)));
then I get into trouble. I'm unable to modify the part about SelectMany, Concat and Observable.Return so that the query would work as before. For instance, if I make query as
var idAlarmStream = idStream
.Select(i => new { Id = i, IsTest = true })
.GroupByUntil(key => key.Id, grp => grp.Throttle(alarmInterval(grp.Key)))
.SelectMany(grp => grp.IgnoreElements().Concat(Observable.Return(grp.Key.First())))
.Subscribe(i => Console.WriteLine(i.Id + "-" + i.IsTest);
Then two events are needed before an output can be observed in the Subscribe. It's the effect of the call to First, I gather. Furthermore, I woul like to use the complex object attributes in the call to alarmInterval too.
Can someone offer an explanation what's going on, perhaps even a solution? The problem in going with unmodified solution is that the grouping doesn't look Ids alone for the key value, but also the IsTest field.
<edit: As a note, the problem probably could be solved firsly by creating an explicit class or struct and then that implements a custom IEquatable and secondly then using James' code as-is so that grouping would happen by IDs alone. It feels like hack though.
Also, if you want to count the number of times you've seen an item before the alarm goes off you can do it like this, taking advantage of the counter overload in Select.
var idAlarmStream = idStream
.Select(i => new { Id = i, IsTest = true })
.GroupByUntil(key => key.Id, grp => grp.Throttle(alarmInterval(grp.Key))
.SelectMany(grp => grp.Select((count, alarm) => new { count, alarm }).TakeLast(1));
Note, this will be 0 for the first (seed) item - which is probably what you want anyway.
You are creating an anonymous type in your Select. Lets call it A1. I will assume your idStream is an IObservable. Since this is the Key in the GroupByUntil you do not need to worry about key comparison - int equality is fine.
The GroupByUntil is an IObservable<IGroupedObservable<int, A1>>.
The SelectMany as written is trying to be an IObservable<A1>. You need to just Concat(Observable.Return(grp.Key)) here - but the the type of the Key and the type of the Group elements must match or the SelectMany won't work. So the key would have to be an A1 too. Anonymous types use structural equality and the return type would be stream of A1 - but you can't declare that as a public return type.
If you just want the Id, you should add a .Select(x => x.Id) after the Throttle:
var idAlarmStream = idStream
.Select(i => new { Id = i, IsTest = true })
.GroupByUntil(key => key.Id, grp => grp.Throttle(alarmInterval(grp.Key)
.Select(x => x.Id))
.SelectMany(grp => grp.IgnoreElements().Concat(Observable.Return(grp.Key)));
If you want A1 instead - you'll need to create a concrete type that implements Equality.
EDIT
I've not tested it, but you could also flatten it more simply like this, I think this is easier! It is outputing A1 though, so you'll have to deal with that if you need to return the stream somewhere.
var idAlarmStream = idStream
.Select(i => new { Id = i, IsTest = true })
.GroupByUntil(key => key.Id, grp => grp.Throttle(alarmInterval(grp.Key))
.SelectMany(grp => grp.TakeLast(1));

How to efficiently limit and then concatenate a result with a linq / lambda expression?

I am in the process of creating a service to make it easy for a user to select a protocol from the IANA - Protocol Registry.
As you might imagine searching the registry for the term http pulls up a lot of hits. Since amt-soap-http is going to selected by a user much less frequently than straight http I decided that it would be a good idea to pull out everything that starts with http and then concatenate that with the remaining results.
The below lambda expression is the result of that thought process:
var records = this._ianaRegistryService.GetAllLike(term).ToList();
var results = records.Where(r => r.Name.StartsWith(term))
.OrderBy(r => r.Name)
.Concat(records.Where(r => !r.Name.StartsWith(term))
.OrderBy(r => r.Name))
.Take(MaxResultSize);
Unfortunately, I feel like I am iterating through my results more times than necessary. Premature optimization considerations aside is there a combination of lambda expressions that would be more efficient than the above?
It might be more efficient as a two-step ordering:
var results = records.OrderBy(r => r.Name.StartsWith(term) ? 1 : 2)
.ThenBy(r => r.Name)
.Take(MaxResultSize);
Using comment to explain what I am trying to do is getting hard. So i will post this another answer.
Suppose I want to sort a list of random integers first according to its being even or odd then in numerical order (simulating StartsWith with mod 2).
Here is the test case: action2 is the same as other answer.
If you run this code you will see that my suggestion (action1) is two times faster.
void Test()
{
Random rnd = new Random();
List<int> records = new List<int>();
for(int i=0;i<2000000;i++)
{
records.Add(rnd.Next());
}
Action action1 = () =>
{
var res1 = records.GroupBy(r => r % 2)
.OrderBy(x => x.Key)
.Select(x => x.OrderBy(y => y))
.SelectMany(x => x)
.ToList();
};
Action action2 = () =>
{
var res2 = records.OrderBy(x => x % 2).ThenBy(x => x).ToList();
};
//Avoid counting JIT
action1();
action2();
var sw = Stopwatch.StartNew();
action1();
long t1 = sw.ElapsedMilliseconds;
sw.Restart();
action2();
long t2 = sw.ElapsedMilliseconds;
Console.WriteLine(t1 + " " + t2);
}

Categories