How can I combine two streams ordered then grouped by timestamp? - c#

I have two streams of objects that each have a Timestamp value. Both streams are in order, so for example the timestamps might be Ta = 1,3,6,6,7 in one stream and Tb = 1,2,5,5,6,8 in the other. Objects in both streams are of the same type.
What I'd like to be able to do is to put each of these events on the bus in order of timestamp, i.e., put A1, then B1, B2, A3 and so on. Furthermore, since some streams have several (sequential) elements with the same timestamp, I want those elements grouped so that each new event is an array. So we would put [A3] on the bus, followed by [A15,A25] and so on.
I've tried to implement this by making two ConcurrentQueue structures, putting each event at the back of the queue, then looking at each front of the queue, choosing first the earlier event and then traversing the queue such that all events with this timestamp are present.
However, I've encountered two problems:
If I leave these queues unbounded, I quickly run out of memory as the read op is a lot faster than the handlers receiving the events. (I've got a few gigabytes of data).
I sometimes end up with a situation where I handle the event, say, A15 before A25 has arrived. I somehow need to guard against this.
I'm thinking that Rx can help in this regard but I don't see an obvious combinator(s) to make this possible. Thus, any advice is much appreciated.

Rx is indeed a good fit for this problem IMO.
IObservables can't 'OrderBy' for obvious reasons (you would have to observe the entire stream first to guarantee the correct output order), so my answer below makes the assumption (that you stated) that your 2 source event streams are in order.
It was an interesting problem in the end. The standard Rx operators are missing a GroupByUntilChanged that would have solved this easily, as long as it called OnComplete on the previous group observable when the first element of the next group was observed. However looking at the implementation of DistinctUntilChanged it doesn't follow this pattern and only calls OnComplete when the source observable completes (even though it knows there will be no more elements after the first non-distinct element... weird???). Anyway, for those reasons, I decided against a GroupByUntilChanged method (to not break Rx conventions) and went instead for a ToEnumerableUntilChanged.
Disclaimer: This is my first Rx extension so would appreciate feedback on my choices made. Also, one main concern of mine is the anonymous observable holding the distinctElements list.
Firstly, your application code is quite simple:
public class Event
{
public DateTime Timestamp { get; set; }
}
private IObservable<Event> eventStream1;
private IObservable<Event> eventStream2;
public IObservable<IEnumerable<Event>> CombineAndGroup()
{
return eventStream1.CombineLatest(eventStream2, (e1, e2) => e1.Timestamp < e2.Timestamp ? e1 : e2)
.ToEnumerableUntilChanged(e => e.Timestamp);
}
Now for the ToEnumerableUntilChanged implementation (wall of code warning):
public static IObservable<IEnumerable<TSource>> ToEnumerableUntilChanged<TSource,TKey>(this IObservable<TSource> source, Func<TSource,TKey> keySelector)
{
// TODO: Follow Rx conventions and create a superset overload that takes the IComparer as a parameter
var comparer = EqualityComparer<TKey>.Default;
return Observable.Create<IEnumerable<TSource>>(observer =>
{
var currentKey = default(TKey);
var hasCurrentKey = false;
var distinctElements = new List<TSource>();
return source.Subscribe((value =>
{
TKey elementKey;
try
{
elementKey = keySelector(value);
}
catch (Exception ex)
{
observer.OnError(ex);
return;
}
if (!hasCurrentKey)
{
hasCurrentKey = true;
currentKey = elementKey;
distinctElements.Add(value);
return;
}
bool keysMatch;
try
{
keysMatch = comparer.Equals(currentKey, elementKey);
}
catch (Exception ex)
{
observer.OnError(ex);
return;
}
if (keysMatch)
{
distinctElements.Add(value);
return;
}
observer.OnNext( distinctElements);
distinctElements.Clear();
distinctElements.Add(value);
currentKey = elementKey;
}), observer.OnError, () =>
{
if (distinctElements.Count > 0)
observer.OnNext(distinctElements);
observer.OnCompleted();
});
});
}

Related

For Each Row: The collection has been changed, the enumeration operation may not be performed [duplicate]

I can't get to the bottom of this error, because when the debugger is attached, it does not seem to occur.
Collection was modified; enumeration operation may not execute
Below is the code.
This is a WCF server in a Windows service. The method NotifySubscribers() is called by the service whenever there is a data event (at random intervals, but not very often - about 800 times per day).
When a Windows Forms client subscribes, the subscriber ID is added to the subscribers dictionary, and when the client unsubscribes, it is deleted from the dictionary. The error happens when (or after) a client unsubscribes. It appears that the next time the NotifySubscribers() method is called, the foreach() loop fails with the error in the subject line. The method writes the error into the application log as shown in the code below. When a debugger is attached and a client unsubscribes, the code executes fine.
Do you see a problem with this code? Do I need to make the dictionary thread-safe?
[ServiceBehavior(InstanceContextMode=InstanceContextMode.Single)]
public class SubscriptionServer : ISubscriptionServer
{
private static IDictionary<Guid, Subscriber> subscribers;
public SubscriptionServer()
{
subscribers = new Dictionary<Guid, Subscriber>();
}
public void NotifySubscribers(DataRecord sr)
{
foreach(Subscriber s in subscribers.Values)
{
try
{
s.Callback.SignalData(sr);
}
catch (Exception e)
{
DCS.WriteToApplicationLog(e.Message,
System.Diagnostics.EventLogEntryType.Error);
UnsubscribeEvent(s.ClientId);
}
}
}
public Guid SubscribeEvent(string clientDescription)
{
Subscriber subscriber = new Subscriber();
subscriber.Callback = OperationContext.Current.
GetCallbackChannel<IDCSCallback>();
subscribers.Add(subscriber.ClientId, subscriber);
return subscriber.ClientId;
}
public void UnsubscribeEvent(Guid clientId)
{
try
{
subscribers.Remove(clientId);
}
catch(Exception e)
{
System.Diagnostics.Debug.WriteLine("Unsubscribe Error " +
e.Message);
}
}
}
What's likely happening is that SignalData is indirectly changing the subscribers dictionary under the hood during the loop and leading to that message. You can verify this by changing
foreach(Subscriber s in subscribers.Values)
To
foreach(Subscriber s in subscribers.Values.ToList())
If I'm right, the problem will disappear.
Calling subscribers.Values.ToList() copies the values of subscribers.Values to a separate list at the start of the foreach. Nothing else has access to this list (it doesn't even have a variable name!), so nothing can modify it inside the loop.
When a subscriber unsubscribes you are changing contents of the collection of Subscribers during enumeration.
There are several ways to fix this, one being changing the for loop to use an explicit .ToList():
public void NotifySubscribers(DataRecord sr)
{
foreach(Subscriber s in subscribers.Values.ToList())
{
^^^^^^^^^
...
A more efficient way, in my opinion, is to have another list that you declare that you put anything that is "to be removed" into. Then after you finish your main loop (without the .ToList()), you do another loop over the "to be removed" list, removing each entry as it happens. So in your class you add:
private List<Guid> toBeRemoved = new List<Guid>();
Then you change it to:
public void NotifySubscribers(DataRecord sr)
{
toBeRemoved.Clear();
...your unchanged code skipped...
foreach ( Guid clientId in toBeRemoved )
{
try
{
subscribers.Remove(clientId);
}
catch(Exception e)
{
System.Diagnostics.Debug.WriteLine("Unsubscribe Error " +
e.Message);
}
}
}
...your unchanged code skipped...
public void UnsubscribeEvent(Guid clientId)
{
toBeRemoved.Add( clientId );
}
This will not only solve your problem, it will prevent you from having to keep creating a list from your dictionary, which is expensive if there are a lot of subscribers in there. Assuming the list of subscribers to be removed on any given iteration is lower than the total number in the list, this should be faster. But of course feel free to profile it to be sure that's the case if there's any doubt in your specific usage situation.
Why this error?
In general .Net collections do not support being enumerated and modified at the same time. If you try to modify the collection list during enumeration, it raises an exception. So the issue behind this error is, we can not modify the list/dictionary while we are looping through the same.
One of the solutions
If we iterate a dictionary using a list of its keys, in parallel we can modify the dictionary object, as we are iterating through the key-collection and
not the dictionary(and iterating its key collection).
Example
//get key collection from dictionary into a list to loop through
List<int> keys = new List<int>(Dictionary.Keys);
// iterating key collection using a simple for-each loop
foreach (int key in keys)
{
// Now we can perform any modification with values of the dictionary.
Dictionary[key] = Dictionary[key] - 1;
}
Here is a blog post about this solution.
And for a deep dive in StackOverflow: Why this error occurs?
Okay so what helped me was iterating backwards. I was trying to remove an entry from a list but iterating upwards and it screwed up the loop because the entry didn't exist anymore:
for (int x = myList.Count - 1; x > -1; x--)
{
myList.RemoveAt(x);
}
The accepted answer is imprecise and incorrect in the worst case . If changes are made during ToList(), you can still end up with an error. Besides lock, which performance and thread-safety needs to be taken into consideration if you have a public member, a proper solution can be using immutable types.
In general, an immutable type means that you can't change the state of it once created.
So your code should look like:
public class SubscriptionServer : ISubscriptionServer
{
private static ImmutableDictionary<Guid, Subscriber> subscribers = ImmutableDictionary<Guid, Subscriber>.Empty;
public void SubscribeEvent(string id)
{
subscribers = subscribers.Add(Guid.NewGuid(), new Subscriber());
}
public void NotifyEvent()
{
foreach(var sub in subscribers.Values)
{
//.....This is always safe
}
}
//.........
}
This can be especially useful if you have a public member. Other classes can always foreach on the immutable types without worrying about the collection being modified.
I want to point out other case not reflected in any of the answers. I have a Dictionary<Tkey,TValue> shared in a multi threaded app, which uses a ReaderWriterLockSlim to protect the read and write operations. This is a reading method that throws the exception:
public IEnumerable<Data> GetInfo()
{
List<Data> info = null;
_cacheLock.EnterReadLock();
try
{
info = _cache.Values.SelectMany(ce => ce.Data); // Ad .Tolist() to avoid exc.
}
finally
{
_cacheLock.ExitReadLock();
}
return info;
}
In general, it works fine, but from time to time I get the exception. The problem is a subtlety of LINQ: this code returns an IEnumerable<Info>, which is still not enumerated after leaving the section protected by the lock. So, it can be changed by other threads before being enumerated, leading to the exception. The solution is to force the enumeration, for example with .ToList() as shown in the comment. In this way, the enumerable is already enumerated before leaving the protected section.
So, if using LINQ in a multi-threaded application, be aware to always materialize the queries before leaving the protected regions.
InvalidOperationException-
An InvalidOperationException has occurred. It reports a "collection was modified" in a foreach-loop
Use break statement, Once the object is removed.
ex:
ArrayList list = new ArrayList();
foreach (var item in list)
{
if(condition)
{
list.remove(item);
break;
}
}
Actually the problem seems to me that you are removing elements from the list and expecting to continue to read the list as if nothing had happened.
What you really need to do is to start from the end and back to the begining. Even if you remove elements from the list you will be able to continue reading it.
I had the same issue, and it was solved when I used a for loop instead of foreach.
// foreach (var item in itemsToBeLast)
for (int i = 0; i < itemsToBeLast.Count; i++)
{
var matchingItem = itemsToBeLast.FirstOrDefault(item => item.Detach);
if (matchingItem != null)
{
itemsToBeLast.Remove(matchingItem);
continue;
}
allItems.Add(itemsToBeLast[i]);// (attachDetachItem);
}
I've seen many options for this but to me this one was the best.
ListItemCollection collection = new ListItemCollection();
foreach (ListItem item in ListBox1.Items)
{
if (item.Selected)
collection.Add(item);
}
Then simply loop through the collection.
Be aware that a ListItemCollection can contain duplicates. By default there is nothing preventing duplicates being added to the collection. To avoid duplicates you can do this:
ListItemCollection collection = new ListItemCollection();
foreach (ListItem item in ListBox1.Items)
{
if (item.Selected && !collection.Contains(item))
collection.Add(item);
}
This way should cover a situation of concurrency when the function is called again while is still executing (and items need used only once):
while (list.Count > 0)
{
string Item = list[0];
list.RemoveAt(0);
// do here what you need to do with item
}
If the function get called while is still executing items will not reiterate from the first again as they get deleted as soon as they get used.
Should not affect performance much for small lists.
There is one link where it elaborated very well & solution is also given.
Try it if you got proper solution please post here so other can understand.
Given solution is ok then like the post so other can try these solution.
for you reference original link :-
https://bensonxion.wordpress.com/2012/05/07/serializing-an-ienumerable-produces-collection-was-modified-enumeration-operation-may-not-execute/
When we use .Net Serialization classes to serialize an object where its definition contains an Enumerable type, i.e.
collection, you will be easily getting InvalidOperationException saying "Collection was modified;
enumeration operation may not execute" where your coding is under multi-thread scenarios.
The bottom cause is that serialization classes will iterate through collection via enumerator, as such,
problem goes to trying to iterate through a collection while modifying it.
First solution, we can simply use lock as a synchronization solution to ensure that
the operation to the List object can only be executed from one thread at a time.
Obviously, you will get performance penalty that
if you want to serialize a collection of that object, then for each of them, the lock will be applied.
Well, .Net 4.0 which makes dealing with multi-threading scenarios handy.
for this serializing Collection field problem, I found we can just take benefit from ConcurrentQueue(Check MSDN)class,
which is a thread-safe and FIFO collection and makes code lock-free.
Using this class, in its simplicity, the stuff you need to modify for your code are replacing Collection type with it,
use Enqueue to add an element to the end of ConcurrentQueue, remove those lock code.
Or, if the scenario you are working on do require collection stuff like List, you will need a few more code to adapt ConcurrentQueue into your fields.
BTW, ConcurrentQueue doesnât have a Clear method due to underlying algorithm which doesnât permit atomically clearing of the collection.
so you have to do it yourself, the fastest way is to re-create a new empty ConcurrentQueue for a replacement.
Here is a specific scenario that warrants a specialized approach:
The Dictionary is enumerated frequently.
The Dictionary is modified infrequently.
In this scenario creating a copy of the Dictionary (or the Dictionary.Values) before every enumeration can be quite costly. My idea about solving this problem is to reuse the same cached copy in multiple enumerations, and watch an IEnumerator of the original Dictionary for exceptions. The enumerator will be cached along with the copied data, and interrogated before starting a new enumeration. In case of an exception the cached copy will be discarded, and a new one will be created. Here is my implementation of this idea:
using System;
using System.Collections;
using System.Collections.Generic;
using System.Collections.ObjectModel;
using System.Linq;
public class EnumerableSnapshot<T> : IEnumerable<T>, IDisposable
{
private IEnumerable<T> _source;
private IEnumerator<T> _enumerator;
private ReadOnlyCollection<T> _cached;
public EnumerableSnapshot(IEnumerable<T> source)
{
_source = source ?? throw new ArgumentNullException(nameof(source));
}
public IEnumerator<T> GetEnumerator()
{
if (_source == null) throw new ObjectDisposedException(this.GetType().Name);
if (_enumerator == null)
{
_enumerator = _source.GetEnumerator();
_cached = new ReadOnlyCollection<T>(_source.ToArray());
}
else
{
var modified = false;
if (_source is ICollection collection) // C# 7 syntax
{
modified = _cached.Count != collection.Count;
}
if (!modified)
{
try
{
_enumerator.MoveNext();
}
catch (InvalidOperationException)
{
modified = true;
}
}
if (modified)
{
_enumerator.Dispose();
_enumerator = _source.GetEnumerator();
_cached = new ReadOnlyCollection<T>(_source.ToArray());
}
}
return _cached.GetEnumerator();
}
public void Dispose()
{
_enumerator?.Dispose();
_enumerator = null;
_cached = null;
_source = null;
}
IEnumerator IEnumerable.GetEnumerator() => GetEnumerator();
}
public static class EnumerableSnapshotExtensions
{
public static EnumerableSnapshot<T> ToEnumerableSnapshot<T>(
this IEnumerable<T> source) => new EnumerableSnapshot<T>(source);
}
Usage example:
private static IDictionary<Guid, Subscriber> _subscribers;
private static EnumerableSnapshot<Subscriber> _subscribersSnapshot;
//...(in the constructor)
_subscribers = new Dictionary<Guid, Subscriber>();
_subscribersSnapshot = _subscribers.Values.ToEnumerableSnapshot();
// ...(elsewere)
foreach (var subscriber in _subscribersSnapshot)
{
//...
}
Unfortunately this idea cannot be used currently with the class Dictionary in .NET Core 3.0, because this class does not throw a Collection was modified exception when enumerated and the methods Remove and Clear are invoked. All other containers I checked are behaving consistently. I checked systematically these classes:
List<T>, Collection<T>, ObservableCollection<T>, HashSet<T>, SortedSet<T>, Dictionary<T,V> and SortedDictionary<T,V>. Only the two aforementioned methods of the Dictionary class in .NET Core are not invalidating the enumeration.
Update: I fixed the above problem by comparing also the lengths of the cached and the original collection. This fix assumes that the dictionary will be passed directly as an argument to the EnumerableSnapshot's constructor, and its identity will not be hidden by (for example) a projection like: dictionary.Select(e => e).ΤοEnumerableSnapshot().
Important: The above class is not thread safe. It is intended to be used from code running exclusively in a single thread.
You can copy subscribers dictionary object to a same type of temporary dictionary object and then iterate the temporary dictionary object using foreach loop.
So a different way to solve this problem would be instead of removing the elements create a new dictionary and only add the elements you didnt want to remove then replace the original dictionary with the new one. I don't think this is too much of an efficiency problem because it does not increase the number of times you iterate over the structure.

How to get the count of items in an observable stream without maintaining state yourself?

How do I get the number of students in this school at any given point in time using the Rx idiom and without having to maintain state in the School class myself?
using System;
using System.Reactive.Subjects;
namespace SchoolManagementSystem
{
public class School
{
private ISubject<Student> _subject = null;
private int _maxNumberOfSeats;
private int _numberOfStudentsAdmitted;
public string Name { get; set; }
public School(string name, int maxNumberOfSeats)
{
Name = name;
_maxNumberOfSeats = maxNumberOfSeats;
_numberOfStudentsAdmitted = 0;
_subject = new ReplaySubject<Student>();
}
public void AdmitStudent(Student student)
{
try
{
if (student == null)
throw new ArgumentNullException("student");
if (_numberOfStudentsAdmitted == _maxNumberOfSeats)
{
_subject.OnCompleted();
}
// Obviously can't do this because this will
// create a kind of dead lock in that it will
// wait for the _subject to complete, but I am
// using the same _subject to issue notifications.
// _numberOfStudentsAdmitted = _subject.Count().Wait();
// OR to keep track of state myself
Interlocked.Increment(ref _numberOfStudentsAdmitted);
_subject?.OnNext(student);
}
catch(Exception ex)
{
_subject.OnError(ex);
}
}
public IObservable<Student> Students
{
get
{
return _subject;
}
}
}
}
Or is this just not in tandem with the principles of components designed using Rx?
Is this something that should be the responsibility of the client (to get the count and do all side-effects in the onNext handler)? And that the observables should simply act as stateless signal-sources or gates much like the hardware interrupt routines that simply signal to the CPU that something of interest has happened?
In that case, we lose the criteria for the observable to signal completion. How then it is supposed to know when to complete?
You can use the Count() method on your _subject sequence. It will itself create an observable sequence where each value produced represents the latest total number of students in _subject.
You could then react to this sequence of student count values. The Zip() operation could be useful in that regard, since it has the advantage on completing the resulting sequence when any of its inner sequences complete, which you can force with a TakeWhile.
The result looks something like this
Observable.Zip(
_subject.Select(student => != null ? student ? throw new ArgumentNullException("student")),
_subject.Count().TakeWhile(studentCount => studentCount < _maxNumberOfSeats),
(student, count) => student
);
All that would be left to do in the AdmitStudent method body would simply be to push any new student to the sequence with _subject?.OnNext(student) (like you already do), but without the extra logic. You could also modify this a bit to make sure that _subject itself also gets completed once the maximum student count is reached, but I'm not certain about your business rules, so I'll leave that for you to decide.
One last thing I can recommend if to play with the extensions for Rx types and to have a look around this website, which uses them liberally.

How to transform an exception to an event and to resubscribe to the faulted IObservable?

How should one approach on transforming exceptions in an IObservable stream to plain domain objects and to resubscribe to the stream transparently?
Addendum: As James points out in the comment, my use case idea was something like having a should-be continuous stream over an unreliable source, e.g. a network. In case of a glitch, just try to reconnect to the source, but notify the downstream processor.
In fact, this relates my other question at Translating a piece of asynchronous C# code to F# (with Reactive Extensions and FSharpx), which in turn stems from How to implement polling using Observables?.
In fact, now that I think of it, I could first use code at How to write a generic, recursive extension method in F#? ("RetryAfterDelay") (with some more parameters to adjust the RetryAfterDelay behavior) and chain it with this implementation. When the retries are exhausted, a domain error will be produced and the poller will be reinitiated. Granted, there problably will be a more efficient way, but nevertheless. :) ... Or provide just a call-back function to log error instead of transforming them to domain events, well, choices abound...
But back to the original code...
For instance, if I have
public enum EventTypeEnum
{
None = 0,
Normal = 1,
Faulted = 2
}
public class Event
{
public EventTypeEnum Type { get; set; }
}
private static IObservable<int> FaultingSequence1()
{
var subject = new ReplaySubject<int>();
subject.OnNext(1);
subject.OnNext(2);
subject.OnError(new InvalidOperationException("Something went wrong!"));
return subject;
}
private static IEnumerable<int> FaultingSequence2()
{
for(int i = 0; i < 3; ++i)
{
yield return 1;
}
throw new InvalidOperationException("Something went wrong!");
}
//Additional pondering: Why isn't FaultingSequence2().ToObservable() too be procted by Catch?
//
//This part is for illustratory purposes here. This is the piece I'd like
//behave so that exceptions would get transformed to Events with EventTypeEnum.Faulted
//and passed along to the stream that's been subscribed to while resubscribing to
//FaultingSequence1. That is, the subscribed would learn about the fault through a
//domain event type.
//Retry does the resubscribing, but only on OnError.
var stream = FaultingSequence1().Catch<int, Exception>(ex =>
{
Console.WriteLine("Exception: {0}", ex);
return Observable.Throw<int>(ex);
}).Retry().Select(i => new Event { Type = EventTypeEnum.Normal });
//How to get this to print "Event type: Normal", "Event type: Normal", "Event type: Faulted"?
stream.Subscribe(i => Console.WriteLine("Event type: {0}", i.Type));
This problem has really got me now! Any advice?
There's an operator called Materialize which converts each event into a Notification<T>:
OnNext:
OnNext a Notification<T> with Kind OnNext containing a value.
OnError:
OnNext a Notification<T> with Kind OnError containing an exception.
OnCompleted.
OnCompleted:
OnNext a Notification<T> with Kind OnCompleted
OnCompleted.
So the subscription still completes when either OnError or OnCompleted is invoked, but OnError is never invoked on the Subscriber. So you could be able to do something like this...
source
.Materialize()
.Repeat();
However, this will resubscribe to the source even when the original subscription completes naturally (via OnCompleted).
So maybe you still want OnError to be invoked, but you also want the exception from the original OnError to be passed through OnNext inside of a Notification<T>. For that, you could use something like this:
source
.Materialize()
.SelectMany(notification =>
notification.Kind == NotificationKind.OnError
? Observable.Return(notification).Concat(Observable.Exception(notification.Exception))
: Observable.Return(notification)
)
.Retry();
In this manner, if the subscription completes naturally (via OnCompleted), then the source will not be resubscribed.
Once you have that set up, it's each enough to map each type of notification to whatever domain object you want to use:
source
.Materialize()
.SelectMany(notification =>
notification.Kind == NotificationKind.OnError
? Observable.Return(notification).Concat(Observable.Exception(notification.Exception))
: Observable.Return(notification)
)
.Retry()
.Map(notification => {
switch (notification.Kind) {
case (NotificationKind.OnNext): return // something.
case (NotificationKind.OnError): return // something.
case (NotificationKind.OnCompleted): return // something.
default: throw new NotImplementedException();
}
});

How to expose IObservable<T> properties without using Subject<T> backing field

In this answer to a question about Subject<T> Enigmativity mentioned:
as an aside, you should try to avoid using subjects at all. The
general rule is that if you're using a subject then you're doing
something wrong.
I often use subjects as backing fields for IObservable properties, which would have probably been .NET events in the days before Rx. e.g. instead of something like
public class Thing
{
public event EventHandler SomethingHappened;
private void DoSomething()
{
Blah();
SomethingHappened(this, EventArgs.Empty);
}
}
I might do
public class Thing
{
private readonly Subject<Unit> somethingHappened = new Subject<Unit>();
public IObservable<Unit> SomethingHappened
{
get { return somethingHappened; }
}
private void DoSomething()
{
Blah();
somethingHappened.OnNext(Unit.Default);
}
}
So, if I want to avoid using Subject what would be the correct way of doing this kind of thing? Or I should I stick to using .NET events in my interfaces, even when they'll be consumed by Rx code (so probably FromEventPattern)?
Also, a bit more details on why using Subject like this is a bad idea would be helpful.
Update: To make this question a bit more concrete, I'm talking about using Subject<T> as a way to get from non-Rx code (maybe you're working with some other legacy code) into the Rx world. So, something like:
class MyVolumeCallback : LegacyApiForSomeHardware
{
private readonly Subject<int> volumeChanged = new Subject<int>();
public IObservable<int> VolumeChanged
{
get
{
return volumeChanged.AsObservable();
}
}
protected override void UserChangedVolume(int newVolume)
{
volumeChanged.OnNext(newVolume);
}
}
Where, instead of using events, the LegacyApiForSomeHardware type makes you override virtual methods as a way of getting "this just happened" notifications.
For one thing, someone can cast the SomethingHappened back to an ISubject and feed things into it from the outside. At the very least, apply AsObservable to it in order to hide the subject-ness of the underlying object.
Also, subject broadcasting of callbacks doesn't do strictly more than a .NET event. For example, if one observer throws, the ones that are next in the chain won't be called.
static void D()
{
Action<int> a = null;
a += x =>
{
Console.WriteLine("1> " + x);
};
a += x =>
{
Console.WriteLine("2> " + x);
if (x == 42)
throw new Exception();
};
a += x =>
{
Console.WriteLine("3> " + x);
};
a(41);
try
{
a(42); // 2> throwing will prevent 3> from observing 42
}
catch { }
a(43);
}
static void S()
{
Subject<int> s = new Subject<int>();
s.Subscribe(x =>
{
Console.WriteLine("1> " + x);
});
s.Subscribe(x =>
{
Console.WriteLine("2> " + x);
if (x == 42)
throw new Exception();
});
s.Subscribe(x =>
{
Console.WriteLine("3> " + x);
});
s.OnNext(41);
try
{
s.OnNext(42); // 2> throwing will prevent 3> from observing 42
}
catch { }
s.OnNext(43);
}
In general, the caller is dead once an observer throws, unless you protect every On* call (but don't swallow exceptions arbitrarily, as shown above). This is the same for multicast delegates; exceptions will swing back at you.
Most of the time, you can achieve what you want to do without a subject, e.g. by using Observable.Create to construct a new sequence. Such sequences don't have an "observer list" that results from multiple subscriptions; each observer has its own "session" (the cold observable model), so an exception from an observer is nothing more than a suicide command in a confined area rather than blowing yourself up in the middle of a square.
Essentially, subjects are best used at the edges of the reactive query graph (for ingress streams that need to be addressable by another party that feeds in the data, though you could use regular .NET events for this and bridge them to Rx using FromEvent* methods) and for sharing subscriptions within a reactive query graph (using Publish, Replay, etc. which are Multicast calls in disguise, using a subject). One of the dangers of using subjects - which are very stateful due to their observer list and potential recording of messages - is to use them when trying to write a query operator using subjects. 99.999% of the time, such stories have a sad ending.
In an answer on the Rx forum, Dave Sexton (of Rxx), said as part an answer to something:
Subjects are the stateful components of Rx. They are useful for when
you need to create an event-like observable as a field or a local
variable.
Which is exactly what's happening with this question, he also wrote an in-depth follow up blog post on To Use Subject Or Not To Use Subject? which concludes with:
When should I use a subject?
When all of the following are true:
you don't have an observable or anything that can be converted into one.
you require a hot observable.
the scope of your observable is a type.
you don't need to define a similar event and no similar event already exists.
Why should I use a subject in that case?
Because you've got no choice!
So, answering the inner question of "details on why using Subject like this is a bad idea" - it's not a bad idea, this is one of the few places were using a Subject is the correct way to do things.
While I can't speak for Enigmativity directly, I imagine it's because it's very low-level, something you don't really need to use directly; everything that's offered by the Subject<T> class can be achieved by using the classes in the System.Reactive.Linq namespace.
Taking the example from the Subject<T> documentation:
Subject<string> mySubject = new Subject<string>();
//*** Create news feed #1 and subscribe mySubject to it ***//
NewsHeadlineFeed NewsFeed1 = new NewsHeadlineFeed("Headline News Feed #1");
NewsFeed1.HeadlineFeed.Subscribe(mySubject);
//*** Create news feed #2 and subscribe mySubject to it ***//
NewsHeadlineFeed NewsFeed2 = new NewsHeadlineFeed("Headline News Feed #2");
NewsFeed2.HeadlineFeed.Subscribe(mySubject);
This is easily achieved with the Merge extension method on the Observable class:
IObservable<string> feeds =
new NewsHeadlineFeed("Headline News Feed #1").HeadlineFeed.Merge(
new NewsHeadlineFeed("Headline News Feed #2").HeadlineFeed);
Which you can then subscribe to normally. Using Subject<T> just makes the code more complex. If you're going to use Subject<T> then you should be doing some very low-level processing of observables where the extension methods fail you.
One approach for classes which have simple one-off events, is to provide a ToObservable method which creates a meaningful cold observable based on an event.
This is more readable than using the Observable factory methods, and allows developers who don't use Rx to make use of the API.
public IObservable<T> ToObservable()
{
return Observable.Create<T>(observer =>
{
Action notifier = () =>
{
switch (Status)
{
case FutureStatus.Completed:
observer.OnNext(Value);
observer.OnCompleted();
break;
case FutureStatus.Cancelled:
observer.OnCompleted();
break;
case FutureStatus.Faulted:
observer.OnError(Exception);
break;
}
};
Resolve += notifier;
return () => Resolve -= notifier;
});
}

Best data structure for thread-safe list of subscriptions?

I am trying to build a subscription list. Let's take the example:
list of Publishers, each having a list of Magazines, each having a list of subscribers
Publishers --> Magazines --> Subscribers
Makes sense to use of a Dictionary within a Dictionary within a Dictionary in C#. Is it possible to do this without locking the entire structure when adding/removing a subscriber without race conditions?
Also the code gets messy very quickly in C# which makes me think I am not going down the right path. Is there an easier way to do this? Here are the constructor and subscribe method:
Note: The code uses Source, Type, Subscriber instead of the names above
Source ---> Type ---> Subscriber
public class SubscriptionCollection<SourceT, TypeT, SubscriberT>
{
// Race conditions here I'm sure! Not locking anything yet but should revisit at some point
ConcurrentDictionary<SourceT, ConcurrentDictionary<TypeT, ConcurrentDictionary<SubscriberT, SubscriptionInfo>>> SourceTypeSubs;
public SubscriptionCollection()
{
SourceTypeSubs = new ConcurrentDictionary<SourceT, ConcurrentDictionary<TypeT, ConcurrentDictionary<SubscriberT, SubscriptionInfo>>>();
}
public void Subscribe(SourceT sourceT, TypeT typeT, SubscriberT subT) {
ConcurrentDictionary<TypeT, ConcurrentDictionary<SubscriberT, SubscriptionInfo>> typesANDsubs;
if (SourceTypeSubs.TryGetValue(sourceT, out typesANDsubs))
{
ConcurrentDictionary<SubscriberT, SubscriptionInfo> subs;
if (typesANDsubs.TryGetValue(typeT, out subs))
{
SubscriptionInfo subInfo;
if (subs.TryGetValue(subT, out subInfo))
{
// Subscription already exists - do nothing
}
else
{
subs.TryAdd(subT, new SubscriptionInfo());
}
}
else
{
// This type does not exist - first add type, then subscription
var newType = new ConcurrentDictionary<SubscriberT, SubscriptionInfo>();
newType.TryAdd(subT, new SubscriptionInfo());
typesANDsubs.TryAdd(typeT, newType);
}
}
else
{
// this source does not exist - first add source, then type, then subscriptions
var newSource = new ConcurrentDictionary<TypeT, ConcurrentDictionary<SubscriberT, SubscriptionInfo>>();
var newType = new ConcurrentDictionary<SubscriberT, SubscriptionInfo>();
newType.TryAdd(subT, new SubscriptionInfo());
newSource.TryAdd(typeT, newType);
SourceTypeSubs.TryAdd(sourceT, newSource);
};
}
If you use ConcurrentDictionary, like you already do, you don't need locking, that's already taken care of.
But you still have to think about race conditions and how to deal with them. Fortunately, ConcurrentDictionary gives you exactly what you need. For example, if you have two threads, that both try to subscribe to source that doesn't exist yet at the same time, only one of them will succeed. But that's why TryAdd() returns whether the addition was successful. You can't just ignore its return value. If it returns false, you know some other thread already added that source, so you can retrieve the dictionary now.
Another option is to use the GetOrAdd() method. It retrieves already existing value, and creates it if it doesn't exist yet.
I would rewrite your code like this (and make it much simpler along the way):
public void Subscribe(SourceT sourceT, TypeT typeT, SubscriberT subT)
{
var typesAndSubs = SourceTypeSubs.GetOrAdd(sourceT,
_ => new ConcurrentDictionary<TypeT, ConcurrentDictionary<SubscriberT, SubscriptionInfo>>());
var subs = typesAndSubs.GetOrAdd(typeT,
_ => new ConcurrentDictionary<SubscriberT, SubscriptionInfo>());
subs.GetOrAdd(subT, _ => new SubscriptionInfo());
}

Categories