I've been doing some work lately with the Reactive Framework and have been absolutely loving it so far. I'm looking at replacing a traditional polling message queue with some filtered IObservables to clean up my server operations. In the old way, I dealt with messages coming into the server like so:
// Start spinning the process message loop
Task.Factory.StartNew(() =>
{
while (true)
{
Command command = m_CommandQueue.Take();
ProcessMessage(command);
}
}, TaskCreationOptions.LongRunning);
Which results in a continuously polling thread that delegates commands from clients out to the ProcessMessage method where I have a series of if/else-if statements that determine the type of the command and delegate work based on its type
I am replacing this with an event driven system using Reactive for which I've written the following code:
private BlockingCollection<BesiegedMessage> m_MessageQueue = new BlockingCollection<BesiegedMessage>();
private IObservable<BesiegedMessage> m_MessagePublisher;
m_MessagePublisher = m_MessageQueue
.GetConsumingEnumerable()
.ToObservable(TaskPoolScheduler.Default);
// All generic Server messages (containing no properties) will be processed here
IDisposable genericServerMessageSubscriber = m_MessagePublisher
.Where(message => message is GenericServerMessage)
.Subscribe(message =>
{
// do something with the generic server message here
}
My question is that while this works, is it good practice to use a blocking collection as the backing for an IObservable like this? I don't see where Take() is ever called this way which makes me think that the Messages will pile off on the queue without being removed after they have been processed?
Would it be more efficient to look into Subjects as the backing collection to drive the filtered IObservables that will be picking up these messages? Is there anything else I'm missing here that might benefit the architecture of this system?
Here is a complete worked example, tested under Visual Studio 2012.
Create a new C# console app.
Right click on your project, select "Manage NuGet Packages", and add "Reactive Extensions - Main
Library".
Add this C# code:
using System;
using System.Collections.Concurrent;
using System.Reactive.Concurrency;
using System.Reactive.Linq;
namespace DemoRX
{
class Program
{
static void Main(string[] args)
{
BlockingCollection<string> myQueue = new BlockingCollection<string>();
{
IObservable<string> ob = myQueue.
GetConsumingEnumerable().
ToObservable(TaskPoolScheduler.Default);
ob.Subscribe(p =>
{
// This handler will get called whenever
// anything appears on myQueue in the future.
Console.Write("Consuming: {0}\n",p);
});
}
// Now, adding items to myQueue will trigger the item to be consumed
// in the predefined handler.
myQueue.Add("a");
myQueue.Add("b");
myQueue.Add("c");
Console.Write("[any key to exit]\n");
Console.ReadKey();
}
}
}
You will see this on the console:
[any key to exit]
Consuming: a
Consuming: b
Consuming: c
The really nice thing about using RX is that you can use the full power of LINQ to filter out any unwanted messages. For example, add a .Where clause to filter by "a", and observe what happens:
ob.Where(o => (o == "a")).Subscribe(p =>
{
// This will get called whenever something appears on myQueue.
Console.Write("Consuming: {0}\n",p);
});
Philosophical notes
The advantage of this method over starting up a dedicated thread to poll the queue, is that you don't have to worry about disposing of the thread properly once the program has exited. This means you don't have to bother with IDisposable or CancellationToken (which is always required when dealing with a BlockingCollection or else your program might hang on exit with a thread that refuses to die).
Believe me, its not as easy as you think to write completely robust code to consume events coming out of a BlockingCollection. I much prefer using the RX method, as shown above as its cleaner, more robust, has less code, and you can filter using LINQ.
Latency
I was surprised at how fast this method is.
On my Xeon X5650 # 2.67Ghz, it takes 5 seconds to process 10 million events, which works out at approximately 0.5 microseconds per event. It took 4.5 seconds to put the items into the BlockingCollection, so RX was taking them out and processing them almost as fast as they were going in.
Threading
In all of my tests, RX only spun up one thread to handle the tasks on the queue.
This means that we have a very nice pattern: we can use RX to collect incoming data from multiple threads, place them into a shared queue, then process the queue contents on a single thread (which is, by definition, thread safe).
This pattern eliminates a huge amount of headaches when dealing with multithreaded code, by decoupling the producer and consumer of data via a queue, where the producer could be multi-threaded and the consumer is single-threaded and thus thread-safe. This is the concept that makes Erlang so robust. For more information on this pattern, see Multi-threading made ridiculously simple.
Here's something pulled directly from my posterior - any real solution would be very much dependent on your actual usage, but here's "The cheapest pseudo Message Queue system ever":
Thoughts/motivations:
Deliberate exposure of IObservable<T> such that subscribers can do any filtering/cross subscriptions they want to
The overall Queue is typeless, but Register and Publish are type-safe(ish)
YMMV with the Publish() where it is - try experimenting with moving it around
Generally Subject is a no-no, although in this case it does make for some SIMPLE code.
One could "internalize" the registration to actually do the subscription as well, but then the queue would need to manage the IDisposables created - bah, let your consumers deal with it!
The Code:
public class TheCheapestPubSubEver
{
private Subject<object> _inner = new Subject<object>();
public IObservable<T> Register<T>()
{
return _inner.OfType<T>().Publish().RefCount();
}
public void Publish<T>(T message)
{
_inner.OnNext(message);
}
}
Usage:
void Main()
{
var queue = new TheCheapestPubSubEver();
var ofString = queue.Register<string>();
var ofInt = queue.Register<int>();
using(ofInt.Subscribe(i => Console.WriteLine("An int! {0}", i)))
using(ofString.Subscribe(s => Console.WriteLine("A string! {0}", s)))
{
queue.Publish("Foo");
queue.Publish(1);
Console.ReadLine();
}
}
Output:
A string! Foo
An int! 1
HOWEVER, this doesn't strictly enforce "consuming consumers" - multiple Registers of a specific type would result in multiple observer calls - that is:
var queue = new TheCheapestPubSubEver();
var ofString = queue.Register<string>();
var anotherOfString = queue.Register<string>();
var ofInt = queue.Register<int>();
using(ofInt.Subscribe(i => Console.WriteLine("An int! {0}", i)))
using(ofString.Subscribe(s => Console.WriteLine("A string! {0}", s)))
using(anotherOfString.Subscribe(s => Console.WriteLine("Another string! {0}", s)))
{
queue.Publish("Foo");
queue.Publish(1);
Console.ReadLine();
}
Results in:
A string! Foo
Another string! Foo
An int! 1
I haven't used BlockingCollection in this context - so I'm 'conjecturing' - you should run it to approve, disprove.
BlockingCollection might only further complicate things here (or provide little help). Take a look at this post from Jon - simply to confirm. GetConsumingEnumerable will provide you with 'per subscriber' enumerable. Exhausting them down eventually - something to have in mind with Rx.
Also the the IEnumerable<>.ToObservable further flattens out the 'source'. As it works (you can lookup the source - I'd recommend w/ Rx more than anything) - each subscribe creates an own 'enumerator' - so all will be getting their own versions of the feed. I'm really not sure, how that pans out in the Observable scenario like this.
Anyhow - if you want to provide app-wide messages - IMO you'd need to introduce Subject or state in some other form (e.g. Publish etc.). And in that sense, I don't think BlockingCollection will help any - but again, it's best that you try it out yourself.
Note (a philosophical one)
If you want to combine message types, or combine different sources - e.g. in a more 'real world' scenario - it gets more complex. And it gets quite interesting I must say.
Keep an eye on having them 'rooted' into a single-shared stream (and avoid what Jer suggested rightly).
I'd recommend that you don't try to evade using Subject. For what you need, that's your friend - no matter all the no-state related discussions and how Subject is bad - you effectively have a state (and you need a 'state') - Rx kicks in 'after the fact', so you enjoy benefits from it regardless.
I encourage you to go that way, as I love it how it turned out.
My issue here is that we have turned a Queue (which I normally associate with destructive reads by one consumer especially if you are using BlockingCollection) into a broadcast (send to anyone and everyone listening right now).
These seem two conflicting ideas.
I have seen this done, but it then was thrown away as it was the "right solution to the wrong question".
Related
I have an Async processing pipeline. I'm implementing a constraint such that I need to limit the number of submissions to the next stage. For my component, I have:
a single input source (items are tagged with a source id)
a single destination that I need to propagate the inputs to in a round-robin fashion
If capacity is available for multiple clients, I'll forward a message for each (i.e. if I wake because client 3's semaphore has finally become available, I may first send a message for client 2, then 3, etc)
The processing loop is thus waiting on one or more of the following conditions to continue processing:
more input has arrived (it might be for a client that is not at its limit)
capacity has been released for a client that we are holding data for
Ideally, I'd thus use Task.WhenAny with
a task representing the input c.Reader.WaitToReadAsync(ct).AsTask()
N tasks representing the clients for which we are holding data, but it's not yet valid for submission (the Wait for the SemaphoreSlim would fail)
SemaphoreSlim's AvailableWaitHandle would be ideal - I want to know when it's available but I don't want to reserve it yet as I have a chain of work to process - I just want to know if one of my trigger conditions has arisen
Is there a way to await the AvailableWaitHandle ?
My current approach is a hack derived from this answer to a similar question by #usr - posting for reference
My actual code is here - there's also some more detail about the whole problem in my self-answer below
I want to know when it's available but I don't want to reserve it yet as I have a chain of work to process
This is very strange and it seems like SemaphoreSlim may not be what you want to use. SemaphoreSlim is a kind of mutual exclusion object that can allow multiple takers. It is sometimes used for throttling. But I would not want to use it as a signal.
It seems like something more like an asynchronous manual-reset event would be what you really want. Or, if you wanted to maintain a locking/concurrent-collection kind of concept, an asynchronous monitor or condition variable.
That said, it is possible to use a SemaphoreSlim as a signal. I just strongly hesitate suggesting this as a solution, since it seems like this requirement is highlighting a mistake in the choice of synchronization primitive.
Is there a way to await the AvailableWaitHandle?
Yes. You can await anything by using TaskCompletionSource. For WaitHandles in particular, ThreadPool.RegisterWaitForSingleObject gives you an efficient wait.
So, what you want to do is create a TCS, register the handle with the thread pool, and complete the TCS in the callback for that handle. Keep in mind that you want to be sure that the TCS is eventually completed and that everything is disposed properly.
I have support for this in my AsyncEx library (WaitHandleAsyncFactory.FromWaitHandle); code is here.
My AsyncEx library also has support for asynchronous manual-reset events, monitors, and condition variables.
Variation of #usr's answer which solved my problem
class SemaphoreSlimExtensions
public static Task AwaitButReleaseAsync(this SemaphoreSlim s) =>
s.WaitAsync().ContinueWith(_t -> s.Release(), TaskContinuationOptions.ExecuteSynchronously);
public static bool TryTake(this SemaphoreSlim s) =>
s.Wait(0);
In my use case, the await is just a trigger for synchronous logic that then walks the full set - the TryTake helper is in my case a natural way to handle the conditional acquisition of the semaphore and the processing that's contingent on that. My wait looks like this:
SemaphoreSlim[] throttled = Enumerable.Empty();
while (!ct.IsCancellationRequested)
{
var throttledClients = from s in throttled select s.AwaitButReleaseAsync();
var timeout = 3000;
var otherConditions = new[] { input.Reader.WaitToReadAsync().ToTask(), Task.Delay(ct, timeout) };
await Task.WhenAny(throttledClients.Append(otherConditions));
throttled = propagateStuff();
}
The actual code is here - I have other cases that follow the same general pattern. The bottom line is that I want to separate the concern of waiting for the availability of capacity on a SemaphoreSlim from actually reserving that capacity.
I want to use threads from ThreadPool to run same procedure at different time.
Here is what I am trying to accomplish:
add an item to the hash,
note the time when item was created, and within X0 minutes
go back and remove/do something from/with item from the hash.
From what I have read, using .Sleep() to delay execution is terrible idea.
What would be better idea ?
(Unfortunately, I can't use Task Parallel Library, and limited only to .NET 3.5)
I don't know how TPL would be useful here anyway. I don't recall anything in it that involves scheduling things for future execution.
.NET includes two different basic Timer classes (in System.Timers and System.Threading), and a third one specifically for Forms (in case you're doing that). Those are the "go-to" API for this specific application.
One alternative you might consider is creating a single thread that consumes a queue of scheduled tasks, essentially implementing your own timer. In that one thread, you'd wind up using Thread.Sleep(). Yes, normally one would want to avoid that, but in a dedicated thread specifically for the purpose, it's fine. Whether you'd find that more desirable than the use of one of the Timer classes, I don't know, since I don't really understand the resistance to using one of the Timer classes.
Note that the System.Threading.Timer class has a "one-shot" mode. By passing Timeout.Infinite as the repeat interval, the timer callback is executed only once, after the initial due time interval has elapsed. The only "management" necessary is to retain a reference to the Timer instance to ensure it's not garbage-collected before the timer period elapses.
It even uses ThreadPool threads.
Here's an approach using Microsoft's Reactive Framework (NuGet Rx-Main):
var query =
Observable.Create<HashAction>(o =>
{
var hash = "Create Hash Somehow";
return Observable
.Return(new HashAction()
{
Action = "Add",
Hash = hash
})
.Concat(
Observable
.Timer(TimeSpan.FromMinutes(1.0))
.Select(x => new HashAction()
{
Action = "Remove",
Hash = hash
}))
.Subscribe(o);
});
query.Subscribe(x =>
{
if (x.Action == "Add")
{
/* Add Hash */
}
if (x.Action == "Remove")
{
/* Remove Hash */
}
});
Now, it's a bit contrived as you don't give very much concrete detail as to what you're trying to do. I don't understand what "add an item to the hash" means, let alone what "go back and remove/do something from/with item from the hash" is. A concrete example would be very useful.
I am currently getting to grips with the Reactive Extensions framework for .NET and I am working my way through the various introduction resources I've found (mainly http://www.introtorx.com)
Our application involves a number of hardware interfaces that detect network frames, these will be my IObservables, I then have a variety of components that will consume those frames or perform some manner of transform on the data and produce a new type of frame. There will also be other components that need to display every n'th frame for example.
I am convinced that Rx is going to be useful for our application, however I am struggling with the implementation details for the IObserver interface.
Most (if not all) of the resources I have been reading have said that I should not implement the IObservable interface myself but use one of the provided functions or classes.
From my research it appears that creating a Subject<IBaseFrame> would provide me what I need, I would have my single thread that reads data from the hardware interface and then calls the OnNext function of my Subject<IBaseFrame> instance. The different IObserver components would then receive their notifications from that Subject.
My confusion is coming from the advice give in the appendix of this tutorial where it says:
Avoid the use of the subject types. Rx is effectively a functional programming paradigm. Using subjects means we are now managing state, which is potentially mutating. Dealing with both mutating state and asynchronous programming at the same time is very hard to get right. Furthermore, many of the operators (extension methods) have been carefully written to ensure that correct and consistent lifetime of subscriptions and sequences is maintained; when you introduce subjects, you can break this. Future releases may also see significant performance degradation if you explicitly use subjects.
My application is quite performance critical, I am obviously going to test the performance of using the Rx patterns before it goes in to production code; however I am worried that I am doing something that is against the spirit of the Rx framework by using the Subject class and that a future version of the framework is going to hurt performance.
Is there a better way of doing what I want? The hardware polling thread is going to be running continuously whether there are any observers or not (the HW buffer will back up otherwise), so this is a very hot sequence. I need to then pass the received frames out to multiple observers.
Any advice would be greatly appreciated.
Ok,
If we ignore my dogmatic ways and ignore "subjects are good/bad" all together. Let us look at the problem space.
I bet you either have 1 of 2 styles of system you need to ingrate to.
The system raises an event or a call back when a message arrives
You need to poll the system to see if there are any message to process
For option 1, easy, we just wrap it with the appropriate FromEvent method and we are done. To the Pub!
For option 2, we now need to consider how we poll this and how to do this effciently. Also when we get the value, how do we publish it?
I would imagine that you would want a dedicated thread for polling. You wouldn't want some other coder hammering the ThreadPool/TaskPool and leaving you in a ThreadPool starvation situation. Alternatively you don't want the hassle of context switching (I guess). So assume we have our own thread, we will probably have some sort of While/Sleep loop that we sit in to poll. When the check finds some messages we publish them. Well all of this sounds perfect for Observable.Create. Now we probably cant use a While loop as that wont allow us to ever return a Disposable to allow cancellation. Luckily you have read the whole book so are savvy with Recursive scheduling!
I imagine something like this could work. #NotTested
public class MessageListener
{
private readonly IObservable<IMessage> _messages;
private readonly IScheduler _scheduler;
public MessageListener()
{
_scheduler = new EventLoopScheduler();
var messages = ListenToMessages()
.SubscribeOn(_scheduler)
.Publish();
_messages = messages;
messages.Connect();
}
public IObservable<IMessage> Messages
{
get {return _messages;}
}
private IObservable<IMessage> ListenToMessages()
{
return Observable.Create<IMessage>(o=>
{
return _scheduler.Schedule(recurse=>
{
try
{
var messages = GetMessages();
foreach (var msg in messages)
{
o.OnNext(msg);
}
recurse();
}
catch (Exception ex)
{
o.OnError(ex);
}
});
});
}
private IEnumerable<IMessage> GetMessages()
{
//Do some work here that gets messages from a queue,
// file system, database or other system that cant push
// new data at us.
//
//This may return an empty result when no new data is found.
}
}
The reason I really don't like Subjects, is that is usually a case of the developer not really having a clear design on the problem. Hack in a subject, poke it here there and everywhere, and then let the poor support dev guess at WTF was going on. When you use the Create/Generate etc methods you are localizing the effects on the sequence. You can see it all in one method and you know no-one else is throwing in a nasty side effect. If I see a subject fields I now have to go looking for all the places in a class it is being used. If some MFer exposes one publicly, then all bets are off, who knows how this sequence is being used!
Async/Concurrency/Rx is hard. You don't need to make it harder by allowing side effects and causality programming to spin your head even more.
In general you should avoid using Subject, however for the thing you are doing here I think they work quite well. I asked a similar question when I came across the "avoid subjects" message in Rx tutorials.
To quote Dave Sexton (of Rxx)
"Subjects are the stateful components of Rx. They are useful for when
you need to create an event-like observable as a field or a local
variable."
I tend to use them as the entry point into Rx. So if I have some code that needs to say 'something happened' (like you have), I would use a Subject and call OnNext. Then expose that as an IObservable for others to subscribe to (you can use AsObservable() on your subject to make sure nobody can cast to a Subject and mess things up).
You could also achieve this with a .NET event and use FromEventPattern, but if I'm only going to turn the event into an IObservable anyway, I don't see the benefit of having an event instead of a Subject (which might mean I'm missing something here)
However, what you should avoid quite strongly is subscribing to an IObservable with a Subject, i.e. don't pass a Subject into the IObservable.Subscribe method.
Often when you're managing a Subject, you're actually just reimplementing features already in Rx, and probably in not as robust, simple and extensible a way.
When you're trying to adapt some asynchronous data flow into Rx (or create an asynchronous data flow from one that's not currently asynchronous), the most common cases are usually:
The source of data is an event: As Lee says, this is the simplest case: use FromEvent and head to the pub.
The source of data is from a synchronous operation and you want polled updates, (eg a webservice or database call): In this case you could use Lee's suggested approach, or for simple cases, you could use something like Observable.Interval.Select(_ => <db fetch>). You may want to use DistinctUntilChanged() to prevent publishing updates when nothing has changed in the source data.
The source of data is some kind of asynchronous api that calls your callback: In this case, use Observable.Create to hook up your callback to call OnNext/OnError/OnComplete on the observer.
The source of data is a call that blocks until new data is available (eg some synchronous socket read operations): In this case, you can use Observable.Create to wrap the imperative code that reads from the socket and publishes to the Observer.OnNext when data is read. This may be similar to what you're doing with the Subject.
Using Observable.Create vs creating a class that manages a Subject is fairly equivalent to using the yield keyword vs creating a whole class that implements IEnumerator. Of course, you can write an IEnumerator to be as clean and as good a citizen as the yield code, but which one is better encapsulated and feels a neater design? The same is true for Observable.Create vs managing Subjects.
Observable.Create gives you a clean pattern for lazy setup and clean teardown. How do you achieve this with a class wrapping a Subject? You need some kind of Start method... how do you know when to call it? Or do you just always start it, even when no one is listening? And when you're done, how do you get it to stop reading from the socket/polling the database, etc? You have to have some kind of Stop method, and you have to still have access not just to the IObservable you're subscribed to, but the class that created the Subject in the first place.
With Observable.Create, it's all wrapped up in one place. The body of Observable.Create is not run until someone subscribes, so if no one subscribes, you never use your resource. And Observable.Create returns a Disposable that can cleanly shutdown your resource/callbacks, etc - this is called when the Observer unsubscribes. The lifetimes of the resources you're using to generate the Observable are neatly tied to the lifetime of the Observable itself.
The quoted block text pretty much explains why you shouldn't be using Subject<T>, but to put it simpler, you are combining the functions of observer and observable, while injecting some sort of state in between (whether you're encapsulating or extending).
This is where you run into trouble; these responsibilities should be separate and distinct from each other.
That said, in your specific case, I'd recommend that you break your concerns into smaller parts.
First, you have your thread that is hot, and always monitoring the hardware for signals to raise notifications for. How would you do this normally? Events. So let's start with that.
Let's define the EventArgs that your event will fire.
// The event args that has the information.
public class BaseFrameEventArgs : EventArgs
{
public BaseFrameEventArgs(IBaseFrame baseFrame)
{
// Validate parameters.
if (baseFrame == null) throw new ArgumentNullException("IBaseFrame");
// Set values.
BaseFrame = baseFrame;
}
// Poor man's immutability.
public IBaseFrame BaseFrame { get; private set; }
}
Now, the class that will fire the event. Note, this could be a static class (since you always have a thread running monitoring the hardware buffer), or something you call on-demand which subscribes to that. You'll have to modify this as appropriate.
public class BaseFrameMonitor
{
// You want to make this access thread safe
public event EventHandler<BaseFrameEventArgs> HardwareEvent;
public BaseFrameMonitor()
{
// Create/subscribe to your thread that
// drains hardware signals.
}
}
So now you have a class that exposes an event. Observables work well with events. So much so that there's first-class support for converting streams of events (think of an event stream as multiple firings of an event) into IObservable<T> implementations if you follow the standard event pattern, through the static FromEventPattern method on the Observable class.
With the source of your events, and the FromEventPattern method, we can create an IObservable<EventPattern<BaseFrameEventArgs>> easily (the EventPattern<TEventArgs> class embodies what you'd see in a .NET event, notably, an instance derived from EventArgs and an object representing the sender), like so:
// The event source.
// Or you might not need this if your class is static and exposes
// the event as a static event.
var source = new BaseFrameMonitor();
// Create the observable. It's going to be hot
// as the events are hot.
IObservable<EventPattern<BaseFrameEventArgs>> observable = Observable.
FromEventPattern<BaseFrameEventArgs>(
h => source.HardwareEvent += h,
h => source.HardwareEvent -= h);
Of course, you want an IObservable<IBaseFrame>, but that's easy, using the Select extension method on the Observable class to create a projection (just like you would in LINQ, and we can wrap all of this up in an easy-to-use method):
public IObservable<IBaseFrame> CreateHardwareObservable()
{
// The event source.
// Or you might not need this if your class is static and exposes
// the event as a static event.
var source = new BaseFrameMonitor();
// Create the observable. It's going to be hot
// as the events are hot.
IObservable<EventPattern<BaseFrameEventArgs>> observable = Observable.
FromEventPattern<BaseFrameEventArgs>(
h => source.HardwareEvent += h,
h => source.HardwareEvent -= h);
// Return the observable, but projected.
return observable.Select(i => i.EventArgs.BaseFrame);
}
It is bad to generalize that Subjects are not good to use for a public interface.
While it is certainly true, that this is not the way a reactive programming approach should look like, it is definitively a good improvement/refactoring option for your classic code.
If you have a normal property with an public set accessor and you want to notify about changes, there speaks nothing against replacing it with a BehaviorSubject.
INPC or additional other events are just not that clean and it personally wears me off.
For this purpose you can and should use BehaviorSubjects as public properties instead of normal properties and ditch INPC or other events.
Additionally the Subject-interface makes the users of your interface more aware about the functionality of your properties and are more likely to subscribe instead of just getting the value.
It is the best to use if you want others to listen/subscribe to changes of a property.
I'm currently reading from my MSMQ like so (simplified for brevity):
public void Start()
{
this.queue.ReceiveCompleted += this.ReceiveCompleted;
this.queue.BeginReceive();
}
void ReceiveCompleted(object sender, ReceiveCompletedEventArgs e)
{
this.queue.EndReceive(e.AsyncResult);
try
{
var m = e.Message;
m.Formatter = this.formatter;
this.Handle(m.Body);
}
finally
{
this.queue.BeginReceive();
}
}
However this only allows me to process messages in serial. How do I modify this code to allow parallel message processing?
I know I can move the this.queue.BeginReceive(); out of the finally and into the top of the ReceiveCompleted but whats to stop that spawning as many threads as I have messages? How do I sensibly control the level of parallelism so I don't flood the threadpool? Is there some inbuilt mechanism for this, or do I have to write my own manager?
Edit: My aim is to process messages faster. The processing of the messages involves an async call to a 3rd party, so currently my implementation is wasting a lot of time in getting through the queue.
Thanks
I think it would be simpler to just host more instances of your queue reader. Then you can quickly scale up and down depending on need by deploying/undeploying more instances.
Also it becomes a management, rather than a development concern, which is what scaling should be.
You could use a "Producer Consumer Pattern"...
.NET 4 and up has Concurrent collections which are thread-safe and implemented "mostly lock-free" (thus perform well with multi-threading)...
You could use BlockingCollection combined with TPL to achieve what you want without much worry of threadpool starvation or similar... you would only change the line this.Handle(m.Body); to something like MyBlockingCollection.Add(m.Body); and spin up "consumer threads" which work on MyBlockingCollection and do the actual work (i.e. call this.Handle on the next item from MyBlockingCollection which they get by calling TryTake for example) ... see the above link for a basic sample...
My application handles a data feed. When a new packet comes in, a dispatcher collects it, and raises an event that a proper listener can pick up and do what it needs to do.
I am trying to simulate the live feed to perform some testing. I made a class that feeds the dispatcher with packets, based on the number of active listeners.
This is the code I use to start the Feed() method which sits in memory and generate a packet every given interval:
foreach (var item in Listeners)
{
object listener = item;
Task.Factory.StartNew(()=> Feed(listener), TaskCreationOptions.LongRunning);
}
The Feed() method works something like this:
while(run)
{
packet = GenerateThePacket(listener.Id); // Make a packet with the listener id
FeedHandler.OnPacketRecieved(this, packet); // Raises the FeedHandler's event as if it came from outside.
Thread.Sleep(1000/interval) // interval determines how many packets per second
}
So, if I have a 100 listeners, it'll start 100 instances of Feed(), each with different listener id, and fire up PacketRecieved events at the same time with the requested interval.
I guess many of you already know whats bad about it, but I'll explain the problem anyway:
When I use an interval of 1 or 2 it works great. When I choose 10 (that is, a packet every 100ms) it doesn't work right. Each thread fires up at different intervals, where the latest one created works good and fast (10/sec), and the first ones create works real slow (1/sec or less).
I guess that a 100 threads can't operate at the same time and so they are just waiting. I think.
What exactly is happening and how can I implement a true feed generator which simulates 10 packets a sec simultaneously for a 100 listeners.
I think you're approaching this from the wrong angle....
Have a read through these:
http://blogs.msdn.com/b/pfxteam/archive/2010/04/21/9997559.aspx (link to pdf on there)
http://www.sadev.co.za/content/pulled-apart-part-vii-plinq-not-easy-first-assumed
In a nutshell, the Task library will get a thread from the threadpool and if one is not available, the tasks will be queued until a thread is available..... so, the number of threads than can concurrently run depends on your system and the size of your threadpool.
For me, there are 2 ways to go... use the Parallel.ForEach static method or use the PLinq AsParallel() option as described in the articles above. At the end of the day, down to you which one to use.
Using plinq... something like this:
var parallelQuery = Listeners.AsParallel().Select(item=> Feed(item)); //creates the parallel query
parallelQuery.ForAll(item=> <dosomething>); //begin the parallel process and do whatever you need to do for each result.
your feed method/object can look like this:
while(run)
{
packet = GenerateThePacket(listener.Id);
FeedHandler.OnPacketRecieved(this, packet); // Raises the FeedHandler's event as if it came from outside.
//No more Thread.Sleep
}
This is just a basic intro for you but the links i've added above are quite helpful and informative. It's up to you which method to use.
Keep in mind there are additional options you can add.... all in the links above.
Hope this helps!