We are just starting to use RabbitMQ with C#. My current plan is to configure in the database the number and kind of consumers to run on a given server. We have an existing windows service and when that starts I want to spawn all of the RabbitMQ consumers. My question is what is the best way to spwan these from a windows service?
My current plan is to read the configuration out of the database and spawn a long running task for each consumer.
var t = new Task(() =>
{
var instance = LoadConsumerClass(consumerEnum, consumerName);
instance.StartConsuming();//blocking call
}, TaskCreationOptions.LongRunning);
t.Start();
Is this better or worse than creating a thread for each consumer?
var messageConsumer = LoadConsumerClass(consumerEnum, consumerName);
var thread = new Thread(messageConsumer.StartConsuming);
I'm hoping that more than a few others have already tried what I'm doing and can provide me with some ideas for what worked well and what didn't.
In EasyNetQ we have a single dispatcher thread for all consumers on a single connection. We also provide a facility to to return a Task from the message handler, so it's easy to do async IO if you want to make a database call, go to the file system, or make a web service request.
Having said that it's perfectly legitimate to have each consumer consuming on a different thread. I guess it depends on your message throughput, how many consumers you have and the nature of your message handlers.
I'd stick with Tasks as they give you more features and generally allow for less boilerplate code.
And, If I understand your code correctly, you'd be sharing a channel (IModel) in second case. This might cause troubles as the default IModel implementation is not thread safe (or used to be). There're more subtle nuances regarding thread safety you'd have to watch out.
But it depends on your usage patterns. If you don't expect many messages/sec on each consumer, or if your app can handle messages fast then perhaps a single thread for all consumers will be you best option.
Task is great, but you not really going to use all the stuff it can do. The only thing you need is to do work in parallel.
I faced the same question couple of months ago, what I finished with - is a thread per computation type (per queue) which is blocking on message arrival and doesn't consume cpu when waiting for messages.
Open a new channel for each one of the threads.
As for connections - if you application is meant to deal with high load of messages, I suggest you opening connection for every X workers (figure you your X), since only one channel can send the messages through the connection, so assuming one worker is consuming large message the others are blocked on connection level waiting it to be free.
Related
Is async/await useful in a backend / webservice scenario?
Given the case there is only one thread for all requests / work. If this thread awaits a task it is not blocked but it also has no other work to do so it just idles. (It can't accept another request because the current execution is waiting for the task to resolve).
Given the case there is one thread per request / "work item". The Thread still idles because the other request is handled by another thread.
The only case I can imagine is doing two async operations at a the same time is like reading a file and sending an http request. But this sounds like a rare case. Is should read the file first and then post the content and not post something I didn't even read.
Given the case there is one thread per request / "work item". The Thread still idles because the other request is handled by another thread.
That's closer to reality but the server doesn't just keep adding threads ad infinitum - at some point it'll let requests queue if there's not a thread free to handle the request. And that's where freeing up a thread that's got no other work to usefully do at the moment starts winning.
It's hard to read your question without feeling that you misunderstand how webservers work and how async/await & threads work. To make it simple, just think of it like this: async/await is almost always good to use when you query an external resource (e.g. database, web service/API, system file, etc). If you follow this simple rule, you don't need to think too deeply about each situation.
However, when you read & learn more on these subjects and gain good experience, deep thinking becomes essential in each case because there are always exceptions to any rule, so there are scenarios where the overhead of using async/await & threads may transcends their benefits. For example, Microsoft decided not to use it for the logger in ASP.Net Core and there is even a comment about it in the source code.
In your case, the webserver uses much more threads that you seem to think and for much more reasons than you seem to think. Also when a thread is idling waiting for something, it cannot do anything else. What async/await do is that they untie the thread from the current awaited task so the thread can go back to the pool and do something else. When the awaited task is finished, a thread (can be a different thread) is pulled out of the pool to continue the job. You seem to understand this to some degree, but perhaps you just don't know what other things a thread in a webserver can do. Believe me, there is a lot to do.
Finally, remember that threads are generic workers, they can do anything. Webservers may have specialized threads for different tasks, but they fall into two or three categories. Threads can still do anything within their category. Webservers can even move threads to different categories when required. All of that is done for you so you don't need to think about it in most cases and you can just focus on freeing the threads so the webserver can do its job.
Given the case there is only one thread for all requests / work.
I challenge you to say that this is a very abstruse case. Even before multi core servers because standard, asp.net used 50+ threads per core.
If this thread awaits a task it is not blocked but it also has no other work to do so it
just idles.
No, it goes back into the pool handling other requests. MOST web services will love handling as many requests as possible with as few resources as possible. Servers only handling one client are a rare edge case. Extremely rare. Most web services will handle as many requests as the plenthora of clients throw at them.
Lets assume that I have a several layers:
Manager that reads data from a socket
Manager that subscribes to #1 and takes care about persisting the data
Manager that subscribes to #2 and takes care about deserialization of the data and propagating it to typed managers that are insterested in certain event types
WPF Controllers that display the data (are subscribed to #3)
As of right now I use
TaskFactory.StartNew(()=>subscriber.Publish(data));
on each layer. The reason for this is that I don't want to rely on the fact that every manager will do his work quickly and that ex. Socket manager is not stuck.
Is this a good approach?
Edit
Let's say that Socket manager receives a price update
There are 10 managers subscribed to Socket manager so when Socket manager propagates the message .StartNew is called 10 times.
Managers #2,#3 do nothing else but to propagate the message by .StartNew to a single subscriber
So ultimately per 1 message from socket 30x .StartNew() is called.
It seems a reasonable approach.
However, if one could meaningfully do:
subscriber.PublishAsync(data).LogExceptions(Log);
Where LogExceptions is something like:
// I'm thinking of Log4Net here, but of course something else could be used.
public static Task LogExceptions(this Task task, ILog log)
{
return task.ContinueWith(ta => LogFailedTask(ta, log), TaskContinuationOptions.OnlyOnFaulted);
}
private static void LogFailedTask(Task ta, ILog log)
{
var aggEx = ta.Exception;
if(aggEx != null)
{
log.Error("Error in asynchronous event");
int errCount = 0;
foreach(var ex in aggEx.InnerExceptions)
log.Error("Asynchronous error " + ++errCount, ex);
}
}
So that fire-and-forget use of tasks still have errors logged, and PublishAsync in turn makes use of tasks where appropriate, then I'd be happier still. In particular, if the "publishing" has anything that would block a thread that can be handled with async like writing to or reading from a database or file system then the thread use could scale better.
Regarding Task.Run vs. TaskFactory.StartNew, they are essentially identical under the hood. Please read the following link: http://blogs.msdn.com/b/pfxteam/archive/2014/12/12/10229468.aspx
Even though these methods use the ThreadPool for decent performance, there is overhead associated with constantly creating new Tasks on-the-fly. Task is generally used more for infrequent, fire-and-forget type workload. Your statement of "30x .StartNew() per 1 message from the socket" is a bit concerning. How often do socket messages arrive? If you are really concerned with latency, I think the better way of doing this is that each manager should have its own dedicated thread. You can use a BlockingQueue implementation so that the threads are waiting to consume a parent input item in the parent's queue. This would be preferable to a simple spinlock, for example.
This is the sort of architecture used regularly in financial market messaging subscription and decoding that needs the fastest possible performance. Also keep in mind that more threads do not always equate to faster performance. If the threads have any shared data dependencies, they will all be contending for the same locks, causing context switching on one another, etc. This is why a preset number of dedicated threads can usually win out vs. a greater number of threads created on-the-fly. The only exception I can think of would be "embarrassingly parallel" tasks where there are no shared data dependencies at all. Note that dependencies can exist on both the input side and the output side (anywhere there is a lock the threads could run into).
I would like to rephrase my previous question How to create Singleton with async method?
Imagine messaging application (like icq) - something that should be always connected to server and can post messages.
I need to implment class Connection. It should be singleton, because it contains "socket" inside and that socket should persist during entirely application lifetime.
Then I want to implement async method Connection.postMessage
Because postMessage can take significant ammount of time:
postMessage should be async
postMessage should queue messages if neccesary
Note my application posts dozens messages per second, so it is not appropiate to create new Thread for each postMessage call.
I diffenetely need to create exactly one extra thread for messages posting but I don't know where and how.
upd: good example http://msdn.microsoft.com/en-us/library/yy12yx1f(v=vs.80).aspx
No, Postmessage (itself) should not be async .
It should
be Thread-safe
ensure the Processing thread is running
queue the message (ConcurrentQueue)
return
And the Processing Thread should
Wait on the Queue
Process the messages
maybe Terminate itself when idle for xx milliseconds
What you have is a classic Producer/Consumer situation with 1 Consumer and multiple Producers.
PostMessage is the entry-point for all producers.
jp,
You're looking at a classic producer/consumer problem here... During initialisation the Connection should create a MessageQueue start a Sender in it's own background thread.
Then the connection posts just messages to the queue, for the Sender to pickup and forward when ready.
The tricky bit is managing the maximum queue size... If the producer consistently outruns the consumer then queue can grow to an unmanagable size. The simplest approach is to block the producer thread until the queue is no longer full. This can be done with a back-off-ARQ. ie: while(queue.isFull) sleep(100, "milliseconds"); queue.add(message); If you don't require 100% transmission (like a chat-app, for instance) then you can simply throw a MessageQueueFullException, and the poor client will just have to get over it... just allways allow them to resubmit later... allowing the user manage the retrys for you.
That's how I'd tackle it anyway. I'll be interested to see what others suggestions are muted.
Hope things work out for you. Cheers. Keith.
Everything that I read about sockets in .NET says that the asynchronous pattern gives better performance (especially with the new SocketAsyncEventArgs which saves on the allocation).
I think this makes sense if we're talking about a server with many client connections where its not possible to allocate one thread per connection. Then I can see the advantage of using the ThreadPool threads and getting async callbacks on them.
But in my app, I'm the client and I just need to listen to one server sending market tick data over one tcp connection. Right now, I create a single thread, set the priority to Highest, and call Socket.Receive() with it. My thread blocks on this call and wakes up once new data arrives.
If I were to switch this to an async pattern so that I get a callback when there's new data, I see two issues
The threadpool threads will have default priority so it seems they will be strictly worse than my own thread which has Highest priority.
I'll still have to send everything through a single thread at some point. Say that I get N callbacks at almost the same time on N different threadpool threads notifying me that there's new data. The N byte arrays that they deliver can't be processed on the threadpool threads because there's no guarantee that they represent N unique market data messages because TCP is stream based. I'll have to lock and put the bytes into an array anyway and signal some other thread that can process what's in the array. So I'm not sure what having N threadpool threads is buying me.
Am I thinking about this wrong? Is there a reason to use the Async patter in my specific case of one client connected to one server?
UPDATE:
So I think that I was mis-understanding the async pattern in (2) above. I would get a callback on one worker thread when there was data available. Then I would begin another async receive and get another callback, etc. I wouldn't get N callbacks at the same time.
The question still is the same though. Is there any reason that the callbacks would be better in my specific situation where I'm the client and only connected to one server.
The slowest part of your application will be the network communication. It's highly likely that you will make almost no difference to performance for a one thread, one connection client by tweaking things like this. The network communication itself will dwarf all other contributions to processing or context switching time.
Say that I get N callbacks at almost
the same time on N different
threadpool threads notifying me that
there's new data.
Why is that going to happen? If you have one socket, you Begin an operation on it to receive data, and you get exactly one callback when it's done. You then decide whether to do another operation. It sounds like you're overcomplicating it, though maybe I'm oversimplifying it with regard to what you're trying to do.
In summary, I'd say: pick the simplest programming model that gets you what you want; considering choices available in your scenario, they would be unlikely to make any noticeable difference to performance whichever one you go with. With the blocking model, you're "wasting" a thread that could be doing some real work, but hey... maybe you don't have any real work for it to do.
The number one rule of performance is only try to improve it when you have to.
I see you mention standards but never mention problems, if you are not having any, then you don't need to worry what the standards say.
"This class was specifically designed for network server applications that require high performance."
As I understand, you are a client here, having only a single connection.
Data on this connection arrives in order, consumed by a single thread.
You will probably loose performance if you instead receive small amounts on separate threads, just so that you can assemble them later in a serialized - and thus like single-threaded - manner.
Much Ado about Nothing.
You do not really need to speed this up, you probably cannot.
What you can do, however is to dispatch work units to other threads after you receive them.
You do not need SocketAsyncEventArgs for this. This might speed things up.
As always, measure & measure.
Also, just because you can, it does not mean you should.
If the performance is enough for the foreseeable future, why complicate matters?
I’m looking for the best way of using threads considering scalability and performance.
In my site I have two scenarios that need threading:
UI trigger: for example the user clicks a button, the server should read data from the DB and send some emails. Those actions take time and I don’t want the user request getting delayed. This scenario happens very frequently.
Background service: when the app starts it trigger a thread that run every 10 min, read from the DB and send emails.
The solutions I found:
A. Use thread pool - BeginInvoke:
This is what I use today for both scenarios.
It works fine, but it uses the same threads that serve the pages, so I think I may run into scalability issues, can this become a problem?
B. No use of the pool – ThreadStart:
I know starting a new thread takes more resources then using a thread pool.
Can this approach work better for my scenarios?
What is the best way to reuse the opened threads?
C. Custom thread pool:
Because my scenarios occurs frequently maybe the best way is to start a new thread pool?
Thanks.
I would personally put this into a different service. Make your UI action write to the database, and have a separate service which either polls the database or reacts to a trigger, and sends the emails at that point.
By separating it into a different service, you don't need to worry about AppDomain recycling etc - and you can put it on an entire different server if and when you want to. I think it'll give you a more flexible solution.
I do this kind of thing by calling a webservice, which then calls a method using a delegate asynchronously. The original webservice call returns a Guid to allow tracking of the processing.
For the first scenario use ASP.NET Asynchronous Pages. Async Pages are very good choice when it comes to scalability, because during async execution HTTP request thread is released and can be re-used.
I agree with Jon Skeet, that for second scenario you should use separate service - windows service is a good choice here.
Out of your three solutions, don't use BeginInvoke. As you said, it will have a negative impact on scalability.
Between the other two, if the tasks are truly background and the user isn't waiting for a response, then a single, permanent thread should do the job. A thread pool makes more sense when you have multiple tasks that should be executing in parallel.
However, keep in mind that web servers sometimes crash, AppPools recycle, etc. So if any of the queued work needs to be reliably executed, then moving it out of process is a probably a better idea (such as into a Windows Service). One way of doing that, which preserves the order of requests and maintains persistence, is to use Service Broker. You write the request to a Service Broker queue from your web tier (with an async request), and then read those messages from a service running on the same machine or a different one. You can also scale nicely that way by simply adding more instances of the service (or more threads in it).
In case it helps, I walk through using both a background thread and Service Broker in detail in my book, including code examples: Ultra-Fast ASP.NET.