We're in the process of moving our .NET platform from using MSMQ to ActiveMQ. We pump 30+ million persistent messages through it a day, so throughput and capacity are critical to us. The way our MSMQ dependent applications are configured, is they write to local/private queues first. Then we have a local service that routes those messages to their respective remote queues for processing. This ensures the initial enqueue/write write is fast (yes, we can also use async enqueueing as well), and messages aren't lost if the remote servers are unavailable.
We were going to use the same paradigm for ActiveMQ, but now we've decided to move to using VM's with NAS storage for most of our application servers. This greatly reduces the write performance of each message since it's going to NAS, and I feel I need to rethink our approach to queueing. I'd like to know what is considered best practice for using ActiveMQ, with persistent, high throughput needs. Should I consider using dedicated queue servers (that aren't VM's)? But that would mean all writes from the application are going directly over the network. How do I deal with high availability requirements?
Any suggestions are appreciated.
You can deploy ActiveMQ instances in a network of brokers and the topology can include local instances as well as remote instances. I have deployed topologies containing a local instance of ActiveMQ so that messages are persisted as close to the sender as possible and then the messages are forwarded to remote ActiveMQ instances based on demand. With this style of topology, I recommend configuring the network connector(s) to disallow forwarding messages from all destinations. I.e., instead of openly allowing the forwarding of messages for all destinations, you may want to narrow the number of messages forwarded using the excludedDestinations property.
As far as high availability with ActiveMQ, the master/slave configuration is designed for exactly this. It comes in three flavors depending on your needs.
Hope that helps.
Bruce
Related
In my project I have a cloud hosted virtual machine running a C# application which needs to:
accept TCP connection from several external clients (approximately 500)
receive data asynchronously from the connected clients (not high frequency, approximately 1 message per minute)
do some processing on received data
forward received data to other actors
reply back to connected clients and possibly do some asynchronous sending (based on internal time-checks)
The design seems to me quite straightforward. I provide a listener which accepts incoming TCP connection, when a new connection is establhised a new thread is spawned; that thread runs in loop (performing activities points from 2 to 5) and check for associated socket aliveness (if socket is dead, the thread exits the loop and would eventually terminate; later a new connection will be attempted from the external client the socket belonged to).
So now the issue is that for limited amount of external clients (I would say 200/300) everything runs smoothly, but as that number grows (or when the clients send data with higher frequency) the communication gets very slow and obstructed.
I was thinking about some better design, for example:
using Tasks instead of Threads
using ThreadPool
replace 1Thread1Socket with something like 1Thread10Socket
or even some scaling strategies:
open two different TCP listeners (different port) within the same application (reconfiguring clients so that half of them target each listener)
provide two identical application with two different TCP listeners (different port) on the same virtual machine
set up two different virtual machines with the same application running on each of them (reconfiguring clients so that half of them target each virtual machine address)
Finally the questions: is the current design poor or naive? do you see any major criticality in the way I handle the communication? do you have any more robust and efficient option (among those mentioned above, or any additional one)?
Thanks
The number of listeners is unlikely to be a limiting factor. Here at Stack Overflow we handle ~60k sockets per instance, and the only reason we need multiple listeners is so we can split the traffic over multiple ports to avoid ephemeral port exhaustion at the load balancer. Likewise, I should note that those 60k per-instance socket servers run at basically zero CPU, so: it is premature to think about multiple exes, VMs, etc. That is not the problem. The problem is the code, and distributing a poor socket infrastructure over multiple processes just hides the problem.
Writing high performance socket servers is hard, but the good news is: you can avoid most of this. Kestrel (the ASP.NET Core http server) can act as a perfectly good TCP server, dealing with most of the horrible bits of async, sockets, buffer management, etc for you, so all you have to worry about is the actual data processing. The "pipelines" API even deals with back-buffers for you, so you don't need to worry about over-read.
An extensive walkthrough of this is in my 3-and-a-bit part blog series starting here - it is simply way too much information to try and post here. But it links through to a demo server - a dummy redis server hosted via Kestrel. It can also be hosted without Kestrel, using Pipelines.Sockets.Unofficial, but... frankly I'd use Kestrel. The server shown there is broadly similar (in terms of broad initialization - not the actual things it does) to our 60k-per-instance web-socket tier.
I have messages that I want to send from multiple WPF client applications to a service that can be processed some time after being sent.
Because of expected intermittent connectivity issues between client and server and necessary down time for the service, I'm inclined to create a WCF service with a queued endpoint. This has worked well for me in the past when the client machines were actually other servers and few in number.
I'm concerned about doing this with many client machines primarily because I think it will be difficult to monitor so many outgoing queues to confirm that no traffic is being trapped on the client machines.
Has anyone tried doing this before?
If so, would you recommend it? Why or why not?
Even if you haven't done it, can you think of other pitfalls beside the operational issue of monitoring all those outbound queues?
Your question may be better worded as:
Should a system be rolled out with many nodes all using MSMQ?
If so this is the essence of messaging and is what such systems are designed for irrespective of whether they are JMS, Apache MQ, Websphere, SonicMQ, or MSMQ.
Also, "traffic is being trapped on the client machines" - how do you define trapped? Remember, the application may be quite happy for the message to be sitting locally for days before being forwarded to the remote host. Messaging systems have timeouts generally for both reaching the destination and for the destination to process it.
I think you will be fine.
I wanted to create my own mini-project with fictitious high-volume data (options trading) to be consumed by a WPF application, to better understand the design concepts and considerations that go into designing a real-time system and want to learn what sort of techniques and approaches are used. Please no mention of third party solutions like Tibco - this is for learning purposes. My intention is the WPF application refreshes its UI every 5 seconds
When designing my fictitious market data server, given that high-volume performance is a criteria a few quick ideas come to mind - multicast UDP (is this too low-level / a bad direction?), a messaging architecture using a queue eg MSMQ or RabbitMQ, a remote service host the client app initiates requests to eg via WCF TCP binding or web-service.
One thought I had was the clients maintain their own local queues and subscribe to topics that the pricing server broadcasts using a messaging solution? Or maybe the server would broadcast the data to all clients equally and leave it to the clients to filter and collate the data locally? In peoples experiences, what are the pros and cons of each approach and is there any other approach I missed here? I guess it comes down to - should the client be pulling data or should the server be pushing it out to them?
The other question is - what would the wire-format would these messages take? I'm primarily used to working with rich business object classes, separated into a repository layer, domain model (with methods for validation and workflow logic) and simple service layer. Could I still leverage this approach and still maintain my performance goals or would I need to create a more light-weight data payload format?
I would start designing such a system from the higher layers before going down to network level optimizations.
RabbitMQ provides different types of exchanges for routing messages. The approach of broadcasting all messages to every client (fanout exchange) is marginally fast on the RabbitMQ server side, but this will only work efficiently for low volume messages and provided the clients are connected via high speed links (e.g. local gigabit ethernet). Instead using a direct or topical exchange may significantly lower your network delays. You can read more about exchange types on the RabbitMQ website.
Your last question is about the wireformat. In theory RabbitMQ allows any string (or even binary) payloads, so its a matter of trying to squeeze more information into fewer bytes. In my experience, as long as your messages are not over network packet MTU, the gains of compression or choosing a clever encoding scheme are marginal.
In general, think of how much time you are spending on each optimization, and what is the expected ROI. IMO some optimizations are more useful than others.If I was you I would look very carefully at the RabbitMQ configuration parameters. For example, look if you can setup the rabbit MQ server with per-process message queues.
I have just finished creating an API where the requests from the API are forwarded to a back-end service via MassTransit/RabbitMQ using the Request/Response pattern. We are now looking at pushing this into production, and are wanting to have multiple instances of the application (both API and service) running on different services, with a load balancer distributing the requests between them.
This leaves us in a position where we could potentially lose all of the messages if one of the servers is taken out of the pool for any reason. I am looking at creating a RabbitMQ cluster between the servers (each server has a local install) and was wondering how I would go about setting up the competing consumers in this instance.
Does RabbitMQ or MassTransit handle this so that only one consumer will receive the request, or will all consumers receive it and attempt to respond? Also, with the RabbitMQ cluster, how will MassTransit/RabbitMQ handle a node failing?
You should take a look at this document.
http://www.rabbitmq.com/distributed.html
Explains the common distributed scenarios quite nicely. For your scenario I think federation would be a better fit than clustering. If you go for clustering you should look at mirrored queues.
If all you need is performance you are better of getting a single server to handle your message queuing and the other server will connect to it and produce/consume messages.
I don't know how Mass Transit works but, if Request/Response is used you should get a single delivery of message to a single consumer, if the message is not ack-ed (the consumer crashes) an other consumer should pick it up.
I've a business application where i've a master application and multiple slave applications (geographically distributed) connected to each other. All the slave application interact through master application and master application should handle all the incoming requests as well as respond to the previous requests.
We're dealing with huge volume of data getting transferred between the master and child sites. So i need to handle all the pouring incoming requests and responses simultaneously and effectively. To be precise, i want all the nodes to communicate in a fail-safe manner.
I was looking at MSMQ for our requirement. I want you guys' opinion as how best this can be handled in .NET using MSMQ or any other proprietary or Open Source message queuing tool.
Thank you.
Regards
NLV
MSMQ is a reliable messaging protocol and will be able to achieve what you described above. If you look into WCF offerings, fundamentally all the messaging types will allow you to handle concurrent requests quite efficiently. The good thing about using WCF
is that through the configuration you can tweak to use different binding, transport protocols and size of the concurrent requests or threads so that you can keep adjusting until you find what is most optimal for your situation. It also takes care of the plumbing code for you and you dont necessary have to code specifically and tied to msmq only.