I am just getting started with RabbitMQ, because on a website I am working on we want to de-couple some of the resource intensive tasks such as sending emails, generating PDF's etc.
I have started by following the very simple C# "Hello world" tutorial on the RabbitMQ website (https://www.rabbitmq.com/tutorials/tutorial-one-dotnet.html). This was useful to get a very brief understanding of how RabbitMQ hangs together, but it has left me with a number of questions that genuinely and surprisingly I can't find answers to online....
The "Hello world" example sends a basic string. In my example of sending emails, would my Publisher send all the data needed to send the email i.e. recipient, subject etc, perhaps in JSon format?
How typically would you structure a consumer to execute a method to DoSomething? Ideally, I would like it to be fluent so that if a message is of a particular type the Consumer executes method SendEmail(), or if the message is of a different type it executes the method GeneratePDF() and so on....
I have a Publisher and a Consumer, however I have a number of different tasks that I want the Consumer to process i.e. send emails or generate PDFS. Should i have multiple consumers i.e. one for each type of task, or, multiple queue's (again one for each task)?
These are some of the basic questions I have that currently are preventing me from seeing how RabbitMQ is used in real-world scenario. Any help would be greatly appreciated.
With messaging, typically you send small-ish packets of data (like simple JSON objects) in the same manner you would as though you were defining an http-based API. The function definitions and input/output specifications can be identical - think of messaging as just a different transport mechanism.
For questions 2-3, yes, you want to set up multiple queues with multiple consumer types. Each consumer type would subscribe to exactly one queue whose messages are intended only for that consumer. Use routing logic on the publisher to determine which queue the messages end up in.
Related
I am developing a service in a multiservice architecture using RabbitMQ and the MassTransit library.
The service receives transactions via Consumer. In accordance with the filtering rules (which are set in the configuration json file and import to service via Options), the address where the information of transaction needs to be sent is determined and item published to a separate queue for future sending.
In the Consumer of Queue for sending, I just send data to the address that was specified for this transaction.
Now there is a need to send data in batches. And here the MassTransit functionality with Batch Consumer could help.
But there are difficulties of dispatching. For example, Consumer receive 4 transactions. 2 of them need to be sent to one address, 2 others to another. In the code, I make two arrays with transactions for each address and try to send. If both arrays were sent successfully, then everything is fine. If both arrays receive an error, the entire Batch goes to retry, which is also good. But if one of the arrays is sent successfully and the other is not, then the entire Batch goes to repeat.
The actual question is, is it possible to create two separate queues for one entity (uses one interface) and send data to each of them separately according of rules? Or is there another way to solve this problem that would divide transactions into Batches according to the sending address?
is it possible to create two separate queues for one entity
I would ask that you try to simplify this process. If the architecture is so confusing that it takes readers 30 mins to under the question, it's too complex. Think about supporting this code in 12 months time.
However, an option is to use a Batch that send to a Batch
The first Batch reads a custom Message Header (say __filterby) to split the message into two different queues (endpoints).
The code then re-batch to a dedicated endpoint/consumer based on the logic. This means one endpoint/queue. Here is some pseudo code to explain what I mean.
public async Task Consume(ConsumeContext<Batch<OrderAudit>> context)
{
var arraya = Contect.Messages(m => m?.Headers?.filterby == 'arraya';
ConsumeContext<IArrayA> a = arraya;
// Send
var arrayb = Contect.Messages(m => m?.Headers?.filterby == 'arrayb';
ConsumeContext<IArrayB> b = arrayb;
// send
}
Also, this feels is close to having a RabbitMQ Exchange direct traffic to multiple queues based on a Topic/routing_key. You could re-work the solution to fix this pattern.
References that might help
https://masstransit-project.com/troubleshooting/common-gotchas.html#sharing-a-queue
https://masstransit-project.com/usage/producers.html#message-initializers
https://www.rabbitmq.com/tutorials/tutorial-five-python.html
I read about multiple exchanges in rabbitmq like fanout etc. for multicast, broadcasting messages.
One way to broadcast/multicast could be by having identifiers in the rabbitmq message body itself rather than differentiating it through routing-key/headers etc.
What is the benefit of using routing-key/headers to decide the consumer VS
pushing all data through NameValueCollection and deciding on single consumer, what action is to be taken?
One benefit, I see here is by having one type of object for each consumer, it looks like that each consumer would have just Single Responsibility. Any other compelling thing to opt this approach?
You could use MassTransit(http://masstransit-project.com/) to broadcast and consume your messages through Rabbitmq. I think it is a good approach since you are thinking about different contracts/consumers.
I recommend you to take a look at the documentation to see if it is a good fit for your needs.
I've been tasked with taking a RabbitMQ queue, processing the messages (key and values) to filter out unneeded items (based on the key), and delaying the results before making them available via a webservice.
Being new to RabbitMQ, it seems like my best approach was to write a windows client that retrieves messages from the queue, filters it accordingly and puts it into a custom class collection (System.Collection.Queue). Whenever an item in this collection has been stored for X seconds, the message data would be pushed into public collection to overwrite the existing value based on the key.
This publicly accessible collection would be exposed as a REST service returning json data.
This would loop indefinitely for as long as the client was running.
The end client is a javascript widget that will connect to this webservice, and probably poll it every second. It seems like my approach would work, but I am concerned the process would be too intensive? I get the feeling there might be a better solution.
I was originally thinking node.js might be a good fit for this project, but I'm predominantly an asp.net developer, so I'm happy to consider other solutions, perhaps like SignalR, Web API, WCF?
It seems what you're worried about is polling, no RabbitMQ. A push solution is a better fit, if you can dictate technology that supports it. Since SignalR and other websocket solutions allow push notifications, it's a great fit for this.
I don't think personally you need a windows application - you can do worker tasks in ASP easily. The .NET parallels library now has great support for producer / consumer patterns using blocking collections as well.
So you could just do
-message recieved - add to blocking queue
-blocking queue consumer gets message and
-starts new task for message
-sleep task for amount of time wanted
-add to web APIs output queue
-push new messages out through websocket
After reading through the pub/sub project sample in MassTransit, it left me scratching my head.
In the sample, the client application publishes a request for the subscriber application to update the password of a fictitious user. This sample code works fine, and it's easy to follow the bouncing ball of this project.
HOWEVER--
In a real-world environment, the purpose of pub/sub (in my understanding) is to have a small number of publishers interacting with a large number of subscribers. In the case of a subscriber performing any sort of CRUD operation, shouldn't the communication pattern prevent more than one subscriber from handling the message? It would be less than desirable to have twenty subscribers attempt to update the same database record, for instance.
Is this just a case of a misguided sample project?
If pub/sub can be used for CRUD operations, how do you configure the framework to only allow one subscriber to perform an operation?
Am I just completely missing some basic info on the purpose of pub/sub?
Thanks for any clarification provided...
David
The scenario you refer to is usually referred to as 'competing consumers', and is quite typical of pub/sub.
If each consumer has it's own, unique queue name, each consumer will receive it's own copy of messages.
Alternatively, to get competing consumer behaviour, if consumers share the same queue name, there will be competition between the consumers for each message (so each message will only be received once)
You can have n-to-n, many-to-few, or few-to-many publishers to subscribers in any pub/sub system. It's really a matter of how many actors you want responding to a given message.
The sample project might not be the best, but we feel it shows what's going on. In real world cases though, it can be used for CRUD type behaviours; however it's more along the lines of many front ends sending "load data" type messages to middleware (cache) requesting a respond of same data. If that data gets updated on the front end somehow, it must publish some message indicating that and multiple middleware pieces need to update (cache, backend store, etc). [see CQRS]
Messaging in general is more about working with disconnected systems. Your specific world is more about the structure of consumers and publishers. I've seen implementations of MassTransit where most of the routes where static and it wasn't really pub/sub at all but just a lot of sends along a known topography of systems. Really understanding the concepts, the best book I know of is Enterprise Service Bus: Theory in Practice.
I hope this helps!
Edit: Also see our documentation, some of the concepts are touched on there.
I have a little trouble deciding which way to go for while designing the message flow in our system.
Because the volatile nature of our business processes (i.e. calculating freight costs) we use a workflow framework to be able to change the process on the fly.
The general process should look something like this
The interface is a service which connects to the customers system via whatever interface the customer provides (webservices, tcp endpoints, database polling, files, you name it). Then a command is sent to the executor containing the received data and the id of the workflow to be executed.
The first problem comes at the point where we want to distribute load on multiple worker services.
Say we have different processes like printing parcel labels, calculating prices, sending notification mails. Printing the labels should never be delayed because a ton of mailing workflows is executed. So we want to be able to route commands to different workers based on the work they do.
Because all commands are like "execute workflow XY" we would be required to implement our own content based routing. NServicebus does not support this out of the box, most times because it's an anti pattern.
Is there a better way to do this, when you are not able to use different message types to route your messages?
The second problem comes when we want to add a monitoring. Because an endpoint can only subscribe to one queue for each message type we can not let all executors just publish a "I completed a workflow" message. The current solution would be to Bus.Send the message to a pre configured auditing endpoint. This feels a little like cheating to me ;)
Is there a better way to consolidate published messages of multiple workers into one queue again? If there would not be problem #1 I think all workers could use the same input queue however this is not possible in this scenario.
You can try to make your routing not content-based, but headers-based which should be much easier. You are not interested if the workflow is to print labels or not, you are interested in whether this command is priority or not. So you can probably add this information into the message header...