PubSub example in MassTransit - c#

After reading through the pub/sub project sample in MassTransit, it left me scratching my head.
In the sample, the client application publishes a request for the subscriber application to update the password of a fictitious user. This sample code works fine, and it's easy to follow the bouncing ball of this project.
HOWEVER--
In a real-world environment, the purpose of pub/sub (in my understanding) is to have a small number of publishers interacting with a large number of subscribers. In the case of a subscriber performing any sort of CRUD operation, shouldn't the communication pattern prevent more than one subscriber from handling the message? It would be less than desirable to have twenty subscribers attempt to update the same database record, for instance.
Is this just a case of a misguided sample project?
If pub/sub can be used for CRUD operations, how do you configure the framework to only allow one subscriber to perform an operation?
Am I just completely missing some basic info on the purpose of pub/sub?
Thanks for any clarification provided...
David

The scenario you refer to is usually referred to as 'competing consumers', and is quite typical of pub/sub.
If each consumer has it's own, unique queue name, each consumer will receive it's own copy of messages.
Alternatively, to get competing consumer behaviour, if consumers share the same queue name, there will be competition between the consumers for each message (so each message will only be received once)

You can have n-to-n, many-to-few, or few-to-many publishers to subscribers in any pub/sub system. It's really a matter of how many actors you want responding to a given message.
The sample project might not be the best, but we feel it shows what's going on. In real world cases though, it can be used for CRUD type behaviours; however it's more along the lines of many front ends sending "load data" type messages to middleware (cache) requesting a respond of same data. If that data gets updated on the front end somehow, it must publish some message indicating that and multiple middleware pieces need to update (cache, backend store, etc). [see CQRS]
Messaging in general is more about working with disconnected systems. Your specific world is more about the structure of consumers and publishers. I've seen implementations of MassTransit where most of the routes where static and it wasn't really pub/sub at all but just a lot of sends along a known topography of systems. Really understanding the concepts, the best book I know of is Enterprise Service Bus: Theory in Practice.
I hope this helps!
Edit: Also see our documentation, some of the concepts are touched on there.

Related

How to guarantee delivery of messages in MassTransit?

I'm trying to figure out how to implement a fault-tolerant message publication solution using MassTransit. We'll focus on the simple scenario where we only need to commit a database change, and publish an event indicating that change. Because there is no (built-in) mechanism that allows an atomic "commit and publish", when our process crashes, we will get into an inconsistent state (some messages would only be committed to the database, and some might only be published to the message queue).
This documentation page offers a solution, where because we assume message handling is idempotent, we can rely on the entire operation to be retried in case of failure, and these partial commits will be resolved eventually. This is a great solution, but it only has one caveat: it assumes that the operation we are performing was triggered by a message, and if we won't send an ack, processing will be retried. This is not a reasonable assumption, as messaging is typically used only for internal communication inside the system, not for communication with the outside world. What should I do when I need to save-and-publish when handling an HTTP request from an external client?
One possible solution is to hack our way into the approach presented in the article, by only publishing (or sending) a message, and listening to it ourselves, then in the message handler we do the commit and the publishing of the actual event we want others to listen to. The main problem I have with this is that it assumes we never have to return anything in the HTTP response. What if we need to indicate the success or failure of the database transaction back to the HTTP client? (example: if we rely on a UNIQUE constraint to tell us whether or not we should accept the request, and we want to indicate failure to the client). We could solve it by using request-response over the message queue (with ourselves), but this is ugly and increases latency and complexity considerably, for what is actually a very common scenario.
The approach I see the most over the internet to solve this problem, is to use an outbox that is persisted to the same database we need to write to anyway, and thus we can wrap the two operations in a regular ACID database transaction. Then a background task polls this database for new events and publishes them to the message broker. Unlike other frameworks, I understand that MassTransit does not support this behavior out of the box. So I guess my question boils down to: before rushing to implement this relatively complex mechanism myself (once per database technology), is there another solution I'm missing? what is the accepted solution to this problem in the MassTransit community?
This has been asked several times, in a variety of forms, here and other places. But the short answer is simple.
In your controller, write to the message broker only. Let the consumer deal with the database, in the context of consuming a reliable message, with all the nice retry and redelivery options that are available in that context. Then you get all the benefits of the InMemoryOutbox, without adding extreme complexity related to having a third-party (HTTP, database, and broker) in a single conversation.

Reliable publish/subscribe

Is there any way to setup a publish/subscribe system (1 publisher, 0..* subscribers) where every subscriber is guaranteed to receive every message exactly once and in the same order the messages were sent? Most bus systems (e.g. NServiceBus) will not guarantee this.
I don't care if I have to implement the system myself, but at least the transactional asynchronous communication (e.g. queue, or similar) should be able to do this.
Any hints/suggestions?
In my (limited) experience you can you RabbitMQ to achieve this:
https://www.rabbitmq.com
In specific, I recommend the pubslish/subscriber tutorial that they have:
https://www.rabbitmq.com/tutorials/tutorial-three-python.html
Please note that specific tutorial does not involve the concept of confirms which would be the next step to make sure that the messages are delivered to the consumers.
As for message ordering it may also be achievable, depending on your use case, as explained in this stackoverflow post:
RabbitMQ - Message order of delivery
Hope it helps.

Is it overkill to use a service bus if all messages are sent locally?

I have a mail reading service that reads every email from an inbox, parses it and inserts it into a database. The issue I'm running into is that there is no guarantee that I will be parsing the emails in order they were received (this is a business requirement). My fix for this would be to introduce some sort of queueing system. This way I would process the items in order they came in. This would also give me the benefit of decoupling my reading of the emails and parsing/inserting them in the database.
So my question is is it overkill to use a service bus (such as NServiceBus) if I only plan on sending messages locally? Meaning that the service that would be reading emails and the service that parses/inserts emails in the database would reside on the same machine.
Thank you.
Yes, this is clearly overkill, especially since NServiceBus doesn't guarantee that messages are delivered in order.
You can just use a Queue<T>, assuming you know how to get the messages out in order (this appears to be where you are having trouble, not that you are or aren't using a queue or whatever; you have to know how to get the items into the queue in the right order to begin with).
KISS and YAGNI apply here, all day, every day.
I would just us an MSMQ for your persistence issues. Once it's in, it's guaranteed to be there, regardless of the machine losing power, or some other application crashing.
The would word I dont't like. In my opinion: make your system as much flexible as it possible, without affecting limits of acceptable performance of your application (that only you may know).
In general: be prepared to worst marketing decision you can think of.
It depends. For your application, I agree with Jason, a service bus will not help you process messages in order any more than a local data structure will. And, as Jason said, it will most likely be more difficult considering the order of messages in a service bus is not guaranteed.
However, sending messages locally with a service bus can be very useful. It makes it very easy to send messages to other processes asynchronously. Since the consumer of the message is in a different process, you don't really have any threading concerns. Messages can be durable so you don't have to worry about something being missed, and it's very easy to add additional processing for a message after-the-fact by just adding a new subscriber. As an extra bonus, if the system ever becomes too big to run comfortable on one machine, it would be trivial to distribute the bus.
For your solution, it is unnecessary and might even cause issues. But there are cases where it makes sense to use a service bus locally.
This is the kind of job where ZeroMQ works well, and the side benefit to you is that you learn how to use a tool which can be used with other languages and on other platforms as well.

MSMQ one (queue) to many (listeners) scenario

I have this scenario: One client sends a message into a msmq queue instance and there are 3 processes which listen on this queue. I want to be able to let every one of those instances pick a different message and process it.
I know that is a common usage scenario for queues and i already have working code for this using MSMQ, .NET and C#.
However i am wondering if msmq is my best option here - the documentation clearly states that MSMQ is meant for "one to one" communication, meaning that there shouldnt be more than one listener.
That kind of leaves me wondering, is what i am doing the right solution for my use case? Or is it the other way round, do i have to create one queue per listener and distribute the messages in a preceeding part of the workflow?
A link to a working example demonstrating the usage of MSMQ in this type of scenario would be greatly appreciated.
Thanks
As I understand it you are using multiple listeners to do something like load balancing. This is an absolutely valid scenario and it is often used in clustered environments or in load balancing scenarios where a single listener is not able to consume all incoming messages. Btw. clustered BizTalk consumes MSMQ messages in the same way.
The one-to-one is meant as one message is passed to one listener but it doesn't mean that each queue can have only single listener. If all listeners do the same processing and it doesn't depend which one will pick the message, it is still one-to-one.
It is also possible to use one queue to deliver one message to multiple listeners. This scenario is not recommended with MSMQ even though it is technically possible with triggers.
If your listeners listen only for messages with some special properties, identifying which listener should consume the message (i.e. you search for messages in the queue), you should definitely use three queues instead.
"the documentation clearly states that MSMQ is meant for "one to one" communication, meaning that there shouldn't be more than one listener."
You have a link for this?
MSMQ uses two delivery methods:
1-1 : one sender, one destination queue
1-M : one sender multicasting to many destination queues
Also, you can have multiple listeners on a queue.
The number of listeners is up to you.
Of course, there will be contention between multiple listeners so if you want messages to be processed only once you need to code/configure for that.
It sounds like you need a service bus - however, they tend to be somewhat heavyweight, so it might be overkill. With a service bus, you can set up publish-subscribe scenarios in which any number of listeners can subscribe to messages. NServiceBus is a service bus that is somewhat simple to use (and it is built on top of MSMQ); there is a free version of it that is capped to 30 messages per second. Rhino ESB also claims to be lightweight.

Message Granularity for Message Queues and Service Buses

I'm working on an application that may generate thousands of messages in a fairly tight loop on a client, to be processed on a server. The chain of events is something like:
Client processes item, places in local queue.
Local queue processing picks up messages and calls web service.
Web service creates message in service bus on server.
Service bus processes message to database.
The idea being that all communications are asynchronous, as there will be many clients for the web service. I know that MSMQ can do this directly, but we don't always have that kind of admin capability on the clients to set things up like security etc.
My question is about the granularity of the messages at each stage. The simplest method would mean that each item processed on the client generates one client message/web service call/service bus message. That's fine, but I know it's better for the web service calls to be batched up if possible, except there's a tradeoff between large granularity web service DTOs, versus short-running transactions on the database. This particular scenario does not require a "business transaction", where all or none items are processed, I'm just looking to achieve the best balance of message size vs. number of web service calls vs. database transactions.
Any advice?
Chatty interfaces (i.e. lots and lots of messages) will tend to have a high overhead from dispatching the incoming message (and, on the client, the reply) to the correct code to process the message (this will be a fixed cost per message). While big messages tend to use the resources in processing the message.
Additionally a lot of web service calls in progress will mean a lot of TCP/IP connections to manage, and concurrency issues (including locking in a database) might become an issue.
But without some details of the processing of the message it is hard to be specific, other than the general advice against chatty interfaces because of the fixed overheads.
Measure first, optimize later. Unless you can make a back-of-the-envelope estimate that shows that the simplest solution yields unacceptably high loads, try it, establish good supervisory measurements, see how it performs and scales. Then start thinking about how much to batch and where.
This approach, of course, requires you to be able to change the web service interface after deployment, so you need a versioning approach to deal with clients which may not have been redesigned, supporting several WS versions in parallel. But not thinking about versioning almost always traps you in suboptimal interfaces, anyway.
Abstract the message queue
and have a swappable message queue backend. This way you can test many backends and give yourself an easy bail-out should you pick the wrong one or grow to like a new one that appears. The overhead of messaging is usually packing and handling the request. Different systems are designed for different levels traffic and different symmetries over time.
If you abstract out the basic features you can swap the mechanics in and out as your needs change, or are more accurately assessed.
You can also translate messages from differing queue types at various portions of the application or message route as the recipient's stresses change because they are handling, for example 1000:1/s vs 10:1/s on a higher level.
Good Luck

Categories