I'm trying to figure out how to implement a fault-tolerant message publication solution using MassTransit. We'll focus on the simple scenario where we only need to commit a database change, and publish an event indicating that change. Because there is no (built-in) mechanism that allows an atomic "commit and publish", when our process crashes, we will get into an inconsistent state (some messages would only be committed to the database, and some might only be published to the message queue).
This documentation page offers a solution, where because we assume message handling is idempotent, we can rely on the entire operation to be retried in case of failure, and these partial commits will be resolved eventually. This is a great solution, but it only has one caveat: it assumes that the operation we are performing was triggered by a message, and if we won't send an ack, processing will be retried. This is not a reasonable assumption, as messaging is typically used only for internal communication inside the system, not for communication with the outside world. What should I do when I need to save-and-publish when handling an HTTP request from an external client?
One possible solution is to hack our way into the approach presented in the article, by only publishing (or sending) a message, and listening to it ourselves, then in the message handler we do the commit and the publishing of the actual event we want others to listen to. The main problem I have with this is that it assumes we never have to return anything in the HTTP response. What if we need to indicate the success or failure of the database transaction back to the HTTP client? (example: if we rely on a UNIQUE constraint to tell us whether or not we should accept the request, and we want to indicate failure to the client). We could solve it by using request-response over the message queue (with ourselves), but this is ugly and increases latency and complexity considerably, for what is actually a very common scenario.
The approach I see the most over the internet to solve this problem, is to use an outbox that is persisted to the same database we need to write to anyway, and thus we can wrap the two operations in a regular ACID database transaction. Then a background task polls this database for new events and publishes them to the message broker. Unlike other frameworks, I understand that MassTransit does not support this behavior out of the box. So I guess my question boils down to: before rushing to implement this relatively complex mechanism myself (once per database technology), is there another solution I'm missing? what is the accepted solution to this problem in the MassTransit community?
This has been asked several times, in a variety of forms, here and other places. But the short answer is simple.
In your controller, write to the message broker only. Let the consumer deal with the database, in the context of consuming a reliable message, with all the nice retry and redelivery options that are available in that context. Then you get all the benefits of the InMemoryOutbox, without adding extreme complexity related to having a third-party (HTTP, database, and broker) in a single conversation.
Related
We are running multiple instances of a windows service that reads messages from a Topic, runs a report, then converts the results into a PDF and emails them to a user. In case of exceptions we simply log the exception and move on.
The use case we want to handle is when the service is shut down we want to preserve the jobs that are currently running so they can be reprocessed by another instance of the service or when the service is restarted.
Is there a way of requeueing a message? The hacky solution would be to just republish the message from the consuming service, but there must be another way.
When incoming messages are processed, their data is put in an internal queue structure (not a message queue) and processed in batches of parallel threads, so the IbmMq transaction stuff seems hard to implement. Is that what I should be using though?
Your requirement seems to be hard to implement if you don't get rid of the "internal queue structure (not a message queue)" if this is not based on a transaction oriented middleware. The MQ queue / topic works well for multi-threaded consumers, so it is not apparent what you gain from this intermediate step of moving the data to just another queue. If you start your transaction with consuming the message from MQ, you can have it being rolled back when something goes wrong.
If I understood your use case correctly, you can use Durable subscriptions:
Durable subscriptions continue to exist when a subscribing application's connection to the queue manager is closed.
The details are explained in DEFINE SUB (create a durable subscription). Example:
DEFINE QLOCAL(THE.REPORTING.QUEUE) REPLACE DEFPSIST(YES)
DEFINE TOPIC(THE.REPORTING.TOPIC) REPLACE +
TOPICSTR('/Path/To/My/Interesting/Thing') DEFPSIST(YES) DURSUB(YES)
DEFINE SUB(THE.REPORTING.SUB) REPLACE +
TOPICOBJ(THE.REPORTING.TOPIC) DEST(THE.REPORTING.QUEUE)
Your service instances can consume now from THE.REPORTING.QUEUE.
While I readily admit that my knowledge is shaky, from what I understood from IBM’s [sketchy, inadequate, obtuse] documentation there really is no good built in solution. With transactions the Queue Manager assumes all is well unless it receives a roll back request and when it does it rolls back to a syncpoint, so if you’re trying to roll back to one message but two other messages have completed in the meantime it will roll back all three.
We ended up coding our own solution updating the way we’re logging messages and marking them as completed in the DB. Then on both startup and shutdown we find the uncompleted messages and programmatically publish them back to the queue, limiting the DB search by machine name so if we have multiple instances of the service running they won’t duplicate message processing.
Consider a web application that implemented every database action except querying (i.e. add, update, remove) as a NServiceBus message, so that whenever a user calls a web API, in the back-end it will be mapped to await endpointInstance.Request method to return the response in the same HTTP request connection.
The challenge is when a message handler needs to send some other messages and wait for their response to finish its job. NServiceBus does not allow to call Request inside a message handler.
I ended up using Saga to implement message handlers that are relied on some other message handler responses. But the problem with Saga is that I can't send back the result in the same HTTP request, because Saga uses publish/subscribe pattern.
All our web APIs need to be responded in the same HTTP request (connection should be kept open until the result is received or a timeout exception occurred).
Is there any clean solution (preferably without using Saga)?
An example scenario:
user call http://test.com/purchase?itemId=5&paymentId=133
web server calls await endpointInstance.Request<PurchaseResult>(new PurchaseMessage(itemId, paymentId));
PurchaseMessage handler should call await endpointInstance.Request<AddPaymentResult>(new AddPaymentMessage(paymentId));
if the AddPaymentResult was successfull, store the purchase details in the database and return true as PurchaseResult, otherwise return false
You're trying to achieve something that we (at Particular Software) are trying to actively prevent. Let me explain.
With Remote Procedure Calls (RPC) you call another component out-of-process. That what makes the procedure call 'remote'. Where with regular programming you do everything in-process and it is blazing fast, with RPC you have the overhead of serialization, latency and more. Basically, you have to deal with the fallacies of distributed computing.
Still, people do it for various reasons. Sometimes because you want to use a WebAPI (or 'old fashioned' web service) because it offers the functionality you don't want to develop. Oldest example in the book is searching for an address by postal code. Or deducting money from someone's bank account. If you're building a CRM, you can use these remote components. These days a lot of people build distributed monoliths because they are taught at conferences that this is a good thing. In an architecture diagram, it looks really nice, but there's still temporal coupling that can provide a lot of headaches.
Some of these headaches come from the fact that you're trying to do stuff in an atomic action. Back in the days, with in-process calling of code/classes/etc this was easy and fast. Until you hit limitations, like tons of locks on a database.
A solution to this is asynchronous communication. You send some information via fire-and-forget. This solves temporal coupling. Instead of having a database that is getting dozens and dozens of requests to update data, etc. and as a result, your website is grinding to a halt, you have various options to make sure this doesn't happen. This is a really good thing, because instead of a single atomic operation, you have various smaller operations and many ways to distributed work, scale your system, etc, etc.
It also brings additional challenges, because not everyone is able to work with fire-and-forget. Some systems that were already built, try to introduce asynchronous communication via messaging (and hopefully NServiceBus). Some parts can work flawlessly with this. But others parts can't. Mainly the user-interface (UI). Because it was built to get an immediate result. So when you send a message from the UI, you expect a result!
With NServiceBus we've built a package called "Client-Side Callbacks" to make exactly this a possibility. We highly recommend our customers not to use it, except for this specific scenario that I just described. It is much better to migrate your entire UI to be able to deal with the fact that you don't receive an immediate answer, but we understand this is so much work, that not many will be able to achieve this.
However once that first message was sent and the UI received a result, there is no need to use callbacks anymore. As a result I'd like to propose this scenario:
use call http://test.com/purchase?itemId=5&paymentId=133
web server calls await endpointInstance.Request<PurchaseResult>();
PurchaseMessage handler retrieves info it needs and sends or publishes a message to (an)other component(s) and then replies back to the web server with an answer.
The next handler works with the send/published message and continues the process
Let us know if you need more information. You can always contact us by sending an email to support#particular.net
At the minute I am trying to put together an asynchronous tcp server to receive data which I then want to process, extracting values and inserting to sql server.
The basic concept I thought would be best is once the data is received and confirmed as the entire message, the message should then be passed of to some sort of collection to await processing on a FIFO basis, which will parse the values and insert them to sql server. I suppose this is whats known as the consumer/producer pattern.
I have been doing some looking into the best collection / way of doing this and have so far seen the BlockingCollection,ConcurrentCollection and BufferBlock using async/await and i think this may be the way to go but to be honest im not sure.
The best example i have found is on Stephen Cleary's blog in particular this article,
http://blog.stephencleary.com/2012/11/async-producerconsumer-queue-using.html
My main reservations are that I in no way want to slow down or interrupt the receiving of messages which to me would suggest using the multiple producer/consumer example which can be seen at the above link, but what i want to know is;
Am i correct in this assumption or is there a more suitable way of doing this in my scenario.
And if im correct in my assumption could anyone suggest the best way of implementing this taking into consideration my use case.
Any and all help is much appreciated.
At the minute I am trying to put together an asynchronous tcp server to receive data which I then want to process, extracting values and inserting to sql server.
There's a common pitfall with this kind of scenario. It is usually wrong to report success back to the client when the work has yet to be done. Most of the time I've seen this design, it's because of an efficiency "requirement" self-imposed by the developer, not by the client or for technical reasons. So first, take a step back and make absolutely sure that you do want to return a "successful completion" message to the client when the operation has not actually completed yet.
If you are sure that's what you want to do, then there's another question you must ask: is it acceptable to lose requests? That is, after you tell the client that the operation successfully completed, will the system still be stable if the operation does not actually ever complete?
The answer to that question is usually "no." At that point, the most common architectural solution is to have an out-of-process reliable queue (such as an Azure queue or MSMQ), with an independent backend (such as an Azure worker role or Win32 service) that processes the queue messages. This definitely complicates the architecture, but it is a necessary complication if the system must return completion messages early and must not lose messages.
On the other hand, if losing messages is acceptable, then you can keep them in-memory. It is only in this case that you can use one of the in-memory producer/consumer types mentioned on my blog. This is a very rare situation, but it does happen from time to time.
In general, I would avoid using BlockingCollection and friends for this sort of work. Doing so encourages you to architect the entire system into a single process, which is the enemy of scalability and reliability.
I second Stephen Cleary's suggestion of using an out-of-process queue to manage the work. I disagree that this necessarily complicates the architecture, though - in fact, I think it can make things quite a bit simpler. Specifically, a major complication of the original requirement ("put together an asynchronous tcp server") disappears. Asynchronous TCP servers are a pain in the butt to write and easy to screw up - why not just skip that part altogether and be free to focus all of your energy on the post-processing code?
When I built a system like this, I used a Redis List as the task queue. Tasks were serialized to JSON, and clients would add their task to the queue with a RPUSH command. Worker processes retrieve the next task from the queue BLPOP, do their thing, then go back to waiting for the next task.
Advantages:
No locks. All synchronization comes for free from Redis (or whatever task queue you choose).
Everything in the system is single-threaded. Multi-threading is hard.
I'm free to spin up as many worker processes as I want, across as many nodes as I want.
After reading through the pub/sub project sample in MassTransit, it left me scratching my head.
In the sample, the client application publishes a request for the subscriber application to update the password of a fictitious user. This sample code works fine, and it's easy to follow the bouncing ball of this project.
HOWEVER--
In a real-world environment, the purpose of pub/sub (in my understanding) is to have a small number of publishers interacting with a large number of subscribers. In the case of a subscriber performing any sort of CRUD operation, shouldn't the communication pattern prevent more than one subscriber from handling the message? It would be less than desirable to have twenty subscribers attempt to update the same database record, for instance.
Is this just a case of a misguided sample project?
If pub/sub can be used for CRUD operations, how do you configure the framework to only allow one subscriber to perform an operation?
Am I just completely missing some basic info on the purpose of pub/sub?
Thanks for any clarification provided...
David
The scenario you refer to is usually referred to as 'competing consumers', and is quite typical of pub/sub.
If each consumer has it's own, unique queue name, each consumer will receive it's own copy of messages.
Alternatively, to get competing consumer behaviour, if consumers share the same queue name, there will be competition between the consumers for each message (so each message will only be received once)
You can have n-to-n, many-to-few, or few-to-many publishers to subscribers in any pub/sub system. It's really a matter of how many actors you want responding to a given message.
The sample project might not be the best, but we feel it shows what's going on. In real world cases though, it can be used for CRUD type behaviours; however it's more along the lines of many front ends sending "load data" type messages to middleware (cache) requesting a respond of same data. If that data gets updated on the front end somehow, it must publish some message indicating that and multiple middleware pieces need to update (cache, backend store, etc). [see CQRS]
Messaging in general is more about working with disconnected systems. Your specific world is more about the structure of consumers and publishers. I've seen implementations of MassTransit where most of the routes where static and it wasn't really pub/sub at all but just a lot of sends along a known topography of systems. Really understanding the concepts, the best book I know of is Enterprise Service Bus: Theory in Practice.
I hope this helps!
Edit: Also see our documentation, some of the concepts are touched on there.
I'm working on an application that may generate thousands of messages in a fairly tight loop on a client, to be processed on a server. The chain of events is something like:
Client processes item, places in local queue.
Local queue processing picks up messages and calls web service.
Web service creates message in service bus on server.
Service bus processes message to database.
The idea being that all communications are asynchronous, as there will be many clients for the web service. I know that MSMQ can do this directly, but we don't always have that kind of admin capability on the clients to set things up like security etc.
My question is about the granularity of the messages at each stage. The simplest method would mean that each item processed on the client generates one client message/web service call/service bus message. That's fine, but I know it's better for the web service calls to be batched up if possible, except there's a tradeoff between large granularity web service DTOs, versus short-running transactions on the database. This particular scenario does not require a "business transaction", where all or none items are processed, I'm just looking to achieve the best balance of message size vs. number of web service calls vs. database transactions.
Any advice?
Chatty interfaces (i.e. lots and lots of messages) will tend to have a high overhead from dispatching the incoming message (and, on the client, the reply) to the correct code to process the message (this will be a fixed cost per message). While big messages tend to use the resources in processing the message.
Additionally a lot of web service calls in progress will mean a lot of TCP/IP connections to manage, and concurrency issues (including locking in a database) might become an issue.
But without some details of the processing of the message it is hard to be specific, other than the general advice against chatty interfaces because of the fixed overheads.
Measure first, optimize later. Unless you can make a back-of-the-envelope estimate that shows that the simplest solution yields unacceptably high loads, try it, establish good supervisory measurements, see how it performs and scales. Then start thinking about how much to batch and where.
This approach, of course, requires you to be able to change the web service interface after deployment, so you need a versioning approach to deal with clients which may not have been redesigned, supporting several WS versions in parallel. But not thinking about versioning almost always traps you in suboptimal interfaces, anyway.
Abstract the message queue
and have a swappable message queue backend. This way you can test many backends and give yourself an easy bail-out should you pick the wrong one or grow to like a new one that appears. The overhead of messaging is usually packing and handling the request. Different systems are designed for different levels traffic and different symmetries over time.
If you abstract out the basic features you can swap the mechanics in and out as your needs change, or are more accurately assessed.
You can also translate messages from differing queue types at various portions of the application or message route as the recipient's stresses change because they are handling, for example 1000:1/s vs 10:1/s on a higher level.
Good Luck