I understand that RabbitMQ with ack, by default, will re-queue the message if it detects that the consumer/worker has died.
What about the situation where the consumer/worker is still alive but its process has stalled out for too long and didn't ack?
I would like to set an explicit time that says that if a message has been dispatched to a consumer but that consumer has held the message without ack for too long that the message gets re-queued.
I recognize that this might result in messages getting processed in duplicate but sometimes the consequence of that is not as bad as delayed message delivery.
It can also happen with errant exception handling if something get swallowed, the task terminates, and the message is never ack'd and never re-queued.
Timeout for RabbitMQ consumer could be explicitly set on the consumer side. I think this is clear but just to mention - there must not be any automatic ACKs in this case. The solution would be that the consumer is multithreaded with one thread doing message processing and ACKing the message only after it has been processed, and the other thread being a timeout thread that would:
terminate the connection to broker once the timeout has expired, and
as a consequence the message would be requeued
ACK the received message and re-publish it (explicitly)
NACK the received message, but based on the documentation (instructing the broker to either discard them or requeue them), it seems that some config should be set instructing the broker what should it do with NACKed messages
Now all this implies that at least some part of the process isn't stuck. If the whole process is stuck, perhaps the broker heartbeat towards the consumer is stopped and that is how the broker knows that the consumer died (honestly I didn't test this situation, so I'm assuming), but if this is not the case (or simply to be extra safe) you could add some kind of a watchdog process that would be pinging the consumer(s) and killing them if there's no reply, which again would lead to the messages not being ACKed and being requeued.
Related
I have a data queue with numbered messages that must be processed in order. When the subscriber receives a message (basicconsume with manual ack) and determines it missed a message I'm trying to do the following
Nack the message with requeue
Stop listening to data queue (basiccancel)
Start listening to error queue to get missed message, process msg (basicconsume)
Stop listening to error queue (basiccancel)
Start listening to data queue (basicconsume)
When I do this, the NACK message (#1) is immediately picked backup before the Stop Listening (#2). It seems like this should be straight forward and I'm just missing something. I'd like to avoid putting the Nack message into a 3rd queue, but it is what it is.
Thanks
I'm having a similar situation described here, but cannot comment there because just registered on this site.
A workaround for "pausing" with SetNumberOfWorkers(0) works in most cases. However, if SetNumberOfWorkers(0) is called during a lengthy message handler, I receive the following error at the end of the message handler:
An error occurred when attempting to complete the transaction context
Rebus.Exceptions.RebusApplicationException: Could not complete message with ID <...> and lock token <...> ---> Microsoft.Azure.ServiceBus.MessageLockLostException: The lock supplied is invalid. Either the lock expired, or the message has already been removed from the queue.
Note, that "Worker Rebus 1 worker 1 stopped" messages are received for all workers almost immediately after calling SetNumberOfWorkers(0) despite handler is still running.
After bringing number of workers back to normal all further messages throw a similar error at the end of the handler.
Any advice how to correctly deal with the pause of rebus?
(I need to pause because my microservice requires to periodically updating some resources and handlers can't run during those update)
I'd like to write parallel execution module based on Solace. And I use request-reply schema for this.
I have:
Multiple message consumers, which publish messages into the same queue.
Multiple message producers, which read queue and create reply messages.
Message execution time is between 10 seconds to 10 minutes.
Queue access type is non-exclusive (e.g. it does round-robin between all consumers).
Each producer and consumer is asynchronous, e.g. Solace API blocks execution during the connection only.
What I'd like to have: if produces works on the message, it should not receive any other messages. This is extremely important, because some tasks blocks executor for several minutes, however other executors can be free after couple of seconds.
Scheme below can be workable (possible), however blocking code appears below. I'd like to avoid it.
while(true)
{
var inputMessage = flow.ReceiveMsg( /*timeout 1s*/1_000); // <--- blocking code, I'd like to avoid it
flow.Ack(inputMessage.ADMessageId);
var reply = await ProcessMessageAsync(inputMessage); // execute plus handle exceptions
session.SendReply(inputMessage, reply)
}
Messages are only pushed to the consuming applications.
That being said, your desired behavior can be obtained by setting the "max-delivered-unacked-msgs-per-flow" on your queue to 1.
This means that each consumer bound to the queue is only allowed to have 1 outstanding unacknowledged messages.
The next message will be only sent to the consumer after it has acknowledged the message.
Details about this feature can be found here.
Do note that your code snippet does not appear to be valid.
IFlow.ReceiveMsg is only used in transacted sessions, which makes use of ITransactedSession.Commit to acknowledge messages.
The template code when you create a worker role with a queue client provides a message pump implementation. The code has a comment in it saying:
// Initiates the message pump and callback is invoked for each message that is received, calling close on the client will stop the pump.
sourceClient.OnMessage(received =>
{
//blah blah implementation
});
What actually happens when you call close() on the sourceClient? Do messages that are currently being processed continue? I.e. is this a graceful shutdown of the message pump? Or will calling close affect messages that are currently being processed by the message pump?
The documentation would lead me to believe it is, but there is this outstanding feedback item which would imply that there is no graceful shutdown mechanism for a message pump: https://feedback.azure.com/forums/216926-service-bus/suggestions/4345733-provide-gracefull-shutdown-feature-to-message-pump
So what does souceClient.close() actually do?
In the full framework client (WindowsAzure.ServiceBus) QueueClient does not stop message pump gracefully. Messages in flight that were not completed will have their delivery count increased.
So what does souceClient.close() actually do?
That client is a closed source project. Best guess would be to raise an issue for it here.
How to configure MassTransit to retry context.Publish() before failing, for example when RabbitMQ server is temporary unavailable?
The problem with retry in this context is that the only real reason a Publish call would fail is if the broker connection was lost (for any reason: network, etc.).
In that case, the connection which was used to receive the message is also lost, meaning that another node connected to the broker may have already picked up the message. So a retry in this case would be bad, since it would reconnect to the broker and send, but then the message could not be acknowledged (since it was likely picked up on another thread/worker).
The usual course of action here is to let it fail, and when the receive endpoint reconnects, the message will be redelivered to a consumer which will then call Publish and reach the desired outcome.
You should make sure that your consumer can handle this (search for idempotent) properly to avoid a failure causing a break in your business logic.
Updated Jan 2022: Since v7, MassTransit retries all publish/send calls until the cancellationToken is canceled.