.Net RabbitMQ client Subscriber.Next hangs - c#

I am using RabbitMQ .net client in a windows service. I have millions of messages coming in bulk which then get processed and then the output is put on another queue. I am creating the connection factory with a heartbeat of 30 and then creating a connection whenever a connection or subscriber is lost. In production, my code probably works in most cases. However, in my integration tests, I know it is failing most of the time. Here is my code:
public void ReceiveAll(Func<IDictionary<ulong, byte[]>, IOnStreamWatchResult> onReceiveAllCallback, int batchSize, CancellationToken cancellationToken)
{
IModel channel = null;
Subscription subscription = null;
while (!cancellationToken.IsCancellationRequested)
{
if (subscription == null || subscription.Model.IsClosed)
{
channel = _channelFactory.CreateChannel(ref _connection, _messageQueueConfig, _connectionFactory);
// This instructs the channel to not prefetch more than batch count into shared queue
channel.BasicQos(0, Convert.ToUInt16(batchSize), false);
subscription = new Subscription(channel, _messageQueueConfig.Queue, false);
}
try
{
BasicDeliverEventArgs message;
var dequeuedMessages = new Dictionary<ulong, byte[]>();
do
{
if (subscription.Next(_messageQueueConfig.DequeueTimeout.Milliseconds, out message))
{
if (message == null)
{
// This means channel is closed and the messages in shared queue would get moved back to ready state
DisposeChannelAndSubcription(ref channel, ref subscription);
ReceiveAll(onReceiveAllCallback, batchSize, cancellationToken);
}
else
{
dequeuedMessages.Add(message.DeliveryTag, message.Body);
}
}
} while (message != null && batchSize > dequeuedMessages.Count && !cancellationToken.IsCancellationRequested);
if (cancellationToken.IsCancellationRequested)
{
if (dequeuedMessages.Any())
{
NackUnProcessedMessages(subscription, dequeuedMessages.Keys);
}
DisposeChannelAndSubcription(ref channel, ref subscription);
dequeuedMessages.Clear();
break;
}
try
{
var onStreamWatchResult = onReceiveAllCallback(dequeuedMessages);
AckProcessedMessages(subscription, onStreamWatchResult.Processed);
NackUnProcessedMessages(subscription, onStreamWatchResult.UnProcessed);
dequeuedMessages.Clear();
}
catch(Exception unhandledException)
{
NackUnProcessedMessages(subscription, dequeuedMessages.Keys);
}
}
catch (EndOfStreamException endOfStreamException)
{
DisposeChannelAndSubcription(ref channel, ref subscription);
}
catch (OperationInterruptedException operationInterruptedException)
{
DisposeChannelAndSubcription(ref channel, ref subscription);
}
}
}
The batch size is set to 4 because I put 4 messages in my integration test, which is just a windows service that I ran after running unit tests.
The issue here is that almost always, the subscriber pre-fetches 4 messages, as expected, returns true for the first two .Next iterations, but after that it returns false. I believe that is happening because my messages are not getting unacked properly. In my integration test, I ack 2 and nack 2 messages and then read the 2 nacked messages again to clear the queue. However, after nacking, the messages are not returned to ready state and hence the test hangs. What am I doing wrong here? Am I not understanding something from the nacking documentation? Here is my nacking code:
subscription.Model.BasicNack(deliveryTag, false, true);

Related

TCP server connection causing processor overload

I have a TCP/IP server that is supposed to allow a connection to remain open as messages are sent across it. However, it seems that some clients open a new connection for each message, which causes the CPU usage to max out. I tried to fix this by adding a time-out but still seem to have the problem occasionally. I suspect that my solution was not the best choice, but I'm not sure what would be.
Below is my basic code with logging, error handling and processing removed.
private void StartListening()
{
try
{
_tcpListener = new TcpListener( IPAddress.Any, _settings.Port );
_tcpListener.Start();
while (DeviceState == State.Running)
{
var incomingConnection = _tcpListener.AcceptTcpClient();
var processThread = new Thread( ReceiveMessage );
processThread.Start( incomingConnection );
}
}
catch (Exception e)
{
// Unfortunately, a SocketException is expected when stopping AcceptTcpClient
if (DeviceState == State.Running) { throw; }
}
finally { _tcpListener?.Stop(); }
}
I believe the actual issue is that multiple process threads are being created, but are not being closed. Below is the code for ReceiveMessage.
private void ReceiveMessage( object IncomingConnection )
{
var buffer = new byte[_settings.BufferSize];
int bytesReceived = 0;
var messageData = String.Empty;
bool isConnected = true;
using (TcpClient connection = (TcpClient)IncomingConnection)
using (NetworkStream netStream = connection.GetStream())
{
netStream.ReadTimeout = 1000;
try
{
while (DeviceState == State.Running && isConnected)
{
// An IOException will be thrown and captured if no message comes in each second. This is the
// only way to send a signal to close the connection when shutting down. The exception is caught,
// and the connection is checked to confirm that it is still open. If it is, and the Router has
// not been shut down, the server will continue listening.
try { bytesReceived = netStream.Read( buffer, 0, buffer.Length ); }
catch (IOException e)
{
if (e.InnerException is SocketException se && se.SocketErrorCode == SocketError.TimedOut)
{
bytesReceived = 0;
if(GlobalSettings.IsLeaveConnectionOpen)
isConnected = GetConnectionState(connection);
else
isConnected = false;
}
else
throw;
}
if (bytesReceived > 0)
{
messageData += Encoding.UTF8.GetString( buffer, 0, bytesReceived );
string ack = ProcessMessage( messageData );
var writeBuffer = Encoding.UTF8.GetBytes( ack );
if (netStream.CanWrite) { netStream.Write( writeBuffer, 0, writeBuffer.Length ); }
messageData = String.Empty;
}
}
}
catch (Exception e) { ... }
finally { FileLogger.Log( "Closing the message stream.", Verbose.Debug, DeviceName ); }
}
}
For most clients the code is running correctly, but there are a few that seem to create a new connection for each message. I suspect that the issue lies around how I handle the IOException. For the systems that fail, the code does not seem to reach the finally statement until 30 seconds after the first message comes in, and each message creates a new ReceiveMessage thread. So the logs will show messages coming in, and 30 seconds in it will start to show multiple messages about the message stream being closed.
Below is how I check the connection, in case this is important.
public static bool GetConnectionState( TcpClient tcpClient )
{
var state = IPGlobalProperties.GetIPGlobalProperties()
.GetActiveTcpConnections()
.FirstOrDefault( x => x.LocalEndPoint.Equals( tcpClient.Client.LocalEndPoint )
&& x.RemoteEndPoint.Equals( tcpClient.Client.RemoteEndPoint ) );
return state != null ? state.State == TcpState.Established : false;
}
You're reinventing the wheel (in a worse way) at quite a few levels:
You're doing pseudo-blocking sockets. That combined with creating a whole new thread for every connection in an OS like Linux which doesn't have real threads can get expensive fast. Instead you should create a pure blocking socket with no read timeout (-1) and just listen on it. Unlike UDP, TCP will catch the connection being terminated by the client without you needing to poll for it.
And the reason why you seem to be doing the above is that you reinvent the standard Keep-Alive TCP mechanism. It's already written and works efficiently, simply use it. And as a bonus, the standard Keep-Alive mechanism is on the client side, not the server side, so even less processing for you.
Edit: And 3. You really need to cache the threads you so painstakingly created. The system thread pool won't suffice if you have that many long-term connections with a single socket communication per thread, but you can build your own expandable thread pool. You can also share multiple sockets on one thread using select, but that's going to change your logic quite a bit.

Queue contains 2 Active Messages, But ReceiveBatch and Receive return null

I am trying to write some unit tests to verify all of my queue operations are working as expected, but I have run into the strangest situation:
I have the following code in my [TestInitialize] method:
var ns = NamespaceManager.CreateFromConnectionString(config.ServiceBusConnectionString);
var queueDescription = ns.GetQueue("happy-birthday");
Client = QueueClient.CreateFromConnectionString(config.QueueConnectionString, ReceiveMode.ReceiveAndDelete);
if (queueDescription.MessageCount > 0)
{
while (Client.Peek() != null)
{
var msg = Client.Receive();
msg.Complete();
}
}
My queue has a few Active Messages (confirmed with the queueDescription object) and the Azure portal confirms there should be two active messages that should be "completed" by the code above. However, Client.Receive() just stalls the code for a 30 second wait, then returns null.
I do not understand why the Client.Peek() returns the message, but when I called Client.Receive() i get a null returned.
I identified the problem was due to my assumption of what "Deferred" means.
I assumed deferred was the way I en-queued the same message back into the queue, when in fact deferred messages are set aside, and must be processed directly by retrieving the message by sequence number.
I was able to retrieve the message following these steps:
Peeked at the message
Confirmed State == Deferred
used the peek message to get the SequenceNumber, and retrieve it directly from the queue
mark the message as complete
this was the way I was able to get the message queue empty.
NameSpace = NamespaceManager.CreateFromConnectionString(ConnectionString);
var queueInfo = NameSpace.GetQueue("happy-birthday");
Client = QueueClient.CreateFromConnectionString(connectionString, "happy-birthday");
if (queueInfo.MessageCount > 0)
{
var message = Client.Peek();
while (message != null)
{
if (message.State == MessageState.Deferred)
{
message = Client.Receive(message.SequenceNumber);
}
else
{
message = Client.Receive();
}
message.Complete();
message = Client.Peek();
}
}

Detecting dropped connections

I have a server and many clients. Server needs to know when client disconnects ungracefully (doesn't send TCP FIN), so that it doesn't have hanging connection and other disposable objects associated with this client.
Anyway, I read this and decided to go with adding a "keepalive message to the application protocol" (contains only header bytes) and "explicit timer assuming the worst" methods from the linked blog.
When client connects (btw I am using TcpListener and TcpClient), server starts a System.Threading.Timer that counts down 30 seconds. Whenever server receives something from that client, it resets the timer. When timer reaches 0, it disconnects user and disposes whatever it needs to dispose. Clients application also has a timer and when user doesn't send anything for 15 seconds (half of the server's value, just to be sure), it sends the keepalive message.
My question is, is there easier way to achieve this? Maybe some option on TcpClient? I tried with TcpClient.ReceiveTimeout, but that doesn't seem to work with ReadAsync.
As Stephen points out using heartbeat messages in the application protocol is the only surefire method of ensuring that the connection is alive and that both applications are operating correctly. be warned that many an engineer has created a heartbeat thread that continues to operate even when the application threads have failed.
Using the classes here will solve your asynchronous socket question.
public sealed class SocketAwaitable : INotifyCompletion
{
private readonly static Action SENTINEL = () => { };
internal bool m_wasCompleted;
internal Action m_continuation;
internal SocketAsyncEventArgs m_eventArgs;
public SocketAwaitable(SocketAsyncEventArgs eventArgs)
{
if (eventArgs == null) throw new ArgumentNullException("eventArgs");
m_eventArgs = eventArgs;
eventArgs.Completed += delegate
{
var prev = m_continuation ?? Interlocked.CompareExchange(
ref m_continuation, SENTINEL, null);
if (prev != null) prev();
};
}
internal void Reset()
{
m_wasCompleted = false;
m_continuation = null;
}
public SocketAwaitable GetAwaiter() { return this; }
public bool IsCompleted { get { return m_wasCompleted; } }
public void OnCompleted(Action continuation)
{
if (m_continuation == SENTINEL ||
Interlocked.CompareExchange(
ref m_continuation, continuation, null) == SENTINEL)
{
Task.Run(continuation);
}
}
public void GetResult()
{
if (m_eventArgs.SocketError != SocketError.Success)
throw new SocketException((int)m_eventArgs.SocketError);
}
}

Re-queue message on exception

I'm looking for a solid way of re-queuing messages that couldn't be handled properly - at this time.
I've been looking at http://dotnetcodr.com/2014/06/16/rabbitmq-in-net-c-basic-error-handling-in-receiver/ and it seems that it's supported to requeue messages in the RabbitMQ API.
else //reject the message but push back to queue for later re-try
{
Console.WriteLine("Rejecting message and putting it back to the queue: {0}", message);
model.BasicReject(deliveryArguments.DeliveryTag, true);
}
However I'm using EasyNetQ.
So wondering how I would do something similar here.
bus.Subscribe<MyMessage>("my_subscription_id", msg => {
try
{
// do work... could be long running
}
catch ()
{
// something went wrong - requeue message
}
});
Is this even a good approach? Not ACK the message could cause problems if do work exceeds the wait for ACK timeout by the RabbitMQ server.
So I came up with this solution. Which replaces the default error strategy by EasyNetQ.
public class DeadLetterStrategy : DefaultConsumerErrorStrategy
{
public DeadLetterStrategy(IConnectionFactory connectionFactory, ISerializer serializer, IEasyNetQLogger logger, IConventions conventions, ITypeNameSerializer typeNameSerializer)
: base(connectionFactory, serializer, logger, conventions, typeNameSerializer)
{
}
public override AckStrategy HandleConsumerError(ConsumerExecutionContext context, Exception exception)
{
object deathHeaderObject;
if (!context.Properties.Headers.TryGetValue("x-death", out deathHeaderObject))
return AckStrategies.NackWithoutRequeue;
var deathHeaders = deathHeaderObject as IList;
if (deathHeaders == null)
return AckStrategies.NackWithoutRequeue;
var retries = 0;
foreach (IDictionary header in deathHeaders)
{
var count = int.Parse(header["count"].ToString());
retries += count;
}
if (retries < 3)
return AckStrategies.NackWithoutRequeue;
return base.HandleConsumerError(context, exception);
}
}
You replace it like this:
RabbitHutch.CreateBus("host=localhost", serviceRegister => serviceRegister.Register<IConsumerErrorStrategy, DeadLetterStrategy>())
You have to use the AdvancedBus so you have to setup everything up manually.
using (var bus = RabbitHutch.CreateBus("host=localhost", serviceRegister => serviceRegister.Register<IConsumerErrorStrategy, DeadLetterStrategy>()))
{
var deadExchange = bus.Advanced.ExchangeDeclare("exchange.text.dead", ExchangeType.Direct);
var textExchange = bus.Advanced.ExchangeDeclare("exchange.text", ExchangeType.Direct);
var queue = bus.Advanced.QueueDeclare("queue.text", deadLetterExchange: deadExchange.Name);
bus.Advanced.Bind(deadExchange, queue, "");
bus.Advanced.Bind(textExchange, queue, "");
bus.Advanced.Consume<TextMessage>(queue, (message, info) => HandleTextMessage(message, info));
}
This will dead letter a failed message 3 times. After that it'll go to the default error queue provided by EasyNetQ for error handling. You can subscribe to that queue.
A message is dead lettered when an exception propagates out of your consumer method. So this would trigger a dead letter.
static void HandleTextMessage(IMessage<TextMessage> textMessage, MessageReceivedInfo info)
{
throw new Exception("This is a test!");
}
to the best of my knowledge, there is no way to manually ack, nack or reject a message with EasyNetQ.
I see you have opened an issue ticket with the EasyNetQ team, regarding this... but no answer, yet.
FWIW, this is a very appropriate thing to do. All of the libraries that I use support this feature set (in NodeJS) and it is common. I'm surprised EasyNetQ doesn't support this.

Rabbit MQ unack message not back to queue for consumer to process again

I use RabbitMQ as my queue message server, I use .NET C# client.
When there is error in processing message from queue, message will not ackknowleage and still stuck in queue not be processed again as the document I understand.
I don't know if I miss some configurations or block of codes.
My idea now is auto manual ack the message if error and manual push this message to queue again.
I hope to have another better solution.
Thank you so much.
my code
public void Subscribe(string queueName)
{
while (!Cancelled)
{
try
{
if (subscription == null)
{
try
{
//try to open connection
connection = connectionFactory.CreateConnection();
}
catch (BrokerUnreachableException ex)
{
//You probably want to log the error and cancel after N tries,
//otherwise start the loop over to try to connect again after a second or so.
log.Error(ex);
continue;
}
//crate chanel
channel = connection.CreateModel();
// This instructs the channel not to prefetch more than one message
channel.BasicQos(0, 1, false);
// Create a new, durable exchange
channel.ExchangeDeclare(exchangeName, ExchangeType.Direct, true, false, null);
// Create a new, durable queue
channel.QueueDeclare(queueName, true, false, false, null);
// Bind the queue to the exchange
channel.QueueBind(queueName, exchangeName, queueName);
//create subscription
subscription = new Subscription(channel, queueName, false);
}
BasicDeliverEventArgs eventArgs;
var gotMessage = subscription.Next(250, out eventArgs);//250 millisecond
if (gotMessage)
{
if (eventArgs == null)
{
//This means the connection is closed.
DisposeAllConnectionObjects();
continue;//move to new iterate
}
//process message
channel.BasicAck(eventArgs.DeliveryTag, false);
}
}
catch (OperationInterruptedException ex)
{
log.Error(ex);
DisposeAllConnectionObjects();
}
}
DisposeAllConnectionObjects();
}
private void DisposeAllConnectionObjects()
{
//dispose subscription
if (subscription != null)
{
//IDisposable is implemented explicitly for some reason.
((IDisposable)subscription).Dispose();
subscription = null;
}
//dipose channel
if (channel != null)
{
channel.Dispose();
channel = null;
}
//check if connection is not null and dispose it
if (connection != null)
{
try
{
connection.Dispose();
}
catch (EndOfStreamException ex)
{
log.Error(ex);
}
catch (OperationInterruptedException ex)//handle this get error from dispose connection
{
log.Error(ex);
}
catch (Exception ex)
{
log.Error(ex);
}
connection = null;
}
}
I think you may have misunderstood the RabbitMQ documentation. If a message does not get ack'ed from the consumer, the Rabbit broker will requeue the message onto the queue for consumption.
I don't believe your suggested method for ack'ing and then requeuing a message is a good idea, and will just make the problem more complex.
If you want to explicitly "reject" a message because the consumer had a problem processing it, you could use the Nack feature of Rabbit.
For example, within your catch exception blocks, you could use:
subscription.Model.BasicNack(eventArgs.DeliveryTag, false, true);
This will inform the Rabbit broker to requeue the message. Basically, you pass the delivery tag, false to say it is not multiple messages, and true to requeue the message.
If you want to reject the message and NOT requeue, just change true to false.
Additionally, you have created a subscription, so I think you should perform your ack's directly on this, not through the channel.
Change:
channel.BasicAck(eventArgs.DeliveryTag, false);
To:
subscription.Ack();
This method of ack'ing is much cleaner since you are then keeping everything subscription-related on the subscription object, rather than messing around with the channel that you've already subscribed to.

Categories