Consume the same message again if processing of the message fails - c#

I am using Confluent.Kafka .NET client version 1.3.0. I am following the docs:
var consumerConfig = new ConsumerConfig
{
BootstrapServers = "server1, server2",
AutoOffsetReset = AutoOffsetReset.Earliest,
EnableAutoCommit = true,
EnableAutoOffsetStore = false,
GroupId = this.groupId,
SecurityProtocol = SecurityProtocol.SaslPlaintext,
SaslMechanism = SaslMechanism.Plain,
SaslUsername = this.kafkaUsername,
SaslPassword = this.kafkaPassword,
};
using (var consumer = new ConsumerBuilder<Ignore, string>(consumerConfig).Build())
{
var cancellationToken = new CancellationTokenSource();
Console.CancelKeyPress += (_, e) =>
{
e.Cancel = true;
cancellationToken.Cancel();
};
consumer.Subscribe("my-topic");
while (true)
{
try
{
var consumerResult = consumer.Consume();
// process message
consumer.StoreOffset(consumerResult);
}
catch (ConsumeException e)
{
// log
}
catch (KafkaException e)
{
// log
}
catch (OperationCanceledException e)
{
// log
}
}
}
The problem is that even if I comment out the line consumer.StoreOffset(consumerResult);, I keep getting the next unconsumed message the next time I Consume, i.e. the offset keeps increasing which doesn't seem to be what the documentation claims it does, i.e. at least one delivery.
Even if I set EnableAutoCommit = false and remove 'EnableAutoOffsetStore = false' from the config, and replace consumer.StoreOffset(consumerResult) with consumer.Commit(), I still see the same behavior, i.e. even if I comment out the Commit, I still keep getting the next unconsumed messages.
I feel like I am missing something fundamental here, but can't figure what. Any help is appreciated!

You may want to have a re-try logic for processing each of your messages for a fixed number of times like say 5. If it doesn't succeed during these 5 retries, you may want to add this message to another topic for handling all failed messages which take precedence over your actual topic. Or you may want to add the failed message to the same topic so that it will be picked up later once all those other messages are consumed.
If the processing of any message is successful within those 5 retries, you can skip to the next message in the queue.

Sorry I can't add comment yet.
Kafka consumer consumes message in batchs, so maybe you still iterate through the batch pre-fetched by background thread.
You can check whether your consumer really commit offset or not using kafka util kafka-consumer-groups.sh
kafka-consumer-groups.sh --bootstrap-server kafka-host:9092 --group consumer_group --describe

I had the same situation, and here is my solution:
Set a configuration of max retries for each operation.
For consuming, just retry.
For Saving, re-assign the current offset, and then retry.
Here is the code:
var saveRetries = 0;
var consumeRetries = 0;
ConsumeResult<string, string> consumeResult;
while (true)
{
try
{
consumeResult = consumer.Consume();
consumeRetries = 0;
}
catch (ConsumeException e)
{
//Log and retry to consume, up to {MaxConsumeRetries} times
if (consumeRetries++ >= MaxConsumeRetries)
{
throw new OperationCanceledException($"Too many consume retries ({MaxConsumeRetries}). Please check configuration and run the service agian.");
}
continue;
}
catch (OperationCanceledException oe)
{
//Log
consumer.Close();
break;
}
try
{
SaveResult(consumeResult);
saveRetries = 0;
}
catch (ArgumentException ae)
{
//Log and retry to save, up to {MaxSaveRetries} times
if (saveRetries++ < MaxSaveRetries)
{
//Assign the same offset, and try again.
consumer.Assign(consumeResult.TopicPartitionOffset);
continue;
}
}
try
{
consumer.StoreOffset(consumeResult);
}
catch (KafkaException ke)
{
//Log and let it continue
}
}

Related

TCP server connection causing processor overload

I have a TCP/IP server that is supposed to allow a connection to remain open as messages are sent across it. However, it seems that some clients open a new connection for each message, which causes the CPU usage to max out. I tried to fix this by adding a time-out but still seem to have the problem occasionally. I suspect that my solution was not the best choice, but I'm not sure what would be.
Below is my basic code with logging, error handling and processing removed.
private void StartListening()
{
try
{
_tcpListener = new TcpListener( IPAddress.Any, _settings.Port );
_tcpListener.Start();
while (DeviceState == State.Running)
{
var incomingConnection = _tcpListener.AcceptTcpClient();
var processThread = new Thread( ReceiveMessage );
processThread.Start( incomingConnection );
}
}
catch (Exception e)
{
// Unfortunately, a SocketException is expected when stopping AcceptTcpClient
if (DeviceState == State.Running) { throw; }
}
finally { _tcpListener?.Stop(); }
}
I believe the actual issue is that multiple process threads are being created, but are not being closed. Below is the code for ReceiveMessage.
private void ReceiveMessage( object IncomingConnection )
{
var buffer = new byte[_settings.BufferSize];
int bytesReceived = 0;
var messageData = String.Empty;
bool isConnected = true;
using (TcpClient connection = (TcpClient)IncomingConnection)
using (NetworkStream netStream = connection.GetStream())
{
netStream.ReadTimeout = 1000;
try
{
while (DeviceState == State.Running && isConnected)
{
// An IOException will be thrown and captured if no message comes in each second. This is the
// only way to send a signal to close the connection when shutting down. The exception is caught,
// and the connection is checked to confirm that it is still open. If it is, and the Router has
// not been shut down, the server will continue listening.
try { bytesReceived = netStream.Read( buffer, 0, buffer.Length ); }
catch (IOException e)
{
if (e.InnerException is SocketException se && se.SocketErrorCode == SocketError.TimedOut)
{
bytesReceived = 0;
if(GlobalSettings.IsLeaveConnectionOpen)
isConnected = GetConnectionState(connection);
else
isConnected = false;
}
else
throw;
}
if (bytesReceived > 0)
{
messageData += Encoding.UTF8.GetString( buffer, 0, bytesReceived );
string ack = ProcessMessage( messageData );
var writeBuffer = Encoding.UTF8.GetBytes( ack );
if (netStream.CanWrite) { netStream.Write( writeBuffer, 0, writeBuffer.Length ); }
messageData = String.Empty;
}
}
}
catch (Exception e) { ... }
finally { FileLogger.Log( "Closing the message stream.", Verbose.Debug, DeviceName ); }
}
}
For most clients the code is running correctly, but there are a few that seem to create a new connection for each message. I suspect that the issue lies around how I handle the IOException. For the systems that fail, the code does not seem to reach the finally statement until 30 seconds after the first message comes in, and each message creates a new ReceiveMessage thread. So the logs will show messages coming in, and 30 seconds in it will start to show multiple messages about the message stream being closed.
Below is how I check the connection, in case this is important.
public static bool GetConnectionState( TcpClient tcpClient )
{
var state = IPGlobalProperties.GetIPGlobalProperties()
.GetActiveTcpConnections()
.FirstOrDefault( x => x.LocalEndPoint.Equals( tcpClient.Client.LocalEndPoint )
&& x.RemoteEndPoint.Equals( tcpClient.Client.RemoteEndPoint ) );
return state != null ? state.State == TcpState.Established : false;
}
You're reinventing the wheel (in a worse way) at quite a few levels:
You're doing pseudo-blocking sockets. That combined with creating a whole new thread for every connection in an OS like Linux which doesn't have real threads can get expensive fast. Instead you should create a pure blocking socket with no read timeout (-1) and just listen on it. Unlike UDP, TCP will catch the connection being terminated by the client without you needing to poll for it.
And the reason why you seem to be doing the above is that you reinvent the standard Keep-Alive TCP mechanism. It's already written and works efficiently, simply use it. And as a bonus, the standard Keep-Alive mechanism is on the client side, not the server side, so even less processing for you.
Edit: And 3. You really need to cache the threads you so painstakingly created. The system thread pool won't suffice if you have that many long-term connections with a single socket communication per thread, but you can build your own expandable thread pool. You can also share multiple sockets on one thread using select, but that's going to change your logic quite a bit.

RabbitMQ with manual ack throwing "Shared Queue Closed"

The consumer code below (not too far removed from the worker sample) throws System.IO.EndOfStreamException ("SharedQueue closed") after a couple of iterations of a single message being nacked.
public void Consume ()
{
using (var connection = connectionFactory.CreateConnection ()) {
using (var channel = connection.CreateModel ()) {
channel.QueueDeclare (queueName, true, false, false, null);
// 0= “Dont send me a new message untill I’ve finshed”,
// 1= “Send me one message at a time”
channel.BasicQos (0, 1, false);
var consumer = new QueueingBasicConsumer (channel);
channel.BasicConsume (queueName, false, consumer);
Console.WriteLine (" [*] Waiting for messages. " +
"To exit press CTRL+C");
while (true) {
BasicDeliverEventArgs ea;
try {
// block until a message can be dequeue
ea = (BasicDeliverEventArgs)consumer.Queue.Dequeue ();
var body = ea.Body;
Console.WriteLine (" [x] Received, executing");
T thing = messageSerializer.Deserialize<T> (body);
try {
executor.DynamicInvoke (thing);
} catch {
channel.BasicNack (ea.DeliveryTag, false, true);
}
}
channel.BasicAck (ea.DeliveryTag, false);
} catch (OperationCanceledException) {
logger.Error ("Bugger");
}
}
}
}
I've read a few google results but AKAIK this typically happens when the stream has been closed due to manually acking when auto ack is set?
Thanks in advance.
Doh!. I guess I'll leave this question in place in the rare event anybody else is as stupid as I am. Note in the code above that the BasicNack will occur and execution will then continue on to the Nack operation. This is the (obvious) cause of the issue. I need spanking.
This Exception is thrown when the channel is closed.
It's happening to me too only when I am debuging my code after a couple of messages
So my guess is that we are getting timeout to the channel after a while.
Check your Rabbit configurations

Rabbit MQ unack message not back to queue for consumer to process again

I use RabbitMQ as my queue message server, I use .NET C# client.
When there is error in processing message from queue, message will not ackknowleage and still stuck in queue not be processed again as the document I understand.
I don't know if I miss some configurations or block of codes.
My idea now is auto manual ack the message if error and manual push this message to queue again.
I hope to have another better solution.
Thank you so much.
my code
public void Subscribe(string queueName)
{
while (!Cancelled)
{
try
{
if (subscription == null)
{
try
{
//try to open connection
connection = connectionFactory.CreateConnection();
}
catch (BrokerUnreachableException ex)
{
//You probably want to log the error and cancel after N tries,
//otherwise start the loop over to try to connect again after a second or so.
log.Error(ex);
continue;
}
//crate chanel
channel = connection.CreateModel();
// This instructs the channel not to prefetch more than one message
channel.BasicQos(0, 1, false);
// Create a new, durable exchange
channel.ExchangeDeclare(exchangeName, ExchangeType.Direct, true, false, null);
// Create a new, durable queue
channel.QueueDeclare(queueName, true, false, false, null);
// Bind the queue to the exchange
channel.QueueBind(queueName, exchangeName, queueName);
//create subscription
subscription = new Subscription(channel, queueName, false);
}
BasicDeliverEventArgs eventArgs;
var gotMessage = subscription.Next(250, out eventArgs);//250 millisecond
if (gotMessage)
{
if (eventArgs == null)
{
//This means the connection is closed.
DisposeAllConnectionObjects();
continue;//move to new iterate
}
//process message
channel.BasicAck(eventArgs.DeliveryTag, false);
}
}
catch (OperationInterruptedException ex)
{
log.Error(ex);
DisposeAllConnectionObjects();
}
}
DisposeAllConnectionObjects();
}
private void DisposeAllConnectionObjects()
{
//dispose subscription
if (subscription != null)
{
//IDisposable is implemented explicitly for some reason.
((IDisposable)subscription).Dispose();
subscription = null;
}
//dipose channel
if (channel != null)
{
channel.Dispose();
channel = null;
}
//check if connection is not null and dispose it
if (connection != null)
{
try
{
connection.Dispose();
}
catch (EndOfStreamException ex)
{
log.Error(ex);
}
catch (OperationInterruptedException ex)//handle this get error from dispose connection
{
log.Error(ex);
}
catch (Exception ex)
{
log.Error(ex);
}
connection = null;
}
}
I think you may have misunderstood the RabbitMQ documentation. If a message does not get ack'ed from the consumer, the Rabbit broker will requeue the message onto the queue for consumption.
I don't believe your suggested method for ack'ing and then requeuing a message is a good idea, and will just make the problem more complex.
If you want to explicitly "reject" a message because the consumer had a problem processing it, you could use the Nack feature of Rabbit.
For example, within your catch exception blocks, you could use:
subscription.Model.BasicNack(eventArgs.DeliveryTag, false, true);
This will inform the Rabbit broker to requeue the message. Basically, you pass the delivery tag, false to say it is not multiple messages, and true to requeue the message.
If you want to reject the message and NOT requeue, just change true to false.
Additionally, you have created a subscription, so I think you should perform your ack's directly on this, not through the channel.
Change:
channel.BasicAck(eventArgs.DeliveryTag, false);
To:
subscription.Ack();
This method of ack'ing is much cleaner since you are then keeping everything subscription-related on the subscription object, rather than messing around with the channel that you've already subscribed to.

Keep trying to talk to server when the Internet is down

So my application is exchanging request/responses with a server (no problems), until the internet connection dies for a couple of seconds, then comes back. Then a code like this:
response = (HttpWebResponse)request.GetResponse();
will throw an exception, with a status like ReceiveFailure, ConnectFailure, KeepAliveFailure etc.
Now, it's quite important that if the internet connection comes back, I am able to continue communicating with the server, otherwise I'd have to start again from the beginning and that will take a long time.
How would you go about resuming this communication when the internet is back?
At the moment, I keep on checking for a possibility to communicate with the server, until it is possible (at least theoretically). My code attempt looks like this:
try
{
response = (HttpWebResponse)request.GetResponse();
}
catch (WebException ex)
{
// We have a problem receiving stuff from the server.
// We'll keep on trying for a while
if (ex.Status == WebExceptionStatus.ReceiveFailure ||
ex.Status == WebExceptionStatus.ConnectFailure ||
ex.Status == WebExceptionStatus.KeepAliveFailure)
{
bool stillNoInternet = true;
// keep trying to talk to the server
while (stillNoInternet)
{
try
{
response = (HttpWebResponse)request.GetResponse();
stillNoInternet = false;
}
catch
{
stillNoInternet = true;
}
}
}
}
However, the problem is that the second try-catch statement keeps throwing an exception even when the internet is back.
What am I doing wrong? Is there another way to go about fixing this?
Thanks!
You should recreate the request each time, and you should execute the retries in a loop with a wait between each retry. The wait time should progressively increase with each failure.
E.g.
ExecuteWithRetry (delegate {
// retry the whole connection attempt each time
HttpWebRequest request = ...;
response = request.GetResponse();
...
});
private void ExecuteWithRetry (Action action) {
// Use a maximum count, we don't want to loop forever
// Alternativly, you could use a time based limit (eg, try for up to 30 minutes)
const int maxRetries = 5;
bool done = false;
int attempts = 0;
while (!done) {
attempts++;
try {
action ();
done = true;
} catch (WebException ex) {
if (!IsRetryable (ex)) {
throw;
}
if (attempts >= maxRetries) {
throw;
}
// Back-off and retry a bit later, don't just repeatedly hammer the connection
Thread.Sleep (SleepTime (attempts));
}
}
}
private int SleepTime (int retryCount) {
// I just made these times up, chose correct values depending on your needs.
// Progressivly increase the wait time as the number of attempts increase.
switch (retryCount) {
case 0: return 0;
case 1: return 1000;
case 2: return 5000;
case 3: return 10000;
default: return 30000;
}
}
private bool IsRetryable (WebException ex) {
return
ex.Status == WebExceptionStatus.ReceiveFailure ||
ex.Status == WebExceptionStatus.ConnectFailure ||
ex.Status == WebExceptionStatus.KeepAliveFailure;
}
I think what you are trying to do is this:
HttpWebResponse RetryGetResponse(HttpWebRequest request)
{
while (true)
{
try
{
return (HttpWebResponse)request.GetResponse();
}
catch (WebException ex)
{
if (ex.Status != WebExceptionStatus.ReceiveFailure &&
ex.Status != WebExceptionStatus.ConnectFailure &&
ex.Status != WebExceptionStatus.KeepAliveFailure)
{
throw;
}
}
}
}
When you want to retry something on failure then instead of thinking of this as something that you want to do when something fails, think of it instead as looping until you succeed. (or a failure that you don't want to retry on). The above will keep on retrying until either a response is returned or a different exception is thrown.
It would also be a good idea to introduce a maximum retry limit of some sort (for example stop retrying after 1 hour).
If it's still doing it when you get the connection back - then my guess is that it's simply returning the same result again.
You might want to to try recreating the request anew each time, other than that I don't see much wrong with the code or logic. Apart from the fact that you're forever blocking this thread. But then that might be okay if this is, in itself, a worker thread.

How to handle WCF exceptions (consolidated list with code)

I'm attempting to extend this answer on SO to make a WCF client retry on transient network failures and handle other situations that require a retry such as authentication expiration.
Question:
What are the WCF exceptions that need to be handled, and what is the correct way to handle them?
Here are a few sample techniques that I'm hoping to see instead of or in addition to proxy.abort():
Delay X seconds prior to retry
Close and recreate a New() WCF client. Dispose the old one.
Don't retry and rethrow this error
Retry N times, then throw
Since it's unlikely one person knows all the exceptions or ways to resolve them, do share what you know. I'll aggregate the answers and approaches in the code sample below.
// USAGE SAMPLE
//int newOrderId = 0; // need a value for definite assignment
//Service<IOrderService>.Use(orderService=>
//{
// newOrderId = orderService.PlaceOrder(request);
//}
/// <summary>
/// A safe WCF Proxy suitable when sessionmode=false
/// </summary>
/// <param name="codeBlock"></param>
public static void Use(UseServiceDelegateVoid<T> codeBlock)
{
IClientChannel proxy = (IClientChannel)_channelFactory.CreateChannel();
bool success = false;
try
{
codeBlock((T)proxy);
proxy.Close();
success = true;
}
catch (CommunicationObjectAbortedException e)
{
// Object should be discarded if this is reached.
// Debugging discovered the following exception here:
// "Connection can not be established because it has been aborted"
throw e;
}
catch (CommunicationObjectFaultedException e)
{
throw e;
}
catch (MessageSecurityException e)
{
throw e;
}
catch (ChannelTerminatedException)
{
proxy.Abort(); // Possibly retry?
}
catch (ServerTooBusyException)
{
proxy.Abort(); // Possibly retry?
}
catch (EndpointNotFoundException)
{
proxy.Abort(); // Possibly retry?
}
catch (FaultException)
{
proxy.Abort();
}
catch (CommunicationException)
{
proxy.Abort();
}
catch (TimeoutException)
{
// Sample error found during debug:
// The message could not be transferred within the allotted timeout of
// 00:01:00. There was no space available in the reliable channel's
// transfer window. The time allotted to this operation may have been a
// portion of a longer timeout.
proxy.Abort();
}
catch (ObjectDisposedException )
{
//todo: handle this duplex callback exception. Occurs when client disappears.
// Source: https://stackoverflow.com/questions/1427926/detecting-client-death-in-wcf-duplex-contracts/1428238#1428238
}
finally
{
if (!success)
{
proxy.Abort();
}
}
}
EDIT: There seems to be some inefficiencies with closing and reopening the client multiple times. I'm exploring solutions here and will update & expand this code if one is found. (or if David Khaykin posts an answer I'll mark it as accepted)
After tinkering around with this for a few years, the code below is my preferred strategy (after seeing this blog posting from the wayback machine) for dealing with WCF retries and handling exceptions.
I investigated every exception, what I would want to do with that exception, and noticed a common trait; every exception that needed a "retry" inherited from a common base class. I also noticed that every permFail exception that put the client into an invalid state also came from a shared base class.
The following example traps every WCF exception a client could through, and is extensible for your own custom channel errors.
Sample WCF Client Usage
Once you generate your client side proxy, this is all you need to implement it.
Service<IOrderService>.Use(orderService=>
{
orderService.PlaceOrder(request);
}
ServiceDelegate.cs
Add this file to your solution. No changes are needed to this file, unless you want to alter the number of retries or what exceptions you want to handle.
public delegate void UseServiceDelegate<T>(T proxy);
public static class Service<T>
{
public static ChannelFactory<T> _channelFactory = new ChannelFactory<T>("");
public static void Use(UseServiceDelegate<T> codeBlock)
{
IClientChannel proxy = null;
bool success = false;
Exception mostRecentEx = null;
int millsecondsToSleep = 1000;
for(int i=0; i<5; i++) // Attempt a maximum of 5 times
{
// Proxy cann't be reused
proxy = (IClientChannel)_channelFactory.CreateChannel();
try
{
codeBlock((T)proxy);
proxy.Close();
success = true;
break;
}
catch (FaultException customFaultEx)
{
mostRecentEx = customFaultEx;
proxy.Abort();
// Custom resolution for this app-level exception
Thread.Sleep(millsecondsToSleep * (i + 1));
}
// The following is typically thrown on the client when a channel is terminated due to the server closing the connection.
catch (ChannelTerminatedException cte)
{
mostRecentEx = cte;
proxy.Abort();
// delay (backoff) and retry
Thread.Sleep(millsecondsToSleep * (i + 1));
}
// The following is thrown when a remote endpoint could not be found or reached. The endpoint may not be found or
// reachable because the remote endpoint is down, the remote endpoint is unreachable, or because the remote network is unreachable.
catch (EndpointNotFoundException enfe)
{
mostRecentEx = enfe;
proxy.Abort();
// delay (backoff) and retry
Thread.Sleep(millsecondsToSleep * (i + 1));
}
// The following exception that is thrown when a server is too busy to accept a message.
catch (ServerTooBusyException stbe)
{
mostRecentEx = stbe;
proxy.Abort();
// delay (backoff) and retry
Thread.Sleep(millsecondsToSleep * (i + 1));
}
catch (TimeoutException timeoutEx)
{
mostRecentEx = timeoutEx;
proxy.Abort();
// delay (backoff) and retry
Thread.Sleep(millsecondsToSleep * (i + 1));
}
catch (CommunicationException comException)
{
mostRecentEx = comException;
proxy.Abort();
// delay (backoff) and retry
Thread.Sleep(millsecondsToSleep * (i + 1));
}
catch(Exception e)
{
// rethrow any other exception not defined here
// You may want to define a custom Exception class to pass information such as failure count, and failure type
proxy.Abort();
throw e;
}
}
if (success == false && mostRecentEx != null)
{
proxy.Abort();
throw new Exception("WCF call failed after 5 retries.", mostRecentEx );
}
}
}
I started a project on Codeplex that has the following features
Allows efficient reuse of the client proxy
Cleans up all resources, including EventHandlers
Operates on Duplex channels
Operates on Per-call services
Supports config constructor, or by factory
http://smartwcfclient.codeplex.com/
It is a work in progress, and is very heavily commented. I'll appreciate any feedback regarding improving it.
Sample usage when in instance mode:
var reusableSW = new LC.Utils.WCF.ServiceWrapper<IProcessDataDuplex>(channelFactory);
reusableSW.Reuse(client =>
{
client.CheckIn(count.ToString());
});
reusableSW.Dispose();
we have a WCF client that deal with almost any type of failure at the server. The Catch list is very long but does not have to be. If you look closely, you will see that many exceptions are child definitions of the Exception Class (and a few other classes).
Thus you can simplify things a lot if you want to. That said, here are some typical errors that we catch:
Server timeout
Server too busy
Server unavailable.
Below links may help to handle WCF Exceptions:
http://www.codeproject.com/KB/WCF/WCFErrorHandling.aspx
http://msdn.microsoft.com/en-us/library/cc949036.aspx

Categories