I am trying to implement exponential backoff with azure service bus.
Basically i have a catch block and if any error currently what i am doing is i am asking it to retry after every 1 second and i am awaiting that.
Goal :
I want to use exponential delay. So basically after each retry i want exponentially increase the seconds and i dont want it to wait. Till then it can process other messages.
Current catch block looks like :
catch (Exception ex)
{
_logger.Error(ex, $"Failed to process request {requestId}");
totalAttempts++;
if (totalAttempts == MaxAttempts)
return new Response { Error = ex.ToString() };
await Task.Delay(TimeSpan.FromSeconds(1));
}
I tried the below but it is not exponentially increasing the time. I am using this for the first time .
while (true)
try
{
executemethod();
}
catch (Exception ex)
{
_logger.Error(ex, $"Failed to process request {requestId}");
totalAttempts++;
if (totalAttempts == MaxAttempts)
return new Response { Error = ex.ToString() };
queueClient.RetryPolicy = new RetryExponential(TimeSpan.FromSeconds(5), TimeSpan.FromSeconds(30), 10);
}
I am not very much sure if i am doing correct.
Goal : need to retry with a back-off strategy and make the processing thread available for other messages while waiting for the next retry.Consider exponential back-off with dead-lettering after reaching max delivery attempts.
Many Azure-oriented .NET libraries implement Retry internally. Service Bus client has it built-in as well. Check out Retry guidance for Azure services. As per documentation:
When using the built-in RetryExponential implementation, do not
implement a fallback operation as the policy reacts to Server Busy
exceptions and automatically switches to an appropriate retry mode.
Other from that, you shall set policy before you make a request and not in the process.
For retry policies, you can also use an external library like Polly.
Related
I have implemented backoff exponential retry. So basically if there is any exception i clone the message and then i re-submit it to the queue by adding some delay.
Now i am facing 2 issues - 1) i see that the delivery count is not increasing when i clone and resubmit back to queue
2) I want to move it to deadletter if the max delivery count is reached.
Code :
catch (Exception ex)
{
_logger.Error(ex, $"Failed to process request {requestId}");
var clone = messageResult.Message.Clone();
clone.ScheduledEnqueueTimeUtc = DateTime.UtcNow.AddSeconds(45);
await messageResult.ResendMessage(clone);
if (retryCount == MaxAttempts)
{
//messageResult.dea
}
return new PdfResponse { Error = ex.ToString() };
}
please help me on this
When you clone a message it becomes a new message, that means system properties are not cloned which gives the cloned message a fresh delivery count starting at 1 again. See also https://docs.azure.cn/zh-cn/dotnet/api/microsoft.azure.servicebus.message.clone?view=azure-dotnet
You can look into the Peek Lock Feature of Azure Service Bus. When using PeekLock the message gets invisible on the queue until you explicitly abandon it (put it back to the queue with delivery count increased) or complete if everything works out as expected when processing the message. Another option is to explicitly dead letter this message.
The feature is documented here: https://learn.microsoft.com/en-us/azure/service-bus-messaging/message-transfers-locks-settlement#peeklock
But the important thing about this is that if you do not perform any of the above mentioned actions such as cloning Azure Service Bus will automatically make the message visible again after a defined interval (the LockDuration property) or when you abandon it.
So to get a delayed retry and dead letter behaviour (when maximum delivery count has been reached) you can use the following options:
Option 1. Retry via Azure service bus auto-unlock
When processing of the message cannot be performed at the moment for some reason catch the exception and make sure none of the mentioned actions (abandon, complete or deadletter) are performed. This will keep the message invisible for the remaining time and will make it again visible after the configured lock duration has been reached. And the delivery count will also be increased by Azure Service Bus as expected.
Option 2. Implement your own retry policy
Perform your own retry policy in your code and retry processing of the message. If your maximum retries have been reached abandon the message which will make it visible again for the next queue reading step after the retry time has been reached. In this case the delivery count is increased as well.
Note: If you choose option 2.) make sure your retry period will conform to the defined LockDuration so that your message will not be visible again on the queue if you are still processing it with retries. You could also renew the lock between retries by calling the RenewLock() method on the message between retries.
If you implement the retry policy in your code I recommend using into Polly .Net which already gives you great features such as Retry and Circuit Breaker policies. See https://github.com/App-vNext/Polly
I have a WPF application in which i want to return list of data or any data when user call it. Also i need to call WCF service to get data. What if service is down for any reason and i want to fixed broken service or wait for service alive and return the data. Let me show you what i am doing:
public List<MyData> GetMyData()
{
try
{
var data =GetOrCreateChannel().GetMyData(); //GetOrCreateChannel method create WCF service channel
return data;
}
catch(Exception ex)
{
_log.error(ex);
FixedBrokenService()
return GetMyData(); //Call again this method.
}
}
In above method, if service is not running, it will go to catch block and again call the same method until unless service is down. Whenever service get alive, it will return the data. I want to know is this approach is fine or not? What if service is down for 2 to 3 hour it wil recursivly call method and the stack size in memory will increasing. Is there any other approach?
What if service is down for 2 to 3 hour it wil recursivly call method and the stack size in memory will increasing. Is there any other approach?
I think you're asking because you already sense there might be some other way to improve what you've got so far; my guess is you're looking for some standard.
If so, I'd recommend Google's Exponential backoff guideline, here applied to Google Maps calls.
The idea is to introduce a delay between subsequent calls to the web service, increasing it in case of repeated failures.
A simple change would be:
public List<MyData> GetMyData()
{
List<MyData> data = null;
int delayMilliseconds = 100;
bool waitingForResults = true;
while (waitingForResults)
{
try
{
data = GetOrCreateChannel().GetMyData();
waitingForResults = false; // if this executes, you've got your data and can exit
}
catch (Exception ex)
{
_log.error(ex);
FixedBrokenService();
Thread.Sleep(delayMilliseconds); // wait before retrying
delayMilliseconds = delayMilliseconds * 2; // increase your delay
}
}
return data;
}
This way you won't have to deal with recursion either; don't forget to add
using System.Threading; to the top.
Since you mentioned WPF, we might want to take Jeroen's suggestion and wait in another thread: this means that your WPF GUI won't be frozen while you try reconnecting, but it will be enabled and perhaps show a spinner, a wait message or something like that (e.g. "Reconnecting in x seconds").
This requires changing the second to last line, i.e. Thread.Sleep(delayMilliseconds); to Wait(delayMilliseconds); and adding these two methods below GetMyData:
private async static Task Wait(int delayMilliseconds)
{
await WaitAsync(delayMilliseconds);
}
private static Task WaitAsync(int delayMilliseconds)
{
Thread.Sleep(delayMilliseconds);
return new Task(() => { });
}
Try using a wcf client with ClientBase (there are tons of examples). You can register to an event of the InnerChannel named InnerChannel.Faulted. When that event is called it means the service has failed somehow.
Instead if immediately retrying to connect in the catch you can write a separate thread which retries to connect with the client when the service has gone down.
As everything fail one day or the other. Are there any recommendations/best practices on how to handle errors when publishing messages to Amazon SQS?
I am running the Amazon .NET SDK and send a couple of 1000 SQS messages a day. It hasnt come to my attention that publishing has failed but that could be that any problem hasent surfaced.
However, how should I handle an error in the following basic code (pretty much a straight forward usage example from the SDK documentation):
public static string sendSqs(string data)
{
IAmazonSQS sqs = AWSClientFactory.CreateAmazonSQSClient(RegionEndpoint.EUWest1);
SendMessageRequest sendMessageRequest = new SendMessageRequest();
CreateQueueRequest sqsRequest = new CreateQueueRequest();
sqsRequest.QueueName = "mySqsQueue";
CreateQueueResponse createQueueResponse = sqs.CreateQueue(sqsRequest);
sendMessageRequest.QueueUrl = createQueueResponse.QueueUrl;
sendMessageRequest.MessageBody = data;
SendMessageResponse sendMessageresponse = sqs.SendMessage(sendMessageRequest);
return sendMessageresponse.MessageId;
}
First (kinda unrelated) I would recommend separating the client from the send message:
public class QueueStuff{
private static IAmazonSQS SQS;
//Get only one of these
public QueueStuff(){
SQS = AWSClientFactory.CreateAmazonSQSClient(RegionEndpoint.EUWest1);
}
//...use SQS elsewhere...
Finally to answer your question: check the Common Errors and SendMessage (in your case) pages and catch relevant exceptions. What you do will depend on your app and how it should handle losing messages. An example might be:
public static string sendSqs(string data)
{
SendMessageRequest sendMessageRequest = new SendMessageRequest();
CreateQueueRequest sqsRequest = new CreateQueueRequest();
sqsRequest.QueueName = "mySqsQueue";
CreateQueueResponse createQueueResponse = sqs.CreateQueue(sqsRequest);
sendMessageRequest.QueueUrl = createQueueResponse.QueueUrl;
sendMessageRequest.MessageBody = data;
try{
SendMessageResponse sendMessageresponse = SQS.SendMessage(sendMessageRequest);
catch(InvalidMessageContents ex){ //Catch or bubble the exception up.
//I can't do anything about this so toss the message...
LOGGER.log("Invalid data in request: "+data, ex);
return null;
} catch(Throttling ex){ //I can do something about this!
//Exponential backoff...
}
return sendMessageresponse.MessageId;
}
Exceptions like Throttling or ServiceUnavailable are ones commonly overlooked but can be handled properly. Its commonly recommended that for things like these you implement an Exponential Backoff. When you're throttled you backoff until the service is available again. An example of implementation and usage in Java: https://gist.github.com/alph486/f123ea139e6ea56e696f .
You shouldn't need to do much of your own error handling at all; the AWS SDK for .NET handles retries for transient failures under the hood.
It will automatically retry any request that fails if:
your access to the AWS service is being throttled
the request times out
the HTTP connection fails
It uses an exponential backoff strategy for multiple retries. On the first failure, it sleeps for 400 ms, then tries again. If that attempt fails, it sleeps for 1600 ms before trying again. If that fails, it sleeps for 6400 ms, and so on, to a maximum of 30 seconds.
When the configured maximum number of retries is reached, the SDK will throw. You can configure the maximum number of retries like this:
var sqsClient = AWSClientFactory.CreateAmazonSQSClient(
new AmazonSQSConfig
{
MaxErrorRetry = 4 // the default is 4.
});
If the API call ends up throwing, it means that something is really wrong, like SQS has gone down in your region, or your request is invalid.
Source: The AWS SDK for .NET Source Code on GitHub.
I am reading from a REST service and need to handle "Wait and retry" for a heavily used service that will give me an error:
Too many queries per second
or
Server Busy
Generally speaking, since I have many REST services to call, how can I generically handle backoff logic that would occur when an exception occurs?
Is there any framework that has this built in? I'm just looking to write clean code that doesn't worry too much about plumbing and infrastructure.
You can wrap the attempt up within a method that handles the retry logic for you. For example, if you're using WebClient's async methods:
public async Task<T> RetryQuery<T>(Func<Task<T>> operation, int numberOfAttempts, int msecsBetweenRetries = 500)
{
while (numberOfAttempts > 0)
{
try
{
T value = await operation();
return value;
}
catch
{
// Failed case - retry
--numberOfAttempts;
}
await Task.Delay(msecsBetweenRetries);
}
throw new ApplicationException("Operation failed repeatedly");
}
You could then use this via:
// Try 3 times with 500 ms wait times in between
string result = await RetryQuery(async () => webClient.DownloadStringTaskAsync(url), 3);
Try and determine how many active requests can be active at a time and use a Semaphore.
It is a way to handle resource locking where the are multiple identical resources, but only a limited number of them.
Here's the MSDN documentation on semaphores
I recommend you look into the Transient Fault Handling Application Block, part of the Enterprise Library.
In the past, the EL has IMO been over-engineered and not that useful, but they've taken steps to address that; the TFHAB is one of the newer blocks that follows better design guidelines (again, IMO).
I'm writing a server for a game, and I want to be able to handle thousands of concurrent users. For this reason, I went with non-blocking sockets and use the poll method. However, I do create multiple threads to handle database and web calls, and some of these threads will send a response to the user. In one of these threads, on send, I get the error "A non-blocking socket operation could not be completed immediately". What could cause this problem? I imagine it's because a poll is occurring at the same time as send is called. If I used beginAsync, would it take stop this error? I thought about locking the socket, but I don't want my main thread to be blocked for this.
I don't know what kind of non-blocking-polling socket calls are you using, but I would recommend that you use the Async socket calls (instead of the Begin). For more information on the difference between Async calls vs Begin see: What's the difference between BeginConnect and ConnectAsync?
The asynchronous calls automatically do "polling" on the OS level, which will be much more efficient than your polling. As a matter of fact, they use IO completion ports, which are probably the fastest and most efficient thing you can use on Windows to handle a large amount of client connections/requests.
As far as the error, I would consider this to be the normal operation of non-blocking sockets, so you just have to handle it gracefully.
Update
Your server should probably do something like this:
// Process the accept for the socket listener.
private void ProcessAccept(SocketAsyncEventArgs e)
{
Socket s = e.AcceptSocket;
if (s.Connected)
{
try
{
SocketAsyncEventArgs readEventArgs = this.readWritePool.Pop();
if (readEventArgs != null)
{
// Get the socket for the accepted client connection and put it into the
// ReadEventArg object user token.
readEventArgs.UserToken = new Token(s, this.bufferSize);
Interlocked.Increment(ref this.numConnectedSockets);
Console.WriteLine("Client connection accepted.
There are {0} clients connected to the server",
this.numConnectedSockets);
if (!s.ReceiveAsync(readEventArgs))
{
this.ProcessReceive(readEventArgs);
}
}
else
{
Console.WriteLine("There are no more available sockets to allocate.");
}
}
catch (SocketException ex)
{
Token token = e.UserToken as Token;
Console.WriteLine("Error when processing data received from {0}:\r\n{1}",
token.Connection.RemoteEndPoint, ex.ToString());
}
catch (Exception ex)
{
Console.WriteLine(ex.ToString());
}
// Accept the next connection request.
this.StartAccept(e);
}
}
Code sample courtesy of code project: http://www.codeproject.com/Articles/22918/How-To-Use-the-SocketAsyncEventArgs-Class
When a non-blocking socket tries to read data but finds none you get that error: the socket would like to wait for data but can't because it has to return immediately, being non-blocking.
I'd suggest you switch to blocking sockets, find out why data is missing, adjust accordingly then revert to non-blocking ones. Or, you could handle the error and retry the operation.
I was also receiving this exception on sending data and just found the solution.
You get the exception because the socket's send buffer is full. Because you are trying to send the data via a non-blocking send, the exception is raised to let you know that you MUST send it via a blocking send.
The data is not sent once the exception is raised, so you have to resend it. Your individual send call now becomes;
try
{
m_socket.Send(buffer, bufferSize, SocketFlags.None);
}
catch (SocketException e)
{
if(e.SocketErrorCode == WouldBlock)
{
m_socket.Blocking = true;
m_socket.Send(buffer, bufferSize, SocketFlags.None);
m_socket.Blocking = false;
}
}
It would also be a good idea to increase the socket's SendBufferSize. By default I think it is 8kb. For my needs I had to increase it to 2MB, and afterwards the Send call no longer threw that exception.
This exception is too general. Per MSDN,
If you receive a SocketException, use the SocketException.ErrorCode property to obtain the specific error code. After you have obtained this code, refer to the Windows Sockets version 2 API error code documentation in the MSDN library for a detailed description of the error.
Sockets error codes are here.