How to programmatically mark an Azure WebJob as failed?

How to programmatically mark an Azure WebJob as failed? - c#

Is there a way to mark a WebJob (triggered, not continuous) as failed, without throwing an exception? I need to check that certain conditions are true to mark the job as successful.

According to Azure WebJob SDK, Code from TriggeredFunctionExecutor class.
public async Task<FunctionResult> TryExecuteAsync(TriggeredFunctionData input, CancellationToken cancellationToken)
{
IFunctionInstance instance = _instanceFactory.Create((TTriggerValue)input.TriggerValue, input.ParentId);
IDelayedException exception = await _executor.TryExecuteAsync(instance, cancellationToken);
FunctionResult result = exception != null ?
new FunctionResult(exception.Exception)
: new FunctionResult(true);
return result;
}
We know that the WebJobs status depends on whether your WebJob/Function is executed without any exceptions or not. We can't set the finial status of a running WebJob programmatically.
I need to check that certain conditions are true to mark the job as successful.
Throw an exception is the only way I found. Or you could store the webjob execute result in an additional place(For example, Azure Table Storage). We can get the current invocation id by ExecutionContext class. In your webjob, you could save the current invocation id and the status you wanted to an Azure Table Storage. You could query the status later if you needed from Azure Table Storage based on the invocation id.
public static void ProcessQueueMessage([QueueTrigger("myqueue")] string message, ExecutionContext context, TextWriter log)
{
log.WriteLine(message);
SaveStatusToTableStorage(context.InvocationId, "Fail/Success");
}
To use ExecutionContext as parameter, you need to install Azure WebJobs SDK Extensions using NuGet and invoke UserCore method before your run your WebJob.
var config = new JobHostConfiguration();
config.UseCore();
var host = new JobHost(config);
host.RunAndBlock();

Throwing an unmanaged exception will result in a Failed execution.
But i have noticed that it will also result with a bad management of your message: i.e. your message will be dequeued but not moved to your poison queue regarding your configuration (but maybe it was due to my SDK version).

#Jean NETR-VALERE the newer versions of the WebJobs packages do act as you say and if an exception is thrown the job will fail, and will continue to be run over and over and over until you finally clear your queue. This is absolutely horrible behavior and I have no clue why they changed this.
Yes they did change it to make it work this way, because I use an older version of the webjobs package just for this reason. About 3 months ago I upgraded to the newer version, and shortly after could not understand why the above behavior was happening . Once I reverted back to the older version, it started working correctly again and after failing 5 times is moved to poison queue and never ran again. My point is that if you want the correct (IMO) behavior, see if you can go back to using version 1.1.0 and you will be happy. Hope that helps.

To mark a triggered web job as failed you just need to set process exit code to non-zero.
System.Environment.ExitCode = 1;
When you throw an unhandled exception it also sets the exit code, that is how Azure determines failure.

Related

Defer message in azure function V2: The lock supplied is invalid

In my azure function, at some point I would like to defer my message. But if I do, I get an exception:
[7/30/2020 5:59:02 PM] Message processing error (Action=Complete, ClientId=MessageReceiver1UserCreated/Subscriptions/MySubscription, EntityPath=UserCreated/Subscriptions/MySubscription, Endpoint=xxxxxxxxxxx.servicebus.windows.net)
[7/30/2020 5:59:02 PM] Microsoft.Azure.ServiceBus: The lock supplied is invalid. Either the lock expired, or the message has already been removed from the queue, or was received by a different receiver instance.
This is my code
[FunctionName("UserCreated")]
public static async Task Run([ServiceBusTrigger("UserCreated", "MySubscription", Connection = "ServiceBusConnectionString")]UserCreated userCreated, ILogger log, string lockToken, MessageReceiver messageReceiver)
{
//some logic.....
await messageReceiver.DeferAsync(lockToken);
}
Honestly I have no clue what I am doing wrong. Code examples id found and also this StackOverflow: Azure Function V2 Service Bus Message Deferral post, does not help me out.
I understand that the message is automatically completed after the function completes. So I tried to disable autocomplete but also there I did not succeed to find a working solution.
Using packages:
Microsoft.Azure.WebJobs.Extensions.ServiceBus 4.1.0
(references) Microsoft.Azure.ServiceBus 4.1.1

As the error message states, the message may be losing the lock before reaching the Defer instruction. Try to extend the lock timeout on your service bus. I think it may fix the issue.

Here is a bit of an explanation on what a lock does in a service bus queue, according to the error you describe, your lock is expiring before you are able to defer and autorenewal should be handled by the functions but it is not guaranteed, so the best way to tackle this is to extend the maximum duration of the lock.
The easiest way to achieve this is to navigate into the azure portal and find the service bus subscription you wish to change, once you select it you should see something like this screen:
By clicking on the Change button under the message lock duration you will be able to modify the duration based on your needs.

Thanks for all answers however none actually explained the real cause.
TL;DR
If you want to complete, defer, abandon or remove the message yourself, you have to disable autocomplete in the host.json file.
Root cause
The reason why the lock is invalid states:
The lock supplied is invalid. Either the lock expired, or the message has already been removed from the queue, or was received by a different receiver instance.
In my case the message was already "removed" since I used messageReceiver.DeferAsync(lockToken);
So this means that after this statement, the function automatically completes the message (which is already deferred).
Therefore you have to disable autocompletion of the message.
Solution
disable autocomplete in host.json
"extensions": {
"serviceBus": {
"messageHandlerOptions": {
"autoComplete": false
}
}
}
Be careful
When disabling autocomplete, you are responsible to do something with the message. You always have to make a decision otherwise the message will become available again after lock timeout.

Azure Service Bus MessageLockLostException when Completing Locked Message

I'm getting a MessageLockLostException when performing a complete operation on Azure Service Bus after performing a long operation of 30 minutes to over an hour. I want this process to scale and be resilient to failures so I keep hold of the Message lock and renew it well within the default lock duration of 1 minute. However when I try to complete the message at the end, even though I can see all the lock renewals have occurred at the correct time I get a MessageLockLostException. I want to scale this up in the future however there is currently only one instance of the application and I can confirm that the message still exists on the Service Bus Subscription after it errors so the problem is definitely around the lock.
Here are the steps I take.
Obtain a message and configure a lock
messages = await Receiver.ReceiveAsync(1, TimeSpan.FromSeconds(10)).ConfigureAwait(false);
var message = messages[0];
var messageBody = GetTypedMessageContent(message);
Messages.TryAdd(messageBody, message);
LockTimers.TryAdd(
messageBody,
new Timer(
async _ =>
{
if (Messages.TryGetValue(messageBody, out var msg))
{
await Receiver.RenewLockAsync(msg.SystemProperties.LockToken).ConfigureAwait(false);
}
},
null,
TimeSpan.FromSeconds(Config.ReceiverInfo.LockRenewalTimeThreshold),
TimeSpan.FromSeconds(Config.ReceiverInfo.LockRenewalTimeThreshold)));
Perform the long running process
Complete the message
internal async Task Complete(T message)
{
if (Messages.TryGetValue(message, out var msg))
{
await Receiver.RenewLockAsync(msg.SystemProperties.LockToken);
await Receiver.CompleteAsync(msg.SystemProperties.LockToken).ConfigureAwait(false);
}
}
The code above is a stripped down version of what's there, I removed some try catch error handling and logging we have but I can confirm that when debugging the issue I can see the timer execute on time. It's just the "CompleteAsync" that fails.
Additional Info;
Service Bus Topic has Partitioning Enabled
I have tried renewing it at 80% of the threshold (48 seconds), 30% of the Threshold (18 seconds) and 10% of the Threshold (6 seconds)
I've searched around for an answer and the closest thing I found was this article but it's from 2016.
I couldn't get it to fail in a standalone Console Application so I don't know if it's something I'm doing in my Application but I can confirm that the lock renewal occurs for the duration of the processing and returns the correct DateTime for the updated lock, I'd expect if the lock was truely lost that the CompleteAsync would fail
I'm using the Microsoft.Azure.ServiceBus nuget package Version="4.1.3"
My Application is Dotnet Core 3.1 and uses a Service Bus Wrapper Package which is written in Dotnet Standard 2.1
The message completes if you don't hold onto it for a long time and occasionally completes even when you do.
Any help or advice on how I could complete my Service Bus message successfully after an hour would be great

The issue here wasn't with my code. It was with Partitioning on the Service Bus topic. If you search around there are some issues on the Microsoft GitHub around completion of messages. That's not important anyway because the fix I used here was to use the Subscription forwarding feature to move the message to a new Topic with partitioning disabled and then read the message from that new topic and I was able to use the exact same code to keep the message locked for a long time and still complete it successfully

Service Bus message abandoned despite WebJobs SDK handler completed successfully

I have implemented a long running process as a WebJob using the WebJobs SDK.
The long running process is awaited because I want the result.
public async Task ProcessMessage([ServiceBusTrigger("queuename")] MyMessage message)
{
await Run(message.SomeProperty); // takes several minutes
// I want to do something with the result here later..
}
What I can't figure out is why the message sometimes is abandoned which of course triggers the handler again. I've tried to debug (locally), setting breakpoints before ProcessMessage finishes and I can see that it appears to finish successfully.
The Sevice Bus part of the WebJobs SDK takes care of message lock renewal, so that shouldn't be a problem as far as I've understood.
What am I missing and how do I troubleshoot?

[Edited previously incorrect response]
The WebJobs SDK relies on the automatic lock renewals done by MessageReceiver.OnMessageAsync. These renewals are governed by the OnMessageOptions.AutoRenewTimeout setting, which can be configured like so in the v1.1.0 release of the WebJobs SDK:
JobHostConfiguration config = new JobHostConfiguration();
ServiceBusConfiguration sbConfig = new ServiceBusConfiguration();
sbConfig.OnMessageOptions = new OnMessageOptions
{
MaxConcurrentCalls = 16,
AutoRenewTimeout = TimeSpan.FromMinutes(10)
};
config.UseServiceBus(sbConfig);
You can also customize these values via a custom MessageProcessor. See the release notes here for more details on these new features.

Bug with Azure Batch, initializing job object from taskitem

To add a task, as shown in the official tutorial from Microsoft, I have to make a chain of initialization. Here is the code.
var cred = new BatchCredentials(Credentials.AzureBatch.Name, Credentials.AzureBatch.AccountKey);
var batchClient = BatchClient.Connect(Credentials.AzureBatch.Uri, cred);
var workItemManager = batchClient.OpenWorkItemManager();
_job = workItemManager.GetJob(Credentials.AzureBatch.Name, "job-0000000001");
Problem is that the code execution stops on the next line.
_job = workItemManager.GetJob(Credentials.AzureBatch.Name, "job-0000000001");
Then throws an exception with the description {"The remote server returned an error: (404) Not Found."}.
I assume, job with the same name is not found on the server. But according to the tutorial, the name given job at its automatic creation, together with the creation of workitem.
What's wrong?

Your code doesn't show the workitem creation part, I assume you have already done so. If not, you need to create the workitem first.
Workitem and job creation are not synchronize. So, it's possible that your workitem has been created but not the job. Just catch the exception and retry until you find the job.
#ccoxton is right that you can download the Batch Explorer from https://code.msdn.microsoft.com/windowsazure/Azure-Batch-Explorer-c1d37768. This should give you a view on what's happening on the server.

Download the Azure Batch Explorer application, and connection your account to it. This will show you the running pools, work items, and jobs. You must have a running work item for that code to work. There could have been a problem with the code you used to create the work item.

download the batch explorer code from here..
https://github.com/Azure/azure-batch-samples/tree/master/CSharp/BatchExplorer

Converting Microsoft EWS StreamingNotification Example to a service

I've been working to try and convert Microsoft's EWS Streaming Notification Example to a service
( MS source http://www.microsoft.com/en-us/download/details.aspx?id=27154).
I tested it as a console app. I then used a generic service template and got it to the point it would compile, install, and start. It stops after about 10 seconds with the ubiquitous "the service on local computer started and then stopped."
So I went back in and upgraded to C# 2013 express and used NLog to put a bunch of log trace commands to so I could see where it was when it exited.
The last place I can find it is in the example code, SynchronizationChanges function,
public static void SynchronizeChanges(FolderId folderId)
{
logger.Trace("Entering SynchronizeChanges");
bool moreChangesAvailable;
do
{
logger.Trace("Synchronizing changes...");
//Console.WriteLine("Synchronizing changes...");
// Get all changes since the last call. The synchronization cookie is stored in the
// _SynchronizationState field.
// Only the the ids are requested. Additional properties should be fetched via GetItem
//calls.
logger.Trace("Getting changes into var changes.");
var changes = _ExchangeService.SyncFolderItems(folderId, PropertySet.IdOnly, null, 512,
SyncFolderItemsScope.NormalItems,
_SynchronizationState);
// Update the synchronization cookie
logger.Trace("Updating _SynchronizationState");
the log file shows the trace message ""Getting changes into var changes." but not the "Updating _SynchronizationState" message.
so it never gets past var changes = _ExchangeService.SyncFolderItems
I cannot for the life figure out why its just exiting. There are many examples of EWS streaming notifications. I have 3 that compile and run just fine but nobody as far as I can tell has posted an example of it done as a service.

If you don't see the "Updating..." message it's likely the sync threw an exception. Wrap it in a try/catch.
OK, so now that I see the error, this looks like your garden-variety permissions problem. When you ran this as a console app, you likely presented the default credentials to Exchange, which were for your login ID. For a Windows service, if you're running the service with one of the built-in accounts (e.g. Local System), your default credentials will not have access to Exchange.
To rectify, either (1) run the service under the account you did the console app with, or (2) add those credentials to the Exchange Service object.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

How to programmatically mark an Azure WebJob as failed? - c#

Is there a way to mark a WebJob (triggered, not continuous) as failed, without throwing an exception? I need to check that certain conditions are true to mark the job as successful.

Throwing an unmanaged exception will result in a Failed execution. But i have noticed that it will also result with a bad management of your message: i.e. your message will be dequeued but not moved to your poison queue regarding your configuration (but maybe it was due to my SDK version).

To mark a triggered web job as failed you just need to set process exit code to non-zero. System.Environment.ExitCode = 1; When you throw an unhandled exception it also sets the exit code, that is how Azure determines failure.

Related

Defer message in azure function V2: The lock supplied is invalid

Azure Service Bus MessageLockLostException when Completing Locked Message

Service Bus message abandoned despite WebJobs SDK handler completed successfully

Bug with Azure Batch, initializing job object from taskitem

Converting Microsoft EWS StreamingNotification Example to a service

Categories

Resources