Worker Service mysteriously stops doing work - c#

My good ladies and gentlemen, I recently had my first go at Worker services in .Net Core 3.1, and only the second go at Windows services in general (first one was made in .Net Framework and works fine to this day). If anyone could maybe shed some light at what I'm missing in the example that I will provide that would be great.
So, to keep it simple, my problem is this:
My supposed long (forever) running Worker service unexpectedly stops doing work at an arbitrary time of day, but still is shown as "Running" in service manager (that's probably how Windows deals with services). It doesn't necessarily have to be every day, but it stops doing work every now and then until I manually stop it and then restart it in Service Manager.
I have also stumbled upon this question which seemed to deal with my problem, but even after completely wrapping all of my service's code blocks in try-catchs, even on top-level, I still get nothing registered in my Log table, or even in the file I set up to write in if my DB connection fails. Service seems to just stop calling ExecuteAsync() method.
Ok here's how my code's logically structured, I have excluded implementation and I'm just showing what happens until DoWork is called:
public class Worker : BackgroundService
{
private readonly IConfiguration _configuration;
public Worker(IConfiguration configuration)
{
_configuration = configuration;
}
public override Task StartAsync(CancellationToken cancellationToken)
{
return base.StartAsync(cancellationToken);
}
protected override async Task ExecuteAsync(CancellationToken stoppingToken)
{
try
{
while (true)
{
try //paranoid try-catch
{
await DoWork();
await Task.Delay(TimeSpan.FromSeconds(45), stoppingToken);
}
catch (Exception e)
{
await Log(e, customMessage: "Proccess failed at top level.");
}
}
}
catch (Exception e)
{
await Log(e, customMessage: "Proccess failed at topmost level.");
}
}
private async Task DoWork()
{
try
{
}
catch (Exception e)
{
await Log(e);
}
}
public async Task Log(Exception e, string user = null, string emailID = null, string customMessage = null)
{
}
}
As you can see, I am not handling cancellation, as in the question I linked above. Now that I think about it maybe I should, and something is inadvertently sending cancellation? The reason I didn't is because I'm not sure what events exactly signal the cancellation. Only the manual stopping of service, or something else maybe? And if it is the cancellation that was sent that caused my service to stop doing work, shouldn't it also stop my service from running?
Btw I just tested cancellation on dummy service which implements my logic with while(true) and it catches the stopping exception, even though it's a bit awkward, as it catches it and logs it multiple times before stopping, so I presume it may not be the cancellation token that is causing my DoWork not to fire.

Ok guys I'd fixed it. See comment below.
I'd guessed that what was causing deadlock was probably too many concurrent calls from different threads to database over the same connection.
Not that that I knew that would be the cause (and I still don't know and can only guess why this happens so if someone can clarify why this happens and why don't the calls get queued please do), but as I tried to fix it that seemed like a good starting point.
What I did was just limit possible concurrent calls to 1:
Instantiate SemaphoreSlim on a class level:
private static SemaphoreSlim Semaphore = new SemaphoreSlim(1);
Insert a SemaphoreSlim.WaitAsync before each of my DB calls and its respective SemaphoreSlim.Release in a finally block after the call:
try
{
await Semaphore.WaitAsync();
var id = await sqlCommand.ExecuteScalarAsync().ToString();
}
finally
{
Semaphore.Release();
}
I thought this would decrease the performance but to my pleasant surprise I felt no noticeable difference.
Also, I was tempted to set Semaphore's initial count to more than 1 thread but I figured if deadlock happens for many threads, then it might happen for 2-10 threads. Does anyone perhaps know anything more about this number? Is it processor related, SQL related, or perhaps C# related?

Have you implemented a dispose method to close the database connection after finishing the DoWork method? I had a deadlock problem using worker service and realized the database connection wasn’t disposed. After implementing a dispose method, it works for me to solve the problem.

Old question, and I don't know how or even if Manus's issue eventually resolved, but in my experience, exceptions in threads that aren't the main thread caused this problem for us in a Windows Service. And we didn't see this when running the same code as a Windows Forms. Try/catch wrapped around the main processing line does not catch them. We had to add it in the threaded method.

Related

Azure functions - Parallel tasks seems not to run simultaneously

I have a question referencing the usage of concurrently running tasks in Azure Functions, on the consumption plan.
One part of our application allows users to connect their mail accounts, then downloads messages every 15 minutes. We have azure function to do so, one for all users. The thing is, as users count increases, the function need's more time to execute.
In order to mitigate a timeout case, I've changed our function logic. You can find some code below. Now it creates a separate task for each user and then waits for all of them to finish. There is also some exception handling implemented, but that's not the topic for today.
The problem is, that when I check some logs, I see executions as the functions weren't executed simultaneously, but rather one after one. Now I wonder if I made some mistake in my code, or is it a thing with azure functions that they cannot run in such a scenario (I haven't found anything suggesting it on the Microsoft sites, quite the opposite actually)
PS - I do know about durable functions, however, for some reason I'd like to resolve this issue without them.
My code:
List<Task<List<MailMessage>>> tasks = new List<Task<List<MailMessage>>>();
foreach (var account in accounts)
{
using (var cancellationTokenSource = new CancellationTokenSource(TimeSpan.FromMinutes(6)))
{
try
{
tasks.Add(GetMailsForUser(account, cancellationTokenSource.Token, log));
}
catch (TaskCanceledException)
{
log.LogInformation("Task was cancelled");
}
}
}
try
{
await Task.WhenAll(tasks.ToArray());
}
catch(AggregateException aex)
{
aex.Handle(ex =>
{
TaskCanceledException tcex = ex as TaskCanceledException;
if (tcex != null)
{
log.LogInformation("Handling cancellation of task {0}", tcex.Task.Id);
return true;
}
return false;
});
}
log.LogInformation($"Zakończono pobieranie wiadomości.");
private async Task<List<MailMessage>> GetMailsForUser(MailAccount account, CancellationToken cancellationToken, ILogger log)
{
log.LogInformation($"[{account.UserID}] Rozpoczęto pobieranie danych dla konta {account.EmailAddress}");
IEnumerable<MailMessage> mails;
try
{
using (var client = _mailClientFactory.GetIncomingMailClient(account))
{
mails = client.GetNewest(false);
}
log.LogInformation($"[{account.UserID}] Pobrano {mails.Count()} wiadomości dla konta {account.EmailAddress}.");
return mails.ToList();
}
catch (Exception ex)
{
log.LogWarning($"[{account.UserID}] Nie udało się pobrać wiadomości dla konta {account.EmailAddress}");
log.LogError($"[{account.UserID}] {ex.Message} {ex.StackTrace}");
return new List<MailMessage>();
}
}
Output:
Azure functions in a consumption plan scales out automatically. Problem is that the load needs to be high enough to trigger the scale out.
What is probably happening is that the scaling is not being triggered, therefore everything runs on the same instance, therefore the calls run sequentially.
There is a discussion on this with some code to test it here: https://learn.microsoft.com/en-us/answers/questions/51368/http-triggered-azure-function-not-scaling-to-extra.html
The compiler will give you a warning for GetMailsForUser:
CS1998: This async method lacks 'await' operators and will run synchronously. Consider using the 'await' operator to await non-blocking API calls, or 'await Task.Run(…)' to do CPU-bound work on a background thread.
It's telling you it will run synchronously, which is the behaviour you're seeing. In the warning message there's a couple of recommendations:
Use await. This would be the most ideal solution, since it will reduce the resources your Azure Function uses. However, this means your _mailClientFactory will need to support asynchronous APIs, which may be too much work to take on right now (many SMTP libraries still do not support async).
Use thread pool threads. Task.Run is one option, or you could use PLINQ or Parallel. This solution will consume one thread per account, and you'll eventually hit scaling issues there.
If you want to identify which Task is running in which Function instance etc. use invocation id ctx.InvocationId.ToString(). May be prefix all your logs with this id.
Your code isn't written such that it can be run in parallel by the runtime. See this: Executing tasks in parallel
You can also get more info about the trigger using trigger meta-data. Depends on trigger. This is just to get more insight into what function is handling what message etc.

How to handle a deadlock in third-party code

We have a third-party method Foo which sometimes runs in a deadlock for unknown reasons.
We are executing an single-threaded tcp-server and call this method every 30 seconds to check that the external system is available.
To mitigate the problem with the deadlock in the third party code we put the ping-call in a Task.Run to so that the server does not deadlock.
Like
async Task<bool> WrappedFoo()
{
var timeout = 10000;
var task = Task.Run(() => ThirdPartyCode.Foo());
var delay = Task.Delay(timeout);
if (delay == await Task.WhenAny(delay, task ))
{
return false;
}
else
{
return await task ;
}
}
But this (in our opinion) has the potential to starve the application of free threads. Since if one call to ThirdPartyCode.Foo deadlock the thread will never recover from this deadlock and if this happens often enough we might run out of resources.
Is there a general approach how one should handle deadlocking third-party code?
A CancellationToken won't work because the third-party-api does not provide any cancellation options.
Update:
The method at hand is from the SAPNCO.dll provided by SAP to establish and test rfc-connections to a sap-system, therefore the method is not a simple network-ping. I renamed the method in the question to avoid further misunderstandings
Is there a general approach how one should handle deadlocking third-party code?
Yes, but it's not easy or simple.
The problem with misbehaving code is that it can not only leak resources (e.g., threads), but it can also indefinitely hold onto important resources (e.g., some internal "handle" or "lock").
The only way to forcefully reclaim threads and other resources is to end the process. The OS is used to cleaning up misbehaving processes and is very good at it. So, the solution here is to start a child process to do the API call. Your main application can communicate with its child process by redirected stdin/stdout, and if the child process ever times out, the main application can terminate it and restart it.
This is, unfortunately, the only reliable way to cancel uncancelable code.
Cancelling a task is a collaborative operation in that you pass a CancellationToken to the desired method and externally you use CancellationTokenSource.Cancel:
public void Caller()
{
try
{
CancellationTokenSource cts=new CancellationTokenSource();
Task longRunning= Task.Run(()=>CancellableThirdParty(cts.Token),cts.Token);
Thread.Sleep(3000); //or condition /signal
cts.Cancel();
}catch(OperationCancelledException ex)
{
//treat somehow
}
}
public void CancellableThirdParty(CancellationToken token)
{
while(true)
{
// token.ThrowIfCancellationRequested() -- if you don't treat the cancellation here
if(token.IsCancellationRequested)
{
// code to treat the cancellation signal
//throw new OperationCancelledException($"[Reason]");
}
}
}
As you can see in the code above , in order to cancel an ongoing task , the method running inside it must be structured around the CancellationToken.IsCancellationRequested flag or simply CancellationToken.ThrowIfCancellationRequested method ,
so that the caller just issues the CancellationTokenSource.Cancel.
Unfortunately if the third party code is not designed around CancellationToken ( it does not accept a CancellationToken parameter ), then there is not much you can do.
Your code isn't cancelling the blocked operation. Use a CancellationTokenSource and pass a cancellation token to Task.Run instead :
var cts=new CancellationTokenSource(timeout);
try
{
await Task.Run(() => ThirdPartyCode.Ping(),cts.Token);
return true;
}
catch(TaskCancelledException)
{
return false;
}
It's quite possible that blocking is caused due to networking or DNS issues, not actual deadlock.
That still wastes a thread waiting for a network operation to complete. You could use .NET's own Ping.SendPingAsync to ping asynchronously and specify a timeout:
var ping=new Ping();
var reply=await ping.SendPingAsync(ip,timeout);
return reply.Status==IPStatus.Success;
The PingReply class contains far more detailed information than a simple success/failure. The Status property alone differentiates between routing problems, unreachable destinations, time outs etc

Task gets stuck in "[Scheduled and Waiting to Run]"

I've run into an issue with tasks I can't seem to figure out. This application makes repeated HTTP calls via WebClient to several servers. It maintains a dictionary of tasks that are running the HTTP calls, and every five seconds it checks for results, then once the results are in it makes an HTTP call again. This goes on for the lifetime of the application.
Recently, it has started having a problem where the tasks are randomly getting stuck in WaitingForActivation. In the debugger the task shows as "[Scheduled and waiting to run]", but it never runs.
This is the function that it's running, when I click on the "Scheduled" task in the debugger, it points to the DownloadStringTaskAsync() line:
private static async Task<string> DownloadString(string url)
{
using (var client = new WebClient()) {
try {
var result = await client.DownloadStringTaskAsync(url).ConfigureAwait(false);
return result;
} catch (WebException) {
return null;
}
}
}
The code that is actually creating the task that runs the above function is this. It only hits this line once the existing task is completed, Task.IsCompleted never returns true since it's stuck in scheduled status. Task.Status gets stuck in WaitingForActivation.
tasks[resource] = Task.Run(() => DownloadString("http://" + resources[resource] + ":8181/busy"));
The odd thing about this is that, as far as I can tell, this code ran perfectly fine for two years, until we recently did a server migration which included an upgraded OS and spawned a network packet loss issue. That's when we started noticing this particular problem, though I don't see how either of those would be related.
Also, this tends to only happen after the application has been running for several thousand seconds. It runs perfectly fine for a while until tasks, one-by-one, start getting stuck. After about a day, there's usually four or five tasks stuck in scheduled. Since it usually takes time for the first task to get stuck, that seems to me like there would be a race condition of some sort, but I don't see how that could be the case.
Is there a reason a task would get stuck in scheduled and never actually run?
I'm not familiar with ancient WebClient (maybe it contains bugs) but can suggest the recommended by Microsoft way to get a response from a server using System.Net.Http.HttpClient. Also HttpClient is rather faster works with multiple requests per endpoint, especially in .NET Core/.NET 5.
// HttpClient is intended to be instantiated once per application, rather than per-use
private static readonly HttpClient client = new HttpClient();
private static async Task<string> DownloadString(string url)
{
try
{
return await client.GetStringAsync(url).ConfigureAwait(false);
}
catch (HttpRequestException ex)
{
Debug.WriteLine(ex.Message);
return null;
}
}
Also remove Task.Run, it's a kind of redundancy.
tasks[resource] = DownloadString($"http://{resources[resource]}:8181/busy");
Asynchronous programming - read the article. You have to get a difference between I/O-bound and CPU-bound work, and don't spawn Threads without special need for concurrency. You need no Thread here.

Creating a c# windows service to poll a database

I am wanting to write a service that polls a database and performs an operation depending on the data being brought back.
I am not sure what is the best way of doing this, I can find a few blogs about it and this stack overflow question Polling Service - C#. However I am wary that they are all quite old and possibly out of date.
Can anyone advise me on the current advice or best practices (if there are any) on doing something like this or point me in the direction of a more recent blog post about this. From what I can gather either using a timer or tpl tasks are two potential ways of doing this.
If timers are still suggested then how will they work when the service is stopped because the operations I intend for these services to do could potentially take 30+ minutes, this is why I say use tasks because I can use a task cancellation token but these throw exceptions when cancelled (correct me if I am wrong) and I don't think I really want that behaviour (although correct me if you think there is a reason I will want that).
Sorry that I may be asking quite a lot in a single question but I'm not entirely sure myself what I am asking.
Go with a Windows service for this. Using a scheduled task is not a bad idea per se, but since you said the polls can occur every 2 minutes then you are probably better off going with the service. The service will allow you to maintain state between polls and you would have more control over the timing of the polls as well. You said the operation might take 30+ minutes once it is kicked off so maybe you would want to defer polls until the operation complete. That is a bit easier to do when the logic is ran as a service.
In the end it does not really matter what mechanism you use to generate the polls. You could use a timer or a dedicated thread/task that sleeps or whatever. Personally, I find the dedicated thread/task easier to work with than a timer for these kinds of things because it is easier to control the polling interval. Also, you should definitely use the cooperative cancellation mechanism provided with the TPL. It does not necessary throw exceptions. It only does so if you call ThrowIfCancellationRequested. You can use IsCancellationRequested instead to just check the cancellation token's state.
Here is a very generic template you might use to get started.
public class YourService : ServiceBase
{
private CancellationTokenSource cts = new CancellationTokenSource();
private Task mainTask = null;
protected override void OnStart(string[] args)
{
mainTask = new Task(Poll, cts.Token, TaskCreationOptions.LongRunning);
mainTask.Start();
}
protected override void OnStop()
{
cts.Cancel();
mainTask.Wait();
}
private void Poll()
{
CancellationToken cancellation = cts.Token;
TimeSpan interval = TimeSpan.Zero;
while (!cancellation.WaitHandle.WaitOne(interval))
{
try
{
// Put your code to poll here.
// Occasionally check the cancellation state.
if (cancellation.IsCancellationRequested)
{
break;
}
interval = WaitAfterSuccessInterval;
}
catch (Exception caught)
{
// Log the exception.
interval = WaitAfterErrorInterval;
}
}
}
}
Like I said, I normally use a dedicated thread/task instead of a timer. I do this because my polling interval is almost never constant. I usually start slowing the polls down if a transient error is detected (like network or server availability issues) that way my log file does not fill up with the same error message over and over again in rapid succession.
You have a few options. To start with what could be essentially the easiest option, you could decide to create your app as a console application and run the executable as a task in the Windows Task Scheduler. All you would need to do is assign your executable as the program to start in the task and have the task scheduler handle the timing interval for you. This is probably the preferred way if you don't care about state and will prevent you from having to worry about creating and managing a windows service if you don't really need to. See the following link for how to use the scheduler.
Windows Task Scheduler
The next way you could do this would be to create a windows service and in that service use a timer, specifically System.Timers.Timer. Essentially you would set the timer interval to the amount of time you would like to have pass before you run your process. Then you would sign up for the timers tick event which would fire every time that interval occurred. In this event you would essentially have the process you would like to run; this could kick off addition threads if you would like. Then after that initial setup you would just call the timers Start() function or set the Enabled property to True to start the timer. A good example of what this would look like can be found in the example on MSDN page describing the object. There are plenty of tutorials out there that show how to set up a windows service so I won't bother with going into that specifically.
MSDN: System.Timers.Timer
Finally and more complex would be to set up a windows service that listens for a SqlDependency. This technique is useful if things can occur in the database outside your application yet you need to be made aware of it in your application or some other service. The following link has a good tutorial on how to set up a SqlDependency in an application.
Using SqlDependency To Monitor SQL Database Changes
Two things I would like to point out from your original post that are not specific to the question you had.
If you are writing a true windows service you don't want the service to stop. The service should be running constantly and if an exception does occur it should be handled appropriately and not stop the service.
A cancellation token doesn't have to throw an exception; simply not calling ThrowIfCancellationRequested() will cause the exception not to be thrown or if this is a CancellationTokenSource set the argument to false on the Cancel method then subsequently check the token to see if cancellation is requested in your threads and return out of the thread gracefully if so.
For example:
CancellationTokenSource cts = new CancellationTokenSource();
ParallelOptions options = new ParallelOptions
{
CancellationToken = cts.Token
};
Parallel.ForEach(data, options, i =>
{
try
{
if (cts.IsCancellationRequested) return;
//do stuff
}
catch (Exception ex)
{
cts.Cancel(false);
}
});

Fire and forget async method in ASP.NET MVC

The general answers such as here and here to fire-and-forget questions is not to use async/await, but to use Task.Run or TaskFactory.StartNew passing in the synchronous method instead. However, sometimes the method that I want to fire-and-forget is async and there is no equivalent sync method.
Update Note/Warning: As Stephen Cleary pointed out below, it is dangerous to continue working on a request after you have sent the response. The reason is because the AppDomain may be shut down while that work is still in progress. See the link in his response for more information. Anyways, I just wanted to point that out upfront, so that I don't send anyone down the wrong path.
I think my case is valid because the actual work is done by a different system (different computer on a different server) so I only need to know that the message has left for that system. If there is an exception there is nothing that the server or user can do about it and it does not affect the user, all I need to do is refer to the exception log and clean up manually (or implement some automated mechanism). If the AppDomain is shut down I will have a residual file in a remote system, but I will pick that up as part of my usual maintenance cycle and since its existence is no longer known by my web server (database) and its name is uniquely timestamped, it will not cause any issues while it still lingers.
It would be ideal if I had access to a persistence mechanism as Stephen Cleary pointed out, but unfortunately I don't at this time.
I considered just pretending that the DeleteFoo request has completed fine on the client side (javascript) while keeping the request open, but I need information in the response to continue, so it would hold things up.
So, the original question...
for example:
//External library
public async Task DeleteFooAsync();
In my asp.net mvc code I want to call DeleteFooAsync in a fire-and-forget fashion - I don't want to hold up the response waiting for DeleteFooAsync to complete. If DeleteFooAsync fails (or throws an exception) for some reason, there is nothing that the user or the program can do about it so I just want to log an error.
Now, I know that any exceptions will result in unobserved exceptions, so the simplest case I can think of is:
//In my code
Task deleteTask = DeleteFooAsync()
//In my App_Start
TaskScheduler.UnobservedTaskException += ( sender, e ) =>
{
m_log.Debug( "Unobserved exception! This exception would have been unobserved: {0}", e.Exception );
e.SetObserved();
};
Are there any risks in doing this?
The other option that I can think of is to make my own wrapper such as:
private void async DeleteFooWrapperAsync()
{
try
{
await DeleteFooAsync();
}
catch(Exception exception )
{
m_log.Error("DeleteFooAsync failed: " + exception.ToString());
}
}
and then call that with TaskFactory.StartNew (probably wrapping in an async action). However this seems like a lot of wrapper code each time I want to call an async method in a fire-and-forget fashion.
My question is, what it the correct way to call an async method in a fire-and-forget fashion?
UPDATE:
Well, I found that the following in my controller (not that the controller action needs to be async because there are other async calls that are awaited):
[AcceptVerbs( HttpVerbs.Post )]
public async Task<JsonResult> DeleteItemAsync()
{
Task deleteTask = DeleteFooAsync();
...
}
caused an exception of the form:
Unhandled Exception: System.NullReferenceException: Object reference
not set to an instance of an object. at System.Web.ThreadContext.AssociateWithCurrentThread(BooleansetImpersonationContext)
This is discussed here and seems to be to do with the SynchronizationContext and 'the returned Task was transitioned to a terminal state before all async work completed'.
So, the only method that worked was:
Task foo = Task.Run( () => DeleteFooAsync() );
My understanding of why this works is because StartNew gets a new thread for DeleteFooAsync to work on.
Sadly, Scott's suggestion below does not work for handling exceptions in this case, because foo is not a DeleteFooAsync task anymore, but rather the task from Task.Run, so does not handle the exceptions from DeleteFooAsync. My UnobservedTaskException does eventually get called, so at least that still works.
So, I guess the question still stands, how do you do fire-and-forget an async method in asp.net mvc?
First off, let me point out that "fire and forget" is almost always a mistake in ASP.NET applications. "Fire and forget" is only an acceptable approach if you don't care whether DeleteFooAsync actually completes.
If you're willing to accept that limitation, I have some code on my blog that will register tasks with the ASP.NET runtime, and it accepts both synchronous and asynchronous work.
You can write a one-time wrapper method for logging exceptions as such:
private async Task LogExceptionsAsync(Func<Task> code)
{
try
{
await code();
}
catch(Exception exception)
{
m_log.Error("Call failed: " + exception.ToString());
}
}
And then use the BackgroundTaskManager from my blog as such:
BackgroundTaskManager.Run(() => LogExceptionsAsync(() => DeleteFooAsync()));
Alternatively, you can keep TaskScheduler.UnobservedTaskException and just call it like this:
BackgroundTaskManager.Run(() => DeleteFooAsync());
As of .NET 4.5.2, you can do the following
HostingEnvironment.QueueBackgroundWorkItem(async cancellationToken => await LongMethodAsync());
But it only works within ASP.NET domain
The HostingEnvironment.QueueBackgroundWorkItem method lets you
schedule small background work items. ASP.NET tracks these items and
prevents IIS from abruptly terminating the worker process until all
background work items have completed. This method can't be called
outside an ASP.NET managed app domain.
More here: https://msdn.microsoft.com/en-us/library/ms171868(v=vs.110).aspx#v452
The best way to handle it is use the ContinueWith method and pass in the OnlyOnFaulted option.
private void button1_Click(object sender, EventArgs e)
{
var deleteFooTask = DeleteFooAsync();
deleteFooTask.ContinueWith(ErrorHandeler, TaskContinuationOptions.OnlyOnFaulted);
}
private void ErrorHandeler(Task obj)
{
MessageBox.Show(String.Format("Exception happened in the background of DeleteFooAsync.\n{0}", obj.Exception));
}
public async Task DeleteFooAsync()
{
await Task.Delay(5000);
throw new Exception("Oops");
}
Where I put my message box you would put your logger.

Categories