I have a headless UWP app that is supposed to get data from a sensor every 10 minutes and send it to the cloud.
When I run the code from the headless app on the Raspberry Pi it stops measuring after 3 or 4 hours, no error (I have a lot of logs). It is exactly 3 or 4 hours. If I start the app at 8, at 11 or 12 it just stops...
It looks like it is stopped because I have cancellation token in place that worked well in tests, but here it is not firing anymore. On the App manager in the Device Portal it appears that the app is running.
I also noticed in the Performance page in the Device Portal that the memory goes down with about 8 MB during the measurements.
The strange thing is that I ran the same code from a headed app on the RPi and on a laptop and it went very well. It worked continuously for over 16 hours until I stopped it. On both the laptop and the RPi there was no memory issue, the app used the same amount of RAM over the whole period.
What could cause this behavior when running as a headless app?
Here is how I call the code from the headless app:
BackgroundTaskDeferral deferral;
private ISettingsReader settings;
private ILogger logger;
private IFlowManager<PalmSenseMeasurement> flow;
private IServiceProvider services;
IBackgroundTaskInstance myTaskInstance;
public async void Run(IBackgroundTaskInstance taskInstance)
{
taskInstance.Canceled += TaskInstance_Canceled;
deferral = taskInstance.GetDeferral();
myTaskInstance = taskInstance;
try
{
SetProperties();
var flowTask = flow.RunFlowAsync();
await flowTask;
}
catch (Exception ex)
{
logger.LogCritical("#####---->Exception occured in StartupTask (Run): {0}", ex.ToString());
}
}
private void SetProperties()
{
services = SensorHubContainer.Services;
settings = services.GetService<ISettingsReader>();
flow = services.GetService<IFlowManager<PalmSenseMeasurement>>();
logger = services.GetService<ILogger<StartupTask>>();
}
private void TaskInstance_Canceled(IBackgroundTaskInstance sender, BackgroundTaskCancellationReason reason)
{
logger.LogDebug("StartupTask.TaskInstance_Canceled() - {0}", reason.ToString());
deferral.Complete();
}
And here is how I call the code from the headed app:
private async Task GetMeasurementsAsync()
{
try
{
flow = services.GetService<IFlowManager<PalmSenseMeasurement>>();
await flow.RunFlowAsync();
}
catch (Exception ex)
{
Measurements.Add(new MeasurementResult() { ErrorMessage = ex.Message });
}
}
The RunFlowAsync method looks like this:
public async Task RunFlowAsync()
{
var loopInterval = settings.NoOfSecondsForLoopInterval;
while (true)
{
try
{
logger.LogInformation("Starting a new loop in {0} seconds...", loopInterval);
//check for previous unsent files
await resender.TryResendMeasuresAsync();
await Task.Delay(TimeSpan.FromSeconds(loopInterval));
await DoMeasureAndSend();
logger.LogInformation("Loop finished");
}
catch (Exception ex)
{
logger.LogError("Error in Flow<{0}>! Error {1}", typeof(T).FullName, ex);
#if DEBUG
Debug.WriteLine(ex.ToString());
#endif
}
}
}
The problem was from a 3rd party library that I had to use and it had to be called differently from a headless app.
Internally it was creating its own TaskScheduler if SynchronizationContext.Current was null.
Related
I'm able to successfully run a .NET 5 Console Application with a BackgroundService in an Azure Kubernetes cluster on Ubuntu 18.04. In fact, the BackgroundService is all that really runs: just grabs messages from a queue, executes some actions, then terminates when Kubernetes tells it to stop, or the occasional exception.
It's this last scenario which is giving me problems. When the BackgroundService hits an unrecoverable exception, I'd like the container to stop (complete, or whatever state will cause Kubernetes to either restart or destroy/recreate the container).
Unfortunately, any time an exception is encountered, the BackgroundService appears to hit the StopAsync() function (from what I can see in the logs and console output), but the container stays in a running state and never restarts. My Main() is as appears below:
public static async Task Main(string[] args)
{
// Build service host and execute.
var host = CreateHostBuilder(args)
.UseConsoleLifetime()
.Build();
// Attach application event handlers.
AppDomain.CurrentDomain.ProcessExit += OnProcessExit;
AppDomain.CurrentDomain.UnhandledException += new UnhandledExceptionEventHandler(OnUnhandledException);
try
{
Console.WriteLine("Beginning WebSec.Scanner.");
await host.StartAsync();
await host.WaitForShutdownAsync();
Console.WriteLine("WebSec.Scanner has completed.");
}
finally
{
Console.WriteLine("Cleaning up...");
// Ensure host is properly disposed.
if (host is IAsyncDisposable ad)
{
await ad.DisposeAsync();
}
else if (host is IDisposable d)
{
d.Dispose();
}
}
}
If relevant, those event handlers for ProcessExit and UnhandledException exist to flush the AppInsights telemetry channel (maybe that's blocking it?):
private static void OnProcessExit(object sender, EventArgs e)
{
// Ensure AppInsights logs are submitted upstream.
Console.WriteLine("Flushing logs to AppInsights");
TelemetryChannel.Flush();
}
private static void OnUnhandledException(object sender, UnhandledExceptionEventArgs e)
{
var thrownException = (Exception)e.ExceptionObject;
Console.WriteLine("Unhandled exception thrown: {0}", thrownException.Message);
// Ensure AppInsights logs are submitted upstream.
Console.WriteLine("Flushing logs to AppInsights");
TelemetryChannel.Flush();
}
I am only overriding ExecuteAsync() in the BackgroundService:
protected async override Task ExecuteAsync(CancellationToken stoppingToken)
{
this.logger.LogInformation(
"Service started.");
try
{
// Loop until the service is terminated.
while (!stoppingToken.IsCancellationRequested)
{
// Do some work...
}
}
catch (Exception ex)
{
this.logger.LogWarning(
ex,
"Terminating due to exception.");
}
this.logger.LogInformation(
"Service ending.",
}
My Dockerfile is simple and has this line to run the service:
ENTRYPOINT ["dotnet", "MyService.dll"]
Am I missing something obvious? I feel like there's something about running this as a Linux container that I'm forgetting in order to make this run properly.
Thank you!
Here is a full example of how to use IHostApplicationLifetime.StopApplication().
void Main()
{
var host = Host.CreateDefaultBuilder()
.ConfigureServices((context, services) =>
{
services.AddHostedService<MyService>();
})
.Build();
Console.WriteLine("Starting service");
host.Run();
Console.WriteLine("Ended service");
}
// You can define other methods, fields, classes and namespaces here
public class MyService : BackgroundService
{
private readonly IHostApplicationLifetime _lifetime;
private readonly Random _rnd = new Random();
public MyService(IHostApplicationLifetime lifetime)
{
_lifetime = lifetime;
}
protected override async Task ExecuteAsync(CancellationToken stoppingToken)
{
try
{
while (true)
{
stoppingToken.ThrowIfCancellationRequested();
var nextNumber = _rnd.Next(10);
if (nextNumber < 8)
{
Console.WriteLine($"We have number {nextNumber}");
}
else
{
throw new Exception("Number too high");
}
await Task.Delay(1000);
}
}
// If the application is shutting down, ignore it
catch (OperationCanceledException e) when (e.CancellationToken == stoppingToken)
{
Console.WriteLine("Application is shutting itself down");
}
// Otherwise, we have a real exception, so must ask the application
// to shut itself down.
catch (Exception e)
{
Console.WriteLine("Oh dear. We have an exception. Let's end the process.");
// Signal to the OS that this was an error condition by
// setting the exit code.
Environment.ExitCode = 1;
_lifetime.StopApplication();
}
}
}
Typical output from this program will look like:
Starting service
We have number 0
info: Microsoft.Hosting.Lifetime[0]
Application started. Press Ctrl+C to shut down.
info: Microsoft.Hosting.Lifetime[0]
Hosting environment: Production
info: Microsoft.Hosting.Lifetime[0]
Content root path: C:\Users\rowla\AppData\Local\Temp\LINQPad6\_spgznchd\shadow-1
We have number 2
Oh dear. We have an exception. Let's end the process.
info: Microsoft.Hosting.Lifetime[0]
Application is shutting down...
Ended service
When using Microsoft.Azure.Devices.Client.DeviceClient .net framework 4.8 closing out the application leaves multiple Threads running. Specifically DotNetty.Common.dll! DotNetty.Common.Concurrency.SingleThreadEventExecutor.PollTask
Versions 1.34.0 & 1.35.0 of Microsoft.Azure.Devices have this same problem.
Are we using DeviceClient improperly?
Is it a async thing im not understanding?
Am i missing a call to shut it down properly?
From examples online, i shouldn't have to do anything special and it should close it self out.
However it still hangs, currently this is a close implementation. I have yet to make a stand alone, so i havent duplicated this problem with only DeviceClient Code running
When the program exits, is_running gets set, and the program closes down other threads. Eventually we call
Environment.Exit(0);
This should be all the relevant code
private void thread_method()
{
using (var _deviceClient = DeviceClient.CreateFromConnectionString(connection), TransportType.Mqtt))
{
while (is_running)
{
var db = new Database(); // roughly an open entity framework connection
List <class> unprocessed_messages = db.GetUnprocessed();
List<List<Messages>> processed = breakup_method(unprocessed_messages);
foreach (var sublist in processed)
{
if (!await SendMessages(sublist , _deviceClient))
break;
// the processed sublist was successful
db.SaveChanges(); // make sure we dont send again
}
}
Thread.Sleep(500);
await _deviceClient.CloseAsync();
}
}
private async Task<bool> SendMessages(List<Message> messages, DeviceClient _deviceClient)
{
try
{
CancellationTokenSource cancellationTokenSource = new CancellationTokenSource(5000);
CancellationToken cancellationToken = cancellationTokenSource.Token;
await _deviceClient.SendEventBatchAsync(messages, cancellationToken);
if (cancellationToken.IsCancellationRequested)
return false;
return true;
}
catch (Exception e)
{
// logging
}
return false;
}
Different approach, which doesnt actively send anything.
Just an open , sleep until the program exits, Then close,
All in a using statement.
8 threads are still running the PollTask, and in the amount of time it took to setup everything above, was the time i was waiting for them to close. Which was at least 5 minutes.
private void thread_method()
{
using (var _deviceClient = DeviceClient.CreateFromConnectionString(connection), TransportType.Mqtt))
{
await _deviceClient.OpenAsync();
while (is_running) Thread.Sleep(500);
await _deviceClient.CloseAsync();
}
}
Last update, stand alone console app.
100% not my problem.
// Repost just in case
class Program
{
private static string _connection_string = $"HostName={url};DeviceId={the_id};SharedAccesskey={key}";// fill your in
public static bool is_running = false;
static void Main(string[] args)
{
is_running = true;
new System.Threading.Thread(new System.Threading.ThreadStart(thread_method)).Start();
Console.WriteLine("enter to exit");
String line = Console.ReadLine();
is_running = false;
}
public static async void thread_method()
{
using (var _deviceClient = DeviceClient.CreateFromConnectionString(_connection_string, TransportType.Mqtt))
{
await _deviceClient.OpenAsync();
while (is_running) System.Threading.Thread.Sleep(500);
await _deviceClient.CloseAsync();
}
}
}
https://github.com/Azure/azure-sdk-for-net/issues/24550
Bumped to the proper location
https://github.com/Azure/azure-iot-sdk-csharp/issues/2194
https://github.com/Azure/azure-sdk-for-net/issues/24550
https://github.com/Azure/azure-iot-sdk-csharp/issues/2194
Not a configuration issue, a 'dot netty' bug was hanging.
The fix, get a newer azure version Microsoft.Azure.Devices > 1.35.0
I'm trying to write a winforms app that retrieves a list of ids from a SOAP web service, then goes through that list, calling another web service to get individual details. To speed things up I'm doing these requests in parallel, in batches of 15.
When the web service calls all return OK, my application works great. However in my testing, if 2 or more requests timeout (I'm monitoring the requests using Fiddler and see I'm getting a 408), then it appears that the process hangs and I never get the completion message my programs should be displaying.
What I'd like to do if possible is retry these requests, or just tell the user to press the button again because an error occurred.
When adding the service reference, I made sure the generate task-based operations option was used.
Here is the code I've got so far (with some details renamed as not to give away the company I'm interfacing with):
private async void btnDownload_Click(object sender, EventArgs e)
{
try
{
Stopwatch watch = new Stopwatch();
watch.Start();
IProgress<string> progress = new Progress<string>(s =>
{
lbinfo.Items.Insert(0, s);
});
await GetData(progress);
watch.Stop();
progress.Report("Took " + (watch.ElapsedMilliseconds / 1000f) + " seconds. All done.");
}
catch(Exception ex)
{
MessageBox.Show("Main Click: "+ex.ToString());
}
}
private async Task GetData(IProgress<string> progress)
{
try
{
DownloadSoapClient client = new DownloadSoapClient();
client.Open();
progress.Report("Getting master list.");
List<int> uniqueIds = await GetMasterList(client); //this always seems to work
progress.Report("Downloaded master list. Found "+uniqueIds.Count +" unique ids.");
var detailedData = await GetIdDataRaw(uniqueIds, client,progress);
client.Close();
}
catch (Exception ex)
{
progress.Report("GetData: "+ex);
}
}
private async Task<List<DownloadResponse>> GetIdDataRaw(List<int> ids, DownloadSoapClient client, IProgress<string> progress)
{
using (var throttler = new SemaphoreSlim(15))
{
var allTasks = ids.Select(async x =>
{
await throttler.WaitAsync();
try
{
progress.Report(x.ToString());
return await client.DownloadAsync(username, password, x);
}
catch (Exception ex)
{
progress.Report("Error getting id:" + x + " " + ex.Message);
return null;
}
finally
{
throttler.Release();
}
});
return (await Task.WhenAll(allTasks)).ToList();
}
}
Typically the master list is around 1000 entries, and when only 1 times out, as expected I get a message saying there was an error getting id xxxx, and then "Took xx seconds. All done.".
So far I've tried a few other things, such as creating a client for each request, and refactoring to use Task.WhenAny in a while loop, but these exhibit the same behaviour when 2 or more requests fail.
I'm trying to use the following code to setup a failure condition, namely were there is no network path available, so the code shouldn't be able to send to the Service bus at all. I know this because I disable my network ports when I test.
I am still having trouble with the Async nature of the code though. I don't know in a console application like I have how to attach something that would log out the exception that I know should be generated.
How do I see that exception text?
public async Task TestQueueExists()
{
_queueClient = new QueueClient(AppSettings.McasServiceBusConnectionString,
AppSettings.ListServSyncQueueName);
Logger.Information(
$"Queue Created to: {_queueClient.QueueName} with RecieveMode: {_queueClient.ReceiveMode}");
try
{
await _queueClient.SendAsync(new Message("Test".ToUtf8Bytes()));
}
catch (Exception e)
{
Console.WriteLine(e);
throw;
}
}
According to your code, I assumed that you are using the Azure Service Bus .NET Standard client library Microsoft.Azure.ServiceBus. Per my test, you could leverage the following code to capture the exception as follows:
try
{
await _queueClient
.SendAsync(new Message(Encoding.UTF8.GetBytes("hello world")))
.ContinueWith(t =>
{
Console.WriteLine(t.Status + "," + t.IsFaulted + "," + t.Exception.InnerException);
}, TaskContinuationOptions.OnlyOnFaulted);
Console.WriteLine("Done");
}
catch (Exception e)
{
Console.WriteLine(e);
}
If the network is break, you may capture the exception as follows:
The issue I ran into was that I was using a Console application and the Console app runs sync out of the box. I had to modify my Main to return a Task and use async for the exception to be caught, which makes sense.
ie
internal class Program
{
static async Task Main(string[] args)
{
[..]
}
}
I have a headless UWP application that uses an external library to connect to a serial device and send some commands. It runs an infinite loop (while true) with a 10 minute pause between loops. The measurement process takes around 4 minutes.
The external library needs to run 3 measurements and after each it signals by raising an event. When the event is raised the 4th time I know that I can return the results.
After 4 hours (+/- a few seconds) the library stops raising events (usually it raises the event one or 2 times and then it halts, no errors, nothing).
I implemented in DoMeasureAsync() below a CancellationTokenSource that was supposed to set the IsCancelled property on the TaskCompletionSource after 8 minutes so that the task returns and the loop continues.
Problem:
When the measurement does not complete (the NMeasureCompletionSource never gets its result set in class CMeasure), the task from nMeasureCompletionSource is never cancelled. The delegate defined in RespondToCancellationAsync() should run after the 8 minutes.
If the measurement runs ok, I can see in the logs that the code in the
taskAtHand.ContinueWith((x) =>
{
Logger.LogDebug("Disposing CancellationTokenSource...");
cancellationTokenSource.Dispose();
});
gets called.
Edit:
Is it possible that the GC comes in after the 4 hours and maybe deallocates some variables and doing so makes the app to not be able to send the commands to the sensor? - It is not the case
What am I missing here?
//this gets called in a while (true) loop
public Task<PMeasurement> DoMeasureAsync()
{
nMeasureCompletionSource = new TaskCompletionSource<PMeasurement>();
cancellationTokenSource = new CancellationTokenSource(TimeSpan.FromMinutes(8));
var t = cMeasure.Run(nitrateMeasureCompletionSource, cancellationTokenSource.Token);
var taskAtHand = nitrateMeasureCompletionSource.Task;
taskAtHand.ContinueWith((x) =>
{
Logger.LogDebug("Disposing CancellationTokenSource...");
cancellationTokenSource.Dispose();
});
return taskAtHand;
}
public class CMeasure
{
public async Task Run(TaskCompletionSource<PMeasurement> tcs, CancellationToken cancellationToken)
{
try
{
NMeasureCompletionSource = tcs;
CancellationToken = cancellationToken;
CancellationToken.Register(async () => await RespondToCancellationAsync(), useSynchronizationContext: false);
CloseDevice(); //Closing device if for some reason is still open
await Task.Delay(2500);
TheDevice = await GetDevice();
measurementsdone = 0;
Process(); //start the first measurement
}
catch (Exception ex)
{
DisconnectCommManagerAndCloseDevice();
NMeasureCompletionSource.SetException(ex);
}
}
public async Task RespondToCancellationAsync()
{
if (!NitrateMeasureCompletionSource.Task.IsCompleted)
{
Logger.LogDebug("Measure Completion Source is not completed. Cancelling...");
NMeasureCompletionSource.SetCanceled();
}
DisconnectCommManagerAndCloseDevice();
await Task.Delay(2500);
}
private void Process()
{
if (measurementsdone < 3)
{
var message = Comm.Measure(m); //start a new measurement on the device
}
else
{
...
NMeasureCompletionSource.SetResult(result);
}
}
//the method called when the event is raised by the external library
private void Comm_EndMeasurement(object sender, EventArgs e)
{
measurementsdone++;
Process();
}
}
After more testing I have reached the conclusion that there is no memory leak and that all the objects are disposed. The cancellation works well also.
So far it appears that my problem comes from the execution of the headless app on the Raspberry Pi. Although I am using the deferral = taskInstance.GetDeferral(); it seems that the execution is stopped at some point...
I will test more and come back with the results (possibly in a new post, but I will put a link here as well).
Edit:
Here is the new post: UWP - Headless app stops after 3 or 4 hours
Edit 2:
The problem was from a 3rd party library that I had to use and it had to be called differently from a headless app. Internally it was creating its own TaskScheduler if SynchronizationContext.Current was null.