Reliable background queuing for task on IIS - c#

I am trying to implement a queue to handle tasks queued by using task.run(function()).
Here are my requirements:
1) The threadpool should not stop after the application context has died. I need to be sure the calls will be made. (I believe this requires the thread to be a foreground thread)
2) Additionally, I need to be able to log these errors. Each function has it own error handling implemented within. I would consider it a fire and forget because I don't actually need to pass the data back to the caller but the information needs to be logged.
3) The queue will remove tasks as they complete. I may need some way of managing the size of the queue to prevent overuse of resources. Possibly, able to set a time limit for each task and forcing it to cancel after allotted time to free space queue.
Specification:
- .Net 4.0 Framework
- IIS

I was able to achieve the desire functionality by referencing Stephen Cleary AspNetBackgroundTasks
By using a singleton pattern, I was able to create a single instance of a object which is used as wrapper for managing the task. The code is able to prevent shut down by using the iregisteredobject.
Upon receiving a notification of pending shut down, Asp.Net notifies the object. Using a TaskCompleteSource, which only updates its state to a completed state when the running tasks count equal zero, the application await for all task to finish before allowing the application to shutdown.
There are risk in this design. The risk that is very similar to notification systems currently running in main memory. If power is lost, the code is lost.
Additionally, remember to make atomic changes to shared variables or implement thread safe locking techniques.

Related

Long-running task without IHostedService running the entire life of the application?

I have a website page that needs the option of performing an operation that could take several minutes. To avoid performance issues and time outs, I want to run this operation outside of the HTTP request.
After some research, I found IHostedService and BackgroundService, which can be registered as a singleton using AddHostedService<T>().
But my concern is that a hosted service is always running. Doesn't that seem like a waste of resources when I just want it to run on demand?
Does anyone know a better option to run a lengthy task, or a way to use IHostedService that doesn't need to run endlessly?
Note that the operation calls and waits for an API call. And so I cannot report the progress of the operation, nor can I set a flag in a common database regarding whether the operation has completed.
One option to run a lengthy task on demand while avoiding performance issues and time outs is to use a message queue. You can have your Razor Pages website send a message to the queue when the operation is requested, and have a separate service, such as a background worker, consume messages from the queue and perform the operation. This allows you to decouple the task from the web request, and also allows for the possibility of adding more worker instances to handle the workload.
Another option is to use a task scheduler that runs on demand, such as Hangfire. It allows you to schedule background jobs and monitor their progress, which can be useful in your scenario where you cannot report the progress of the operation.
You can also use IHostedService, but you need to make sure that the service is only running when it is needed. You can use a flag or a semaphore to control whether the service is running or not. You can set the flag or semaphore when the operation is requested, and clear it when the operation is completed. The service can then check the flag or semaphore in its main loop, and exit if the flag is not set.
In summary:
message queue, task scheduler, and IHostedService with controlling flag/semaphore are all viable options for running a lengthy task on demand. The best option depends on your specific use case and requirements.

Are the benefits to using background workers in ASP.NET if there isn't app recycling?

Background: I have a simple ASP.NET Core 3.1 site. Very rarely (three or four times per week), a user might fill out a form that triggers an email to be sent.
I don't want to delay the page response while running the 'send email' operation (even though it only takes a second or two), so from everything I've read, it seems like the code that should handle the email should be a background worker/hosted service, and the Razor pages code should place the data object to be sent in a collection that gets monitored by the background service.
What I'm not fully understanding is why this is necessary in modern ASP.NET Core.
If I was doing this in a normal C# application (not ASP), I'd simply make the 'send email' method async (it's using MailKit, which has async methods), and call the async method without awaiting, allowing the the work be done on the threadpool while allowing the response thread to continue.
But existing answers and blog posts say that calling an async method without an await in ASP is dangerous, due to the fact that IIS can restart ASP processes (application pool recycling).
Yet, most things I've read say Application Recycling is an artifact of old ASP when memory leaks were common, and it's not really a thing on .Net Core. Additionally, many ASP applications aren't even hosted in IIS anymore.
Further, as far as I can tell, IHostedService/Background Worker objects aren't doing anything special - they don't seem to add any additional threading; they just look like singletons that have additional notification for environment startup and shutdown.
So:
Is calling a fire-and-forget async method in ASP.NET Core still considered poor practice, especially if the fire and forget task is short-lived? If so, why? [see edit below for clarification]
Other than notifications for shutdown, is there any reason why a background service is considered better than borrowing a managed threadpool thread (via Task.Run or QueueBackgroundWorkItem)? Wouldn't waking a background service (if it was awaiting on object to be placed in a collection) consume a pool thread in the same way?
Edit: I acknowledge that starting a task, and reporting success to the user, when there's a chance that operation could be terminated, is poor form. There's benefit to being notified of a shutdown and being able to finalize tasks.
Perhaps a better question is, does the old behavior of cycling still exist in modern ASP (on IIS or Kestrel)? Are there other reasons an orderly shutdown might be triggered (other than server shutdown/manual stop)?
I would still call it a poor practice.
The main concern here as well as in the referenced post is mainly about the promise of task completion.
Without being aware of the ghost background tasks, the runtime will not be able to notify the tasks to gracefully stop. This may or may not cause serious issues depending on the status of the tasks at the point the termination occurs.
Using fire forget task often means, your task is at the risk of having to start all over again when the process restarts. And sometimes this is not possible due to loss of context. Imagine your fire-forget task is calling another web API with parameters provided by a web request. The parameters are likely to get wiped out from memory if the process restarts.
And remember, the recycling is not always triggered by IIS / server. It could also be triggered by people. Say when your application runs into a memory leak issue, and you may want to recycle the app process every 1 hour as a temporary relief. Then you need to make sure you don't break your background tasks.
In terms of hosting - it is still possible to host ASP.Net Core applications in-process, in which the app pool gets recycled by IIS after a configured time period, or by default 29 hours.
In terms of lifetime - hosted services are types you register to DI, so DI features could be used, for example, this built-in hosted service implements IDisposable, which means proper clean up could be done upon shutting down.
Frankly, background tasks and hosted services both allow you to do fire and forget. But when you need reliability and resilience, hosted services win.
To answer the second half of your question, the app will wait for all hosted services' StopAsync methods to finish before shutting down. As long as you await your Tasks in the hosted service, this effectively means you can assume your Tasks will be allowed to finish running before the app shuts down. The app could still be force-shutdown, which in that case, nothing is guaranteed anymore.
If you need more guarantees about your background tasks, you should move them to run in a separate process. You could use something like Runly to make it easier to break out functionality into background jobs. It also makes it easy to provide real-time feedback to the user so that you are not lying to the user when you say "everything is done" while something is still running in the background.
Full disclosure: I cofounded Runly.

Converting threaded app to service

I currently have an application which is basically a wrapper for ~10 "LongRunning" Tasks. Each thread should keep running indefinitely, but sometimes they lock up or crash, and sometimes the wrapper app spontaneously exits (I haven't been able to track that down yet). Additionally, the wrapper application can currently only be running for one user, and that user has to be the one to restart the threads or relaunch the whole app.
I currently have a monitor utility to let me know when the threads stop doing work so that they can be manually restarted, but I'd like to automatically restart them instead. I'd also like the wrapper to be available to everyone to check the status of the threads, and for the threads to be running even when the wrapper isn't.
Based on these goals, I think I want to separate the threads into a Windows Service, and convert the wrapper into something which can just connect to the service to check its status and manipulate it.
How would I go about doing this? Is this a reasonable architecture? Should I turn each thread into a separate service, or should I have a single multi-threaded service?
Edit: All the tasks log to the same set of output files (via a TextWriter.Synchronized(StreamWriter)), and I would want to maintain that behavior.
They also all currently share the same database connection, which means I need to get them all to agree to close the connection at the same time when it's necessary. However, if they were split up they could each use their own database connection, and I wouldn't need to worry about synchronizing that. I actually suspect that this step is one of the current failure points, so splitting it up would be a Good Thing.
I would suggest you to stay inside one multithreading service if possible. Just make sure that threads are handled correctly when Service Stop is triggered. Put brake flags inside blocks of code that will take a lot of time to execute. This way you will make your service responsive on Stop event. Log any exceptions and make sure to wait for all threads to exit until service is finally stopped. This will prevent you to run same "task" in multiple threads.
Maintaining one service is in the end easier then multiple services.
Splitting to multiple services would be reasonable if you require some separate functionalities that can run or not beside each other.
I don't think moving the threads to a Windows Service removes any of the problems. The service will still crash randomly and the threads will still exit randomly.
I assume that your long-running tasks implement a kind of worker loop. Wrap the body of that loop in a try-catch and log all exceptions. Don't rethrow them so that the task does not ever exit. Examine the logs to find the bugs.

How to create a "Spool" service for a class in C#

I am looking into a C# programming fairly scrub to the language. I would like to think I have a good understanding of object oriented programming in general, and what running multiple threads means, at a high level, but actual implementation I am as said scrub.
What I am looking to do is to create a tool that will have many threads running and interacting with each other independent, each will serve their own task and may call others.
My strategy to ensure communication (without losing anything with multiple updates occurring same time from different threads) is on each class to create a spool like task that can be called external, and add tasks to a given thread, or spool service for these. I am not sure if I should place this on the class or external and have the class itself call the spool for new tasks and keeping track of the spool. Here I am in particular considering how to signal the class if an empty spool gets a task (listener approach, so tasks can subscribe to pools if they want to be awoken if new stuff arrive), or make a "check every X seconds if out of tasks and next task is not scheduled" approach
What would a good strategy be to create this, should I create this in the actual class, or external? What are the critical regions in the implementation, as the "busy wait check" allows it to only be on adding new jobs, and removing jobs on the actual spool, while the signaling will require both adding/removing jobs, but also the goto sleep on signaling to be critical, and that suddenly add a high requirement for the spool of what to do if the critical region has entered, as this could result in blocks, causing other blocks, and possible unforeseen deadlocks.
I use such a model often, on various systems. I define a class for the agents, say 'AgentClass' and one for the requests, say 'RequestClass'. The agent has two abstract methods, 'submit(RequestClass *message)' and 'signal()'. Typically, a thread in the agent constructs a producer-consumer queue and waits on it for RequestClass instances, the submit() method queueing the passed RequestClass instances to the queue. The RequestClass usually contains a 'command' enumeration that tells the agent what needs doing, together with all data required to perform the request and the 'sender' agent instance. When an agent gets a request, it switches on the enumeration to call the correct function to do the request. The agent acts only on the data in the RequestClass - results, error messages etc. are placed in data members of the RequestClass. When the agent has performed the request, (or failed and generated error data), it can either submit() the request back to the sender, (ie. the request has been performed asynchronously), or call the senders signal() function, whch signals an event upon which the sender was waiting, (ie. the request was performed synchronously).
I usually construct a fixed number of RequestClass instances at startup and store them in a global 'pool' P-C queue. Any agent/thread/whatever than needs to send a request can dequeue a RequestClass instance, fill in data, submit() it to the agent and then wait asynchronously or synchronously for the request to be performed. When done with, the RequestClass is returned to the pool. I do this to avoid continual malloc/free/new/dispose, ease debugging, (I dump the pool level to a status bar using a timer, so I always notice if a request leaks or gets double-freed), and to eliminate the need for explicit thread termination on app close, (if multiple threads are only ever reading/writing to data areas that outlive the application forms etc, the app will close easily and the OS can deal with all the threads - there are hundreds of posts about 'cleanly shutting down threads upon app close' - I never bother!).
Such message-passing designs are quite resistant to deadlocks since the only locks, (if any), are in the P-C queues, though you can certainly achieve it if you try hard enough:)
Is this the sort of system that you seem to need , or have I got it wrong?
Rgds,
Martin

What's the thread context for events in .Net using existing APIs?

When using APIs handling asynchronous events in .Net I find myself unable to predict how the library will scale for large numbers of objects.
For example, using the Microsoft.Office.Interop.UccApi library, when I create an endpoint it gets events when phone events happen. Now let's say I want to create 1000 endpoints. The number of events per endpoint is small, but is what's happening behind the scenes in the API able to keep up with the event flow? I don't know because it never says how it's architected.
Let's say I want to create all 1000 objects in the main thread. Then I want to put the Login method into a large thread pool so all objects login in parallel. Then once all the objects have logged in the next phase will begin.
Are the event callbacks the API raises happening in the original creating thread? A separate threadpool? Or the same threadpool I'm accessing with ThreadPool.QueueUserWorkItem?
Would I be better putting each object in it's own thread? Grouping a few objects in each thread? Or is it fine just creating all 1000 objects in the main thread and through .Net magic it will all be OK?
thanx
The events from interop assemblies are just wrappers around the COM connection points. The thread on which the call from the connection point arrive depends on the threading model of the object that advised on that connection point. COM will ensure the proper thread switching for this.
If your objects are implemented on the main thread, which in .Net is usually an STA, all events should arrive on that same thread. If you want your calls to arrive on a random thread from the COM thread pool (which I think is the same as the CLR thread pool), you need to create your objects on a thread that is configured as an MTA.
I would strongly advise against creating a thread for each object: 1) If you create these threads as STA, each of them will have a message queue, waisting system resource; 2) If you create them as MTA, nothing guarantees you the event call will arrive on your thread; 3) You'll have 1000 idle threads doing nothing and just waiting on an event to shutdown; and 4) Starting up and shutting down all these threads will have terrible perf cost on your application.
It really depends on a lot of things, primarily how powerful your hardware is. The threadpool does have a certain number of threads (which you can increase) that it will make available for your application. So if all of your events are firing at the same time some will most likely be waiting for a few moments while your threadpool waits for threads to become free again. The tradeoff is that you don't have the performance hit of creating new threads all the time either. Probably creating 1000 threads isn't the right answer either.
It may turn out that this is ideal, both because of the performance gains in reusing threads but also because having 1000 threads all running simultaneously might be more memory / CPU usage than it's worth.
I just wanted to note that in .NET 2.0 and greater it's possible to programmatically increase the maximum number of threads in the thread pool using ThreadPool.SetMaxThreads(). Given this you can put a hard cap on the number of threads and so ensure the scheduler won't be brought to it's knees by the overhead.
Even more useful in this sort of case, you can set the minimum number of threads with ThreadPool.SetMinThreads(). With this you can ensure that you only pay the "horrible performance price" Franci is talking about once, at application startup. You could balance this against the expected number peak of users and so ensure you won't be creating tons of new threads.
A single new thread creation won't destroy you. What I would be worried about is the case where a lot of threads need to be created at the same time. If you can say that this will only happen at startup you would be golden.

Categories