We have alot of requests in our system so we use Tasks with WebApi. On some places we have high requirements on speed so we cant wait for the Task to complete, I have created a Worker for this. It creates a nested container so that Entity frameworks DbContext wont get disposed etc. But it looks like Task.Run spawns a new thread for each time, how well will this scale?
public class BackgroundWorker<TScope> : IBusinessWorker<TScope>, IRegisteredObject where TScope : class
{
private readonly IBusinessScope<TScope> _scope;
private bool _started;
private bool _stopping;
public BackgroundWorker(IBusinessScope<TScope> scope)
{
_scope = scope;
}
public void Run(Func<TScope, Task> action)
{
if(_stopping) throw new Exception("App pool is recycling, cant queue work");
if(_started) throw new Exception("You cant call Run multiple times");
_started = true;
HostingEnvironment.RegisterObject(this);
Task.Run(() =>
action(_scope.EntryPoint).ContinueWith(t =>
{
_scope.Dispose();
HostingEnvironment.UnregisterObject(this);
}));
}
public void Stop(bool immediate)
{
_stopping = true;
if(immediate)
HostingEnvironment.UnregisterObject(this);
}
}
Used like
backgroundWorker.Run(async ctx => await ctx.AddRange(foos).Save());
If I google they all end up using Task.Run but doesn't that kill the purpose?
Update:
Did a test
var guid = Guid.NewGuid();
_businessWorker.Run(async ctx => {
System.Diagnostics.Debug.WriteLine("{0}: {1}", guid, Thread.CurrentThread.ManagedThreadId);
await Task.Delay(1);
System.Diagnostics.Debug.WriteLine("{0}: {1}", guid, Thread.CurrentThread.ManagedThreadId);
});
This outputs
3bdbe90b-c31e-4709-95d8-f7516210b0ac: 17
3bdbe90b-c31e-4709-95d8-f7516210b0ac: 9
6548fd26-d209-4427-9a91-40fc30aa509e: 15
6548fd26-d209-4427-9a91-40fc30aa509e: 19
7411b043-4fae-44bf-b93f-4273a532afa1: 7
7411b043-4fae-44bf-b93f-4273a532afa1: 17
Which indicates that Task.Run actually works like i think it should
With real DB code it looks like this
a939713d-d728-46c9-be33-aa57704cf242: 19 <--
a939713d-d728-46c9-be33-aa57704cf242: 19 <-- Used same for entire work
7e588a42-afd0-4ab5-ba6b-f8520c889cde: 7
7e588a42-afd0-4ab5-ba6b-f8520c889cde: 19 <-- Reused first works thread when work #2 continued
6f3b067f-f478-43f9-8411-8142b449c28b: 8
6f3b067f-f478-43f9-8411-8142b449c28b: 18
update:
Tried Luaan's approach, seems to work with Tasks spawned from EntityFramework or WebApi HttpClient, but with manual Tasks etc like below it does not work well, some are executed some are not. With Task.Run all are executed
_businessWorkerFactory().Run(async ctx =>
{
var guid = Guid.NewGuid();
System.Diagnostics.Debug.WriteLine("{0}: {1}", guid, Thread.CurrentThread.ManagedThreadId);
var completion = new TaskCompletionSource<bool>();
ThreadPool.QueueUserWorkItem(obj =>
{
Thread.Sleep(1000);
completion.SetResult(true);
});
await completion.Task;
System.Diagnostics.Debug.WriteLine("{0}: {1}", guid, Thread.CurrentThread.ManagedThreadId);
});
Task.Run schedules the task to run on a thread pool thread. The same thread pool that handles requests.
On an ASP.NET application, sending work to the thread pool steals threads that might be necessary to handle requests.
Given your requirements, I think you would be better queuing that work to another service/process using something like MSMQ.
Task.Run doesn't spawn a new thread - it borrows one from the thread pool (assuming the thread pool task scheduler - there's different schedulers, and you can write your own as well). When you use await inside of Task.Run, it will still work as usual - freeing the thread pool thread until a callback is posted.
However, exactly for that reason, there's little point in using Task.Run for I/O work. If you have asynchronous I/O to do, just do it - it will work exactly the same, without requiring a context switch. You must make it asynchronous though - if it's just blocking code, you're taking up valuable threads from the thread pool.
Note that you don't need for an asynchronous request to finish. If the asynchronous action you are performing doesn't need too much time to setup (that is, it returns the Task almost immediately, even though it isn't finished), you can just call it:
public async Task SomeAsync()
{
var request = new MyRequest();
await request.MakeRequestAsync();
...
}
public void Start()
{
var task = SomeAsync();
// Now the task is started, and we can use it for future reference. Or just wire up
// some error handling continuations etc. - though it's usually a better idea to do that
// within SomeAsync directly.
}
Related
I am trying to understand some code (for performance reasons) that is processing tasks from a queue. The code is C# .NET Framework 4.8 (And I didn't write this stuff)
I have this code creating a timer that from what I can tell should use a new thread every 10 seconds
_myTimer = new Timer(new TimerCallback(OnTimerGo), null, 0, 10000 );
Inside the onTimerGo it calls DoTask() inside of DoTask() it grabs a task off a queue and then does this
System.Threading.Tasks.Task.Factory.StartNew(ProcessTask, task).ContinueWith(c => DoTask());
My reading of this is that a new thread should start running OnTimerGo every 10 seconds, and that thread should in parralel run ProcessTask on tasks as fast as it can get them from the queue.
I inserted some code to call ThreadPool.GetMaxThreads and ThreadPool.GetAvailableThreads to figure out how many threads were in use. Then I queued up 10,000 things for it to do and let it loose.
I never see more then 4 threads in use at a time. This is running on a c4.4xlarge ec2 instance... so 16 vCPU 30 gb mem. The get max and available return over 2k. So I would expect more threads. By looking at the logging I can see that a total of 50ish different threads (by thread id) end up doing the work over the course of 20 minutes. Since the timer is set to every 10 seconds, I would expect 100 threads to be doing the work (or for it to finish sooner).
Looking at the code, the only time a running thread should stop is if it asks for a task from the queue and doesn't get one. Some other logging shows that there are never more than 2 tasks running in a thread. This is probably because they work is pretty fast. So the threads shouldn't be exiting, and I can even see from the logs that many of them end up doing as many as 500 tasks over the 20 minutes.
so... what am I missing here. Are the ThreadPool.GetMaxThreads and ThreadPool.GetAvailableThreads not accurate if run from inside a thread? Is something shutting down some of the threads while letting others keep going?
EDIT: adding more code
public static void StartScheduler()
{
lock (TimerLock)
{
if (_timerShutdown == false)
{
_myTimer = new Timer(new TimerCallback(OnTimerGo), null, 0, 10 );
const int numberOfSecondsPerMinute = 60;
const int margin = 1;
var pollEventsPerMinute = (numberOfSecondsPerMinute/SystemPreferences.TaskPollingIntervalSeconds);
_numberOfTimerCallsForHeartbeat = pollEventsPerMinute - margin;
}
}
}
private static void OnTimerGo(object state)
{
try
{
_lastTimer = DateTime.UtcNow;
var currentTickCount = Interlocked.Increment(ref _timerCallCount);
if (currentTickCount == _numberOfTimerCallsForHeartbeat)
{
Interlocked.Exchange(ref _timerCallCount, 0);
MonitoringTools.SendHeartbeatMetric(Heartbeat);
}
CheckForTasks();
}
catch (Exception e)
{
Log.Warn("Scheduler: OnTimerGo exception", e);
}
}
public static void CheckForTasks()
{
try
{
if (DoTask())
_lastStart = DateTime.UtcNow;
_lastStartOrCheck = DateTime.UtcNow;
}
catch (Exception e)
{
Log.Error("Unexpected exception checking for tasks", e);
}
}
private static bool DoTask()
{
Func<DataContext, bool> a = db =>
{
var mtid = Thread.CurrentThread.ManagedThreadId;
int totalThreads = Process.GetCurrentProcess().Threads.Count;
int maxWorkerThreads;
int maxPortThreads;
ThreadPool.GetMaxThreads(out maxWorkerThreads, out maxPortThreads);
int AvailableWorkerThreads;
int AvailablePortThreads;
ThreadPool.GetAvailableThreads(out AvailableWorkerThreads, out AvailablePortThreads);
int usedWorkerThreads = maxWorkerThreads - AvailableWorkerThreads;
string usedThreadMessage = $"Thread {mtid}: Threads in Use count: {usedWorkerThreads}";
Log.Info(usedThreadMessage);
var taskTypeAndTasks = GetTaskListTypeAndTasks();
var task = GetNextTask(db, taskTypeAndTasks.Key, taskTypeAndTasks.Value);
if (_timerShutdown)
{
Log.Debug("Task processing stopped.");
return false;
}
if (task == null)
{
Log.DebugFormat("DoTask: Idle in thread {0} ({1} tasks running)", mtid, _processingTaskLock);
return false;
}
Log.DebugFormat("DoTask: starting task {2}:{0} on thread {1}", task.Id, mtid, task.Class);
System.Threading.Tasks.Task.Factory.StartNew(ProcessTask, task).ContinueWith(c => DoTask());
Log.DebugFormat("DoTask: done ({0})", mtid);
return true;
};
return DbExtensions.WithDbWrite(ctx => a(ctx));
}
The Task.Factory.StartNew by default doesn't create a new thread. It borrows a thread from the ThreadPool instead.
The ThreadPool is intended as a small pool of reusable threads, to help amortize the cost of running frequent and lightweight operations like callbacks, continuations, event handers etc. Depleting the ThreadPool from available workers by scheduling too much work on it, results in a situation that is called saturation or starvation. And as you've already figured out, it's not a happy situation to be.
You can prevent the saturation of the ThreadPool by running your long-running work on dedicated threads instead of ThreadPool threads. This can be done by passing the TaskCreationOptions.LongRunning as argument to the Task.Factory.StartNew:
_ = Task.Factory.StartNew(ProcessTask, task, CancellationToken.None,
TaskCreationOptions.LongRunning,
TaskScheduler.Default).ContinueWith(t => DoTask(), CancellationToken.None,
TaskContinuationOptions.ExecuteSynchronously,
TaskScheduler.Default);
The above code schedules the ProcessTask(task) on a new thread, and after the invocation is completed either successfully or unsuccessfully, the DoTask will be invoked on the same thread. Finally the thread will be terminated. The discard _ signifies that the continuation Task (the task returned by the ContinueWith) is fire-and-forget. Which, to put it mildly, is architecturally suspicious. 😃
In case you are wondering why I pass the TaskScheduler.Default explicitly as argument to StartNew and ContinueWith, check out this link.
My reading of this is that a new thread should start running OnTimerGo every 10 seconds, and that thread should in parralel run ProcessTask on tasks as fast as it can get them from the queue.
Well, that is definitely not what's happening. It's a lot of uncertainty about your code, but it's clear that another DoTask is starting AFTER ProcessTask completes. And that is not parallel execution. Your line of code is this
System.Threading.Tasks.Task.Factory.StartNew(ProcessTask, task).ContinueWith(c => DoTask());
I suggest you to start another DoTask right there like this:
System.Threading.Tasks.Task.Factory.StartNew(ProcessTask, task);
DoTask();
Make sure your code is ready for parallel execution, though.
Background
I have an MVC 5 application and wanted to test if the requests were running in parallel. To do so I used the code below, and opened multiple pages all making the same request.
Code
Below is a relatively simple method where I wanted to the parallel nature.
public async Task<ActionResult> Login(string returnUrl, string message = "")
{
var rng = new Random();
var wait = rng.Next(3, 10);
var threadGuid = Guid.NewGuid();
DebugHelper.WriteToDebugLog($"Thread {threadGuid} about to wait {wait} seconds");
await Task.Delay(wait * 1000);
DebugHelper.WriteToDebugLog($"Thread {threadGuid} finished");
return View();
}
The class DebugHelper is just used so that I can write to a file safely.
public static class DebugHelper
{
private static readonly object WriteLock = new object();
public static void WriteToDebugLog(string message, string path = "C:\\Temp\\Log.txt")
{
lock (WriteLock)
{
File.AppendAllLines(path, new string[] { "", GetDateString(), message });
}
}
}
Output
I'm consistently getting this type of output which suggests the threads are blocking each other.
2020-03-24T13:43:43.1431913Z
Thread 6e42a6c5-d3cb-4541-b8aa-34b290952973 about to wait 7 seconds
2020-03-24T13:43:50.1564077Z
Thread 6e42a6c5-d3cb-4541-b8aa-34b290952973 finished
2020-03-24T13:43:50.1853278Z
Thread 90923f55-befd-4224-bdd8-b67f787839fc about to wait 4 seconds
2020-03-24T13:43:54.1943271Z
Thread 90923f55-befd-4224-bdd8-b67f787839fc finished
2020-03-24T13:43:54.2312257Z
Thread fa2d8d30-b762-4262-b188-0b34da5f4f04 about to wait 3 seconds
2020-03-24T13:43:57.2370556Z
Thread fa2d8d30-b762-4262-b188-0b34da5f4f04 finished
2020-03-24T13:43:57.2679690Z
Thread 37311a0e-d19e-4563-b92a-5e5e3def379a about to wait 8 seconds
2020-03-24T13:44:05.2812367Z
Thread 37311a0e-d19e-4563-b92a-5e5e3def379a finished
Question
Why is this occurring?
I was under the impression that any ASP.NET application was multithreaded to begin with, so even in a situation where I don't have the async/await setup, I thought it would run these threads simultaneously.
Update
As suggested in the answers/comments, my methodology was wrong. After using the following code I could see quite clearly in the logs that it was indeed running in parallel.
var targetTime = DateTime.UtcNow + TimeSpan.FromSeconds(5);
while(DateTime.UtcNow < targetTime)
{
DebugHelper.WriteToDebugLog($"Thread {threadGuid} with ID {threadId} doing stuff");
await Task.Delay(1000);
}
It simply boils down to the fact that your debug logging with its WriteLock and synchronous File.AppendAllLines forces a synchronization lock onto all asynchronous functions that call it.
You would do far better to have an asynchronous write to debug process that would allow your tasks to continue running.
Product/consumer patter, semaphores, events, use of asynchronous file access APIs all spring to mind.
If you are using session at all it can lock the user to a single thread. Check for controller level, page level, or filter/attribute session use. If you are unsure try adding
[SessionState(System.Web.SessionState.SessionStateBehavior.ReadOnly)]
to the controller.
Also, await by default will continue on the same thread that began the await. Try using configureAwait(false) to allow it to be flexible in the threads it uses.
await Task.Delay(wait * 1000).ConfigureAwait(false);
We could abort a Thread like this:
Thread thread = new Thread(SomeMethod);
.
.
.
thread.Abort();
But can I abort a Task (in .Net 4.0) in the same way not by cancellation mechanism. I want to kill the Task immediately.
The guidance on not using a thread abort is controversial. I think there is still a place for it but in exceptional circumstance. However you should always attempt to design around it and see it as a last resort.
Example;
You have a simple windows form application that connects to a blocking synchronous web service. Within which it executes a function on the web service within a Parallel loop.
CancellationTokenSource cts = new CancellationTokenSource();
ParallelOptions po = new ParallelOptions();
po.CancellationToken = cts.Token;
po.MaxDegreeOfParallelism = System.Environment.ProcessorCount;
Parallel.ForEach(iListOfItems, po, (item, loopState) =>
{
Thread.Sleep(120000); // pretend web service call
});
Say in this example, the blocking call takes 2 mins to complete. Now I set my MaxDegreeOfParallelism to say ProcessorCount. iListOfItems has 1000 items within it to process.
The user clicks the process button and the loop commences, we have 'up-to' 20 threads executing against 1000 items in the iListOfItems collection. Each iteration executes on its own thread. Each thread will utilise a foreground thread when created by Parallel.ForEach. This means regardless of the main application shutdown, the app domain will be kept alive until all threads have finished.
However the user needs to close the application for some reason, say they close the form.
These 20 threads will continue to execute until all 1000 items are processed. This is not ideal in this scenario, as the application will not exit as the user expects and will continue to run behind the scenes, as can be seen by taking a look in task manger.
Say the user tries to rebuild the app again (VS 2010), it reports the exe is locked, then they would have to go into task manager to kill it or just wait until all 1000 items are processed.
I would not blame you for saying, but of course! I should be cancelling these threads using the CancellationTokenSource object and calling Cancel ... but there are some problems with this as of .net 4.0. Firstly this is still never going to result in a thread abort which would offer up an abort exception followed by thread termination, so the app domain will instead need to wait for the threads to finish normally, and this means waiting for the last blocking call, which would be the very last running iteration (thread) that ultimately gets to call po.CancellationToken.ThrowIfCancellationRequested.
In the example this would mean the app domain could still stay alive for up to 2 mins, even though the form has been closed and cancel called.
Note that Calling Cancel on CancellationTokenSource does not throw an exception on the processing thread(s), which would indeed act to interrupt the blocking call similar to a thread abort and stop the execution. An exception is cached ready for when all the other threads (concurrent iterations) eventually finish and return, the exception is thrown in the initiating thread (where the loop is declared).
I chose not to use the Cancel option on a CancellationTokenSource object. This is wasteful and arguably violates the well known anti-patten of controlling the flow of the code by Exceptions.
Instead, it is arguably 'better' to implement a simple thread safe property i.e. Bool stopExecuting. Then within the loop, check the value of stopExecuting and if the value is set to true by the external influence, we can take an alternate path to close down gracefully. Since we should not call cancel, this precludes checking CancellationTokenSource.IsCancellationRequested which would otherwise be another option.
Something like the following if condition would be appropriate within the loop;
if (loopState.ShouldExitCurrentIteration || loopState.IsExceptional || stopExecuting) {loopState.Stop(); return;}
The iteration will now exit in a 'controlled' manner as well as terminating further iterations, but as I said, this does little for our issue of having to wait on the long running and blocking call(s) that are made within each iteration (parallel loop thread), since these have to complete before each thread can get to the option of checking if it should stop.
In summary, as the user closes the form, the 20 threads will be signaled to stop via stopExecuting, but they will only stop when they have finished executing their long running function call.
We can't do anything about the fact that the application domain will always stay alive and only be released when all foreground threads have completed. And this means there will be a delay associated with waiting for any blocking calls made within the loop to complete.
Only a true thread abort can interrupt the blocking call, and you must mitigate leaving the system in a unstable/undefined state the best you can in the aborted thread's exception handler which goes without question. Whether that's appropriate is a matter for the programmer to decide, based on what resource handles they chose to maintain and how easy it is to close them in a thread's finally block. You could register with a token to terminate on cancel as a semi workaround i.e.
CancellationTokenSource cts = new CancellationTokenSource();
ParallelOptions po = new ParallelOptions();
po.CancellationToken = cts.Token;
po.MaxDegreeOfParallelism = System.Environment.ProcessorCount;
Parallel.ForEach(iListOfItems, po, (item, loopState) =>
{
using (cts.Token.Register(Thread.CurrentThread.Abort))
{
Try
{
Thread.Sleep(120000); // pretend web service call
}
Catch(ThreadAbortException ex)
{
// log etc.
}
Finally
{
// clean up here
}
}
});
but this will still result in an exception in the declaring thread.
All things considered, interrupt blocking calls using the parallel.loop constructs could have been a method on the options, avoiding the use of more obscure parts of the library. But why there is no option to cancel and avoid throwing an exception in the declaring method strikes me as a possible oversight.
But can I abort a Task (in .Net 4.0) in the same way not by
cancellation mechanism. I want to kill the Task immediately.
Other answerers have told you not to do it. But yes, you can do it. You can supply Thread.Abort() as the delegate to be called by the Task's cancellation mechanism. Here is how you could configure this:
class HardAborter
{
public bool WasAborted { get; private set; }
private CancellationTokenSource Canceller { get; set; }
private Task<object> Worker { get; set; }
public void Start(Func<object> DoFunc)
{
WasAborted = false;
// start a task with a means to do a hard abort (unsafe!)
Canceller = new CancellationTokenSource();
Worker = Task.Factory.StartNew(() =>
{
try
{
// specify this thread's Abort() as the cancel delegate
using (Canceller.Token.Register(Thread.CurrentThread.Abort))
{
return DoFunc();
}
}
catch (ThreadAbortException)
{
WasAborted = true;
return false;
}
}, Canceller.Token);
}
public void Abort()
{
Canceller.Cancel();
}
}
disclaimer: don't do this.
Here is an example of what not to do:
var doNotDoThis = new HardAborter();
// start a thread writing to the console
doNotDoThis.Start(() =>
{
while (true)
{
Thread.Sleep(100);
Console.Write(".");
}
return null;
});
// wait a second to see some output and show the WasAborted value as false
Thread.Sleep(1000);
Console.WriteLine("WasAborted: " + doNotDoThis.WasAborted);
// wait another second, abort, and print the time
Thread.Sleep(1000);
doNotDoThis.Abort();
Console.WriteLine("Abort triggered at " + DateTime.Now);
// wait until the abort finishes and print the time
while (!doNotDoThis.WasAborted) { Thread.CurrentThread.Join(0); }
Console.WriteLine("WasAborted: " + doNotDoThis.WasAborted + " at " + DateTime.Now);
Console.ReadKey();
You shouldn't use Thread.Abort()
Tasks can be Cancelled but not aborted.
The Thread.Abort() method is (severely) deprecated.
Both Threads and Tasks should cooperate when being stopped, otherwise you run the risk of leaving the system in a unstable/undefined state.
If you do need to run a Process and kill it from the outside, the only safe option is to run it in a separate AppDomain.
This answer is about .net 3.5 and earlier.
Thread-abort handling has been improved since then, a.o. by changing the way finally blocks work.
But Thread.Abort is still a suspect solution that you should always try to avoid.
And in .net Core (.net 5+) Thread.Abort() will now throw a PlatformNotSupportedException .
Kind of underscoring the 'deprecated' point.
Everyone knows (hopefully) its bad to terminate thread. The problem is when you don't own a piece of code you're calling. If this code is running in some do/while infinite loop , itself calling some native functions, etc. you're basically stuck. When this happens in your own code termination, stop or Dispose call, it's kinda ok to start shooting the bad guys (so you don't become a bad guy yourself).
So, for what it's worth, I've written those two blocking functions that use their own native thread, not a thread from the pool or some thread created by the CLR. They will stop the thread if a timeout occurs:
// returns true if the call went to completion successfully, false otherwise
public static bool RunWithAbort(this Action action, int milliseconds) => RunWithAbort(action, new TimeSpan(0, 0, 0, 0, milliseconds));
public static bool RunWithAbort(this Action action, TimeSpan delay)
{
if (action == null)
throw new ArgumentNullException(nameof(action));
var source = new CancellationTokenSource(delay);
var success = false;
var handle = IntPtr.Zero;
var fn = new Action(() =>
{
using (source.Token.Register(() => TerminateThread(handle, 0)))
{
action();
success = true;
}
});
handle = CreateThread(IntPtr.Zero, IntPtr.Zero, fn, IntPtr.Zero, 0, out var id);
WaitForSingleObject(handle, 100 + (int)delay.TotalMilliseconds);
CloseHandle(handle);
return success;
}
// returns what's the function should return if the call went to completion successfully, default(T) otherwise
public static T RunWithAbort<T>(this Func<T> func, int milliseconds) => RunWithAbort(func, new TimeSpan(0, 0, 0, 0, milliseconds));
public static T RunWithAbort<T>(this Func<T> func, TimeSpan delay)
{
if (func == null)
throw new ArgumentNullException(nameof(func));
var source = new CancellationTokenSource(delay);
var item = default(T);
var handle = IntPtr.Zero;
var fn = new Action(() =>
{
using (source.Token.Register(() => TerminateThread(handle, 0)))
{
item = func();
}
});
handle = CreateThread(IntPtr.Zero, IntPtr.Zero, fn, IntPtr.Zero, 0, out var id);
WaitForSingleObject(handle, 100 + (int)delay.TotalMilliseconds);
CloseHandle(handle);
return item;
}
[DllImport("kernel32")]
private static extern bool TerminateThread(IntPtr hThread, int dwExitCode);
[DllImport("kernel32")]
private static extern IntPtr CreateThread(IntPtr lpThreadAttributes, IntPtr dwStackSize, Delegate lpStartAddress, IntPtr lpParameter, int dwCreationFlags, out int lpThreadId);
[DllImport("kernel32")]
private static extern bool CloseHandle(IntPtr hObject);
[DllImport("kernel32")]
private static extern int WaitForSingleObject(IntPtr hHandle, int dwMilliseconds);
While it's possible to abort a thread, in practice it's almost always a very bad idea to do so. Aborthing a thread means the thread is not given a chance to clean up after itself, leaving resources undeleted, and things in unknown states.
In practice, if you abort a thread, you should only do so in conjunction with killing the process. Sadly, all too many people think ThreadAbort is a viable way of stopping something and continuing on, it's not.
Since Tasks run as threads, you can call ThreadAbort on them, but as with generic threads you almost never want to do this, except as a last resort.
I faced a similar problem with Excel's Application.Workbooks.
If the application is busy, the method hangs eternally. My approach was simply to try to get it in a task and wait, if it takes too long, I just leave the task be and go away (there is no harm "in this case", Excel will unfreeze the moment the user finishes whatever is busy).
In this case, it's impossible to use a cancellation token. The advantage is that I don't need excessive code, aborting threads, etc.
public static List<Workbook> GetAllOpenWorkbooks()
{
//gets all open Excel applications
List<Application> applications = GetAllOpenApplications();
//this is what we want to get from the third party library that may freeze
List<Workbook> books = null;
//as Excel may freeze here due to being busy, we try to get the workbooks asynchronously
Task task = Task.Run(() =>
{
try
{
books = applications
.SelectMany(app => app.Workbooks.OfType<Workbook>()).ToList();
}
catch { }
});
//wait for task completion
task.Wait(5000);
return books; //handle outside if books is null
}
This is my implementation of an idea presented by #Simon-Mourier, using the dotnet thread, short and simple code:
public static bool RunWithAbort(this Action action, int milliseconds)
{
if (action == null) throw new ArgumentNullException(nameof(action));
var success = false;
var thread = new Thread(() =>
{
action();
success = true;
});
thread.IsBackground = true;
thread.Start();
thread.Join(milliseconds);
thread.Abort();
return success;
}
You can "abort" a task by running it on a thread you control and aborting that thread. This causes the task to complete in a faulted state with a ThreadAbortException. You can control thread creation with a custom task scheduler, as described in this answer. Note that the caveat about aborting a thread applies.
(If you don't ensure the task is created on its own thread, aborting it would abort either a thread-pool thread or the thread initiating the task, neither of which you typically want to do.)
using System;
using System.Threading;
using System.Threading.Tasks;
...
var cts = new CancellationTokenSource();
var task = Task.Run(() => { while (true) { } });
Parallel.Invoke(() =>
{
task.Wait(cts.Token);
}, () =>
{
Thread.Sleep(1000);
cts.Cancel();
});
This is a simple snippet to abort a never-ending task with CancellationTokenSource.
I have a simple implementation of HTTP Server. The code is shown below. It was tested on the server machine with 32cores. If I wrap the processContext method into Task.Run call, then the performance doubles (at least). Considering that this gave me a performance gain in this particular case, I am confused now: having some method returning a Task, which I don't wish to wait for, what strategy should I follow? Should I call it directly or should I wrap inside Task.Run?
class Program
{
private static ConcurrentBag<DateTime> _trails = new ConcurrentBag<DateTime>();
static void Main(string[] args)
{
Console.WriteLine($"Is server GC: {(GCSettings.IsServerGC ? "true" : "false")}");
string prefix = args[0];
CancellationTokenSource cancellationSource = new CancellationTokenSource();
HttpListener httpListener = new HttpListener();
httpListener.Prefixes.Add(prefix);
httpListener.Start();
Task.Run(async () =>
{
while (!cancellationSource.Token.IsCancellationRequested)
{
HttpListenerContext context = null;
try
{
context = await httpListener.GetContextAsync();
if (cancellationSource.Token.IsCancellationRequested)
{
context.Response.Abort();
break;
}
}
catch (ObjectDisposedException)
{
return;
}
catch (HttpListenerException ex)
{
if (cancellationSource.Token.IsCancellationRequested && ex.ErrorCode == 995)
{
break;
}
throw;
}
// Uncommenting below line and commenting the next one improves the performance at least twice
// Task childProcessingTask = Task.Run(async () => await processContext(context));
var dt = processContext(context);
}
});
using (Timer t = new Timer(o => Console.Title = $"Async Server: {_trails.Count}", null, 0, 5000))
{
Console.WriteLine("Running...");
Console.ReadLine();
cancellationSource.Cancel();
Console.WriteLine("Stopped accepting new request. Waiting for pending requests...");
Console.WriteLine("Stopped");
httpListener.Close();
}
var gTrails = _trails.GroupBy(t => new DateTime(t.Year, t.Month, t.Day, t.Hour, t.Minute, 0))
.Select(g => new { MinuteDt = g.Key, Count = g.Count() })
.OrderBy(x => x.MinuteDt).ToList();
gTrails.ForEach(x => Console.WriteLine($"{x.MinuteDt:HH:mm}\t{x.Count}"));
if (gTrails.Count > 2)
{
decimal avg = gTrails.Take(gTrails.Count - 1).Skip(1).Average(g => (decimal)g.Count);
Console.WriteLine($"Average: {avg:0.00}/min, {avg / 60.0m:00.0}/sec");
}
Console.ReadLine();
}
private static async Task processContext(HttpListenerContext context)
{
DateTime requestDt = DateTime.Now;
Stopwatch sw = Stopwatch.StartNew();
string requestId = context.Request.QueryString["requestId"];
byte[] requestIdBytes = Encoding.ASCII.GetBytes(requestId);
context.Response.ContentLength64 = requestIdBytes.Length;
await context.Response.OutputStream.WriteAsync(requestIdBytes, 0, requestIdBytes.Length);
try
{
context.Response.Close();
}
catch { }
_trails.Add(requestDt);
}
}
Why wrapping awaitable async method into Task.Run improves the performance at least twice?
Many async methods have a synchronous part, that’s running synchronously on the caller’s thread.
In case of your processContext method, the code of that method from the start to the first await is running on the caller’s thread.
If you don’t use Task.Run, after the connection’s accepted, your software first runs the synchronous portion of processContext method. It calls await, the task’s context goes to the heap, the thread’s free and resumes another iteration of while (!cancellationSource.Token.IsCancellationRequested) loop.
Soon the task’s finished and the scheduler wants to resume it.
But it can’t resume it on the same thread where it started, ‘coz that thread is likely busy listening for new connections and starting another child tasks.
You have many cores, there’s very good change the task will resume on another core. If it happens on another core of the same CPU, the core will have to wait for data (like local variables, HttpListenerContext instance variables, etc.) from L3 cache, because L1 and L2 caches are per core. If it happens on another CPU, the core will have to wait for the system RAM, which is even slower.
If you use Task.Run, the thread that was running that endless while(!IsCancellationRequested) loop stays doing that, and immediately continues to another iteration of that loop, all data already on the cache of that core.
The processContext method will start running on some other core from the very beginning. If you’re only sending a few bytes, that await WriteAsync is going to return very fast. The scheduler ain’t stupid. If you have 32 cores and not that many tasks, the scheduler will likely resume processContext task on the same core where it was started, with all your session-specific data already on the cache of that core.
what strategy should I follow?
Test several techniques, pick whatever works better with your workload. Task.Run isn’t always faster, it’s just in your situation.
Or, understand how things work under the hood, you’ll them be able to make educated guesses. Less precise, but much faster to implement.
Our application uses the TPL to serialize (potentially) long running units of work. The creation of work (tasks) is user-driven and may be cancelled at any time. In order to have a responsive user interface, if the current piece of work is no longer required we would like to abandon what we were doing, and immediately start a different task.
Tasks are queued up something like this:
private Task workQueue;
private void DoWorkAsync
(Action<WorkCompletedEventArgs> callback, CancellationToken token)
{
if (workQueue == null)
{
workQueue = Task.Factory.StartWork
(() => DoWork(callback, token), token);
}
else
{
workQueue.ContinueWork(t => DoWork(callback, token), token);
}
}
The DoWork method contains a long running call, so it is not as simple as constantly checking the status of token.IsCancellationRequested and bailing if/when a cancel is detected. The long running work will block the Task continuations until it finishes, even if the task is cancelled.
I have come up with two sample methods to work around this issue, but am not convinced that either are proper. I created simple console applications to demonstrate how they work.
The important point to note is that the continuation fires before the original task completes.
Attempt #1: An inner task
static void Main(string[] args)
{
CancellationTokenSource cts = new CancellationTokenSource();
var token = cts.Token;
token.Register(() => Console.WriteLine("Token cancelled"));
// Initial work
var t = Task.Factory.StartNew(() =>
{
Console.WriteLine("Doing work");
// Wrap the long running work in a task, and then wait for it to complete
// or the token to be cancelled.
var innerT = Task.Factory.StartNew(() => Thread.Sleep(3000), token);
innerT.Wait(token);
token.ThrowIfCancellationRequested();
Console.WriteLine("Completed.");
}
, token);
// Second chunk of work which, in the real world, would be identical to the
// first chunk of work.
t.ContinueWith((lastTask) =>
{
Console.WriteLine("Continuation started");
});
// Give the user 3s to cancel the first batch of work
Console.ReadKey();
if (t.Status == TaskStatus.Running)
{
Console.WriteLine("Cancel requested");
cts.Cancel();
Console.ReadKey();
}
}
This works, but the "innerT" Task feels extremely kludgey to me. It also has the drawback of forcing me to refactor all parts of my code that queue up work in this manner, by necessitating the wrapping up of all long running calls in a new Task.
Attempt #2: TaskCompletionSource tinkering
static void Main(string[] args)
{ var tcs = new TaskCompletionSource<object>();
//Wire up the token's cancellation to trigger the TaskCompletionSource's cancellation
CancellationTokenSource cts = new CancellationTokenSource();
var token = cts.Token;
token.Register(() =>
{ Console.WriteLine("Token cancelled");
tcs.SetCanceled();
});
var innerT = Task.Factory.StartNew(() =>
{
Console.WriteLine("Doing work");
Thread.Sleep(3000);
Console.WriteLine("Completed.");
// When the work has complete, set the TaskCompletionSource so that the
// continuation will fire.
tcs.SetResult(null);
});
// Second chunk of work which, in the real world, would be identical to the
// first chunk of work.
// Note that we continue when the TaskCompletionSource's task finishes,
// not the above innerT task.
tcs.Task.ContinueWith((lastTask) =>
{
Console.WriteLine("Continuation started");
});
// Give the user 3s to cancel the first batch of work
Console.ReadKey();
if (innerT.Status == TaskStatus.Running)
{
Console.WriteLine("Cancel requested");
cts.Cancel();
Console.ReadKey();
}
}
Again this works, but now I have two problems:
a) It feels like I'm abusing TaskCompletionSource by never using it's result, and just setting null when I've finished my work.
b) In order to properly wire up continuations I need to keep a handle on the previous unit of work's unique TaskCompletionSource, and not the task that was created for it. This is technically possible, but again feels clunky and strange.
Where to go from here?
To reiterate, my question is: are either of these methods the "correct" way to tackle this problem, or is there a more correct/elegant solution that will allow me to prematurely abort a long running task and immediately starting a continuation? My preference is for a low-impact solution, but I'd be willing to undertake some huge refactoring if it's the right thing to do.
Alternately, is the TPL even the correct tool for the job, or am I missing a better task queuing mechanism. My target framework is .NET 4.0.
The real issue here is that the long-running call in DoWork is not cancellation-aware. If I understand correctly, what you're doing here is not really cancelling the long-running work, but merely allowing the continuation to execute and, when the work completes on the cancelled task, ignoring the result. For example, if you used the inner task pattern to call CrunchNumbers(), which takes several minutes, cancelling the outer task will allow continuation to occur, but CrunchNumbers() will continue to execute in the background until completion.
I don't think there's any real way around this other than making your long-running calls support cancellation. Often this isn't possible (they may be blocking API calls, with no API support for cancellation.) When this is the case, it's really a flaw in the API; you may check to see if there are alternate API calls that could be used to perform the operation in a way that can be cancelled. One hack approach to this is to capture a reference to the underlying Thread being used by the Task when the Task is started and then call Thread.Interrupt. This will wake up the thread from various sleep states and allow it to terminate, but in a potentially nasty way. Worst case, you can even call Thread.Abort, but that's even more problematic and not recommended.
Here is a stab at a delegate-based wrapper. It's untested, but I think it will do the trick; feel free to edit the answer if you make it work and have fixes/improvements.
public sealed class AbandonableTask
{
private readonly CancellationToken _token;
private readonly Action _beginWork;
private readonly Action _blockingWork;
private readonly Action<Task> _afterComplete;
private AbandonableTask(CancellationToken token,
Action beginWork,
Action blockingWork,
Action<Task> afterComplete)
{
if (blockingWork == null) throw new ArgumentNullException("blockingWork");
_token = token;
_beginWork = beginWork;
_blockingWork = blockingWork;
_afterComplete = afterComplete;
}
private void RunTask()
{
if (_beginWork != null)
_beginWork();
var innerTask = new Task(_blockingWork,
_token,
TaskCreationOptions.LongRunning);
innerTask.Start();
innerTask.Wait(_token);
if (innerTask.IsCompleted && _afterComplete != null)
{
_afterComplete(innerTask);
}
}
public static Task Start(CancellationToken token,
Action blockingWork,
Action beginWork = null,
Action<Task> afterComplete = null)
{
if (blockingWork == null) throw new ArgumentNullException("blockingWork");
var worker = new AbandonableTask(token, beginWork, blockingWork, afterComplete);
var outerTask = new Task(worker.RunTask, token);
outerTask.Start();
return outerTask;
}
}