I am trying to understand some code (for performance reasons) that is processing tasks from a queue. The code is C# .NET Framework 4.8 (And I didn't write this stuff)
I have this code creating a timer that from what I can tell should use a new thread every 10 seconds
_myTimer = new Timer(new TimerCallback(OnTimerGo), null, 0, 10000 );
Inside the onTimerGo it calls DoTask() inside of DoTask() it grabs a task off a queue and then does this
System.Threading.Tasks.Task.Factory.StartNew(ProcessTask, task).ContinueWith(c => DoTask());
My reading of this is that a new thread should start running OnTimerGo every 10 seconds, and that thread should in parralel run ProcessTask on tasks as fast as it can get them from the queue.
I inserted some code to call ThreadPool.GetMaxThreads and ThreadPool.GetAvailableThreads to figure out how many threads were in use. Then I queued up 10,000 things for it to do and let it loose.
I never see more then 4 threads in use at a time. This is running on a c4.4xlarge ec2 instance... so 16 vCPU 30 gb mem. The get max and available return over 2k. So I would expect more threads. By looking at the logging I can see that a total of 50ish different threads (by thread id) end up doing the work over the course of 20 minutes. Since the timer is set to every 10 seconds, I would expect 100 threads to be doing the work (or for it to finish sooner).
Looking at the code, the only time a running thread should stop is if it asks for a task from the queue and doesn't get one. Some other logging shows that there are never more than 2 tasks running in a thread. This is probably because they work is pretty fast. So the threads shouldn't be exiting, and I can even see from the logs that many of them end up doing as many as 500 tasks over the 20 minutes.
so... what am I missing here. Are the ThreadPool.GetMaxThreads and ThreadPool.GetAvailableThreads not accurate if run from inside a thread? Is something shutting down some of the threads while letting others keep going?
EDIT: adding more code
public static void StartScheduler()
{
lock (TimerLock)
{
if (_timerShutdown == false)
{
_myTimer = new Timer(new TimerCallback(OnTimerGo), null, 0, 10 );
const int numberOfSecondsPerMinute = 60;
const int margin = 1;
var pollEventsPerMinute = (numberOfSecondsPerMinute/SystemPreferences.TaskPollingIntervalSeconds);
_numberOfTimerCallsForHeartbeat = pollEventsPerMinute - margin;
}
}
}
private static void OnTimerGo(object state)
{
try
{
_lastTimer = DateTime.UtcNow;
var currentTickCount = Interlocked.Increment(ref _timerCallCount);
if (currentTickCount == _numberOfTimerCallsForHeartbeat)
{
Interlocked.Exchange(ref _timerCallCount, 0);
MonitoringTools.SendHeartbeatMetric(Heartbeat);
}
CheckForTasks();
}
catch (Exception e)
{
Log.Warn("Scheduler: OnTimerGo exception", e);
}
}
public static void CheckForTasks()
{
try
{
if (DoTask())
_lastStart = DateTime.UtcNow;
_lastStartOrCheck = DateTime.UtcNow;
}
catch (Exception e)
{
Log.Error("Unexpected exception checking for tasks", e);
}
}
private static bool DoTask()
{
Func<DataContext, bool> a = db =>
{
var mtid = Thread.CurrentThread.ManagedThreadId;
int totalThreads = Process.GetCurrentProcess().Threads.Count;
int maxWorkerThreads;
int maxPortThreads;
ThreadPool.GetMaxThreads(out maxWorkerThreads, out maxPortThreads);
int AvailableWorkerThreads;
int AvailablePortThreads;
ThreadPool.GetAvailableThreads(out AvailableWorkerThreads, out AvailablePortThreads);
int usedWorkerThreads = maxWorkerThreads - AvailableWorkerThreads;
string usedThreadMessage = $"Thread {mtid}: Threads in Use count: {usedWorkerThreads}";
Log.Info(usedThreadMessage);
var taskTypeAndTasks = GetTaskListTypeAndTasks();
var task = GetNextTask(db, taskTypeAndTasks.Key, taskTypeAndTasks.Value);
if (_timerShutdown)
{
Log.Debug("Task processing stopped.");
return false;
}
if (task == null)
{
Log.DebugFormat("DoTask: Idle in thread {0} ({1} tasks running)", mtid, _processingTaskLock);
return false;
}
Log.DebugFormat("DoTask: starting task {2}:{0} on thread {1}", task.Id, mtid, task.Class);
System.Threading.Tasks.Task.Factory.StartNew(ProcessTask, task).ContinueWith(c => DoTask());
Log.DebugFormat("DoTask: done ({0})", mtid);
return true;
};
return DbExtensions.WithDbWrite(ctx => a(ctx));
}
The Task.Factory.StartNew by default doesn't create a new thread. It borrows a thread from the ThreadPool instead.
The ThreadPool is intended as a small pool of reusable threads, to help amortize the cost of running frequent and lightweight operations like callbacks, continuations, event handers etc. Depleting the ThreadPool from available workers by scheduling too much work on it, results in a situation that is called saturation or starvation. And as you've already figured out, it's not a happy situation to be.
You can prevent the saturation of the ThreadPool by running your long-running work on dedicated threads instead of ThreadPool threads. This can be done by passing the TaskCreationOptions.LongRunning as argument to the Task.Factory.StartNew:
_ = Task.Factory.StartNew(ProcessTask, task, CancellationToken.None,
TaskCreationOptions.LongRunning,
TaskScheduler.Default).ContinueWith(t => DoTask(), CancellationToken.None,
TaskContinuationOptions.ExecuteSynchronously,
TaskScheduler.Default);
The above code schedules the ProcessTask(task) on a new thread, and after the invocation is completed either successfully or unsuccessfully, the DoTask will be invoked on the same thread. Finally the thread will be terminated. The discard _ signifies that the continuation Task (the task returned by the ContinueWith) is fire-and-forget. Which, to put it mildly, is architecturally suspicious. 😃
In case you are wondering why I pass the TaskScheduler.Default explicitly as argument to StartNew and ContinueWith, check out this link.
My reading of this is that a new thread should start running OnTimerGo every 10 seconds, and that thread should in parralel run ProcessTask on tasks as fast as it can get them from the queue.
Well, that is definitely not what's happening. It's a lot of uncertainty about your code, but it's clear that another DoTask is starting AFTER ProcessTask completes. And that is not parallel execution. Your line of code is this
System.Threading.Tasks.Task.Factory.StartNew(ProcessTask, task).ContinueWith(c => DoTask());
I suggest you to start another DoTask right there like this:
System.Threading.Tasks.Task.Factory.StartNew(ProcessTask, task);
DoTask();
Make sure your code is ready for parallel execution, though.
Related
We could abort a Thread like this:
Thread thread = new Thread(SomeMethod);
.
.
.
thread.Abort();
But can I abort a Task (in .Net 4.0) in the same way not by cancellation mechanism. I want to kill the Task immediately.
The guidance on not using a thread abort is controversial. I think there is still a place for it but in exceptional circumstance. However you should always attempt to design around it and see it as a last resort.
Example;
You have a simple windows form application that connects to a blocking synchronous web service. Within which it executes a function on the web service within a Parallel loop.
CancellationTokenSource cts = new CancellationTokenSource();
ParallelOptions po = new ParallelOptions();
po.CancellationToken = cts.Token;
po.MaxDegreeOfParallelism = System.Environment.ProcessorCount;
Parallel.ForEach(iListOfItems, po, (item, loopState) =>
{
Thread.Sleep(120000); // pretend web service call
});
Say in this example, the blocking call takes 2 mins to complete. Now I set my MaxDegreeOfParallelism to say ProcessorCount. iListOfItems has 1000 items within it to process.
The user clicks the process button and the loop commences, we have 'up-to' 20 threads executing against 1000 items in the iListOfItems collection. Each iteration executes on its own thread. Each thread will utilise a foreground thread when created by Parallel.ForEach. This means regardless of the main application shutdown, the app domain will be kept alive until all threads have finished.
However the user needs to close the application for some reason, say they close the form.
These 20 threads will continue to execute until all 1000 items are processed. This is not ideal in this scenario, as the application will not exit as the user expects and will continue to run behind the scenes, as can be seen by taking a look in task manger.
Say the user tries to rebuild the app again (VS 2010), it reports the exe is locked, then they would have to go into task manager to kill it or just wait until all 1000 items are processed.
I would not blame you for saying, but of course! I should be cancelling these threads using the CancellationTokenSource object and calling Cancel ... but there are some problems with this as of .net 4.0. Firstly this is still never going to result in a thread abort which would offer up an abort exception followed by thread termination, so the app domain will instead need to wait for the threads to finish normally, and this means waiting for the last blocking call, which would be the very last running iteration (thread) that ultimately gets to call po.CancellationToken.ThrowIfCancellationRequested.
In the example this would mean the app domain could still stay alive for up to 2 mins, even though the form has been closed and cancel called.
Note that Calling Cancel on CancellationTokenSource does not throw an exception on the processing thread(s), which would indeed act to interrupt the blocking call similar to a thread abort and stop the execution. An exception is cached ready for when all the other threads (concurrent iterations) eventually finish and return, the exception is thrown in the initiating thread (where the loop is declared).
I chose not to use the Cancel option on a CancellationTokenSource object. This is wasteful and arguably violates the well known anti-patten of controlling the flow of the code by Exceptions.
Instead, it is arguably 'better' to implement a simple thread safe property i.e. Bool stopExecuting. Then within the loop, check the value of stopExecuting and if the value is set to true by the external influence, we can take an alternate path to close down gracefully. Since we should not call cancel, this precludes checking CancellationTokenSource.IsCancellationRequested which would otherwise be another option.
Something like the following if condition would be appropriate within the loop;
if (loopState.ShouldExitCurrentIteration || loopState.IsExceptional || stopExecuting) {loopState.Stop(); return;}
The iteration will now exit in a 'controlled' manner as well as terminating further iterations, but as I said, this does little for our issue of having to wait on the long running and blocking call(s) that are made within each iteration (parallel loop thread), since these have to complete before each thread can get to the option of checking if it should stop.
In summary, as the user closes the form, the 20 threads will be signaled to stop via stopExecuting, but they will only stop when they have finished executing their long running function call.
We can't do anything about the fact that the application domain will always stay alive and only be released when all foreground threads have completed. And this means there will be a delay associated with waiting for any blocking calls made within the loop to complete.
Only a true thread abort can interrupt the blocking call, and you must mitigate leaving the system in a unstable/undefined state the best you can in the aborted thread's exception handler which goes without question. Whether that's appropriate is a matter for the programmer to decide, based on what resource handles they chose to maintain and how easy it is to close them in a thread's finally block. You could register with a token to terminate on cancel as a semi workaround i.e.
CancellationTokenSource cts = new CancellationTokenSource();
ParallelOptions po = new ParallelOptions();
po.CancellationToken = cts.Token;
po.MaxDegreeOfParallelism = System.Environment.ProcessorCount;
Parallel.ForEach(iListOfItems, po, (item, loopState) =>
{
using (cts.Token.Register(Thread.CurrentThread.Abort))
{
Try
{
Thread.Sleep(120000); // pretend web service call
}
Catch(ThreadAbortException ex)
{
// log etc.
}
Finally
{
// clean up here
}
}
});
but this will still result in an exception in the declaring thread.
All things considered, interrupt blocking calls using the parallel.loop constructs could have been a method on the options, avoiding the use of more obscure parts of the library. But why there is no option to cancel and avoid throwing an exception in the declaring method strikes me as a possible oversight.
But can I abort a Task (in .Net 4.0) in the same way not by
cancellation mechanism. I want to kill the Task immediately.
Other answerers have told you not to do it. But yes, you can do it. You can supply Thread.Abort() as the delegate to be called by the Task's cancellation mechanism. Here is how you could configure this:
class HardAborter
{
public bool WasAborted { get; private set; }
private CancellationTokenSource Canceller { get; set; }
private Task<object> Worker { get; set; }
public void Start(Func<object> DoFunc)
{
WasAborted = false;
// start a task with a means to do a hard abort (unsafe!)
Canceller = new CancellationTokenSource();
Worker = Task.Factory.StartNew(() =>
{
try
{
// specify this thread's Abort() as the cancel delegate
using (Canceller.Token.Register(Thread.CurrentThread.Abort))
{
return DoFunc();
}
}
catch (ThreadAbortException)
{
WasAborted = true;
return false;
}
}, Canceller.Token);
}
public void Abort()
{
Canceller.Cancel();
}
}
disclaimer: don't do this.
Here is an example of what not to do:
var doNotDoThis = new HardAborter();
// start a thread writing to the console
doNotDoThis.Start(() =>
{
while (true)
{
Thread.Sleep(100);
Console.Write(".");
}
return null;
});
// wait a second to see some output and show the WasAborted value as false
Thread.Sleep(1000);
Console.WriteLine("WasAborted: " + doNotDoThis.WasAborted);
// wait another second, abort, and print the time
Thread.Sleep(1000);
doNotDoThis.Abort();
Console.WriteLine("Abort triggered at " + DateTime.Now);
// wait until the abort finishes and print the time
while (!doNotDoThis.WasAborted) { Thread.CurrentThread.Join(0); }
Console.WriteLine("WasAborted: " + doNotDoThis.WasAborted + " at " + DateTime.Now);
Console.ReadKey();
You shouldn't use Thread.Abort()
Tasks can be Cancelled but not aborted.
The Thread.Abort() method is (severely) deprecated.
Both Threads and Tasks should cooperate when being stopped, otherwise you run the risk of leaving the system in a unstable/undefined state.
If you do need to run a Process and kill it from the outside, the only safe option is to run it in a separate AppDomain.
This answer is about .net 3.5 and earlier.
Thread-abort handling has been improved since then, a.o. by changing the way finally blocks work.
But Thread.Abort is still a suspect solution that you should always try to avoid.
And in .net Core (.net 5+) Thread.Abort() will now throw a PlatformNotSupportedException .
Kind of underscoring the 'deprecated' point.
Everyone knows (hopefully) its bad to terminate thread. The problem is when you don't own a piece of code you're calling. If this code is running in some do/while infinite loop , itself calling some native functions, etc. you're basically stuck. When this happens in your own code termination, stop or Dispose call, it's kinda ok to start shooting the bad guys (so you don't become a bad guy yourself).
So, for what it's worth, I've written those two blocking functions that use their own native thread, not a thread from the pool or some thread created by the CLR. They will stop the thread if a timeout occurs:
// returns true if the call went to completion successfully, false otherwise
public static bool RunWithAbort(this Action action, int milliseconds) => RunWithAbort(action, new TimeSpan(0, 0, 0, 0, milliseconds));
public static bool RunWithAbort(this Action action, TimeSpan delay)
{
if (action == null)
throw new ArgumentNullException(nameof(action));
var source = new CancellationTokenSource(delay);
var success = false;
var handle = IntPtr.Zero;
var fn = new Action(() =>
{
using (source.Token.Register(() => TerminateThread(handle, 0)))
{
action();
success = true;
}
});
handle = CreateThread(IntPtr.Zero, IntPtr.Zero, fn, IntPtr.Zero, 0, out var id);
WaitForSingleObject(handle, 100 + (int)delay.TotalMilliseconds);
CloseHandle(handle);
return success;
}
// returns what's the function should return if the call went to completion successfully, default(T) otherwise
public static T RunWithAbort<T>(this Func<T> func, int milliseconds) => RunWithAbort(func, new TimeSpan(0, 0, 0, 0, milliseconds));
public static T RunWithAbort<T>(this Func<T> func, TimeSpan delay)
{
if (func == null)
throw new ArgumentNullException(nameof(func));
var source = new CancellationTokenSource(delay);
var item = default(T);
var handle = IntPtr.Zero;
var fn = new Action(() =>
{
using (source.Token.Register(() => TerminateThread(handle, 0)))
{
item = func();
}
});
handle = CreateThread(IntPtr.Zero, IntPtr.Zero, fn, IntPtr.Zero, 0, out var id);
WaitForSingleObject(handle, 100 + (int)delay.TotalMilliseconds);
CloseHandle(handle);
return item;
}
[DllImport("kernel32")]
private static extern bool TerminateThread(IntPtr hThread, int dwExitCode);
[DllImport("kernel32")]
private static extern IntPtr CreateThread(IntPtr lpThreadAttributes, IntPtr dwStackSize, Delegate lpStartAddress, IntPtr lpParameter, int dwCreationFlags, out int lpThreadId);
[DllImport("kernel32")]
private static extern bool CloseHandle(IntPtr hObject);
[DllImport("kernel32")]
private static extern int WaitForSingleObject(IntPtr hHandle, int dwMilliseconds);
While it's possible to abort a thread, in practice it's almost always a very bad idea to do so. Aborthing a thread means the thread is not given a chance to clean up after itself, leaving resources undeleted, and things in unknown states.
In practice, if you abort a thread, you should only do so in conjunction with killing the process. Sadly, all too many people think ThreadAbort is a viable way of stopping something and continuing on, it's not.
Since Tasks run as threads, you can call ThreadAbort on them, but as with generic threads you almost never want to do this, except as a last resort.
I faced a similar problem with Excel's Application.Workbooks.
If the application is busy, the method hangs eternally. My approach was simply to try to get it in a task and wait, if it takes too long, I just leave the task be and go away (there is no harm "in this case", Excel will unfreeze the moment the user finishes whatever is busy).
In this case, it's impossible to use a cancellation token. The advantage is that I don't need excessive code, aborting threads, etc.
public static List<Workbook> GetAllOpenWorkbooks()
{
//gets all open Excel applications
List<Application> applications = GetAllOpenApplications();
//this is what we want to get from the third party library that may freeze
List<Workbook> books = null;
//as Excel may freeze here due to being busy, we try to get the workbooks asynchronously
Task task = Task.Run(() =>
{
try
{
books = applications
.SelectMany(app => app.Workbooks.OfType<Workbook>()).ToList();
}
catch { }
});
//wait for task completion
task.Wait(5000);
return books; //handle outside if books is null
}
This is my implementation of an idea presented by #Simon-Mourier, using the dotnet thread, short and simple code:
public static bool RunWithAbort(this Action action, int milliseconds)
{
if (action == null) throw new ArgumentNullException(nameof(action));
var success = false;
var thread = new Thread(() =>
{
action();
success = true;
});
thread.IsBackground = true;
thread.Start();
thread.Join(milliseconds);
thread.Abort();
return success;
}
You can "abort" a task by running it on a thread you control and aborting that thread. This causes the task to complete in a faulted state with a ThreadAbortException. You can control thread creation with a custom task scheduler, as described in this answer. Note that the caveat about aborting a thread applies.
(If you don't ensure the task is created on its own thread, aborting it would abort either a thread-pool thread or the thread initiating the task, neither of which you typically want to do.)
using System;
using System.Threading;
using System.Threading.Tasks;
...
var cts = new CancellationTokenSource();
var task = Task.Run(() => { while (true) { } });
Parallel.Invoke(() =>
{
task.Wait(cts.Token);
}, () =>
{
Thread.Sleep(1000);
cts.Cancel();
});
This is a simple snippet to abort a never-ending task with CancellationTokenSource.
I am creating a console program, which can test read / write to a Cache by simulating multiple clients, and have written following code. Please help me understand:
Is it correct way to achieve the multi client simulation
What can I do more to make it a genuine load test
void Main()
{
List<Task<long>> taskList = new List<Task<long>>();
for (int i = 0; i < 500; i++)
{
taskList.Add(TestAsync());
}
Task.WaitAll(taskList.ToArray());
long averageTime = taskList.Average(t => t.Result);
}
public static async Task<long> TestAsync()
{
// Returns the total time taken using Stop Watch in the same module
return await Task.Factory.StartNew(() => // Call Cache Read / Write);
}
Adjusted your code slightly to see how many threads we have at a particular time.
static volatile int currentExecutionCount = 0;
static void Main(string[] args)
{
List<Task<long>> taskList = new List<Task<long>>();
var timer = new Timer(Print, null, TimeSpan.FromSeconds(1), TimeSpan.FromSeconds(1));
for (int i = 0; i < 1000; i++)
{
taskList.Add(DoMagic());
}
Task.WaitAll(taskList.ToArray());
timer.Change(Timeout.Infinite, Timeout.Infinite);
timer = null;
//to check that we have all the threads executed
Console.WriteLine("Done " + taskList.Sum(t => t.Result));
Console.ReadLine();
}
static void Print(object state)
{
Console.WriteLine(currentExecutionCount);
}
static async Task<long> DoMagic()
{
return await Task.Factory.StartNew(() =>
{
Interlocked.Increment(ref currentExecutionCount);
//place your code here
Thread.Sleep(TimeSpan.FromMilliseconds(1000));
Interlocked.Decrement(ref currentExecutionCount);
return 4;
}
//this thing should give a hint to scheduller to use new threads and not scheduled
, TaskCreationOptions.LongRunning
);
}
The result is: inside a virtual machine I have from 2 to 10 threads running simultaneously if I don't use the hint. With the hint — up to 100. And on real machine I can see 1000 threads at once. Process explorer confirms this. Some details on the hint that would be helpful.
If it is very busy, then apparently your clients have to wait a while before their requests are serviced. Your program does not measure this, because your stopwatch starts running when the service request starts.
If you also want to measure what happen with the average time before a request is finished, you should start your stopwatch when the request is made, not when the request is serviced.
Your program takes only threads from the thread pool. If you start more tasks then there are threads, some tasks will have to wait before TestAsync starts running. This wait time would be measured if you remember the time Task.Run is called.
Besides the flaw in time measurements, how many service requests do you expect simultaneously? Are there enough free threads in your thread pool to simulate this? If you expect about 50 service requests at the same time, and the size of your thread pool is only 20 threads, then you'll never run 50 service requests at the same time. Vice versa: if your thread pool is way bigger than your number of expected simultaneous service requests, then you'll measure longer times than are actual the case.
Consider changing the number of threads in your thread pool, and make sure no one else uses any threads of the pool.
We have alot of requests in our system so we use Tasks with WebApi. On some places we have high requirements on speed so we cant wait for the Task to complete, I have created a Worker for this. It creates a nested container so that Entity frameworks DbContext wont get disposed etc. But it looks like Task.Run spawns a new thread for each time, how well will this scale?
public class BackgroundWorker<TScope> : IBusinessWorker<TScope>, IRegisteredObject where TScope : class
{
private readonly IBusinessScope<TScope> _scope;
private bool _started;
private bool _stopping;
public BackgroundWorker(IBusinessScope<TScope> scope)
{
_scope = scope;
}
public void Run(Func<TScope, Task> action)
{
if(_stopping) throw new Exception("App pool is recycling, cant queue work");
if(_started) throw new Exception("You cant call Run multiple times");
_started = true;
HostingEnvironment.RegisterObject(this);
Task.Run(() =>
action(_scope.EntryPoint).ContinueWith(t =>
{
_scope.Dispose();
HostingEnvironment.UnregisterObject(this);
}));
}
public void Stop(bool immediate)
{
_stopping = true;
if(immediate)
HostingEnvironment.UnregisterObject(this);
}
}
Used like
backgroundWorker.Run(async ctx => await ctx.AddRange(foos).Save());
If I google they all end up using Task.Run but doesn't that kill the purpose?
Update:
Did a test
var guid = Guid.NewGuid();
_businessWorker.Run(async ctx => {
System.Diagnostics.Debug.WriteLine("{0}: {1}", guid, Thread.CurrentThread.ManagedThreadId);
await Task.Delay(1);
System.Diagnostics.Debug.WriteLine("{0}: {1}", guid, Thread.CurrentThread.ManagedThreadId);
});
This outputs
3bdbe90b-c31e-4709-95d8-f7516210b0ac: 17
3bdbe90b-c31e-4709-95d8-f7516210b0ac: 9
6548fd26-d209-4427-9a91-40fc30aa509e: 15
6548fd26-d209-4427-9a91-40fc30aa509e: 19
7411b043-4fae-44bf-b93f-4273a532afa1: 7
7411b043-4fae-44bf-b93f-4273a532afa1: 17
Which indicates that Task.Run actually works like i think it should
With real DB code it looks like this
a939713d-d728-46c9-be33-aa57704cf242: 19 <--
a939713d-d728-46c9-be33-aa57704cf242: 19 <-- Used same for entire work
7e588a42-afd0-4ab5-ba6b-f8520c889cde: 7
7e588a42-afd0-4ab5-ba6b-f8520c889cde: 19 <-- Reused first works thread when work #2 continued
6f3b067f-f478-43f9-8411-8142b449c28b: 8
6f3b067f-f478-43f9-8411-8142b449c28b: 18
update:
Tried Luaan's approach, seems to work with Tasks spawned from EntityFramework or WebApi HttpClient, but with manual Tasks etc like below it does not work well, some are executed some are not. With Task.Run all are executed
_businessWorkerFactory().Run(async ctx =>
{
var guid = Guid.NewGuid();
System.Diagnostics.Debug.WriteLine("{0}: {1}", guid, Thread.CurrentThread.ManagedThreadId);
var completion = new TaskCompletionSource<bool>();
ThreadPool.QueueUserWorkItem(obj =>
{
Thread.Sleep(1000);
completion.SetResult(true);
});
await completion.Task;
System.Diagnostics.Debug.WriteLine("{0}: {1}", guid, Thread.CurrentThread.ManagedThreadId);
});
Task.Run schedules the task to run on a thread pool thread. The same thread pool that handles requests.
On an ASP.NET application, sending work to the thread pool steals threads that might be necessary to handle requests.
Given your requirements, I think you would be better queuing that work to another service/process using something like MSMQ.
Task.Run doesn't spawn a new thread - it borrows one from the thread pool (assuming the thread pool task scheduler - there's different schedulers, and you can write your own as well). When you use await inside of Task.Run, it will still work as usual - freeing the thread pool thread until a callback is posted.
However, exactly for that reason, there's little point in using Task.Run for I/O work. If you have asynchronous I/O to do, just do it - it will work exactly the same, without requiring a context switch. You must make it asynchronous though - if it's just blocking code, you're taking up valuable threads from the thread pool.
Note that you don't need for an asynchronous request to finish. If the asynchronous action you are performing doesn't need too much time to setup (that is, it returns the Task almost immediately, even though it isn't finished), you can just call it:
public async Task SomeAsync()
{
var request = new MyRequest();
await request.MakeRequestAsync();
...
}
public void Start()
{
var task = SomeAsync();
// Now the task is started, and we can use it for future reference. Or just wire up
// some error handling continuations etc. - though it's usually a better idea to do that
// within SomeAsync directly.
}
static void Main(string[] args)
{
// do async method for each stock in the list
Task<IEnumerable<string>> symbols = Helper.getStockSymbols("amex", 0);
List<Task> tasks = new List<Task>();
try
{
for (int i = 0; i < symbols.Result.Count(); i++)
{
if (i < symbols.Result.Count())
{
string symbol = symbols.Result.ElementAtOrDefault(i);
Task t = Task.Run(() => getCalculationsDataAsync(symbol, "amex"));
tasks.Add(t);
Task e = t.ContinueWith((d) => getThreadStatus(tasks));
}
}
// don't exit until they choose to
while (args.FirstOrDefault() != "exit")
{
// do nothing
}
}
catch (Exception ex)
{
Console.WriteLine(ex.Message);
}
}
public static void getThreadStatus(List<Task> taskList)
{
int count = 0;
foreach (Task t in taskList)
{
if (t.Status == TaskStatus.Running)
{
count += 1;
}
}
Console.WriteLine(count + " threads are running.");
}
public static async Task getCalculationsDataAsync(string symbol, string market)
{
// do calculation here
}
What I'm trying to do in my code is run a new task for each stock in my list and run them all at the same time. I have a 4 core processor and I believe that means I can only run 4 tasks at once. I tried testing how many tasks were running by inserting the continuewith method that you see in my code that will tell me how many tasks are running. When I run this code, it tells me that 0 tasks are running so my questions are:
How can I complete my objective by running these tasks at the same exact time?
Why is it telling me that 0 tasks are running? I can only assume this is because the current task is finished and it hasn't started a new one yet if the tasks are running one after the other.
I am not sure why you are seeing no task running. Are you sure your Task.Run() is being hit? That is to say is i < symbols.Result.Count() satisfied?
Regardless of the above, let us try and achieve what you want. Firstly, no, it is not correct to say that because your machine has four physical cores/'threads', that you can use a maximum of four Threads. These are not the same things. A Google on this subject will bring a plethora of information your way about this.
Basically starting a thread the way you have will start background threads on a thread pool, and this thread pool can hold/govern numerous threads see Threading in C# by J. Albahari for more information. Whenever you start a thread, a few hundred microseconds are spent organizing such things as a fresh private local variable stack. Each thread also consumes (by default) around 1 MB of memory. The thread pool cuts these overheads by sharing and recycling threads, allowing multithreading to be applied at a very granular level without a performance penalty. This is useful when leveraging multicore processors to execute computationally intensive code in parallel in “divide-and-conquer” style.
The thread pool also keeps a lid on the total number of worker threads it will run simultaneously. Too many active threads throttle the operating system with administrative burden and render CPU caches ineffective. Once a limit is reached, jobs queue up and start only when another finishes.
Okay, now for your code. Let us say we have a list of stock types
List<string> types = new List<string>() { "AMEX", "AMEC", "BP" };
To dispatch multiple threads to do for for each of these (not using async/await), you could do something like
foreach (string t in types)
{
Task.Factory.StartNew(() => DoSomeCalculationForType(t));
}
This will fire of three background thread pool threads and is non-blocking, so this code will return to caller almost immediately.
If you want to set up post processing you can do this via continuations and continuation chaining. This is all described in the Albahari link I provided above.
I hope this helps.
--
Edit. to address comments:
Beginning with the .NET Framework version 4, the default size of the thread pool for a process depends on several factors, such as the size of the virtual address space. A process can call the GetMaxThreads method to determine the number of threads.
However, there's something else at play: the thread pool doesn't immediately create new threads in all situations. In order to cope with bursts of small tasks, it limits how quickly it creates new threads. IIRC, it will create one thread every 0.5 seconds if there are outstanding tasks, up to the maximum number of threads. I can't immediately see that figure documented though, so it may well change. I strongly suspect that's what you're seeing though. Try queuing a lot of items and then monitor the number of threads over time.
Basically let the thread pool dispatch what it wants, its optimizer will do the best job for your circumstance.
I think the problem is that the tasks have a status of Running either very briefly or can skip that status all together and go straight from WaitingForActivation to RanToCompletion.
I modified your program slightly and i can see the tasks starting and completing but they don't appear to be in the Running state whenever I check.
static void Main(string[] args)
{
// do async method for each stock in the list
Task<IEnumerable<string>> symbols = Task.FromResult(Enumerable.Range(1, 5).Select (e => e.ToString()));
List<Task> tasks = new List<Task>();
try
{
for (int i = 0; i < symbols.Result.Count(); i++)
{
string symbol = symbols.Result.ElementAtOrDefault(i);
Task t = getCalculationsDataAsync(symbol, "amex", tasks);
tasks.Add(t);
}
Console.WriteLine("Tasks Count:"+ tasks.Count());
Task.WaitAll(tasks.ToArray());
}
catch (Exception ex)
{
Console.WriteLine(ex.Message);
}
}
public static void getThreadStatus(List<Task> taskList)
{
int count = 0;
foreach (Task t in taskList)
{
Console.WriteLine("Status " + t.Status);
if (t.Status == TaskStatus.Running)
{
count += 1;
Console.WriteLine("A task is running");
}
}
//Console.WriteLine(count + " threads are running.");
}
public static async Task getCalculationsDataAsync(string symbol, string market, List<Task> tasks)
{
Console.WriteLine("Starting task");
var delay = new Random((int)DateTime.Now.Ticks).Next(5000);
Console.WriteLine("Delay:" + delay);
await Task.Delay(delay);
Console.WriteLine("Finished task");
getThreadStatus(tasks);
}
Output
Starting task
Delay:1784
Starting task
Delay:2906
Starting task
Delay:2906
Starting task
Delay:2906
Starting task
Delay:2906
Tasks Count:5
Finished task
Status WaitingForActivation
Status WaitingForActivation
Status WaitingForActivation
Status WaitingForActivation
Status WaitingForActivation
Finished task
Finished task
Finished task
Status RanToCompletion
Status RanToCompletion
Status WaitingForActivation
Status WaitingForActivation
Status RanToCompletion
Status WaitingForActivation
Status WaitingForActivation
Status WaitingForActivation
Status WaitingForActivation
Status WaitingForActivation
Status WaitingForActivation
Status WaitingForActivation
Status WaitingForActivation
Status WaitingForActivation
Status WaitingForActivation
Finished task
Status RanToCompletion
Status RanToCompletion
Status RanToCompletion
Status RanToCompletion
Status WaitingForActivation
As I describe on my blog, there are two kinds of tasks: Delegate Tasks and Promise Tasks. Only Delegate Tasks ever actually run. Promise Tasks only complete. The task returned by Task.Run is a Promise Task, not a Delegate Task; this is necessary so that Task.Run can understand async code and only complete when the async code is completed.
So, checking for TaskStatus.Running is not going to work the way you want. Instead, you can create a counter:
private static int _count;
static void Main(string[] args)
{
...
for (int i = 0; i < symbols.Result.Count(); i++)
{
if (i < symbols.Result.Count())
{
string symbol = symbols.Result.ElementAtOrDefault(i);
Task t = Task.Run(() => getCalculationsDataWithCountAsync(symbol, "amex"));
tasks.Add(t);
}
}
...
}
public static async Task getCalculationsDataWithCountAsync(string symbol, string market)
{
Console.WriteLine(Interlocked.Increment(ref _count) + " threads are running.");
try
{
await getCalculationsDataAsync(symbol, market);
}
finally
{
Console.WriteLine(Interlocked.Decrement(ref _count) + " threads are running.");
}
}
Note that I used a separate async "wrapper" method instead of messing with ContinueWith. Code using await instead of ContinueWith more correctly handles a number of edge cases, and IMO is clearer to read.
Also, remember that async and await free up threads while they're "awaiting." So, you can potentially have hundreds of (asynchronous) tasks going simultaneously; this does not imply that there are that many threads.
I have a method called WaitForAction, which takes an Action delegate and executes it in a new Task. The method blocks until the task completes or until a timeout expires. It uses ManualResetEvent to wait for timeout/completion.
The following code shows an attempt to test the method in a multi-threaded environment.
class Program
{
public static void Main()
{
List<Foo> list = new List<Foo>();
for (int i = 0; i < 10; i++)
{
Foo foo = new Foo();
list.Add(foo);
foo.Bar();
}
SpinWait.SpinUntil(() => list.Count(f => f.finished || f.failed) == 10, 2000);
Debug.WriteLine(list.Count(f => f.finished));
}
}
public class Foo
{
public volatile bool finished = false;
public volatile bool failed = false;
public void Bar()
{
Task.Factory.StartNew(() =>
{
try
{
WaitForAction(1000, () => { });
finished = true;
}
catch
{
failed = true;
}
});
}
private void WaitForAction(int iMsToWait, Action action)
{
using (ManualResetEvent waitHandle = new ManualResetEvent(false))
{
Task.Factory.StartNew(() =>
{
action();
waitHandle.SafeSet();
});
if (waitHandle.SafeWaitOne(iMsToWait) == false)
{
throw new Exception("Timeout");
}
}
}
}
As the Action is doing nothing I would expect the 10 tasks started by calling Foo.Bar 10 times to complete well within the timeout. Sometimes this happens, but usually the program takes 2 seconds to execute and reports that only 2 instances of Foo 'finished' without error. In other words, 8 calls to WaitForAction have timed out.
I'm assuming that WaitForAction is thread safe, as each call on a Task-provided thread has its own stack. I have more or less proved this by logging the thread ID and wait handle ID for each call.
I realise that this code presented is a daft example, but I am interested in the principle. Is it possible for the task scheduler to be scheduling a task running the action delegate to the same threadpool thread that is already waiting for another action to complete? Or is there something else going on that I've missed?
Task.Factory utilizes the ThreadPool by default. With every call to WaitHandle.WaitOne, you block a worker thread. The .Net 4/4.5 thread pool starts with a small number of worker threads depending on your hardware platform (e.g., 4 on my machine) and it re-evaluates the pool size periodically (I believe it is every 1 second), creating new workers if necessary.
Since your program blocks all worker threads, and the thread pool doesn't grow fast enough, your waithandles timeout as you saw.
To confirm this, you can either 1) increase the timeouts or 2) increase the beginning thread pool size by adding the following line to the beginning of your program:
ThreadPool.SetMinThreads(32, 4);
then you should see the timeouts don't occur.
I believe your question was more academic than anything else, but you can read about a better implementation of a task timeout mechanism here, e.g.
var task = Task.Run(someAction);
if (task == await Task.WhenAny(task, Task.Delay(millisecondsTimeout)))
await task;
else
throw new TimeoutException();