I'm talking about single-threaded (not TaskEx for WindowsPhone) (ok, even basic Task is designed to be async, this makes question senseless) and synchronous (no async/await) pure Task.
Can in be useful in some cases (i have quite common app which pulls data from the server, deserialize it and shows results), or is Task just a basement for
await TaskEx.Run()?
EDIT1: i mean, how this
void Foo()
{
DoSmth();
}
void Main()
{
int a = 1;
Foo();
int b = 1;
}
would differ from
void Main()
{
int a = 1;
Task.Run( () => DoSmth );
int b = 1;
}
Calling Foo(); is also kinda a "promise that next code would be called after Foo() is done".
EDIT2: I just ran in wp7 app
Debug.WriteLine("OnLoaded {0} ", Thread.CurrentThread.ManagedThreadId);
Task.Factory.StartNew(() =>
{
Thread.Sleep(5000);
Debug.WriteLine("Run Id: {0}", Thread.CurrentThread.ManagedThreadId);
});
Debug.WriteLine("Done");
Got the output:
OnLoaded 1
Done
Run Id: 4
So, is Task.Factory.StartNew() the same as TaskEx.Run() ?
ESIT3: so, here is a short summary (as Task.Factory.StartNew() is the same as TaskEx.Run()):
Thread.Sleep(5000); // UI is frozen for 5 seconds
int a = 1; // this is called after 5 seconds
TaskEx.Run(() =>
{
Thread.Sleep(5000);
int a = 1; // this is called after 5 seconds
}
int b = 2; // UI is not frozen, this is called instantly
await TaskEx.Run(() => // UI is not frozen, but...
{
Thread.Sleep(5000);
int a = 1; // this is called after 5 seconds
}
int b = 2; // this is called then task is done
A Task is just a way to represent something that will complete in the future. This is most commonly an asynchronous operation or something running in a background thread (via Task.Run/TaskEx.Run).
A "synchronous pure Task" really doesn't make sense - the entire purpose of a Task is to represent something that is not synchronous.
Can in be useful in some cases (i have quite common app which pulls data from the server, deserialize it and shows results),
In this case, since the data is pulling from a server, that is by its nature a good canidate for an asynchronous operation. This would make it a perfect canidate for Task (or Task<T>).
In response to your edit:
In the first version, everything is just run sequentially.
The second version, using Task.Run, actually causes DoSmth() to execute in a background thread. The Task returned can be used with await to asynchonously wait for it to complete, if you wanted to do so. This means that DoSmth() will potentially run at the same time as the assignment to b (and subsequent operations).
Related
I am trying to understand some code (for performance reasons) that is processing tasks from a queue. The code is C# .NET Framework 4.8 (And I didn't write this stuff)
I have this code creating a timer that from what I can tell should use a new thread every 10 seconds
_myTimer = new Timer(new TimerCallback(OnTimerGo), null, 0, 10000 );
Inside the onTimerGo it calls DoTask() inside of DoTask() it grabs a task off a queue and then does this
System.Threading.Tasks.Task.Factory.StartNew(ProcessTask, task).ContinueWith(c => DoTask());
My reading of this is that a new thread should start running OnTimerGo every 10 seconds, and that thread should in parralel run ProcessTask on tasks as fast as it can get them from the queue.
I inserted some code to call ThreadPool.GetMaxThreads and ThreadPool.GetAvailableThreads to figure out how many threads were in use. Then I queued up 10,000 things for it to do and let it loose.
I never see more then 4 threads in use at a time. This is running on a c4.4xlarge ec2 instance... so 16 vCPU 30 gb mem. The get max and available return over 2k. So I would expect more threads. By looking at the logging I can see that a total of 50ish different threads (by thread id) end up doing the work over the course of 20 minutes. Since the timer is set to every 10 seconds, I would expect 100 threads to be doing the work (or for it to finish sooner).
Looking at the code, the only time a running thread should stop is if it asks for a task from the queue and doesn't get one. Some other logging shows that there are never more than 2 tasks running in a thread. This is probably because they work is pretty fast. So the threads shouldn't be exiting, and I can even see from the logs that many of them end up doing as many as 500 tasks over the 20 minutes.
so... what am I missing here. Are the ThreadPool.GetMaxThreads and ThreadPool.GetAvailableThreads not accurate if run from inside a thread? Is something shutting down some of the threads while letting others keep going?
EDIT: adding more code
public static void StartScheduler()
{
lock (TimerLock)
{
if (_timerShutdown == false)
{
_myTimer = new Timer(new TimerCallback(OnTimerGo), null, 0, 10 );
const int numberOfSecondsPerMinute = 60;
const int margin = 1;
var pollEventsPerMinute = (numberOfSecondsPerMinute/SystemPreferences.TaskPollingIntervalSeconds);
_numberOfTimerCallsForHeartbeat = pollEventsPerMinute - margin;
}
}
}
private static void OnTimerGo(object state)
{
try
{
_lastTimer = DateTime.UtcNow;
var currentTickCount = Interlocked.Increment(ref _timerCallCount);
if (currentTickCount == _numberOfTimerCallsForHeartbeat)
{
Interlocked.Exchange(ref _timerCallCount, 0);
MonitoringTools.SendHeartbeatMetric(Heartbeat);
}
CheckForTasks();
}
catch (Exception e)
{
Log.Warn("Scheduler: OnTimerGo exception", e);
}
}
public static void CheckForTasks()
{
try
{
if (DoTask())
_lastStart = DateTime.UtcNow;
_lastStartOrCheck = DateTime.UtcNow;
}
catch (Exception e)
{
Log.Error("Unexpected exception checking for tasks", e);
}
}
private static bool DoTask()
{
Func<DataContext, bool> a = db =>
{
var mtid = Thread.CurrentThread.ManagedThreadId;
int totalThreads = Process.GetCurrentProcess().Threads.Count;
int maxWorkerThreads;
int maxPortThreads;
ThreadPool.GetMaxThreads(out maxWorkerThreads, out maxPortThreads);
int AvailableWorkerThreads;
int AvailablePortThreads;
ThreadPool.GetAvailableThreads(out AvailableWorkerThreads, out AvailablePortThreads);
int usedWorkerThreads = maxWorkerThreads - AvailableWorkerThreads;
string usedThreadMessage = $"Thread {mtid}: Threads in Use count: {usedWorkerThreads}";
Log.Info(usedThreadMessage);
var taskTypeAndTasks = GetTaskListTypeAndTasks();
var task = GetNextTask(db, taskTypeAndTasks.Key, taskTypeAndTasks.Value);
if (_timerShutdown)
{
Log.Debug("Task processing stopped.");
return false;
}
if (task == null)
{
Log.DebugFormat("DoTask: Idle in thread {0} ({1} tasks running)", mtid, _processingTaskLock);
return false;
}
Log.DebugFormat("DoTask: starting task {2}:{0} on thread {1}", task.Id, mtid, task.Class);
System.Threading.Tasks.Task.Factory.StartNew(ProcessTask, task).ContinueWith(c => DoTask());
Log.DebugFormat("DoTask: done ({0})", mtid);
return true;
};
return DbExtensions.WithDbWrite(ctx => a(ctx));
}
The Task.Factory.StartNew by default doesn't create a new thread. It borrows a thread from the ThreadPool instead.
The ThreadPool is intended as a small pool of reusable threads, to help amortize the cost of running frequent and lightweight operations like callbacks, continuations, event handers etc. Depleting the ThreadPool from available workers by scheduling too much work on it, results in a situation that is called saturation or starvation. And as you've already figured out, it's not a happy situation to be.
You can prevent the saturation of the ThreadPool by running your long-running work on dedicated threads instead of ThreadPool threads. This can be done by passing the TaskCreationOptions.LongRunning as argument to the Task.Factory.StartNew:
_ = Task.Factory.StartNew(ProcessTask, task, CancellationToken.None,
TaskCreationOptions.LongRunning,
TaskScheduler.Default).ContinueWith(t => DoTask(), CancellationToken.None,
TaskContinuationOptions.ExecuteSynchronously,
TaskScheduler.Default);
The above code schedules the ProcessTask(task) on a new thread, and after the invocation is completed either successfully or unsuccessfully, the DoTask will be invoked on the same thread. Finally the thread will be terminated. The discard _ signifies that the continuation Task (the task returned by the ContinueWith) is fire-and-forget. Which, to put it mildly, is architecturally suspicious. 😃
In case you are wondering why I pass the TaskScheduler.Default explicitly as argument to StartNew and ContinueWith, check out this link.
My reading of this is that a new thread should start running OnTimerGo every 10 seconds, and that thread should in parralel run ProcessTask on tasks as fast as it can get them from the queue.
Well, that is definitely not what's happening. It's a lot of uncertainty about your code, but it's clear that another DoTask is starting AFTER ProcessTask completes. And that is not parallel execution. Your line of code is this
System.Threading.Tasks.Task.Factory.StartNew(ProcessTask, task).ContinueWith(c => DoTask());
I suggest you to start another DoTask right there like this:
System.Threading.Tasks.Task.Factory.StartNew(ProcessTask, task);
DoTask();
Make sure your code is ready for parallel execution, though.
This is the code that I wrote to better understand asynchronous methods. I knew that an asynchronous method is not the same as multithreading, but it does not seem so in this particular scenario:
class Program
{
static void Main(string[] args)
{
Thread.CurrentThread.CurrentCulture = new System.Globalization.CultureInfo("en-US");
//the line above just makes sure that the console output uses . to represent doubles instead of ,
ExecuteAsync();
Console.ReadLine();
}
private static async Task ParallelAsyncMethod() //this is the method where async parallel execution is taking place
{
List<Task<string>> tasks = new List<Task<string>>();
for (int i = 0; i < 5; i++)
{
tasks.Add(Task.Run(() => DownloadWebsite()));
}
var strings = await Task.WhenAll(tasks);
foreach (var str in strings)
{
Console.WriteLine(str);
}
}
private static string DownloadWebsite() //Imitating a website download
{
Thread.Sleep(1500); //making the thread sleep for 1500 miliseconds before returning
return "Download finished";
}
private static async void ExecuteAsync()
{
var watch = Stopwatch.StartNew();
await ParallelAsyncMethod();
watch.Stop();
Console.WriteLine($"It took the machine {watch.ElapsedMilliseconds} milliseconds" +
$" or {Convert.ToDouble(watch.ElapsedMilliseconds) / 1000} seconds to complete this task");
Console.ReadLine();
}
}
//OUTPUT:
/*
Download finished
Download finished
Download finished
Download finished
Download finished
It took the machine 1537 milliseconds or 1.537 seconds to complete this task
*/
As you can see, the DownloadWebsite method waits for 1.5 seconds and then returns "a". The method called ParallelAsyncMethod adds five of these methods into the "tasks" list and then starts the parallel asynchronous execution. As you can see, I also tracked the amount of time that it takes for the ExecuteAsync method to be executed. The result is always somewhere around 1540 milliseconds. Here is my question: if the DownloadWebsite method required a thread to sleep 5 times for 1500 milliseconds, does it mean that the parallel execution of these methods required 5 different threads? If not, then how come it only took the program 1540 milliseconds to be executed and not ~7500 ms?
I knew that an asynchronous method is not the same as multi-threading
That is correct, an asynchronous method releases the current thread whilst I/O occurs, and schedules a continuation after it's completion.
Async and threads are completely unrelated concepts.
but it does not seem so in this particular scenario
That is because you explicitly run DownloadWebsite on the ThreadPool using Task.Run, which imitates asynchronous code by returning a Task after instructing the provided delegate to run.
Because you are not waiting for each Task to complete before starting the next, multiple threads can be used simultaneously.
Currently each thread is being blocked, as you have used Thread.Sleep in the implementation of DownloadWebsite, meaning you are actually running 5 synchronous methods on the ThreadPool.
In production code your DownloadWebsite method should be written asynchronously, maybe using HttpClient.GetAsync:
private static async Task<string> DownloadWebsiteAsync()
{
//...
await httpClinet.GetAsync(//...
//...
}
In that case, GetAsync returns a Task, and releases the current thread whilst waiting for the HTTP response.
You can still run multiple async methods concurrently, but as the thread is released each time, this may well use less than 5 separate threads and may even use a single thread.
Ensure that you dont use Task.Run with an asynchronous method; this simply adds unnecessary overhead:
var tasks = new List<Task<string>>();
for (int i = 0; i < 5; i++)
{
tasks.Add(DownloadWebsiteAsync()); // No need for Task.Run
}
var strings = await Task.WhenAll(tasks);
As an aside, if you want to imitate an async operation, use Task.Delay instead of Thread.Sleep as the former is non-blocking:
private static async Task<string> DownloadWebsite() //Imitating a website download
{
await Task.Delay(1500); // Release the thread for ~1500ms before continuing
return "Download finished";
}
How can I check if a thread returned to the thread pool, using VS C# 2015 debugger?
What's problematic in my case is the fact that it cannot be detected by debugging line by line.
async Task foo()
{
int y = 0;
await Task.Delay(5);
// (1) thread 2000 returns to thread pool here...
while (y<5) y++;
}
async Task testAsync()
{
Task task = foo();
// (2) ... and here thread 2000 is back from the thread pool, to run the code below. I want
// to confirm that it was in the thread pool in the meantime, using debugger.
int i = 0;
while (i < 100)
{
Console.WriteLine("Async 1 before: " + i++);
}
await task;
}
In the first line of testAsync running on thread 2000, foo is called. Once it encounters await Task.Delay(5), thread 2000 returns to thread pool (allegedly, I'm trying to confirm this), and the method waits for Task.Delay(5) to complete. In the meantime, the control returns to the caller and the first loop of testAsync is executed on thread 2000 as well.
So between two consecutive lines of code, the thread returned to thread pool and came back from there. How can I confirm this with debugger? Possibly with Threads debugger window?
To clarify a bit more what I'm asking: foo is running on thread 2000. There are two possible scenarios:
When it hits await Task.Delay(5), thread 2000 returns to the thread pool for a very short time, and the control returns to the caller, at line (2), which will execute on thread 2000 taken from the thread pool. If this is true, you can't detect it easily, because Thread 2000 was in the thread pool during time between two consecutive lines of code.
When it hits await Task.Delay(5), thread 2000 doesn't return to thread pool, but immediately executes code in testAsync starting from line (2)
I'd like to verify which one is really happening.
There is a major mistake in your assumption:
When it hits await Task.Delay(5), thread 2000 returns to the thread pool
Since you don't await foo() yet, when thread 2000 hits Task.Delay(5) it just creates a new Task and returns to testAsync() (to int i = 0;). It moves on to the while block, and only then you await task. At this point, if task is not completed yet, and assuming the rest of the code is awaited, thread 2000 will return to the thread pool. Otherwise, if task is already completed, it will synchronously continue from foo() (at while (y<5) y++;).
EDIT:
what if the main method called testAsync?
When synchronous method calls and waits async method, it must block the thread if the async method returns uncompleted Task:
void Main()
{
var task = foo();
task.Wait(); //Will block the thread if foo() is not completed.
}
Note that in the above case the thread is not returning to the thread pool - it is completely suspended by the OS.
Maybe you can give an example of how to call testAsync so that thread 2000 returns to the thread pool?
Assuming thread 2k is the main thread, it cannot return to the thread pool. But you can use Task.Run(()=> foo()) to run foo() on the thread pool, and since the calling thread is the main thread, another thread pool thread will pick up that Task. So the following code:
static void Main(string[] args)
{
Console.WriteLine("main started on thread {0}", Thread.CurrentThread.ManagedThreadId);
var testAsyncTask = Task.Run(() => testAsync());
testAsyncTask.Wait();
}
static async Task testAsync()
{
Console.WriteLine("testAsync started on thread {0}", Thread.CurrentThread.ManagedThreadId);
await Task.Delay(1000);
Console.WriteLine("testAsync continued on thread {0}", Thread.CurrentThread.ManagedThreadId);
}
Produced (on my PC) the following output:
main started on thread 1
testAsync started on thread 3
testAsync continued on thread 4
Press any key to continue . . .
Threads 3 and 4 came from and returned to the thread pool.
You can print out the Thread.CurrentThread.ManagedThreadId to the console. Note that the thread-pool is free to re-use that same thread to run continuations on it, so there's no guarantee that it'll be different:
void Main()
{
TestAsync().Wait();
}
public async Task FooAsync()
{
int y = 0;
await Task.Delay(5);
Console.WriteLine($"After awaiting in FooAsync:
{Thread.CurrentThread.ManagedThreadId }");
while (y < 5) y++;
}
public async Task TestAsync()
{
Console.WriteLine($"Before awaiting in TestAsync:
{Thread.CurrentThread.ManagedThreadId }");
Task task = foo();
int i = 0;
while (i < 100)
{
var x = i++;
}
await task;
Console.WriteLine($"After awaiting in TestAsync:
{Thread.CurrentThread.ManagedThreadId }");
}
Another thing you can check is ThreadPool.GetAvailableThreads to determine if another worker has been handed out for use:
async Task FooAsync()
{
int y = 0;
await Task.Delay(5);
Console.WriteLine("Thread-Pool threads after first await:");
int avaliableWorkers;
int avaliableIo;
ThreadPool.GetAvailableThreads(out avaliableWorkers, out avaliableIo);
Console.WriteLine($"Available Workers: { avaliableWorkers},
Available IO: { avaliableIo }");
while (y < 1000000000) y++;
}
async Task TestAsync()
{
int avaliableWorkers;
int avaliableIo;
ThreadPool.GetAvailableThreads(out avaliableWorkers, out avaliableIo);
Console.WriteLine("Thread-Pool threads before first await:");
Console.WriteLine($"Available Workers: { avaliableWorkers},
Available IO: { avaliableIo }");
Console.WriteLine("-------------------------------------------------------------");
Task task = FooAsync();
int i = 0;
while (i < 100)
{
var x = i++;
}
await task;
}
On my machine, this yields:
Thread-Pool threads before first await:
Available Workers: 1023, Available IO: 1000
----------------------------------------------
Thread-Pool threads after first await:
Available Workers: 1022, Available IO: 1000
I'd like to verify which one is really happening.
There is no way to "verify" that with debugger, because the debugger is made to simulate the logical (synchronous) flow - see Walkthrough: Using the Debugger with Async Methods.
In order to understand what is happening (FYI it's your case (2)), you need to learn how await works starting from Asynchronous Programming with Async and Await - What Happens in an Async Method section, Control Flow in Async Programs and many other sources.
Look at this snippet:
static void Main(string[] args)
{
Task.Run(() =>
{
// Initial thread pool thread
var t = testAsync();
t.Wait();
});
Console.ReadLine();
}
If we make the lambda to be async and use await t; instead of t.Wait();, this is the point where the initial thread will be returned to the thread pool. As I mentioned above, you cannot verify that with debugger. But look at the above code and think logically - we are blocking the initial thread, so if it' wasn't free, your testAsync and foo methods will not be able to resume. But they do, and this can easily be verified by putting breakpoint after await lines.
I'm trying to write my own scheduler; the rationale behind it is that all the submitted actions will be executed in order, according to a delay. For example, if at time 0 I schedule action A with delay 5 and at time 1 I schedule action B with delay 2, then B should be executed first at time 3 and A should be executed second, at time 5.
Basically, what I am trying to do is something like:
public class MyScheduler
{
Task _task = new Task(() => { });
public MyScheduler()
{
_task.Start();
}
public void Schedule(Action action, long delay)
{
Task.Delay(TimeSpan.FromTicks(delay)).ContinueWith(_ =>
lock(_task) {
_task = _task.ContinueWith(task => action())
}
);
}
}
A relevant test for this code would be:
var waiter = new Waiter(3);
int _count = 0;
mysched = new MyScheduler();
mysched.Schedule(() => { _count++; waiter.Signal(); });
mysched.Schedule(() => { Task.Delay(100).Wait(); _count *= 3; waiter.Signal(); });
mysched.Schedule(() => { _count++; waiter.Signal(); });
waiter.Await();
Assert.AreEqual(4, _count);
In the above code, Waiter is a class with an internal variable initialized in the constructor; the Signal method decrements that internal variable and the Await method loops (and sleeps 10 ms on each iteration) until the internal variable is less than or equal to zero.
The aim of the test is to show that the scheduled actions have been performed in order.
Most of the times this is true and the test passes, but on few occasions the resulting value for _count is 2 instead of 4. I have spent a lot of time trying to figure out why this happens, but I can't seem to figure it out and my lack of experience with C# is not helping either.
Does anyone have any suggestions?
For one thing, _count is not synchronized for access from different threads.
I recommend that you not use ContinueWith at all; it is a very low-level method and is very easy to get the details wrong (for example, the default scheduler is TaskScheduler.Current, which is almost never what you want). Your general logic code should use await instead of ContinueWith.
Regarding the scheduler, these days it is almost impossible to make a good use case for developing your own. There are better ones available that are developed by geniuses and extremely well-tested. Consider Reactive Extensions: they provide several schedulers, and they all support scheduling.
I have a method called WaitForAction, which takes an Action delegate and executes it in a new Task. The method blocks until the task completes or until a timeout expires. It uses ManualResetEvent to wait for timeout/completion.
The following code shows an attempt to test the method in a multi-threaded environment.
class Program
{
public static void Main()
{
List<Foo> list = new List<Foo>();
for (int i = 0; i < 10; i++)
{
Foo foo = new Foo();
list.Add(foo);
foo.Bar();
}
SpinWait.SpinUntil(() => list.Count(f => f.finished || f.failed) == 10, 2000);
Debug.WriteLine(list.Count(f => f.finished));
}
}
public class Foo
{
public volatile bool finished = false;
public volatile bool failed = false;
public void Bar()
{
Task.Factory.StartNew(() =>
{
try
{
WaitForAction(1000, () => { });
finished = true;
}
catch
{
failed = true;
}
});
}
private void WaitForAction(int iMsToWait, Action action)
{
using (ManualResetEvent waitHandle = new ManualResetEvent(false))
{
Task.Factory.StartNew(() =>
{
action();
waitHandle.SafeSet();
});
if (waitHandle.SafeWaitOne(iMsToWait) == false)
{
throw new Exception("Timeout");
}
}
}
}
As the Action is doing nothing I would expect the 10 tasks started by calling Foo.Bar 10 times to complete well within the timeout. Sometimes this happens, but usually the program takes 2 seconds to execute and reports that only 2 instances of Foo 'finished' without error. In other words, 8 calls to WaitForAction have timed out.
I'm assuming that WaitForAction is thread safe, as each call on a Task-provided thread has its own stack. I have more or less proved this by logging the thread ID and wait handle ID for each call.
I realise that this code presented is a daft example, but I am interested in the principle. Is it possible for the task scheduler to be scheduling a task running the action delegate to the same threadpool thread that is already waiting for another action to complete? Or is there something else going on that I've missed?
Task.Factory utilizes the ThreadPool by default. With every call to WaitHandle.WaitOne, you block a worker thread. The .Net 4/4.5 thread pool starts with a small number of worker threads depending on your hardware platform (e.g., 4 on my machine) and it re-evaluates the pool size periodically (I believe it is every 1 second), creating new workers if necessary.
Since your program blocks all worker threads, and the thread pool doesn't grow fast enough, your waithandles timeout as you saw.
To confirm this, you can either 1) increase the timeouts or 2) increase the beginning thread pool size by adding the following line to the beginning of your program:
ThreadPool.SetMinThreads(32, 4);
then you should see the timeouts don't occur.
I believe your question was more academic than anything else, but you can read about a better implementation of a task timeout mechanism here, e.g.
var task = Task.Run(someAction);
if (task == await Task.WhenAny(task, Task.Delay(millisecondsTimeout)))
await task;
else
throw new TimeoutException();