If I have the following block of code in a method (using .NET 4 and the Task Parallel Library):
var task = new Task(() => DoSomethingLongRunning());
task.Start();
and the method returns, will that task go out of scope and be garbage collected, or will it run to completion? I haven't noticed any issues with GCing, but want to make sure I'm not setting myself up for a race condition with the GC.
Update:
After I answered this question (a long time ago!) I found out that it's not true that Tasks will always run to completion - there's a small, let's say "corner" case, where tasks may not finish.
The reason for that is this: As I have answered previously, Tasks are essentially threads; but they are background threads. Background threads are automatically aborted when all foreground threads finish. So, if you don't do anything with the task and the program ends, there's a chance the task won't complete.
You should always await on tasks. More information can be found on the excellent answer Jon gave me.
Original:
Task are scheduled to the ThreadPool, meaning that they are essentially threads¹ (actually, they encapsulate threads).
From the Thread documentation:
It is not necessary to retain a
reference to a Thread object once you
have started the thread. The thread
continues to execute until the thread
procedure is complete.
So, no, there is no need to retain a reference to it.
Also, the documentation states that the preferred way to create a Task is to use it's factory:
You can also use the StartNew method
to create and start a task in one
operation. This is the preferred way
to create and start tasks if creation
and scheduling do not have to be
separated (...)
Hope it helps.
¹ Accordingly to the documentation:
A task represents an asynchronous
operation, and in some ways it
resembles the creation of a new thread
or ThreadPool work item, but at a
higher level of abstraction.
The task will run to completion. Even if there aren't any other references to it (not being rooted I believe is the term), the thread pool will still hold a reference to it, and prevent it from being Garbage Collected at least (I say at least, because even after it completes, there is no guarantee that it will be Garbage Collected) until completion.
Related
So I am developing a UWP application that has a large number of threads. Previously I would start all the threads with System.Threading.Tasks.Task.Run(), save their thread handles to an array, then Task.WaitAll() for completion and get the results. This currently is taking too much memory though.
I have changed my code to only wait for a smaller amount of threads and copy their results out before continuing on to more of the threads. Since UWP the UWP implementation of Task does not implement IDisposable, what is the proper way to signal the framework that I am done with a task so it can be garbage collected? I would like to read out the results of the treads after a certain number of them come in and dispose of the threads resources to make space for the next threads.
Thanks so much!
Just to point out an issue which might be degrading the performance of your application: You are deliberately blocking the thread until all Tasks complete rather than actually await for them. That would make sense, if you are not performing Asynchronous work inside them, but if you are, you should definitely switch to:
Task.WhenAll rather than Task.WaitAll , such as this:
List<Tasks> tasks = new List<Tasks> { Method1(), Method2(), ... };
Task result = await Task.WhenAll(tasks);
This way, you are actually leveraging the asynchrony of your app, and you will not block the current thread until all the tasks are completed, like Task.WaitAll() does.
Since you are utilizing the Task.Run() method, instead of the Task.Factory.StartNew(), the TaskScheduler used is the default, and utilizes Threads from the Thread Pool. So you will not actually end up blocking the UI thread, but blocking many Thread Pool threads, is also not good.
Taking from Microsoft documentation, for one of the cases where Thread Pools should not be used:
You have tasks that cause the thread to block for long periods of
time. The thread pool has a maximum number of threads, so a large
number of blocked thread pool threads might prevent tasks from
starting.
Edit:
I do not need anything else but I will look in to that! Thanks! So is
there any way I can get it to run the Tasks like a FIFO with just the
API's available with the default thread pool?
You should take a look, into Continuations
A continuation is nothing else other than a task which is activated whenever it's antecedent task/tasks have completed. If you have specific tasks which you only want to execute after another task has completed you should take a look into Continuations, since they are extremely flexible, and you can actually create really complex flow of Tasks to better suit your needs.
Garbage collection on a .Net application always works the same, when a variable is not needed anymore (out of scope) it is collected.
Why do you think the threads are consuming the memory? It is much likely than the process inside the threads is the one consuming the memory.
It is considered that tasks are usually better choice than threads as they avoid wasting OS threads and give more programmatic control, but I wonder is there actually a use case where tasks performs worse than threads (so threads should be used instead)?
A task, when is about to be executed, will be executed in a thread context. A thread from the .NET thread pool will be used for the execution of a task. That being said there isn't any comparison between them.
Specifically, a task will be assigned to a thread of the thread pool to be executed, provided that there is a free thread. If there isn't any available thread, then the task will be placed in a queue waiting for one of the used threads to be free and it will be assigned to this thread (if the task is the first in the queue...). If the task will wait a long time in the queue (there is a specific time interval for this, but I don't remember it at this moment), then a new thread will be created, in order to service this task.
When I create a task as
Task task = Task.Factory.StartNew(() => someMethod(args));
in C# 4.0+, how can I get the reference of the thread(s) of this task?
Is it possible that the task is executed in the same thread that created the task or spawn more than one thread?
Update:
The reasons are:
I'd like to identify the task's thread in debugger (and attribute a name for it), etc.
Is created task executed always in separate thread from the one in which a task was created?
Is it one, zero or more than one thread?
Is it executed on a single and the same core?
It is important to know since, for example, I can put to sleep the main thread thinking that I am freezing the background worker
Update:
Useful answer:
Specifying a Thread's Name when using Task.StartNew
Is created task executed always in separate thread from the one in which a task was created?
No, there are certain situations in which the TPL is able to determine that the task can be executed on the same thread that created it, either because the relevant task creation option (or task scheduler) was supplied, or as an optimization because the calling thread would otherwise not have anything to do. You don't really need to worry about this though; it's not like you're going to end up blocking the UI thread because the TPL choose to execute it's code in that context. That won't happen unless you specifically indicate that it should. For all intents and purposes you can assume that this never happens (unless you force it to happen) but behind the scenes, without you ever needing to realize it, yes, it can happen.
Is it one, zero or more than one thread?
By default, tasks are executed in the thread pool. The thread pool will vary in the number of threads it contains based on the workload it's given. It will start out at one, but grow if there is sufficient need, and shrink if that need disappears. If you specify the LongRunning option, a new thread will be created just for that Task. If you specify a custom TaskScheduler, you can have it do whatever you want it to.
Is it executed on a single and the same core?
Potentially, but not assuredly.
It is important to know since, for example, I can put to sleep the main thread thinking that I am freezing the background worker
Putting the main thread to sleep will not prevent background workers from working. That's the whole point of creating the background workers, the two tasks don't stop each other from doing work. Note that if the background workers ever try to access the UI either to report progress or display results, and the UI is blocked, then they will be waiting for the UI thread to be free at that point.
You can use:
System.Threading.Thread.CurrentThread
But as said in the comments, you use the TPL to abstract threading away, so going back to this "low level" is a likely indicator of poor design.
Task.Factory.StartNew() queues the task for execution (see here). The actual thread that executes the task and when it gets executed is up to the TaskScheduler specified (the current TaskScheduler is used if none is specified).
In .Net 4 the default TaskScheduler uses the ThreadPool to execute tasks (see here) so if a ThreadPool Thread queued the task the same thread can possibly execute it later on.
The number of threads is dictated by the ThreadPool.
You shouldn't really care about which core your tasks are executed on.
Queuing a Task for execution will most likely schedule it to be executed on a ThreadPool Thread so you won't be at risk of accidentally putting the main thread to sleep
I understand that the TPL does not necessarily create a new thread for every task in a parallel set, but does it always create at least one? eg:
private void MyFunc()
{
Task.Factory.StartNew(() =>
{
//do something that takes a while
});
DoSomethingTimely(); //is this line guaranteed to be hit immediately?
}
EDIT: To clarify: Yes, I mean is it guaranteed that the thread executing MyFunc() is not going to be used to execute //do something that takes a while.
That's up to whatever the current default TaskScheduler is. You can just about envisage someone doing something horrific like implementing a SynchronousTaskScheduler that executes the task body during QueueTask and sets it to complete before returning.
Assuming you're not letting someone else muck about with your task schedulers, you shouldn't have to worry about it.
It depends on what you mean by "immediately" but I think it's reasonable to assume that the TPL isn't going to hijack your currently executing thread to synchronously run the code in your task, if that's what you mean. At least not with the normal scheduler... you could probably write your own scheduler which does do so, but you can normally assume that StartNew will schedule the task rather than just running it inline.
Your main question and the question in your code are completely different questions. But the answers to the two questions are:
1) No, there's no guarantee a thread will be started. What is created and started is a task. Ultimately, some thread will have to execute that task, but whether one will be created is unspecified. An existing thread could be re-used.
2) It depends what you mean by "immediately". Strictly speaking, there is no timeliness guarantee. But you have told the system to execute that task, and it will at least start it as soon as it finishes everything it considers more important. Strict fairness or timeliness is not guaranteed.
Yes, it will hit very shortly after dispatching the task to run.
No, it will not create a new thread, but claim one. When they say it doesn't always create a new thread, they are referring to the fact that it re-uses threads from a thread pool.
The size of the pool is based on the number of CPU cores detected. But it will always contain at least 1 thread. ;)
In short: Yes, this is guranteed.
Longer: If StartNew() does not create a new thread, it will reuse another: Either by it being free, or by queueing.
DoSomethingTimely will get called very quickly, but what does that have to do with creating a new thread, or adding the task to a queue?
I want to implement a timeout on the execution of tasks in a project that uses the CCR. Basically when I post an item to a Port or enqueue a Task to a DispatcherQueue I want to be able to abort the task or the thread that its running on if it takes longer than some configured time. How can I do this?
Can you confirm what you are asking? Are you running a long-lived task in the Dispatcher? Killing the thread would break the CCR model, so you need to be able to signal to the thread to finish its work and yield. Assuming it's a loop that is not finishing quick enough, you might choose to enqueue a timer:
var resultTimeoutPort = new Port<DateTime>();
dispatcherQueue.EnqueueTimer(TimeSpan.FromSeconds(RESULT_TIMEOUT),
resultTimeoutPort);
and ensure the blocking thread has available a reference to resultTimeoutPort. In the blocking loop, one of the exit conditions might be:
do
{
//foomungus amount of work
}while(resultTimeoutPort.Test()==null&&
someOtherCondition)
Please post more info if I'm barking up the wrong tree.
You could register the thread (Thread.CurrentThread) at the beginning of your CCR "Receive" handler (or in a method that calls your method via a delegate). Then you can do your periodic check and abort if necessary basically the same way you would have done it if you created the thread manually. The catch is that if you use your own Microsoft.Ccr.Core.Dispatcher with a fixed number of threads, I don't think there is a way to get those threads back once you abort them (based on my testing). So, if your dispatcher has 5 threads, you'll only be able to abort 5 times before posting will no longer work regardless of what tasks have been registered. However, if you construct a DispatcherQueue using the CLR thread pool, any CCR threads you abort will be replaced automatically and you won't have that problem. From what I've seen, although the CCR dispatcher is recommended, I think using the CLR thread pool is the way to go in this situation.