TPL wait curiosity - c#

I was playing around with TPL, and find out something that is very strange.
Code waits for tasks to end, and doing this dummy test, found out that a couple of task were executed after the Wait call. I'm I missing something, or this is a TPL problem?
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading;
using System.Threading.Tasks;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
Task tareas = null;
Console.WriteLine("Start process");
for (int i = 0; i < 4000; i++)
{
var n = i.ToString();
tareas = Task.Factory.StartNew(() =>
{
var random = new Random();
Thread.Sleep(random.Next(200, 500));
Console.WriteLine("Task completed: " + n);
});
}
tareas.Wait();
Console.WriteLine("Process end");
Console.Read();
}
}
}
Output:
Start process
Task completed: 4
Task completed: 3
...
End process
Task completed: 3996
Task completed: 3991
Task completed: 3993

Well, tareas.Wait(); only waits for the last task you created to complete. As your tasks complete in a random amount of time, it's perfectly normal that some tasks are not yet completed once the last one your created completes.
Here's an example of what could happen:
Task 1 created
Task 2 created
Task 3 created
Task 4 created
Task 1 completed
Task 3 completed
Wait for task 4 completion
Task 4 completed
// task 2 executed during more time than other
// tasks because of the random value the thread waited
Task 2 completed

As others have noted, you're waiting on only a single task, the last one scheduled.
Tasks scheduled on the default TaskScheduler use the thread pool to execute the work items as described here. With this scheduler, ordering is not guaranteed. The framework will attempt to balance the queues running on each thread in the pool, but it is possible (and not uncommon) that the queue in which the last scheduled task is put will finish running through all its tasks before the other queues.
This is without even considering the Wait. Check out the section labeled “Task Inlining” in the document I linked – if a task is being waited on, the thread executing the wait can say 'well, I can't do anything until this task is complete, let me see if I can go ahead and execute it myself'. In that case, the task gets removed from the thread pool queue entirely.

taereas will contain reference to your last created thread, so treas.Wait() will wait only for last thread to finish, so other threads may run even after your last thread has finished.

Related

Asynchronous tasks c#

I am trying to understand asynchronous programming in C#. I created an example of a function that can be called only once. For example reading from a big file. I want to create a query and request will wait in query until they can safely read file.
AsyncWork aw = new AsyncWork();
Task.Run(() => {
for (int count = 0; count < 10; count++) {
aw.DoAsyncWork(count);
Console.WriteLine("request added");
}
});
Console.ReadKey();
and AsyncWork
public class AsyncWork {
int Requests = 0;
int Workers = 0;
public AsyncWork() { }
public async Task DoAsyncWork(int count) {
Requests++;
while (Workers > 0) ;
Workers++;
Requests--;
if (Workers > 1) Console.WriteLine("MORE WORKERS AT ONCE");
Console.WriteLine(count.ToString() + " Started task");
//reading from file
await Task.Delay(1000);
Console.WriteLine(count.ToString() + " Ended task");
Workers--;
}
}
I expected that I will have 10 requests and then they will be done one by one. But output is like:
0 Started task
request added
0 Ended task
1 Started task
request added
1 Ended task
2 Started task
request added
2 Ended task
3 Started task
request added
3 Ended task
4 Started task
request added
4 Ended task
5 Started task
request added
5 Ended task
6 Started task
request added
6 Ended task
7 Started task
request added
7 Ended task
8 Started task
request added
8 Ended task
9 Started task
request added
9 Ended task
Why it is running synchronously?
When you run the code your first worker will start working, then the next will get stuck in the busy loop that you have, while (Workers > 0) ; and won't move on until the previous worker finishes. Then when it starts the next worker will be stuck there, and so on, for each iteration of the loop in which you start the workers.
So you only ever have at most one worker doing work, and one worker pegging an entire CPU sitting there waiting for it to finish.
A proper way to synchronize access when writing asynchronous code is to use a SemaphoreSlim and use WaitAsync, which will asynchronously wait until the semaphore can be acquired, rather than synchronously blocking the thread (and pegging the CPU while you're at it) for the other workers to finish. It also has the advantage of being safe to access from multiple threads, unlike an integer which is not safe to change from multiple threads.

Task does not wait up-to the wait time

I have created a task and provided the wait time to the task.wait() method, but the task does not wait up to the provided time and return before the wait time with completed status false.
using System;
using System.Threading;
using System.Threading.Tasks;
class Test
{
static void Main(string[] args)
{
for(int i = 0 ; i < 10 ; i++)
{
int localValue = i;
Task.Factory.StartNew(() => ProcessTask(localValue));
}
Console.ReadKey();
}
private static void ProcessTask(int thread)
{
var task = Task<int>.Factory.StartNew(() => GetSomeValue());
task.Wait(2000);
if(task.IsCompleted)
{
Console.WriteLine("Completed Thread: " + thread);
}
else
{
Console.WriteLine("Not Completed Thread " + thread);
}
}
private static int GetSomeValue()
{
Thread.Sleep(400);
return 5;
}
}
Update:
I have updated the code. When I have run this code I got the following output.
Only two tasks are completed out of 10. so I want to know what is the issue with this code?
Note: I am running this code in 4.5.2 frameworks.
The problem isn't that Task.Wait isn't waiting long enough here - it's that you're assuming that as soon as you call Task.Factory.StartNew() (which you should almost never do, btw - use Task.Run instead), the task is started. That's not the case. Task scheduling is a complicated topic, and I don't claim to be an expert, but when you start a lot of tasks at the same time, the thread pool will wait a little while before creating a new thread, to see if it can reuse it.
You can see this if you add more logging to your code. I added logging before and after the Wait call, and before and after the Sleep call, identifying which original value of i was involved. (I followed your convention of calling that the thread, although that's not quite the case.) The log uses DateTime.UtcNow with a pattern of MM:ss.FFF to show a timestamp down to a millisecond.
Here's the output of the log for a single task that completed:
12:01.657: Before Wait in thread 7
12:03.219: Task for thread 7 started
12:03.623: Task for thread 7 completing
12:03.625: After Wait in thread 7
Here the Wait call returns after less than 2 seconds, but that's fine because the task has completed.
And here's the output of the log for a single task that didn't complete:
12:01.644: Before Wait in thread 6
12:03.412: Task for thread 6 started
12:03.649: After Wait in thread 6
12:03.836: Task for thread 6 completing
Here Wait really has waited for 2 seconds, but the task still hasn't completed, because it only properly started just before the Wait time was up.
If you need to wait for task completion, you can use property Result. The Result property blocks the calling thread until the task finishes.
var task = Task<int>.Factory.StartNew(() => GetsomeValue());
int res = task.Result;

Creating a Thread Queue

I need to handle user request one by one(similar like a queue job)
This is what i have:
Thread checkQueue;
Boolean IsComplete = true;
protected void Start()
{
checkQueue = new Thread(CheckQueueState);
checkQueue.Start();
}
private void CheckQueueState()
{
while (true)
{
if (checkIsComplete)
{
ContinueDoSomething();
checkQueue.Abort();
break;
}
System.Threading.Thread.Sleep(1000);
}
}
protected void ContinueDoSomething()
{
IsComplete = false;
...
...
IsComplete = true; //when done, set it to true
}
Everytime when there is a new request from user, system will call Start() function and check whether the previous job is complete, if yes then will proceed to next job.
But I am not sure whether it is correct to do in this way.
Any improvement or any better way to do this?
I like usr's suggestion regarding using TPL Dataflow. If you have the ability to add external dependencies to your project (TPL Dataflow is not distributed as part of the .NET framework), then it provides a clean solution to your problem.
If, however, you're stuck with what the framework has to offer, you should have a look at BlockingCollection<T>, which works nicely with the producer-consumer pattern that you're trying to implement.
I've thrown together a quick .NET 4.0 example to illustrate how it can be used in your scenario. It is not very lean because it has a lot of calls to Console.WriteLine(). However, if you take out all the clutter it's extremely simple.
At the center of it is a BlockingCollection<Action>, which gets Action delegates added to it from any thread, and a thread specifically dedicated to dequeuing and executing those Actions sequentially in the exact order in which they were added.
using System;
using System.Collections.Concurrent;
using System.Threading;
using System.Threading.Tasks;
namespace SimpleProducerConsumer
{
class Program
{
static void Main(string[] args)
{
Console.WriteLine("Main thread id is {0}.", Thread.CurrentThread.ManagedThreadId);
using (var blockingCollection = new BlockingCollection<Action>())
{
// Start our processing loop.
var actionLoop = new Thread(() =>
{
Console.WriteLine(
"Starting action loop on thread {0} (dedicated action loop thread).",
Thread.CurrentThread.ManagedThreadId,
Thread.CurrentThread.IsThreadPoolThread);
// Dequeue actions as they become available.
foreach (var action in blockingCollection.GetConsumingEnumerable())
{
// Invoke the action synchronously
// on the "actionLoop" thread.
action();
}
Console.WriteLine("Action loop terminating.");
});
actionLoop.Start();
// Enqueue some work.
Console.WriteLine("Enqueueing action 1 from thread {0} (main thread).", Thread.CurrentThread.ManagedThreadId);
blockingCollection.Add(() => SimulateWork(1));
Console.WriteLine("Enqueueing action 2 from thread {0} (main thread).", Thread.CurrentThread.ManagedThreadId);
blockingCollection.Add(() => SimulateWork(2));
// Let's enqueue it from another thread just for fun.
var enqueueTask = Task.Factory.StartNew(() =>
{
Console.WriteLine(
"Enqueueing action 3 from thread {0} (task executing on a thread pool thread).",
Thread.CurrentThread.ManagedThreadId);
blockingCollection.Add(() => SimulateWork(3));
});
// We have to wait for the task to complete
// because otherwise we'll end up calling
// CompleteAdding before our background task
// has had the chance to enqueue action #3.
enqueueTask.Wait();
// Tell our loop (and, consequently, the "actionLoop" thread)
// to terminate when it's done processing pending actions.
blockingCollection.CompleteAdding();
Console.WriteLine("Done enqueueing work. Waiting for the loop to complete.");
// Block until the "actionLoop" thread terminates.
actionLoop.Join();
Console.WriteLine("Done. Press Enter to quit.");
Console.ReadLine();
}
}
private static void SimulateWork(int actionNo)
{
Thread.Sleep(500);
Console.WriteLine("Finished processing action {0} on thread {1} (dedicated action loop thread).", actionNo, Thread.CurrentThread.ManagedThreadId);
}
}
}
And the output is:
0.016s: Main thread id is 10.
0.025s: Enqueueing action 1 from thread 10 (main thread).
0.026s: Enqueueing action 2 from thread 10 (main thread).
0.027s: Starting action loop on thread 11 (dedicated action loop thread).
0.028s: Enqueueing action 3 from thread 6 (task executing on a thread pool thread).
0.028s: Done enqueueing work. Waiting for the loop to complete.
0.527s: Finished processing action 1 on thread 11 (dedicated action loop thread).
1.028s: Finished processing action 2 on thread 11 (dedicated action loop thread).
1.529s: Finished processing action 3 on thread 11 (dedicated action loop thread).
1.530s: Action loop terminating.
1.532s: Done. Press Enter to quit.
Use an ActionBlock<T> from the TPL Dataflow library. Set its MaxDegreeOfParalellism to 1 and you're done.
Note, that ASP.NET worker processes can recycle at any time (e.g. due to scheduled recycle, memory limit, server reboot or deployment), so the queued work might suddenly be lost without notice. I recommend you look into some external queueing solution like MSMQ (or others) for reliable queues.
Take a look at the Reactive Extensions from Microsoft. This library contains a set of schedulers available that follow the semantics you require.
The best to fit your needs is the EventLoopScheduler. It will queue up actions and perform them one after another. If it completes an action and there are more items in the queue it will sequentially process the actions on the same thread until the queue is empty and then it disposes the thread. When a new action is queued it creates a new thread. It is very efficient because of this.
The code is super simple and looks like this:
var scheduler = new System.Reactive.Concurrency.EventLoopScheduler();
scheduler.Schedule(() => { /* action here */ });
If you need to have every queued action performed on a new thread then use it like this:
var scheduler = new System.Reactive.Concurrency.NewThreadScheduler();
scheduler.Schedule(() => { /* action here */ });
Very simple.

Why does the Task Parallel Library have a 'hidden' 1 second timeout for scheduling tasks under certain conditions?

My laptop has 2 logical processors and I stumbled upon the scenario where if I schedule 2 tasks that take longer than 1 second without designating them long-running, subsequent tasks are started after 1 second has elapsed. It is possible to change this timeout?
I know normal tasks should be short-running - much shorter than a second if possible - I'm just wondering I am seeing hard-coded TPL behavior or if I can influence this behavior in any way other than designating tasks long-running.
This Console app method should demonstrate the behavior for a machine with any number of processors:
static void Main(string[] args)
{
var timer = new Stopwatch();
timer.Start();
int numberOfTasks = Environment.ProcessorCount;
var rudeTasks = new List<Task>();
var shortTasks = new List<Task>();
for (int index = 0; index < numberOfTasks; index++)
{
int capturedIndex = index;
rudeTasks.Add(Task.Factory.StartNew(() =>
{
Console.WriteLine("Starting rude task {0} at {1}ms", capturedIndex, timer.ElapsedMilliseconds);
Thread.Sleep(5000);
}));
}
for (int index = 0; index < numberOfTasks; index++)
{
int capturedIndex = index;
shortTasks.Add(Task.Factory.StartNew(() =>
{
Console.WriteLine("Short-running task {0} running at {1}ms", capturedIndex, timer.ElapsedMilliseconds);
}));
}
Task.WaitAll(shortTasks.ToArray());
Console.WriteLine("Finished waiting for short tasks at {0}ms", timer.ElapsedMilliseconds);
Task.WaitAll(rudeTasks.ToArray());
Console.WriteLine("Finished waiting for rude tasks at {0}ms", timer.ElapsedMilliseconds);
Console.ReadLine();
}
Here is the app's output on my 2 proc laptop:
Starting rude task 0 at 2ms
Starting rude task 1 at 2ms
Short-running task 0 running at 1002ms
Short-running task 1 running at 1002ms
Finished waiting for short tasks at 1002ms
Finished waiting for rude tasks at 5004ms
Press any key to continue . . .
The lines:
Short-running task 0 running at 1002ms
Short-running task 1 running at 1002ms
indicate that there is a 1 second timeout or something of that nature allowing the shorter-running tasks to get scheduled over the 'rude' tasks. That's what I'm inquiring about.
The behavior that you are seeing is not specific to the TPL, it's specific to the TPL's default scheduler. The scheduler is attempting to increase the number of threads so that those two that are running don't "hog" the CPU and choke out the others. It's also helpful in avoiding deadlock situations if the two that are running start and wait on Tasks themselves.
If you want to change the scheduling behavior, you might want to look into implementing your own TaskScheduler.
This is standard behavior for the threadpool scheduler. It tries to keep the number of active threads equal to the number of cores. But can't do the job really well when your tasks do a lot of blocking instead of running. Sleeping in your case. Twice a second it allows another thread to run to try to work down the backlog. Seems like you have a dual-core cpu.
The proper workaround is to use TaskCreationOptions.LongRunning so the scheduler uses a regular Thread instead of a threadpool thread. An improper workaround is to use ThreadPool.SetMinThreads. But you should perhaps focus on doing real work in your tasks, Sleep() is not a very good simulation of that.
The problem is it takes a while for the scheduler to start the new tasks as it tries to determine if a task is long-running. You can tell the TPL that a task is long running as a parameter of the task:
for (int index = 0; index < numberOfTasks; index++)
{
int capturedIndex = index;
rudeTasks.Add(Task.Factory.StartNew(() =>
{
Console.WriteLine("Starting rude task {0} at {1}ms", capturedIndex, timer.ElapsedMilliseconds);
Thread.Sleep(3000);
}, TaskCreationOptions.LongRunning));
}
Resulting in:
Starting rude task 0 at 11ms
Starting rude task 1 at 13ms
Starting rude task 2 at 15ms
Starting rude task 3 at 19ms
Short-running task 0 running at 45ms
Short-running task 1 running at 45ms
Short-running task 2 running at 45ms
Short-running task 3 running at 45ms
Finished waiting for short tasks at 46ms
Finished waiting for rude tasks at 3019ms

Managing threads efficiently in C#

I have an application in which the user will choose to do a number of tasks along with the maximum number of threads. Each task should run on a separate thread. Here is what I am looking for:
If the user specified "n less than t" where n is the maximum number of threads and t is the number of tasks. The program should run "n" threads and after they finish, the program should be notified by some way and repeat the loop untill all tasks are done.
My Question is:
How to know that all running threads has finished their job so that I can repeat the loop.
I recommend using the ThreadPool for your task. Its algorithm will generally be more efficient than something you can roll by hand.
Now the fun part is getting notified when all of your threads complete. Unless you have really specific needs which make this solution unsuitable, it should be easy enough to implement with the CountdownEvent class, which is a special kind of waithandle that waits until its been signaled n times. Here's an example:
using System;
using System.Linq;
using System.Threading;
using System.Diagnostics;
namespace CSharpSandbox
{
class Program
{
static void SomeTask(int sleepInterval, CountdownEvent countDown)
{
try
{
// pretend this did something more profound
Thread.Sleep(sleepInterval);
}
finally
{
// need to signal in a finally block, otherwise an exception may occur and prevent
// this from being signaled
countDown.Signal();
}
}
static CountdownEvent StartTasks(int count)
{
Random rnd = new Random();
CountdownEvent countDown = new CountdownEvent(count);
for (int i = 0; i < count; i++)
{
ThreadPool.QueueUserWorkItem(_ => SomeTask(rnd.Next(100), countDown));
}
return countDown;
}
public static void Main(string[] args)
{
Console.WriteLine("Starting. . .");
var stopWatch = Stopwatch.StartNew();
using(CountdownEvent countdownEvent = StartTasks(100))
{
countdownEvent.Wait();
// waits until the countdownEvent is signalled 100 times
}
stopWatch.Stop();
Console.WriteLine("Done! Elapsed time: {0} milliseconds", stopWatch.Elapsed.TotalMilliseconds);
}
}
}
You probably want to use a Thread Pool for this. You (can) specify the number of threads in the pool, and give it tasks to do. When a thread in the pool is idle, it automatically looks for another task to carry out.
If you want to do this without the thread pool, you can use Thread.Join to wait for the threads to complete. That is:
Thread t1 = new Thread(...);
Thread t2 = new Thread(...);
t1.Start();
t2.Start();
// Wait for threads to finish
t1.Join();
t2.Join();
// At this point, all threads are done.
Of course, if this is an interactive application you'd want that to happen in a thread itself. And if you wanted to get fancy, the waiting thread could do the work of one of the threads (i.e. you'd start thread 1 and then the main thread would do the work of the second thread).
If this is an interactive application, then you probably want to make use of BackgroundWorker (which used the thread pool). If you attach an event handler to the RunWorkCompleted event, then you will be notified when the worker has completed its task. If you have multiple workers, have a single RunWorkCompleted event handler, and keep track of which workers have signaled. When they've all signaled, then your program can go ahead and do whatever else it needs to do.
The example at http://msdn.microsoft.com/en-us/library/system.componentmodel.backgroundworker.aspx should give you a good start.
Could you check the isAlive() value for each thread? if all values equal false then you would know that all your threads have ended. Additionally, there is a way to have your delegate return it's own status.
http://msdn.microsoft.com/en-us/library/system.threading.thread.isalive(v=VS.90).aspx

Categories