Bad SemaphoreSlim perfomance using a lot of semaphores

Bad SemaphoreSlim perfomance using a lot of semaphores - c#

When I run a lot of operations in parallel using SemaphoreSlim for each, their invocations are not so quick as expected.
Here is the code
var sw = new Stopwatch();
sw.Start();
for (int i = 0; i < 50; i++) {
int localI = i;
Task.Run(async () => {
var semaphore = new SemaphoreSlim(1, 1);
await semaphore.WaitAsync();
Thread.Sleep(1000);
counter++;
semaphore.Release();
Debug.WriteLine($"{localI} - {sw.ElapsedMilliseconds}");
});
}
Thread.Sleep(5000);
And here is the output:
2 - 1015
0 - 1015
1 - 1015
3 - 2053
4 - 2053
5 - 2053
6 - 2120
7 - 3009
8 - 3064
9 - 3066
10 - 3068
11 - 3134
12 - 4011
13 - 4016
14 - 4070
15 - 4071
16 - 4073
17 - 4140
Can somebody explain why they were not invoked approximately in 1 second?

What you are seeing is the limited thread pool injection rate. It has nothing to do with SemaphoreSlim or even async, as all the code posted is actually synchronous.
On your machine, three threads are able to run immediately. The thread pool sees that it has other work to do (47 other items already queued). So it waits for a bit and then injects another thread. The next group of work uses four threads. The thread pool is still "behind", so it waits for a bit and then injects another thread, etc.
The "wait for a bit" part of the description above is the limited thread pool injection rate. The thread pool has to wait for a bit, or else whenever it gets more work, it would immediately create a bunch of threads, which would then be disposed of when the work is done. So to be more efficient and prevent this "thread thrashing", the thread pool waits for a bit before creating new threads.

Related

How does asynchronous programming work with threads when using Thread.Sleep()?

Presumptions/Prelude:
In previous questions, we note that Thread.Sleep blocks threads see: When to use Task.Delay, when to use Thread.Sleep?.
We also note that console apps have three threads: The main thread, the GC thread & the finalizer thread IIRC. All other threads are debugger threads.
We know that async does not spin up new threads, and it instead runs on the synchronization context, "uses time on the thread only when the method is active". https://learn.microsoft.com/en-us/dotnet/csharp/programming-guide/concepts/async/task-asynchronous-programming-model
Setup:
In a sample console app, we can see that neither the sibling nor the parent code are affected by a call to Thread.Sleep, at least until the await is called (unknown if further).
var sw = new Stopwatch();
sw.Start();
Console.WriteLine($"{sw.Elapsed}");
var asyncTests = new AsyncTests();
var go1 = asyncTests.WriteWithSleep();
var go2 = asyncTests.WriteWithoutSleep();
await go1;
await go2;
sw.Stop();
Console.WriteLine($"{sw.Elapsed}");
Stopwatch sw1 = new Stopwatch();
public async Task WriteWithSleep()
{
sw1.Start();
await Task.Delay(1000);
Console.WriteLine("Delayed 1 seconds");
Console.WriteLine($"{sw1.Elapsed}");
Thread.Sleep(9000);
Console.WriteLine("Delayed 10 seconds");
Console.WriteLine($"{sw1.Elapsed}");
sw1.Stop();
}
public async Task WriteWithoutSleep()
{
await Task.Delay(3000);
Console.WriteLine("Delayed 3 second.");
Console.WriteLine($"{sw1.Elapsed}");
await Task.Delay(6000);
Console.WriteLine("Delayed 9 seconds.");
Console.WriteLine($"{sw1.Elapsed}");
}
Question:
If the thread is blocked from execution during Thread.Sleep, how is it that it continues to process the parent and sibling? Some answer that it is background threads, but I see no evidence of multithreading background threads. What am I missing?

I see no evidence of multithreading background threads. What am I missing?
Possibly you are looking in the wrong place, or using the wrong tools. There's a handy property that might be of use to you, in the form of Thread.CurrentThread.ManagedThreadId. According to the docs,
A thread's ManagedThreadId property value serves to uniquely identify that thread within its process.
The value of the ManagedThreadId property does not vary over time
This means that all code running on the same thread will always see the same ManagedThreadId value. If you sprinkle some extra WriteLines into your code, you'll be able to see that your tasks may run on several different threads during their lifetimes. It is even entirely possible for some async applications to have all their tasks run on the same thread, though you probably won't see that behaviour in your code under normal circumstances.
Here's some example output from my machine, not guaranteed to be the same on yours, nor is it necessarily going to be the same output on successive runs of the same application.
00:00:00.0000030
* WriteWithSleep on thread 1 before await
* WriteWithoutSleep on thread 1 before first await
* WriteWithSleep on thread 4 after await
Delayed 1 seconds
00:00:01.0203244
* WriteWithoutSleep on thread 5 after first await
Delayed 3 second.
00:00:03.0310891
* WriteWithoutSleep on thread 6 after second await
Delayed 9 seconds.
00:00:09.0609263
Delayed 10 seconds
00:00:10.0257838
00:00:10.0898976
The business of running tasks on threads is handled by a TaskScheduler. You could write one that forces code to be single threaded, but that's not often a useful thing to do. The default scheduler uses a threadpool, and as such tasks can be run on a number of different threads.

The Task.Delay method is implemented basically like this (simplified¹):
public static Task Delay(int millisecondsDelay)
{
var tcs = new TaskCompletionSource();
_ = new Timer(_ => tcs.SetResult(), null, millisecondsDelay, -1);
return tcs.Task;
}
The Task is completed on the callback of a System.Threading.Timer component, and according to the documentation this callback is invoked on a ThreadPool thread:
The method does not execute on the thread that created the timer; it executes on a ThreadPool thread supplied by the system.
So when you await the task returned by the Task.Delay method, the continuation after the await runs on the ThreadPool. The ThreadPool typically has more than one threads available immediately on demand, so it's not difficult to introduce concurrency and parallelism if you create 2 tasks at once, like you do in your example. The main thread of a console application is not equipped with a SynchronizationContext by default, so there is no mechanism in place to prevent the observed concurrency.
¹ For demonstration purposes only. The Timer reference is not stored anywhere, so it might be garbage collected before the callback is invoked, resulting in the Task never completing.

I am not accepting my own answer, I will accept someone else's answer because they helped me figure this out. First, in the context of my question, I was using async Main. It was very hard to choose between Theodor's & Rook's answer. However, Rook's answer provided me with one thing that helped me fish: Thread.CurrentThread.ManagedThreadId
These are the results of my running code:
1 00:00:00.0000767
Not Delayed.
1 00:00:00.2988809
Delayed 1 second.
4 00:00:01.3392148
Delayed 3 second.
5 00:00:03.3716776
Delayed 9 seconds.
5 00:00:09.3838139
Delayed 10 seconds
4 00:00:10.3411050
4 00:00:10.5313519
I notice that there are 3 threads here, The initial thread (1) provides for the first calling method and part of the WriteWithSleep() until Task.Delay is initialized and later awaited. At the point that Task.Delay is brought back into Thread 1, everything is run on Thread 4 instead of Thread 1 for the main and the remainder of WriteWithSleep.
WriteWithoutSleep uses its own Thread(5).
So my error was believing that there were only 3 threads. I believed the answer to this question: https://stackoverflow.com/questions/3476642/why-does-this-simple-net-console-app-have-so-many-threads#:~:text=You%20should%20only%20see%20three,see%20are%20debugger%2Drelated%20threads.
However, that question may not have been async, or may not have considered these additional worker threads from the threadpool.
Thank you all for your assistance in figuring out this question.

Does usage of Thread.Sleep(n) causes performance issues?

I am using Thread.Sleep(n) for my project. I heard that Thread.Sleep can cause performance issues but not sure about it.
My requirement is:
Wait for 5 minute increments, up to 30 minutes (6 times the 5 minute
delay). After this, begin to increment by 1 hour and do this 5 times
(additional 5 hours).
Below I have provided my sample code which uses Thread.Sleep(n) in different scenarios:
Thread.Sleep(1000 * 60 * 5); //-------waiting for 5 minutes
var isDownloaded = false;
try
{
var attempt = 0;
while (attempt < 11)
{
isDownloaded = TryDownloading(strPathToDownload, strFileToDownload);
if (isDownloaded)
break;
attempt++;
if (attempt < 6)
Thread.Sleep(1000 * 60 * 5); //--------waiting for 5 minutes
else
{
if (attempt < 11)
Thread.Sleep(1000 * 60 * 60); //-------waiting for 1 hour
else
break;
}
}
}
On the above code I am trying to download a file with a maximum of 11 download attempts. Initially it waits for 5 minutes and then try to download the file which is the first attempt and if failed then it tries for the next 5 attempts with an interval of 5 minutes each. If they failed for the first six attempts then it go for the next 5 attempts with an interval for 1 hour each.
So we decided to use Thread.Sleep for those time delay in our console app.
Does this causes any problem or performance issues?
If Thread.Sleep(n) causes performance issues, then which would be an better alternative way instead of using Thread.Sleep(n)?
Also finally, Does MSDN suggested that Thread.Sleep(n) is harmful or it shouldn't be used?

This is absolutely fine. Here are the costs of sleeping:
You keep a thread occupied. This takes a little memory.
Setting up the wait inside of the OS. That is O(1). The duration of the sleep does not matter. It is a small, constant cost.
That's all.
What is bad, though, is busy waiting or doing polling loops because that causes actual CPU usage. Just spending time in a sleep or wait does not create CPU usage.
TL;DR: Use sleep for delays, do not use sleep for polling.
I must say that the aversion against sleeping is sometimes just a trained reflex. Be sure to analyze the concrete use case before you condemn sleeping.

Do not use ASP.NET worker process to run long running tasks!
The app pool may be recycled any time and you will lose your sleeping threads.
Consider using Windows Service instead.
You can communicate between web site and windows service using database or messaging.

You'd better use a Timer to perform intermittent action like that because Thread.Sleep is a blocking call and keeps the process allocated and the application freeze.
like this:
in the caller object
Timer t = new Timer(1000 * 60 * 5);
t.Tick += t_Tick;
t.Start();
than implement the event
//event timer elapsed implementation
int count = 0;
private void t_Tick(object sender, EventArgs e)
{
if(count >=5)
t.Stop();
else{
//your code that do the work here
}

Parallel.ForEach keeps spawning new threads

While I was using Parallel.ForEach in my program, I found that some threads never seemed to finish. In fact, it kept spawning new threads over and over, a behaviour that I wasn't expecting and definitely don't want.
I was able to reproduce this behaviour with the following code which, just like my 'real' program, both uses processor and memory a lot (.NET 4.0 code):
public class Node
{
public Node Previous { get; private set; }
public Node(Node previous)
{
Previous = previous;
}
}
public class Program
{
public static void Main(string[] args)
{
DateTime startMoment = DateTime.Now;
int concurrentThreads = 0;
var jobs = Enumerable.Range(0, 2000);
Parallel.ForEach(jobs, delegate(int jobNr)
{
Interlocked.Increment(ref concurrentThreads);
int heavyness = jobNr % 9;
//Give the processor and the garbage collector something to do...
List<Node> nodes = new List<Node>();
Node current = null;
for (int y = 0; y < 1024 * 1024 * heavyness; y++)
{
current = new Node(current);
nodes.Add(current);
}
TimeSpan elapsed = DateTime.Now - startMoment;
int threadsRemaining = Interlocked.Decrement(ref concurrentThreads);
Console.WriteLine("[{0:mm\\:ss}] Job {1,4} complete. {2} threads remaining.",
elapsed, jobNr, threadsRemaining);
});
}
}
When run on my quad-core, it initially starts of with 4 concurrent threads, just as you would expect. However, over time more and more threads are being created. Eventually, this program then throws an OutOfMemoryException:
[00:00] Job 0 complete. 3 threads remaining.
[00:01] Job 1 complete. 4 threads remaining.
[00:01] Job 2 complete. 4 threads remaining.
[00:02] Job 3 complete. 4 threads remaining.
[00:05] Job 9 complete. 5 threads remaining.
[00:05] Job 4 complete. 5 threads remaining.
[00:05] Job 5 complete. 5 threads remaining.
[00:05] Job 10 complete. 5 threads remaining.
[00:08] Job 11 complete. 5 threads remaining.
[00:08] Job 6 complete. 5 threads remaining.
...
[00:55] Job 67 complete. 7 threads remaining.
[00:56] Job 81 complete. 8 threads remaining.
...
[01:54] Job 107 complete. 11 threads remaining.
[02:00] Job 121 complete. 12 threads remaining.
..
[02:55] Job 115 complete. 19 threads remaining.
[03:02] Job 166 complete. 21 threads remaining.
...
[03:41] Job 113 complete. 28 threads remaining.
<OutOfMemoryException>
The memory usage graph for the experiment above is as follows:
(The screenshot is in Dutch; the top part represents processor usage, the bottom part memory usage.) As you can see, it looks like a new thread is being spawned almost every time the garbage collector gets in the way (as can be seen in the dips of memory usage).
Can anyone explain why this is happening, and what I can do about it? I just want .NET to stop spawning new threads, and finish the existing threads first...

You can limit the maximum number of threads that get created by specifying a ParallelOptions instance with the MaxDegreeOfParallelism property set:
var jobs = Enumerable.Range(0, 2000);
ParallelOptions po = new ParallelOptions
{
MaxDegreeOfParallelism = Environment.ProcessorCount
};
Parallel.ForEach(jobs, po, jobNr =>
{
// ...
});
As to why you're getting the behaviour you're observing: The TPL (which underlies PLINQ) is, by default, at liberty to guess the optimal number of threads to use. Whenever a parallel task blocks, the task scheduler may create a new thread in order to maintain progress. In your case, the blocking might be happening implicitly; for example, through the Console.WriteLine call, or (as you observed) during garbage collection.
From Concurrency Levels Tuning with Task Parallel Library (How Many Threads to Use?):
Since the TPL default policy is to use one thread per processor, we can conclude that TPL initially assumes that the workload of a task is ~100% working and 0% waiting, and if the initial assumption fails and the task enters a waiting state (i.e. starts blocking) - TPL with take the liberty to add threads as appropriate.

You should probably read a bit about the how the task scheduler works.
Parallel Programming with Microsoft .NET - Parallel Tasks
(latter half of the page)
"The .NET thread pool automatically manages the number of worker
threads in the pool. It adds and removes threads according to built-in
heuristics. The .NET thread pool has two main mechanisms for injecting
threads: a starvation-avoidance mechanism that adds worker threads if
it sees no progress being made on queued items and a hill-climbing
heuristic that tries to maximize throughput while using as few threads
as possible.
The goal of starvation avoidance is to prevent deadlock. This kind of
deadlock can occur when a worker thread waits for a synchronization
event that can only be satisfied by a work item that is still pending
in the thread pool's global or local queues. If there were a fixed
number of worker threads, and all of those threads were similarly
blocked, the system would be unable to ever make further progress.
Adding a new worker thread resolves the problem.
A goal of the hill-climbing heuristic is to improve the utilization of
cores when threads are blocked by I/O or other wait conditions that
stall the processor. By default, the managed thread pool has one
worker thread per core. If one of these worker threads becomes
blocked, there's a chance that a core might be underutilized,
depending on the computer's overall workload. The thread injection
logic doesn't distinguish between a thread that's blocked and a thread
that's performing a lengthy, processor-intensive operation. Therefore,
whenever the thread pool's global or local queues contain pending work
items, active work items that take a long time to run (more than a
half second) can trigger the creation of new thread pool worker
threads."
You can mark a task as LongRunning but this has the side effect of allocating a thread for it from outside the thread pool which means that the task cannot be inlined.
Remember that the ParallelFor treats the work it is given as blocks so even if the work in one loop is fairly small the overall work done by the task invoked by the look may appear longer to the scheduler.
Most calls to the GC in and of them selves aren't blocking (it runs on a separate thread) but if you wait for GC to complete then this does block. Remember also that the GC is rearranging memory so this may have some side effects (and blocking) if you are trying to allocate memory while running GC. I don't have specifics here but I know the PPL has some memory allocation features specifically for concurrent memory management for this reason.
Looking at your code's output it seems that things are running for many seconds. So I'm not surprised that you are seeing thread injection. However I seem to remember that the default thread pool size is roughly 30 threads (probably depending on the number of cores on your system). A thread takes up roughly a MB of memory before your code allocates any more so I'm not clear why you could get an out of memory exception here.

I've posted the follow-up question "How to count the amount of concurrent threads in .NET application?"
If to count the threads directly, their number in Parallel.For() mostly ((very rarely and insignificantly decreasing) only increases and is not releleased after loop completion.
Checked this in both Release and Debug mode, with
ParallelOptions po = new ParallelOptions
{
MaxDegreeOfParallelism = Environment.ProcessorCount
};
and without
The digits vary but conclusions are the same.
Here is the ready code I was using, if someone wants to play with:
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System.Threading;
using System.Threading.Tasks;
namespace Edit4Posting
{
public class Node
{
public Node Previous { get; private set; }
public Node(Node previous)
{
Previous = previous;
}
}
public class Edit4Posting
{
public static void Main(string[] args)
{
int concurrentThreads = 0;
int directThreadsCount = 0;
int diagThreadCount = 0;
var jobs = Enumerable.Range(0, 160);
ParallelOptions po = new ParallelOptions
{
MaxDegreeOfParallelism = Environment.ProcessorCount
};
Parallel.ForEach(jobs, po, delegate(int jobNr)
//Parallel.ForEach(jobs, delegate(int jobNr)
{
int threadsRemaining = Interlocked.Increment(ref concurrentThreads);
int heavyness = jobNr % 9;
//Give the processor and the garbage collector something to do...
List<Node> nodes = new List<Node>();
Node current = null;
//for (int y = 0; y < 1024 * 1024 * heavyness; y++)
for (int y = 0; y < 1024 * 24 * heavyness; y++)
{
current = new Node(current);
nodes.Add(current);
}
//*******************************
directThreadsCount = Process.GetCurrentProcess().Threads.Count;
//*******************************
threadsRemaining = Interlocked.Decrement(ref concurrentThreads);
Console.WriteLine("[Job {0} complete. {1} threads remaining but directThreadsCount == {2}",
jobNr, threadsRemaining, directThreadsCount);
});
Console.WriteLine("FINISHED");
Console.ReadLine();
}
}
}

Thread count growth when using Task Parallel Library

I'm using C# TPL and I'm having a problem with a producer/consumer code... for some reason, TPL doesn't reuse threads and keeps creating new ones without stopping
I made a simple example to demonstrate this behavior:
class Program
{
static BlockingCollection<int> m_Buffer = new BlockingCollection<int>(1);
static CancellationTokenSource m_Cts = new CancellationTokenSource();
static void Producer()
{
try
{
while (!m_Cts.IsCancellationRequested)
{
Console.WriteLine("Enqueuing job");
m_Buffer.Add(0);
Thread.Sleep(1000);
}
}
finally
{
m_Buffer.CompleteAdding();
}
}
static void Consumer()
{
Parallel.ForEach(m_Buffer.GetConsumingEnumerable(), Run);
}
static void Run(int i)
{
Console.WriteLine
("Job Processed\tThread: {0}\tProcess Thread Count: {1}",
Thread.CurrentThread.ManagedThreadId,
Process.GetCurrentProcess().Threads.Count);
}
static void Main(string[] args)
{
Task producer = new Task(Producer);
Task consumer = new Task(Consumer);
producer.Start();
consumer.Start();
Console.ReadKey();
m_Cts.Cancel();
Task.WaitAll(producer, consumer);
}
}
This code creates 2 tasks, producer and consumer. Produces adds 1 work item every second, and Consumer only prints out a string with information. I would assume that 1 consumer thread is enough in this situation, because tasks are processed much faster than they are being added to the queue, but what actually happens is that every second number of threads in the process grows by 1... as if TPL is creating new thread for every item
after trying to understand what's happening I also noticed another thing: even though BlockingCollection size is 1, after a while Consumer starts getting called in bursts, for example, this is how it starts:
Enqueuing job
Job Processed Thread: 4 Process Thread Count: 9
Enqueuing job
Job Processed Thread: 6 Process Thread Count: 9
Enqueuing job
Job Processed Thread: 5 Process Thread Count: 10
Enqueuing job
Job Processed Thread: 4 Process Thread Count: 10
Enqueuing job
Job Processed Thread: 6 Process Thread Count: 11
and this is how it's processing items less than a minute later:
Enqueuing job
Job Processed Thread: 25 Process Thread Count: 52
Enqueuing job
Enqueuing job
Job Processed Thread: 5 Process Thread Count: 54
Job Processed Thread: 5 Process Thread Count: 54
and because threads get disposed after finishing Parallel.ForEach loop (I don't show it in this example, but it was in the real project) I assumed that it has something to do with ForEach specifically... I found this artice http://reedcopsey.com/2010/01/26/parallelism-in-net-part-5-partitioning-of-work/, and I thought that my problem was caused by this default partitioner, so I took custom partitioner from TPL Examples that is feeding Consumer threads item one by one, and although it fixed the order of execution (got rid of delay)...
Enqueuing job
Job Processed Thread: 71 Process Thread Count: 140
Enqueuing job
Job Processed Thread: 12 Process Thread Count: 141
Enqueuing job
Job Processed Thread: 72 Process Thread Count: 142
Enqueuing job
Job Processed Thread: 38 Process Thread Count: 143
Enqueuing job
Job Processed Thread: 73 Process Thread Count: 143
Enqueuing job
Job Processed Thread: 21 Process Thread Count: 144
Enqueuing job
Job Processed Thread: 74 Process Thread Count: 145
...it didn't stop threads from growing
I know about ParallelOptions.MaxDegreeOfParallelism, but I still want to understand what's happening with TPL and why it creates hundreds of threads for no reason
in my project I a code that has to run for hours and read new data from database, put it into a BlockingCollections and have has data processed by other code, there's 1 new item about every 5 seconds and it takes from several milliseconds to almost a minute to process it, and after running for about 10 minutes, thread count reached over a 1000 threads

There are two things that together cause this behavior:
ThreadPool tries to use the optimal number of threads for your situation. But if one of the threads in the pool blocks, the pool sees this as if that thread wasn't doing any useful work and so it tends to create another thread soon after that. What this means is that if you have a lot of blocking, ThreadPool is really bad at guessing the optimal number of threads and it tends to create new threads until it reaches the limit.
Parallel.ForEach() trusts the ThreadPool to guess the correct number of threads, unless you set the maximum number of threads explicitly. Parallel.ForEach() was also primarily meant for bounded collections, not streams of data.
When you combine these two things with GetConsumingEnumerable(), what you get is that Parallel.ForEach() creates threads that are almost always blocked. The ThreadPool sees this, and, to try to keep the CPU utilized, creates more and more threads.
The correct solution here is to set MaxDegreeOfParallelism. If your computations are CPU-bound, the best value is most likely Environment.ProcessorCount. If they are IO-bound, you will have to find out the best value experimentally.
Another option, if you can use .Net 4.5, is to use TPL Dataflow. This library was made specifically to process streams of data, like you have, so it doesn't have the problems your code has. It's actually even better than that and doesn't use any threads at all when it's not processing anything currently.
Note: There is also a good reason why is a new thread created for each new item, but explaining that would require me to explain how Parallel.ForEach() works in more detail and I feel that's not necessary here.

Why does the Task Parallel Library have a 'hidden' 1 second timeout for scheduling tasks under certain conditions?

My laptop has 2 logical processors and I stumbled upon the scenario where if I schedule 2 tasks that take longer than 1 second without designating them long-running, subsequent tasks are started after 1 second has elapsed. It is possible to change this timeout?
I know normal tasks should be short-running - much shorter than a second if possible - I'm just wondering I am seeing hard-coded TPL behavior or if I can influence this behavior in any way other than designating tasks long-running.
This Console app method should demonstrate the behavior for a machine with any number of processors:
static void Main(string[] args)
{
var timer = new Stopwatch();
timer.Start();
int numberOfTasks = Environment.ProcessorCount;
var rudeTasks = new List<Task>();
var shortTasks = new List<Task>();
for (int index = 0; index < numberOfTasks; index++)
{
int capturedIndex = index;
rudeTasks.Add(Task.Factory.StartNew(() =>
{
Console.WriteLine("Starting rude task {0} at {1}ms", capturedIndex, timer.ElapsedMilliseconds);
Thread.Sleep(5000);
}));
}
for (int index = 0; index < numberOfTasks; index++)
{
int capturedIndex = index;
shortTasks.Add(Task.Factory.StartNew(() =>
{
Console.WriteLine("Short-running task {0} running at {1}ms", capturedIndex, timer.ElapsedMilliseconds);
}));
}
Task.WaitAll(shortTasks.ToArray());
Console.WriteLine("Finished waiting for short tasks at {0}ms", timer.ElapsedMilliseconds);
Task.WaitAll(rudeTasks.ToArray());
Console.WriteLine("Finished waiting for rude tasks at {0}ms", timer.ElapsedMilliseconds);
Console.ReadLine();
}
Here is the app's output on my 2 proc laptop:
Starting rude task 0 at 2ms
Starting rude task 1 at 2ms
Short-running task 0 running at 1002ms
Short-running task 1 running at 1002ms
Finished waiting for short tasks at 1002ms
Finished waiting for rude tasks at 5004ms
Press any key to continue . . .
The lines:
Short-running task 0 running at 1002ms
Short-running task 1 running at 1002ms
indicate that there is a 1 second timeout or something of that nature allowing the shorter-running tasks to get scheduled over the 'rude' tasks. That's what I'm inquiring about.

The behavior that you are seeing is not specific to the TPL, it's specific to the TPL's default scheduler. The scheduler is attempting to increase the number of threads so that those two that are running don't "hog" the CPU and choke out the others. It's also helpful in avoiding deadlock situations if the two that are running start and wait on Tasks themselves.
If you want to change the scheduling behavior, you might want to look into implementing your own TaskScheduler.

This is standard behavior for the threadpool scheduler. It tries to keep the number of active threads equal to the number of cores. But can't do the job really well when your tasks do a lot of blocking instead of running. Sleeping in your case. Twice a second it allows another thread to run to try to work down the backlog. Seems like you have a dual-core cpu.
The proper workaround is to use TaskCreationOptions.LongRunning so the scheduler uses a regular Thread instead of a threadpool thread. An improper workaround is to use ThreadPool.SetMinThreads. But you should perhaps focus on doing real work in your tasks, Sleep() is not a very good simulation of that.

The problem is it takes a while for the scheduler to start the new tasks as it tries to determine if a task is long-running. You can tell the TPL that a task is long running as a parameter of the task:
for (int index = 0; index < numberOfTasks; index++)
{
int capturedIndex = index;
rudeTasks.Add(Task.Factory.StartNew(() =>
{
Console.WriteLine("Starting rude task {0} at {1}ms", capturedIndex, timer.ElapsedMilliseconds);
Thread.Sleep(3000);
}, TaskCreationOptions.LongRunning));
}
Resulting in:
Starting rude task 0 at 11ms
Starting rude task 1 at 13ms
Starting rude task 2 at 15ms
Starting rude task 3 at 19ms
Short-running task 0 running at 45ms
Short-running task 1 running at 45ms
Short-running task 2 running at 45ms
Short-running task 3 running at 45ms
Finished waiting for short tasks at 46ms
Finished waiting for rude tasks at 3019ms

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.