How do Tasks in the Task Parallel Library affect ActivityID? - c#

Before using the Task Parallel Library, I have often used CorrelationManager.ActivityId to keep track of tracing/error reporting with multiple threads.
ActivityId is stored in Thread Local Storage, so each thread get's its own copy. The idea is that when you fire up a thread (activity), you assign a new ActivityId. The ActivityId will be written to the logs with any other trace information, making it possible to single out the trace information for a single 'Activity'. This is really useful with WCF as the ActivityId can be carried over to the service component.
Here is an example of what I'm talking about:
static void Main(string[] args)
{
ThreadPool.QueueUserWorkItem(new WaitCallback((o) =>
{
DoWork();
}));
}
static void DoWork()
{
try
{
Trace.CorrelationManager.ActivityId = Guid.NewGuid();
//The functions below contain tracing which logs the ActivityID.
CallFunction1();
CallFunction2();
CallFunction3();
}
catch (Exception ex)
{
Trace.Write(Trace.CorrelationManager.ActivityId + " " + ex.ToString());
}
}
Now, with the TPL, my understanding is that multiple Tasks share Threads. Does this mean that ActivityId is prone to being reinitialized mid-task (by another task)? Is there a new mechanism to deal with activity tracing?

I ran some experiments and it turns out the assumption in my question is incorrect - multiple tasks created with the TPL do not run on the same thread at the same time.
ThreadLocalStorage is safe to use with TPL in .NET 4.0, since a thread can only be used by one task at a time.
The assumption that tasks can share threads concurrently was based on an interview I heard about c# 5.0 on DotNetRocks (sorry, I can't remember which show it was) - so my question may (or may not) become relevant soon.
My experiment starts a number of tasks, and records how many tasks ran, how long they took, and how many threads were consumed. The code is below if anyone would like to repeat it.
class Program
{
static void Main(string[] args)
{
int totalThreads = 100;
TaskCreationOptions taskCreationOpt = TaskCreationOptions.None;
Task task = null;
Stopwatch stopwatch = new Stopwatch();
stopwatch.Start();
Task[] allTasks = new Task[totalThreads];
for (int i = 0; i < totalThreads; i++)
{
task = Task.Factory.StartNew(() =>
{
DoLongRunningWork();
}, taskCreationOpt);
allTasks[i] = task;
}
Task.WaitAll(allTasks);
stopwatch.Stop();
Console.WriteLine(String.Format("Completed {0} tasks in {1} milliseconds", totalThreads, stopwatch.ElapsedMilliseconds));
Console.WriteLine(String.Format("Used {0} threads", threadIds.Count));
Console.ReadKey();
}
private static List<int> threadIds = new List<int>();
private static object locker = new object();
private static void DoLongRunningWork()
{
lock (locker)
{
//Keep a record of the managed thread used.
if (!threadIds.Contains(Thread.CurrentThread.ManagedThreadId))
threadIds.Add(Thread.CurrentThread.ManagedThreadId);
}
Guid g1 = Guid.NewGuid();
Trace.CorrelationManager.ActivityId = g1;
Thread.Sleep(3000);
Guid g2 = Trace.CorrelationManager.ActivityId;
Debug.Assert(g1.Equals(g2));
}
}
The output (of course this will depend on the machine) was:
Completed 100 tasks in 23097 milliseconds
Used 23 threads
Changing taskCreationOpt to TaskCreationOptions.LongRunning gave different results:
Completed 100 tasks in 3458 milliseconds
Used 100 threads

Please forgive my posting this as an answer as it is not really answer to your question, however, it is related to your question since it deals with CorrelationManager behavior and threads/tasks/etc. I have been looking at using the CorrelationManager's LogicalOperationStack (and StartLogicalOperation/StopLogicalOperation methods) to provide additional context in multithreading scenarios.
I took your example and modified it slightly to add the ability to perform work in parallel using Parallel.For. Also, I use StartLogicalOperation/StopLogicalOperation to bracket (internally) DoLongRunningWork. Conceptually, DoLongRunningWork does something like this each time it is executed:
DoLongRunningWork
StartLogicalOperation
Thread.Sleep(3000)
StopLogicalOperation
I have found that if I add these logical operations to your code (more or less as is), all of the logical operatins remain in sync (always the expected number of operations on stack and the values of the operations on the stack are always as expected).
In some of my own testing I found that this was not always the case. The logical operation stack was getting "corrupted". The best explanation I could come up with is that the "merging" back of the CallContext information into the "parent" thread context when the "child" thread exits was causing the "old" child thread context information (logical operation) to be "inherited" by another sibling child thread.
The problem might also be related to the fact that Parallel.For apparently uses the main thread (at least in the example code, as written) as one of the "worker threads" (or whatever they should be called in the parallel domain). Whenever DoLongRunningWork is executed, a new logical operation is started (at the beginning) and stopped (at the end) (that is, pushed onto the LogicalOperationStack and popped back off of it). If the main thread already has a logical operation in effect and if DoLongRunningWork executes ON THE MAIN THREAD, then a new logical operation is started so the main thread's LogicalOperationStack now has TWO operations. Any subsequent executions of DoLongRunningWork (as long as this "iteration" of DoLongRunningWork is executing on the main thread) will (apparently) inherit the main thread's LogicalOperationStack (which now has two operations on it, rather than just the one expected operation).
It took me a long time to figure out why the behavior of the LogicalOperationStack was different in my example than in my modified version of your example. Finally I saw that in my code I had bracketed the entire program in a logical operation, whereas in my modified version of your test program I did not. The implication is that in my test program, each time my "work" was performed (analogous to DoLongRunningWork), there was already a logical operation in effect. In my modified version of your test program, I had not bracketed the entire program in a logical operation.
So, when I modified your test program to bracket the entire program in a logical operation AND if I am using Parallel.For, I ran into exactly the same problem.
Using the conceptual model above, this will run successfully:
Parallel.For
DoLongRunningWork
StartLogicalOperation
Sleep(3000)
StopLogicalOperation
While this will eventually assert due to an apparently out of sync LogicalOperationStack:
StartLogicalOperation
Parallel.For
DoLongRunningWork
StartLogicalOperation
Sleep(3000)
StopLogicalOperation
StopLogicalOperation
Here is my sample program. It is similar to yours in that it has a DoLongRunningWork method that manipulates the ActivityId as well as the LogicalOperationStack. I also have two flavors of kicking of DoLongRunningWork. One flavor uses Tasks one uses Parallel.For. Each flavor can also be executed such that the whole parallelized operation is enclosed in a logical operation or not. So, there are a total of 4 ways to execute the parallel operation. To try each one, simply uncomment the desired "Use..." method, recompile, and run. UseTasks, UseTasks(true), and UseParallelFor should all run to completion. UseParallelFor(true) will assert at some point because the LogicalOperationStack does not have the expected number of entries.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Diagnostics;
using System.Threading;
using System.Threading.Tasks;
namespace CorrelationManagerParallelTest
{
class Program
{
static void Main(string[] args)
{
//UseParallelFor(true) will assert because LogicalOperationStack will not have expected
//number of entries, all others will run to completion.
UseTasks(); //Equivalent to original test program with only the parallelized
//operation bracketed in logical operation.
////UseTasks(true); //Bracket entire UseTasks method in logical operation
////UseParallelFor(); //Equivalent to original test program, but use Parallel.For
//rather than Tasks. Bracket only the parallelized
//operation in logical operation.
////UseParallelFor(true); //Bracket entire UseParallelFor method in logical operation
}
private static List<int> threadIds = new List<int>();
private static object locker = new object();
private static int mainThreadId = Thread.CurrentThread.ManagedThreadId;
private static int mainThreadUsedInDelegate = 0;
// baseCount is the expected number of entries in the LogicalOperationStack
// at the time that DoLongRunningWork starts. If the entire operation is bracketed
// externally by Start/StopLogicalOperation, then baseCount will be 1. Otherwise,
// it will be 0.
private static void DoLongRunningWork(int baseCount)
{
lock (locker)
{
//Keep a record of the managed thread used.
if (!threadIds.Contains(Thread.CurrentThread.ManagedThreadId))
threadIds.Add(Thread.CurrentThread.ManagedThreadId);
if (Thread.CurrentThread.ManagedThreadId == mainThreadId)
{
mainThreadUsedInDelegate++;
}
}
Guid lo1 = Guid.NewGuid();
Trace.CorrelationManager.StartLogicalOperation(lo1);
Guid g1 = Guid.NewGuid();
Trace.CorrelationManager.ActivityId = g1;
Thread.Sleep(3000);
Guid g2 = Trace.CorrelationManager.ActivityId;
Debug.Assert(g1.Equals(g2));
//This assert, LogicalOperation.Count, will eventually fail if there is a logical operation
//in effect when the Parallel.For operation was started.
Debug.Assert(Trace.CorrelationManager.LogicalOperationStack.Count == baseCount + 1, string.Format("MainThread = {0}, Thread = {1}, Count = {2}, ExpectedCount = {3}", mainThreadId, Thread.CurrentThread.ManagedThreadId, Trace.CorrelationManager.LogicalOperationStack.Count, baseCount + 1));
Debug.Assert(Trace.CorrelationManager.LogicalOperationStack.Peek().Equals(lo1), string.Format("MainThread = {0}, Thread = {1}, Count = {2}, ExpectedCount = {3}", mainThreadId, Thread.CurrentThread.ManagedThreadId, Trace.CorrelationManager.LogicalOperationStack.Peek(), lo1));
Trace.CorrelationManager.StopLogicalOperation();
}
private static void UseTasks(bool encloseInLogicalOperation = false)
{
int totalThreads = 100;
TaskCreationOptions taskCreationOpt = TaskCreationOptions.None;
Task task = null;
Stopwatch stopwatch = new Stopwatch();
stopwatch.Start();
if (encloseInLogicalOperation)
{
Trace.CorrelationManager.StartLogicalOperation();
}
Task[] allTasks = new Task[totalThreads];
for (int i = 0; i < totalThreads; i++)
{
task = Task.Factory.StartNew(() =>
{
DoLongRunningWork(encloseInLogicalOperation ? 1 : 0);
}, taskCreationOpt);
allTasks[i] = task;
}
Task.WaitAll(allTasks);
if (encloseInLogicalOperation)
{
Trace.CorrelationManager.StopLogicalOperation();
}
stopwatch.Stop();
Console.WriteLine(String.Format("Completed {0} tasks in {1} milliseconds", totalThreads, stopwatch.ElapsedMilliseconds));
Console.WriteLine(String.Format("Used {0} threads", threadIds.Count));
Console.WriteLine(String.Format("Main thread used in delegate {0} times", mainThreadUsedInDelegate));
Console.ReadKey();
}
private static void UseParallelFor(bool encloseInLogicalOperation = false)
{
int totalThreads = 100;
Stopwatch stopwatch = new Stopwatch();
stopwatch.Start();
if (encloseInLogicalOperation)
{
Trace.CorrelationManager.StartLogicalOperation();
}
Parallel.For(0, totalThreads, i =>
{
DoLongRunningWork(encloseInLogicalOperation ? 1 : 0);
});
if (encloseInLogicalOperation)
{
Trace.CorrelationManager.StopLogicalOperation();
}
stopwatch.Stop();
Console.WriteLine(String.Format("Completed {0} tasks in {1} milliseconds", totalThreads, stopwatch.ElapsedMilliseconds));
Console.WriteLine(String.Format("Used {0} threads", threadIds.Count));
Console.WriteLine(String.Format("Main thread used in delegate {0} times", mainThreadUsedInDelegate));
Console.ReadKey();
}
}
}
This whole issue of if LogicalOperationStack can be used with Parallel.For (and/or other threading/Task constructs) or how it can be used probably merits its own question. Maybe I will post a question. In the meantime, I wonder if you have any thoughts on this (or, I wonder if you had considered using LogicalOperationStack since ActivityId appears to be safe).
[EDIT]
See my answer to this question for more information about using LogicalOperationStack and/or CallContext.LogicalSetData with some of the various Thread/ThreadPool/Task/Parallel contstructs.
See also my question here on SO about LogicalOperationStack and Parallel extensions:
Is CorrelationManager.LogicalOperationStack compatible with Parallel.For, Tasks, Threads, etc
Finally, see also my question here on Microsoft's Parallel Extensions forum:
http://social.msdn.microsoft.com/Forums/en-US/parallelextensions/thread/7c5c3051-133b-4814-9db0-fc0039b4f9d9
In my testing it looks like Trace.CorrelationManager.LogicalOperationStack can become corrupted when using Parallel.For or Parallel.Invoke IF you start a logical operation in the main thread and then start/stop logical operations in the delegate. In my tests (see either of the two links above) the LogicalOperationStack should always have exactly 2 entries when DoLongRunningWork is executing (if I start a logical operation in the main thread before kicking of DoLongRunningWork using various techniques). So, by "corrupted" I mean that the LogicalOperationStack will eventually have many more than 2 entries.
From what I can tell, this is probably because Parallel.For and Parallel.Invoke use the main thread as one of the "worker" threads to perform the DoLongRunningWork action.
Using a stack stored in CallContext.LogicalSetData to mimic the behavior of the LogicalOperationStack (similar to log4net's LogicalThreadContext.Stacks which is stored via CallContext.SetData) yields even worse results. If I am using such a stack to maintain context, it becomes corrupted (i.e. does not have the expected number of entries) in almost all of the scenarios where I have a "logical operation" in the main thread and a logical operation in each iteration/execution of the DoLongRunningWork delegate.

Related

Why do I seem to have so few threads

I am trying to understand some code (for performance reasons) that is processing tasks from a queue. The code is C# .NET Framework 4.8 (And I didn't write this stuff)
I have this code creating a timer that from what I can tell should use a new thread every 10 seconds
_myTimer = new Timer(new TimerCallback(OnTimerGo), null, 0, 10000 );
Inside the onTimerGo it calls DoTask() inside of DoTask() it grabs a task off a queue and then does this
System.Threading.Tasks.Task.Factory.StartNew(ProcessTask, task).ContinueWith(c => DoTask());
My reading of this is that a new thread should start running OnTimerGo every 10 seconds, and that thread should in parralel run ProcessTask on tasks as fast as it can get them from the queue.
I inserted some code to call ThreadPool.GetMaxThreads and ThreadPool.GetAvailableThreads to figure out how many threads were in use. Then I queued up 10,000 things for it to do and let it loose.
I never see more then 4 threads in use at a time. This is running on a c4.4xlarge ec2 instance... so 16 vCPU 30 gb mem. The get max and available return over 2k. So I would expect more threads. By looking at the logging I can see that a total of 50ish different threads (by thread id) end up doing the work over the course of 20 minutes. Since the timer is set to every 10 seconds, I would expect 100 threads to be doing the work (or for it to finish sooner).
Looking at the code, the only time a running thread should stop is if it asks for a task from the queue and doesn't get one. Some other logging shows that there are never more than 2 tasks running in a thread. This is probably because they work is pretty fast. So the threads shouldn't be exiting, and I can even see from the logs that many of them end up doing as many as 500 tasks over the 20 minutes.
so... what am I missing here. Are the ThreadPool.GetMaxThreads and ThreadPool.GetAvailableThreads not accurate if run from inside a thread? Is something shutting down some of the threads while letting others keep going?
EDIT: adding more code
public static void StartScheduler()
{
lock (TimerLock)
{
if (_timerShutdown == false)
{
_myTimer = new Timer(new TimerCallback(OnTimerGo), null, 0, 10 );
const int numberOfSecondsPerMinute = 60;
const int margin = 1;
var pollEventsPerMinute = (numberOfSecondsPerMinute/SystemPreferences.TaskPollingIntervalSeconds);
_numberOfTimerCallsForHeartbeat = pollEventsPerMinute - margin;
}
}
}
private static void OnTimerGo(object state)
{
try
{
_lastTimer = DateTime.UtcNow;
var currentTickCount = Interlocked.Increment(ref _timerCallCount);
if (currentTickCount == _numberOfTimerCallsForHeartbeat)
{
Interlocked.Exchange(ref _timerCallCount, 0);
MonitoringTools.SendHeartbeatMetric(Heartbeat);
}
CheckForTasks();
}
catch (Exception e)
{
Log.Warn("Scheduler: OnTimerGo exception", e);
}
}
public static void CheckForTasks()
{
try
{
if (DoTask())
_lastStart = DateTime.UtcNow;
_lastStartOrCheck = DateTime.UtcNow;
}
catch (Exception e)
{
Log.Error("Unexpected exception checking for tasks", e);
}
}
private static bool DoTask()
{
Func<DataContext, bool> a = db =>
{
var mtid = Thread.CurrentThread.ManagedThreadId;
int totalThreads = Process.GetCurrentProcess().Threads.Count;
int maxWorkerThreads;
int maxPortThreads;
ThreadPool.GetMaxThreads(out maxWorkerThreads, out maxPortThreads);
int AvailableWorkerThreads;
int AvailablePortThreads;
ThreadPool.GetAvailableThreads(out AvailableWorkerThreads, out AvailablePortThreads);
int usedWorkerThreads = maxWorkerThreads - AvailableWorkerThreads;
string usedThreadMessage = $"Thread {mtid}: Threads in Use count: {usedWorkerThreads}";
Log.Info(usedThreadMessage);
var taskTypeAndTasks = GetTaskListTypeAndTasks();
var task = GetNextTask(db, taskTypeAndTasks.Key, taskTypeAndTasks.Value);
if (_timerShutdown)
{
Log.Debug("Task processing stopped.");
return false;
}
if (task == null)
{
Log.DebugFormat("DoTask: Idle in thread {0} ({1} tasks running)", mtid, _processingTaskLock);
return false;
}
Log.DebugFormat("DoTask: starting task {2}:{0} on thread {1}", task.Id, mtid, task.Class);
System.Threading.Tasks.Task.Factory.StartNew(ProcessTask, task).ContinueWith(c => DoTask());
Log.DebugFormat("DoTask: done ({0})", mtid);
return true;
};
return DbExtensions.WithDbWrite(ctx => a(ctx));
}
The Task.Factory.StartNew by default doesn't create a new thread. It borrows a thread from the ThreadPool instead.
The ThreadPool is intended as a small pool of reusable threads, to help amortize the cost of running frequent and lightweight operations like callbacks, continuations, event handers etc. Depleting the ThreadPool from available workers by scheduling too much work on it, results in a situation that is called saturation or starvation. And as you've already figured out, it's not a happy situation to be.
You can prevent the saturation of the ThreadPool by running your long-running work on dedicated threads instead of ThreadPool threads. This can be done by passing the TaskCreationOptions.LongRunning as argument to the Task.Factory.StartNew:
_ = Task.Factory.StartNew(ProcessTask, task, CancellationToken.None,
TaskCreationOptions.LongRunning,
TaskScheduler.Default).ContinueWith(t => DoTask(), CancellationToken.None,
TaskContinuationOptions.ExecuteSynchronously,
TaskScheduler.Default);
The above code schedules the ProcessTask(task) on a new thread, and after the invocation is completed either successfully or unsuccessfully, the DoTask will be invoked on the same thread. Finally the thread will be terminated. The discard _ signifies that the continuation Task (the task returned by the ContinueWith) is fire-and-forget. Which, to put it mildly, is architecturally suspicious. šŸ˜ƒ
In case you are wondering why I pass the TaskScheduler.Default explicitly as argument to StartNew and ContinueWith, check out this link.
My reading of this is that a new thread should start running OnTimerGo every 10 seconds, and that thread should in parralel run ProcessTask on tasks as fast as it can get them from the queue.
Well, that is definitely not what's happening. It's a lot of uncertainty about your code, but it's clear that another DoTask is starting AFTER ProcessTask completes. And that is not parallel execution. Your line of code is this
System.Threading.Tasks.Task.Factory.StartNew(ProcessTask, task).ContinueWith(c => DoTask());
I suggest you to start another DoTask right there like this:
System.Threading.Tasks.Task.Factory.StartNew(ProcessTask, task);
DoTask();
Make sure your code is ready for parallel execution, though.

Limiting the number of parallel task with SemaphoreSlim - why does it work?

in MS Docu you can read about SemaphoreSlim:
ā€žRepresents a lightweight alternative to Semaphore that limits the number of threads that can access a resource or pool of resources concurrently.ā€œ
https://learn.microsoft.com/en-us/dotnet/api/system.threading.semaphoreslim?view=net-5.0
In my understanding a Task is different from Thread. Task is higher level than Thread. Different tasks can run on the same thread. Or a task can be continued on another thread than it was started on.
(Compare: "server-side applications in .NET using asynchrony will use very few threads without limiting themselves to that. If everything really can be served by a single thread, it may well be - if you never have more than one thing to do in terms of physical processing, then that's fine."
from in C# how to run method async in the same thread)
IMO if you put this information together, the conclusion is that you canā€™t limit the number of Tasks running in parallel with the use of a semaphore slim, butā€¦
there are other texts that give this kind of advice (How to limit the amount of concurrent async I/O operations?, see ā€œYou can definitely do thisā€¦ā€)
if Iā€™m executing this code on my machine it seems it IS possible. If I work with different numbers for _MaxDegreeOfParallelism and different ranges of numbers, _RunningTasksCount doesnā€™t exceed the limit that is given by MaxDegreeOfParallelism.
Can somebody provide me some information to clearify?
class Program
{
static void Main(string[] args)
{
Console.WriteLine("Hello World!");
IRunner runner = new RunnerSemaphore();
runner.Run();
Console.WriteLine("Hit any key to close...");
Console.ReadLine();
}
}
public class RunnerSemaphore : IRunner
{
private readonly SemaphoreSlim _ConcurrencySemaphore;
private List<int> _Numbers;
private int _MaxDegreeOfParallelism = 3;
private object _RunningTasksLock = new object();
private int _RunningTasksCount = 0;
public RunnerSemaphore()
{
_ConcurrencySemaphore = new SemaphoreSlim(_MaxDegreeOfParallelism);
_Numbers = _Numbers = Enumerable.Range(1, 100).ToList();
}
public void Run()
{
RunAsync().Wait();
}
private async Task RunAsync()
{
List<Task> allTasks = new List<Task>();
foreach (int number in _Numbers)
{
var task = Task.Run
(async () =>
{
await _ConcurrencySemaphore.WaitAsync();
bool isFast = number != 1;
int delay = isFast ? 200 : 10000;
Console.WriteLine($"Start Work {number}\tManagedThreadId {Thread.CurrentThread.ManagedThreadId}\tRunning {IncreaseTaskCount()} tasks");
await Task.Delay(delay).ConfigureAwait(false);
Console.WriteLine($"End Work {number}\tManagedThreadId {Thread.CurrentThread.ManagedThreadId}\tRunning {DecreaseTaskCount()} tasks");
})
.ContinueWith((t) =>
{
_ConcurrencySemaphore.Release();
});
allTasks.Add(task);
}
await Task.WhenAll(allTasks.ToArray());
}
private int IncreaseTaskCount()
{
int taskCount;
lock (_RunningTasksLock)
{
taskCount = ++ _RunningTasksCount;
}
return taskCount;
}
private int DecreaseTaskCount()
{
int taskCount;
lock (_RunningTasksLock)
{
taskCount = -- _RunningTasksCount;
}
return taskCount;
}
}
Represents a lightweight alternative to Semaphore that limits the number of threads that can access a resource or pool of resources concurrently.
Well, that was a perfectly fine description when SemaphoreSlim was first introduced - it was just a lightweight Semaphore. Since that time, it has gotten new methods (i.e., WaitAsync) that enable it to act like an asynchronous synchronization primitive.
In my understanding a Task is different from Thread. Task is higher level than Thread. Different tasks can run on the same thread. Or a task can be continued on another thread than it was started on.
This is true for what I call "Delegate Tasks". There's also a completely different kind of Task that I call "Promise Tasks". Promise tasks are similar to promises (or "futures") in other languages (e.g., JavaScript), and they just represent the completion of some event. Promise tasks do not "run" anywhere; they just complete based on some future event (usually via a callback).
async methods always return promise tasks. The code in an asynchronous method is not actually run as part of the task; the task itself only represents the completion of the async method. I recommend my async intro for more information about async and how the code portions are scheduled.
if you put this information together, the conclusion is that you canā€™t limit the number of Tasks running in parallel with the use of a semaphore slim
This is personal preference, but I try to be very careful about terminology, precisely to avoid problems like this question. Delegate tasks may run in parallel, e.g., Parallel. Promise tasks do not "run", and they don't run in "parallel", but you can have multiple concurrent promise tasks that are all in progress. And SemaphoreSlim's WaitAsync is a perfect match for limiting that kind of concurrency.
You may wish to read about Stephen Toub's AsyncSemaphore (and other articles in that series). It's not the same implementation as SemaphoreSlim, but behaves essentially the same as far as promise tasks are concerned.

Thread Local Storage working principle

This is an example about Thread Local Storage (TLS) from Apress parallel programming book. I know that if we have 4 cores computer 4 thread can run parallel in same time. In this example we create 10 task and we suppose that have 4 cores computer. Each Thread local storage live in on thread so when start 10 task parallel only 4 thread perform. And We have 4 TLS so 10 task try to change 4 Thread local storage object. i want to ask how Tls prevent data race problem when thread count < Task count ??
using System;
using System.Threading;
using System.Threading.Tasks;
namespace Listing_04
{
class BankAccount
{
public int Balance
{
get;
set;
}
}
class Listing_04
{
static void Main(string[] args)
{
// create the bank account instance
BankAccount account = new BankAccount();
// create an array of tasks
Task<int>[] tasks = new Task<int>[10];
// create the thread local storage
ThreadLocal<int> tls = new ThreadLocal<int>();
for (int i = 0; i < 10; i++)
{
// create a new task
tasks[i] = new Task<int>((stateObject) =>
{
// get the state object and use it
// to set the TLS data
tls.Value = (int)stateObject;
// enter a loop for 1000 balance updates
for (int j = 0; j < 1000; j++)
{
// update the TLS balance
tls.Value++;
}
// return the updated balance
return tls.Value;
}, account.Balance);
// start the new task
tasks[i].Start();
}
// get the result from each task and add it to
// the balance
for (int i = 0; i < 10; i++)
{
account.Balance += tasks[i].Result;
}
// write out the counter value
Console.WriteLine("Expected value {0}, Balance: {1}",
10000, account.Balance);
// wait for input before exiting
Console.WriteLine("Press enter to finish");
Console.ReadLine();
}
}
}
We have 4 TLS so 10 task try to change 4 Thread local storage object
In your example, you could have anywhere between 1 and 10 TLS slots. This is because a) you are not managing your threads explicitly and so the tasks are executed using the thread pool, and b) the thread pool creates and destroys threads over time according to demand.
A loop of only 1000 iterations will completely almost instantaneously. So it's likely all ten of your tasks will get through the thread pool before the thread pool decides a work item has been waiting long enough to justify adding any new threads. But there is no guarantee of this.
Some important parts of the documentation include these statements:
By default, the minimum number of threads is set to the number of processors on a system
and
When demand is low, the actual number of thread pool threads can fall below the minimum values.
In other words, on your four-core system, the default minimum number of threads is four, but the actual number of threads active in the thread pool could in fact be less than that. And if the tasks take long enough to execute, the number of active threads could rise above that.
The biggest thing to keep in mind here is that using TLS in the context of a thread pool is almost certainly the wrong thing to do.
You use TLS when you have control over the threads, and you want a thread to be able to maintain some data private or unique to that thread. That's the opposite of what happens when you are using the thread pool. Even in the simplest case, multiple tasks can use the same thread, and so would wind up sharing TLS. And in more complicated scenarios, such as when using await, a single task could wind up executed in different threads, and so that one task could wind up using different TLS values depending on what thread is assigned to that task at that moment.
how Tls prevent data race problem when thread count < Task count ??
That depends on what "data race problem" you're talking about.
The fact is, the code you posted is filled with problems that are at the very least odd, if not outright wrong. For example, you are passing account.Balance as the initial value for each task. But why? This value is evaluated when you create the task, before it could ever be modified later, so what's the point of passing it?
And if you thought you were passing whatever the current value is when the task starts, that seems like that would be wrong too. Why would it be valid to make the starting value for a given task vary according to how many tasks had already completed and been accounted for in your later loop? (To be clear: that's not what's happeningā€¦but even if it were, it'd be a strange thing to do.)
Beyond all that, it's not clear what you thought using TLS here would accomplish anyway. When each task starts, you reinitialize the TLS value to 0 (i.e. the value of account.Balance that you've passed to the Task<int> constructor). So no thread involved ever sees a value other than 0 during the context of executing any given task. A local variable would accomplish exactly the same thing, without the overhead of TLS and without confusing anyone who reads the code and tries to figure out why TLS was used when it adds no value to the code.
So, does TLS solve some sort of "data race problem"? Not in this example, it doesn't appear to. So asking how it does that is impossible to answer. It doesn't do that, so there is no "how".
For what it's worth, I modified your example slightly so that it would report the individual threads that were assigned to the tasks. I found that on my machine, the number of threads used varied between two and eight. This is consistent with my eight-core machine, with the variation due to how much the first thread in the pool can get done before the pool has initialized additional threads and assigned tasks to them. Most commonly, I would see the first thread completing between three and five of the tasks, with the remaining tasks handled by remaining individual threads.
In each case, the thread pool created eight threads as soon as the tasks were started. But most of the time, at least one of those threads wound up unused, because the other threads were able to complete the tasks before the pool was saturated. That is, there is overhead in the thread pool just managing the tasks, and in your example the tasks are so inexpensive that this overhead allows one or more thread pool threads to finish one task before the thread pool needs that thread for another.
I've copied that version below. Note that I also added a delay between trial iterations, to allow the thread pool to terminate the threads it created (on my machine, this took 20 seconds, hence the delay time hard-codedā€¦you can see the threads being terminated in the debugger output).
static void Main(string[] args)
{
while (_PromptContinue())
{
// create the bank account instance
BankAccount account = new BankAccount();
// create an array of tasks
Task<int>[] tasks = new Task<int>[10];
// create the thread local storage
ThreadLocal<int> tlsBalance = new ThreadLocal<int>();
ThreadLocal<(int Id, int Count)> tlsIds = new ThreadLocal<(int, int)>(
() => (Thread.CurrentThread.ManagedThreadId, 0), true);
for (int i = 0; i < 10; i++)
{
int k = i;
// create a new task
tasks[i] = new Task<int>((stateObject) =>
{
// get the state object and use it
// to set the TLS data
tlsBalance.Value = (int)stateObject;
(int id, int count) = tlsIds.Value;
tlsIds.Value = (id, count + 1);
Console.WriteLine($"task {k}: thread {id}, initial value {tlsBalance.Value}");
// enter a loop for 1000 balance updates
for (int j = 0; j < 1000; j++)
{
// update the TLS balance
tlsBalance.Value++;
}
// return the updated balance
return tlsBalance.Value;
}, account.Balance);
// start the new task
tasks[i].Start();
}
// Make sure this thread isn't busy at all while the thread pool threads are working
Task.WaitAll(tasks);
// get the result from each task and add it to
// the balance
for (int i = 0; i < 10; i++)
{
account.Balance += tasks[i].Result;
}
// write out the counter value
Console.WriteLine("Expected value {0}, Balance: {1}", 10000, account.Balance);
Console.WriteLine("{0} thread ids used: {1}",
tlsIds.Values.Count,
string.Join(", ", tlsIds.Values.Select(t => $"{t.Id} ({t.Count})")));
System.Diagnostics.Debug.WriteLine("done!");
_Countdown(TimeSpan.FromSeconds(20));
}
}
private static void _Countdown(TimeSpan delay)
{
System.Diagnostics.Stopwatch sw = System.Diagnostics.Stopwatch.StartNew();
TimeSpan remaining = delay - sw.Elapsed,
sleepMax = TimeSpan.FromMilliseconds(250);
int cchMax = $"{delay.TotalSeconds,2:0}".Length;
string format = $"\r{{0,{cchMax}:0}}", previousText = null;
while (remaining > TimeSpan.Zero)
{
string nextText = string.Format(format, remaining.TotalSeconds);
if (previousText != nextText)
{
Console.Write(format, remaining.TotalSeconds);
previousText = nextText;
}
Thread.Sleep(remaining > sleepMax ? sleepMax : remaining);
remaining = delay - sw.Elapsed;
}
Console.Write(new string(' ', cchMax));
Console.Write('\r');
}
private static bool _PromptContinue()
{
Console.Write("Press Esc to exit, any other key to proceed: ");
try
{
return Console.ReadKey(true).Key != ConsoleKey.Escape;
}
finally
{
Console.WriteLine();
}
}

Thread synchronization in the Task Parallel Library (TPL)

I am learning TPL and stuck with a doubt. It is only for learning purpose and I hope people will guide me in the correct direction.
I want only one thread to access the variable sum at one time so that it does not get overwritten.
The code I have is below.
using System;
using System.Threading.Tasks;
class ThreadTest
{
private Object thisLock = new Object();
static int sum = 0;
public void RunMe()
{
lock (thisLock)
{
sum = sum + 1;
}
}
static void Main()
{
ThreadTest b = new ThreadTest();
Task t1 = new Task(()=>b.RunMe());
Task t2= new Task(() => b.RunMe());
t1.Start();
t2.Start();
Task.WaitAll(t1, t2);
Console.WriteLine(sum.ToString());
Console.ReadLine();
}
}
Question -Am i right in this code ?
Question-Can I do it without lock because I read it somewhere that it should be avoided as it does not allow task to communicate with each other.I have seen some examples with async and await but I am using .Net 4.0 .
Thanks
Am i right in this code
Implementation wise Yes, but understanding wise No, as you have mixed up the new and old world while trying to implement the logic, let me try to explain in detail.
Task t1 = new Task(()=>b.RunMe()); doesn't mean as expected as in case of Thread API a new thread every time
Task API will invoke a thread pool thread, so chances are two Task objects t1,t2, gets executed on same thread most of the times for a short running logic and there's never a race condition, which needs an explicit lock, while trying to update the shared object
Better way to prevent race condition for Sum object would be Interlocked.Increment(ref sum), which is a thread safe mechanism to do basic operations on primitive types
For the kind of operation you are doing a better API would be Parallel.For, instead of creating a separate Task, the benefit would be you can run any number of such increment operations with minimal effort, instead of creating a Separate Task and it automatically blocks the Main thread, so your code shall look like:
using System;
using System.Threading.Tasks;
class ThreadTest
{
public static int sum;
}
static void Main()
{
Parallel.For(0, 1, i =>
{
// Some thread instrumentation
Console.WriteLine("i = {0}, thread = {1}", i,
Thread.CurrentThread.ManagedThreadId);
Interlocked.Increment(ref ThreadTest.sum);
});
Console.WriteLine(ThreadTest.sum.ToString());
Console.ReadLine();
}
}
While using the Thread instrumentation you will find that chances are that for two loops, 0,1, managed thread id is same, thus obviating the need for thread safety as suggested earlier
Answer 1:
This is threadsafe for the code that you posted.
However, as Sam has pointed out, this is not currently threadsafe in the general case because the field being incremented is static, but the locking object is not static.
This means that two separate instances of ThreadTest could be created on two separate threads, and then RunMe() could be called from those threads and because each instance has a separate locking object, the locking wouldn't work.
The solution here is to make the locking object static too.
Answer 2:
You can do this without explicit locking using Interlocked.Increment():
public void RunMe()
{
Interlocked.Increment(ref sum);
}
Now to the point, as I was adding some unhappy comments about downvotes, that have no reasons:).
Your scenario is working and is classic multithreading usage.
Classic because of using system lock, that is lock are actually WinAPI locks of the OS, so in order to synchronize, the code has to manage to ring down to the OS and back and of course lose some time with switching threads as some contention may happen especially if you would access RunMe multiple times in each thread running given task or created even more concurrent tasks thank 2.
Please try look on atomic operations.
For your scenario it would work very well, Interlocked.Increment(ref sum).
From there you have to restrain yourself from directly accessing the sum, but that is not a problem, because the Increment method is returning latest result.
Another option is to use SpinLock, that is IF YOU OPERATION IS REALLY FAST.
NEVER ON something ,like Console.WriteLine or any other system operations or long running calculations etc.
Here Inrelocked examples:
using System;
using System.Threading;
using System.Threading.Tasks;
class ThreadTest
{
/// <summary> DO NOT TOUCH ME DIRECTLY </summary>
private static int sum/* = 0 zero is default*/;
private static int Add(int add) => Interlocked.Add(ref sum, add);
private static int Increment() => Interlocked.Increment(ref sum);
private static int Latest() => Interlocked.Add(ref sum, 0);
private static void RunMe() => Increment();
static void Main()
{
Task t1 = new Task(RunMe);
Task t2 = new Task(RunMe);
t1.Start();
t2.Start();
Task.WaitAll(t1, t2);
Console.WriteLine(Latest().ToString());
Console.ReadLine();
}
}

Why does await Task take considerably longer than return Task

I have made an interesting observation which I would like to fully understand.
The easiest way to explain this is by capturing it with this little sample console application:
namespace AsyncAwaitTestApp
{
using System;
using System.Diagnostics;
using System.Threading.Tasks;
class Program
{
static void Main(string[] args)
{
var timer = new Stopwatch();
// First Run:
// -------------------------
Console.WriteLine("Running GiveMeATask in parallel...");
timer.Start();
Parallel.For(0, 500, (i) =>
{
SillyClass.GiveMeATask().Wait();
});
timer.Stop();
Console.WriteLine($"GiveMeATask run completed. Total time: {timer.Elapsed.TotalSeconds} seconds.");
// Second Run:
// -------------------------
Console.WriteLine("-------------------------------");
Console.WriteLine("Running AwaitATask in parallel...");
timer.Restart();
Parallel.For(0, 500, (i) =>
{
SillyClass.AwaitATask().Wait();
});
timer.Stop();
Console.WriteLine($"AwaitATask run completed. Total time: {timer.Elapsed.TotalSeconds} seconds.");
Console.ReadLine();
}
}
public class SillyClass
{
public static TimeSpan SomeTime = TimeSpan.FromSeconds(3);
public static Task GiveMeATask()
{
return Task.Delay(SomeTime);
}
public static async Task AwaitATask()
{
await Task.Delay(SomeTime);
}
}
}
It is also available as a Gist.
SillyClass has two methods which both return a Task. Inside both methods I have a Task.Delay of 3 seconds to simulate an expensive network or IO bound operation.
The first method GiveMeATask simply returns the task from Task.Delay. The second method AwaitATask awaits the Task.Delay with the async/await keywords.
Both methods get called from Parallel.For to simulate concurrent calls.
What I found interesting is that the first run always takes about double the time as the second run, no matter how many times I repeat it:
As far as I understand the async/await pattern has to capture a synchronisation context, which adds a little bit of overhead, but I would be surprised if that is the reason?
I also understand that in certain applications such as a WPF application or an ASP.NET application where only 1 thread can access the UI/Request context an synchronously blocking .Wait() can cause deadlocks, but this is not the case for console applications and evidently I don't have a deadlock here, because both runs complete just fine.
So I am clearly not understanding the full picture of the above and would appreciate someone who can shed more light on this?

Categories