Differences between Multithreading and Async

Differences between Multithreading and Async - c#

Note: Please read to the end before marking as duplicate. I've read the other answers, and they don't seem to answer my question.
I've seen various pictures and people point out and say that multithreading is different from asynchronous programming, by giving various analogies to restaurant workers and the like. But I've yet to see the difference with an actual example.
I tried this in C#:
using System;
using System.Diagnostics;
using System.Threading;
using System.Threading.Tasks;
namespace AsyncTest
{
class Program
{
static void RunSeconds(double seconds)
{
int ms = (int)(seconds * 1000);
Stopwatch stopwatch = new Stopwatch();
stopwatch.Start();
Console.WriteLine($"Thread started to run for {seconds} seconds");
Thread.Sleep(ms);
stopwatch.Stop();
Console.WriteLine($"Stopwatch passed {stopwatch.ElapsedMilliseconds} ms.");
}
static async Task RunSecondsAsync(double seconds)
{
int ms = (int)(seconds * 1000);
Stopwatch stopwatch = new Stopwatch();
stopwatch.Start();
Console.WriteLine($"Thread started to run for {seconds} seconds");
await Task.Run(() => Thread.Sleep(ms));
stopwatch.Stop();
Console.WriteLine($"Stopwatch passed {stopwatch.ElapsedMilliseconds} ms.");
}
static void RunSecondsThreaded(double seconds)
{
Thread th = new Thread(() => RunSeconds(seconds));
th.Start();
}
static async Task Main()
{
Console.WriteLine("Synchronous:");
RunSeconds(2.5); RunSeconds(2);
Console.WriteLine("\nAsynchronous:");
Task t1 = RunSecondsAsync(2.5); Task t2 = RunSecondsAsync(2);
await t1; await t2;
Console.WriteLine("\nMultithreading:");
RunSecondsThreaded(2.5); RunSecondsThreaded(2);
}
}
}
Results:
Synchronous:
Thread started to run for 2.5 seconds
Stopwatch passed 2507 ms.
Thread started to run for 2 seconds
Stopwatch passed 2001 ms.
Asynchronous:
Thread started to run for 2.5 seconds
Thread started to run for 2 seconds
Stopwatch passed 2002 ms.
Stopwatch passed 2554 ms.
Multithreading:
Thread started to run for 2.5 seconds
Thread started to run for 2 seconds
Stopwatch passed 2000 ms.
Stopwatch passed 2501 ms.
They yielded essentially the same results, behaviour-wise. So when and what exactly would I find different in the behaviour of a multithreaded program vs an asynchronous one?
I have various other issues to resolve:
In this image, for example:
What I don't get is that when you run an asynchronous program, it behaves practically identically to a multithreaded one, in that it seems to spend a similar amount of time. By the image above, it's addressing the asynchronous task in "breaks". If it does this, shouldn't it take longer for the asynchronous task to complete?
Let's say an asynchronous task which would normally complete 3 seconds synchronously while locking other tasks is run, should I not expect these tasks to finish in much longer than 3 seconds, given that it does other tasks on the side while taking breaks from my original task?
So why does it often take a similar asynchronously (ie. the usual 3 seconds)? And why does the program become "responsive": if the task is not being done on a separate thread, why does working on the task while working on other tasks on the side take only the expected 3 seconds?
The problem I have with the examples using workers in a restaurant (see top answer), is that in a restaurant, the cooking is done by the oven. In a computer, this analogy doesn't make much sense, as it's not clear why the oven isn't being treated as a separate "thread" but the people/workers are.
Furthermore, does a multithreaded application use more memory? And if it does, is it possible to create a simple application (ideally as similar to the one above) proving that it does?
Bit of a lengthy question, but the differences between multithreading and asynchronous programming are far from clear to me.

You can't use Thread.Sleep in async code, use
await Task.Delay(1000);
instead.
The async code uses a thread pool, any time the program awaits for some IO to complete, the thread is returned to the pool to do other stuff. Once the IO completes, the async method resumes at the line where it yielded the thread back to threadpool, continuing on.
When you manipulate with the Thread directly, you block and your code is no longer async, you also starve the threadpool as it is limited in the number of threads available.
Also throughout the lifetime of an async method, you are not guaranteed every line will be executed on the same thread. Generally after every await keyword the thread may change.
You never want to touch the Thread class in an async method.
By doing:
await Task.Run(() => Thread.Sleep(ms));
You force the TPL to allocate a thread out of the pool to block it, starving it.
By doing
await Task.Run(async () => await Task.Delay(ms));
you will essentially run on one or two threads from a pool even if you start it many times.
Running Task.Run() on synchronous code is mostly used for legacy calls that do not support async internally and the TPL just wraps the sync call in a pooled thread. To get the full advantages of async code you need to await a call that itself runs only async code internally.

Let me try to correlate your program with a real world example and then explain it.
Consider your program to be an IT office and your are the boss of it. Boss means the main thread which starts the program execution. The console can be considered as your diary.
Programs execution starts:
static async Task Main()
{
Process process = Process.GetCurrentProcess();
Console.WriteLine("Synchronous:");
You enter into the office from the main door and log "Synchronous:" into your diary.
Synchronous:
Calling method 'RunSeconds()'
RunSeconds(2.5); RunSeconds(2);
Let us assume 'RunSeconds()' is equivalent to a call from one of your projects client, however there is no one to attend the calls. So you attend both the calls.The thing to remember is you attend the calls one after the other as you are one person and total spent is close to 4.5 seconds.
Meanwhile you get a call from your home but you could not attend it because you were busy attending the client calls. Now coming to logging of the calls.You get a call you log it.Once it is completed you log the amount of time spent on call. And you do it twice for both the calls.
Thread started to run for 2.5 seconds
Stopwatch passed 2507 ms.
Thread started to run for 2 seconds
Stopwatch passed 2001 ms.
Console.WriteLine("\nAsynchronous:");
Then you log "Asynchronous:" into the diary
Calling method 'RunSecondsAsync()'
Task t1 = RunSecondsAsync(2.5); Task t2 = RunSecondsAsync(2);
await t1; await t2;
Let us assume 'RunSecondsAsync()' is again equivalent to a call from one of your projects client, however this time you have a Manager with a team of 10 call attendants who take the call. Here Manager is equivalent to the Task and each call attendant is a thread and collectively known as thread pool. Remember the manager by himself does not take any calls, he is just there to delegate calls to the call attendants and manage them
When the first call 'RunSecondsAsync(2.5)' comes in, the manager immediately assigns it to one of the call attendant and lets you know that the call has been addressed with the help of task object as return. You again get an immediate second call 'RunSecondsAsync(2)', which the manager immediately assigns to another call attendant and both the calls are handled simultaneously.
However you want to log the amount of time spent on the phone calls, so you wait for those calls to be completed with the help of await keywords. The key difference of waiting this time is, you are still free to do whatever you want because the phone calls are attended by call attendants.So if you get a call from your home this time around you will be able to take it. (analogous to application being responsive).
Once the calls are done, the manager lets you know that the calls are completed and you go ahead and log in your diary. Now coming to logging of the calls, you first log both the calls which have come in and once they are completed you log in the total time spent on each call. The total duration spent by you in this case is close to 2.5 seconds which is the maximum of both calls because calls are handled in parallel and some overhead in communicating with the manager.
Thread started to run for 2.5 seconds
Thread started to run for 2 seconds
Stopwatch passed 2002 ms.
Stopwatch passed 2554 ms.
Console.WriteLine("\Multithreading:");
Then you log "Multithreading:" into the diary
Calling method 'RunSecondsThreaded()'
RunSecondsThreaded(2.5); RunSecondsThreaded(2);
And finally you and your manager have a small fight and he leaves the organization. However you do not want to take the calls because you have other important tasks to take care of. So you hire a call attendant when a phone call comes in and have the work done for you. You do it two times because two calls have come by. Meanwhile you are again free to do other tasks like if you get a phone call from your home you can attend it.
Now coming to logging of the calls. You do not log the calls this time around into the diary. The call attendants do it on your behalf. The work done by you is just hiring the call attendants. Since calls have come in almost at the same time, the total time spent is 2.5 seconds plus some additional time for hiring.
Thread started to run for 2.5 seconds
Thread started to run for 2 seconds
Stopwatch passed 2000 ms.
Stopwatch passed 2501 ms.
Hope it helps in resolving your confusion

Related

How does asynchronous programming work with threads when using Thread.Sleep()?

Presumptions/Prelude:
In previous questions, we note that Thread.Sleep blocks threads see: When to use Task.Delay, when to use Thread.Sleep?.
We also note that console apps have three threads: The main thread, the GC thread & the finalizer thread IIRC. All other threads are debugger threads.
We know that async does not spin up new threads, and it instead runs on the synchronization context, "uses time on the thread only when the method is active". https://learn.microsoft.com/en-us/dotnet/csharp/programming-guide/concepts/async/task-asynchronous-programming-model
Setup:
In a sample console app, we can see that neither the sibling nor the parent code are affected by a call to Thread.Sleep, at least until the await is called (unknown if further).
var sw = new Stopwatch();
sw.Start();
Console.WriteLine($"{sw.Elapsed}");
var asyncTests = new AsyncTests();
var go1 = asyncTests.WriteWithSleep();
var go2 = asyncTests.WriteWithoutSleep();
await go1;
await go2;
sw.Stop();
Console.WriteLine($"{sw.Elapsed}");
Stopwatch sw1 = new Stopwatch();
public async Task WriteWithSleep()
{
sw1.Start();
await Task.Delay(1000);
Console.WriteLine("Delayed 1 seconds");
Console.WriteLine($"{sw1.Elapsed}");
Thread.Sleep(9000);
Console.WriteLine("Delayed 10 seconds");
Console.WriteLine($"{sw1.Elapsed}");
sw1.Stop();
}
public async Task WriteWithoutSleep()
{
await Task.Delay(3000);
Console.WriteLine("Delayed 3 second.");
Console.WriteLine($"{sw1.Elapsed}");
await Task.Delay(6000);
Console.WriteLine("Delayed 9 seconds.");
Console.WriteLine($"{sw1.Elapsed}");
}
Question:
If the thread is blocked from execution during Thread.Sleep, how is it that it continues to process the parent and sibling? Some answer that it is background threads, but I see no evidence of multithreading background threads. What am I missing?

I see no evidence of multithreading background threads. What am I missing?
Possibly you are looking in the wrong place, or using the wrong tools. There's a handy property that might be of use to you, in the form of Thread.CurrentThread.ManagedThreadId. According to the docs,
A thread's ManagedThreadId property value serves to uniquely identify that thread within its process.
The value of the ManagedThreadId property does not vary over time
This means that all code running on the same thread will always see the same ManagedThreadId value. If you sprinkle some extra WriteLines into your code, you'll be able to see that your tasks may run on several different threads during their lifetimes. It is even entirely possible for some async applications to have all their tasks run on the same thread, though you probably won't see that behaviour in your code under normal circumstances.
Here's some example output from my machine, not guaranteed to be the same on yours, nor is it necessarily going to be the same output on successive runs of the same application.
00:00:00.0000030
* WriteWithSleep on thread 1 before await
* WriteWithoutSleep on thread 1 before first await
* WriteWithSleep on thread 4 after await
Delayed 1 seconds
00:00:01.0203244
* WriteWithoutSleep on thread 5 after first await
Delayed 3 second.
00:00:03.0310891
* WriteWithoutSleep on thread 6 after second await
Delayed 9 seconds.
00:00:09.0609263
Delayed 10 seconds
00:00:10.0257838
00:00:10.0898976
The business of running tasks on threads is handled by a TaskScheduler. You could write one that forces code to be single threaded, but that's not often a useful thing to do. The default scheduler uses a threadpool, and as such tasks can be run on a number of different threads.

The Task.Delay method is implemented basically like this (simplified¹):
public static Task Delay(int millisecondsDelay)
{
var tcs = new TaskCompletionSource();
_ = new Timer(_ => tcs.SetResult(), null, millisecondsDelay, -1);
return tcs.Task;
}
The Task is completed on the callback of a System.Threading.Timer component, and according to the documentation this callback is invoked on a ThreadPool thread:
The method does not execute on the thread that created the timer; it executes on a ThreadPool thread supplied by the system.
So when you await the task returned by the Task.Delay method, the continuation after the await runs on the ThreadPool. The ThreadPool typically has more than one threads available immediately on demand, so it's not difficult to introduce concurrency and parallelism if you create 2 tasks at once, like you do in your example. The main thread of a console application is not equipped with a SynchronizationContext by default, so there is no mechanism in place to prevent the observed concurrency.
¹ For demonstration purposes only. The Timer reference is not stored anywhere, so it might be garbage collected before the callback is invoked, resulting in the Task never completing.

I am not accepting my own answer, I will accept someone else's answer because they helped me figure this out. First, in the context of my question, I was using async Main. It was very hard to choose between Theodor's & Rook's answer. However, Rook's answer provided me with one thing that helped me fish: Thread.CurrentThread.ManagedThreadId
These are the results of my running code:
1 00:00:00.0000767
Not Delayed.
1 00:00:00.2988809
Delayed 1 second.
4 00:00:01.3392148
Delayed 3 second.
5 00:00:03.3716776
Delayed 9 seconds.
5 00:00:09.3838139
Delayed 10 seconds
4 00:00:10.3411050
4 00:00:10.5313519
I notice that there are 3 threads here, The initial thread (1) provides for the first calling method and part of the WriteWithSleep() until Task.Delay is initialized and later awaited. At the point that Task.Delay is brought back into Thread 1, everything is run on Thread 4 instead of Thread 1 for the main and the remainder of WriteWithSleep.
WriteWithoutSleep uses its own Thread(5).
So my error was believing that there were only 3 threads. I believed the answer to this question: https://stackoverflow.com/questions/3476642/why-does-this-simple-net-console-app-have-so-many-threads#:~:text=You%20should%20only%20see%20three,see%20are%20debugger%2Drelated%20threads.
However, that question may not have been async, or may not have considered these additional worker threads from the threadpool.
Thank you all for your assistance in figuring out this question.

Log data into cassandra using c#

I trying to log data into Cassandra using c#. So my aim is to log as much data points as I can in 200ms.
I am trying to save time, random key and value in 200ms. Please see code for refrence. the problem how can I execute session after while loop.
Cluster cluster = Cluster.Builder()
.AddContactPoint("127.0.0.1")
.Build();
ISession session = cluster.Connect("log"); //keyspace to connect with
var ps = session.Prepare("Insert into logcassandra(nanodate, key, value) values (?,?,?)");
stopwatch.Start();
while(stop.ElapsedMilliseconds <= 200)
{
i++;
var statement = ps.Bind(nanoTime(),"key"+i,"value"+i);
session.ExecuteAsync(statement);
}

Please prefer System.Threading.Timer with a TimerCallback over Stopwatch.
EDIT: (reply to the comment)
Hi, I'm not sure what you want to achieve, but here are some general concepts about async calls and parallel execution. In .NET world the async is mainly used for Non-blocking I/O operations, which means your caller thread will not wait for the response of the I/O driver. In other words, you instantiate an I/O operation and dispatch this work to a "thing" which is outside of the .NET ecosystem and that will gives you back a future (a Task). The driver acknowledges back that it received the request and it promises that it will process it once it has free capacity.
That Task represents an async work that either succeeded or fail. But because you are calling it asynchronously you are not awaiting its result (not blocking the caller thread to wait for external work) rather move on to the next statement. Eventually this operation will be finished and at that time the driver will notify that Task that a request operation has been finished. (The Task can be seen as the primary communication channel between the caller and the callee)
In your case you are using a fire and forget style async call. That means you are firing off a lot of I/O operations in async and you forget to process the result of them. You don't know either any of them failed or not. But you have called the Casandra to do a lot of staff. Your time measurement is used only for firing off jobs, which means you have no idea how much of these jobs has been finished.
If you would choose to use await against your async calls, that would mean that your while loop would be serially executed. You would firing off a job and you can't move on to the next iteration because you are awaiting it, so your caller thread will move one level higher in its call stack and examines if it can processed with something. If there is an await as well, then it moves one level higher and so on...
while(stop.ElapsedMilliseconds <= 200)
{
await session.ExecuteAsync(statement);
}
If you don't want serial execution rather parallel, you can create as many jobs as you need and await them as a whole. That's where Task.WhenAll comes into the play. You will fire off a lot of jobs and you will await that single job that will track all of other jobs.
var cassandraCalls = new List<Task>();
cassandraCalls.AddRange(Enumerable.Range(0, 100).Select(_ => session.ExecuteAsync(statement)));
await Task.WhenAll(cassandraCalls);
But this code will run until all of the jobs are finished. If you want to constrain the whole execution time then you should use some cancellation mechanism. Task.WhenAll does not support CancellationToken. But you can overcome of this limitation in several way. The simplest solution is a combination of the Task.Delay and the Task.WhenAny. Task.Delay will be used for the timeout, and Task.WhenAny will be used to await either the your cassandra calls or the timeout to complete.
var cassandraCalls = new List<Task>();
cassandraCalls.AddRange(Enumerable.Range(0, 100).Select(_ => ExecuteAsync()));
await Task.WhenAny(Task.WhenAll(cassandraCalls), Task.Delay(1000));
In this way, you have fired off as many jobs as you wanted and depending on your driver they may be executed in parallel or concurrently. You are awaiting either to finish all or elapse a certain amount of time. When the WhenAny job finishes then you can examine the result of the jobs, but simply iterating over the cassandraCalls
foreach (var call in cassandraCalls)
{
Console.WriteLine(call.IsCompleted);
}
I hope this explanation helped you a bit.

System.Threading.Timer does not work correctly

I notice the timer is not correct.
This is a very simple C# code: it will print current date/time every 1 minute.
My expected result is: let it run at 3:30 PM then we will have: 3:31 PM, 3:32 PM, 3:33 PM, ...
But sometime don't receive above result: sometime it is 3:31 PM, 3:32 PM, 3:34 PM, ...
So it lost 1 row.
Could anyone point me what is problem?
class Program
{
static Timer m_Timer;
static int countDown;
static void Main(string[] args)
{
countDown = 60;
m_Timer = new Timer(TimerCallback, null, 0, 1000);
while (true) { System.Threading.Thread.Sleep(10); };
}
static void TimerCallback(Object o)
{
countDown -= 1;
if (countDown <= 0)
{
Console.WriteLine(" ===>>>>>" + System.DateTime.Now.ToString());
countDown = 60;
}
System.Threading.Thread.Sleep(10000); //long running code demo
}
}

System.Threading.Timer runs on threads from thread pool. You run callback function which runs on one thread in pool every 1s and block it for 10s using sleep. Depending on how many threads you have in thread pool at some timepoints they all may be blocked and wait or .NET should allocate new thread up to the maximum of threads in pool for you.
From comments extended answer.
Each function is independent and it does not wait until another processing finish. A simple task is: call a function to do something every 1 minutes. "do something" in my case is saving local variables into SQL server. This process is fast not slow. I use 1 timer for many functions because each function is schedule in different cycle. For example, function 1 is triggered every 1 minute, function 2 is triggered every 10 seconds ... That why I use the timer 1 second.
Your use case seems to be more complex as I read it from initial question. You have different tasks and try to implement sort of scheduler. Maybe each particular tasks is fast but all together some runs may be longer and blocking. Not sure how this logic was well implemented but there could be a lot of edge cases e.g. some run was missed etc.
How I would approach it?
I would not try to implement on my own if scheduler can be more complex. I would pick ready solution, e.g. Quartz.NET. They consider edge cases and help to scale on cluster with needed and help with config.
In any case I would refactor bigger schedule to have each task to run on its schedule based on configuration (custom implementation or Quartz) as smaller tasks
I would scale your "queue" of tasks first locally by introducing some queue, for example using ConcurrentQueue or BlockingCollection or any produce-consumer to limit number of threads and if performance of such execution is not good scale on cluster. By doing so you can at least guarantee that N tasks can be scheduled and executed locally and everything beyond is queued. Maybe having some priorities for tasks can also help because there might be execution which could be missed but there are execution which must run on schedule.
I doubt it is a good idea to start from thread timer execution other threads or tasks if most likely you already have problems with threading.
You problem is not with System.Threading.Timer, it does its job well. Your use case is more complex.

Windows - is not real time operating system. So, if you expect that timer waits ecactly 1 second - it's wrong. There are many reasonsm when timer can wait more time. Because of timer resolution or other high load operations.

If you like newer .NET TPL syntax yo can write it like this:
using System;
using System.Threading.Tasks;
namespace ConsoleApp1
{
internal class Program
{
private static void Main(string[] args)
{
Repeat(TimeSpan.FromSeconds(10));
Console.ReadKey();
}
private static void Repeat(TimeSpan period)
{
Task.Delay(period)
.ContinueWith(
t =>
{
//Do your staff here
Console.WriteLine($"Time:{DateTime.Now}");
Repeat(period);
});
}
}
}

The above code causes, that every second you run 10-second "demo" (sleep). You will run 10 worker threads simultanously.
Are you sure, this is what you are trying to achieve?
To see what really happens in your app, simply add:
Console.WriteLine($"Time:{DateTime.Now.ToString("hh:mm:ss.fff tt")},Thread:{Thread.CurrentThread.ManagedThreadId},countDown:{countDown}");
in the beginning of TimerCallback. You will notice, that timespan between following callbacks are not exactly 1000ms (usually it is a little bit more). This is perfectly normal in non-rtc OS, and, in most cases - it's not a problem. Just keep in mind, that Timer is not exact.
Moreover, if you are trying to use Timer that way, and trying to count ticks - these little errors cumulates in following ticks.

I just post what found here for people that have problem like me.
I found the answer from another thread.
I use "HighResolutionTimer.cs" and it works perfect:
https://gist.github.com/DraTeots/436019368d32007284f8a12f1ba0f545

Task starts with delay

I create and start task in following way:
Task task = new Task(() => controller.Play());
task.Start();
For some reason, sometimes task get started with around 7-10 seconds delay.
I use 6 tasks in parallel, max number of tasks is 32767 and available 32759
which is what i log before i create task so it can't be that max number of tasks is reached. I write log at the first line of code in controller.Play() method that task should execute, so there is no lock or anything that could make task to wait.

Long running tasks, like your deserialization of 100MB that takes 10 seconds, should be, hm, well, run as long-running tasks :-)
Long-running tasks are, as per the current implementation, always run on a dedicated thread and they do not put pressure on the thread-pool.
In your case, you perhaps only two tasks - the deserialization and the player. The TaskScheduler works under the assumption that tasks are short-lived, and in this case, it obviously schedules the "player" task to run after the "deserializaion" one.

If I run a stored procedure that takes 5 minutes to complete, will my code sequence continue before the stored procedure is complete?

If I have a program that runs a stored procedure that takes about 5 minutes to run, for example:
Database.RunJob(); -- takes 5 mins to complete
MessageBox.Show("Hi");
Question - is the message box going to show "Hi" BEFORE Database.RunJob() is complete?
If it does, how do I make sure a query is complete before proceeding to the next line of code?

Insufficient information to give meaningful answer.
The best answer we can give is: it depends on whether Database.RunJob() is synchronous or asynchronous.
If .RunJob is synchronous (that is, run on the current thread), then
MessageBox.Show will not run until the .RunJob finishes.
If it's asynchronous, spawning other thread(s) to do the work, then
MessageBosh will execute immediately, before the job is finished.
Since you didn't show what RunJob actually does, we can't answer any more precisely.
Let's suppose .RunJob is asynchronous. If the designer of that method was worth his salt, he would return something from that method; ideally a Task<T>.
If he returns a Task, you could do write your C# 5 code like this:
async void RunJobAndShowSuccessMessage()
{
await Database.RunJob();
MessageBox.Show("Hi, the job is done!");
}
Let's suppose .RunJob is synchronous, running only on the current thread. This means you probably don't want to run it on the UI thread; doing so will cause your application UI to freeze for 5 minutes.
To avoid that scenario, let's run the database job on a different thread ourselves, so that the application UI remains responsive:
async void RunJobAndShowSuccessMessage()
{
// Spawn a task to run in the background.
await Task.Factory.Start(() => Database.RunJob());
MessageBox.Show("Hi, the job is done!");
}

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.