Why this order of events in a lambda expression?

Why this order of events in a lambda expression? - c#

This is the code that I'm running:
[TestMethod]
public async Task HelloTest()
{
List<int> hello = new List<int>();
//await Task.WhenAll(Enumerable.Range(0, 2).Select(async x => await Say(hello)));
await Say(hello);
await Say(hello);
}
private static async Task Say(List<int> hello)
{
await Task.Delay(100);
var rep = new Random().Next();
hello.Add(rep);
}
Why is it that running this code, as it is, works as intended and results in two random numbers, but that using the commented code instead always results in two of the exact same number?

So you have several issues here.
First, why are you seeing the same value twice. That's the easy one. When you create a Random instance it is seeded with the current time, but the precision of the current time it uses is rather low. If you get two new Random instances within say 16 milliseconds or so (which is a really long time for a computer) you'll see the same values out of them. That's what's happening for you.
Normally the fix for that is just to share a single Random instance, but the problem there is that your random instances aren't being accessed from the same thread (potentially, assuming you don't have a SynchronizationContext specified), and Random isn't thread safe. You can use something like this to get your random numbers instead:
public static class MyRandom
{
private static object key = new object();
private static Random random = new Random();
public static int Next()
{
lock (key)
{
return random.Next();
}
}
//TODO add other methods for other `Random` methods as needed
}
Use that and it will resolve the immediate issue.
The other problem that you have, although it doesn't seem to be biting you currently, is that you're modifying your List from two different tasks, possibly being executed in different threads. You shouldn't do that. It's bad enough practice to have methods like this in a single threaded environment (as you're relying on side effects to do your work) but in a multitheraded environment this is very problematic, for more than just conceptual reasons. Instead you should have each thread return a value, and then pull all of those values into a collection on the caller's side, like so:
public async Task HelloTest()
{
var data = await Task.WhenAll(Say(), Say());
}
private static async Task<int> Say()
{
await Task.Delay(100);
return MyRandom.Next();
}
As to why the two Say calls are run in parallel, rather than sequentially, that has to do with the fact that in your second code snippet you aren't actually waiting for one task to complete before starting the next.
The method that you pass to Select is the method to spin up the task, and it won't block until that task is done before starting the next. The code that you have here:
await Task.WhenAll(Enumerable.Range(0, 2).Select(async x => await Say(hello)));
Is no different than simply having:
await Task.WhenAll(Enumerable.Range(0, 2).Select(x => Say(hello)));
Having an async method that does nothing but await one method call is really no different than just having that one method call. What's happening here is that Select is calling Say, staring the task, continuing on, which stats the next task, and then WhenAll is waiting (asynchronously) for both tasks to finish before continuing on.

Note: This is an answer to the original question (the question has since been changed).
They operate identically. I ran the below Console application and got the results 0 1 for both versions:
class Program
{
static int m_nextNumber = 0;
static void Main(string[] args)
{
var t1 = Version1();
m_nextNumber = 0;
var t2 = Version2();
Task.WaitAll(t1, t2);
Console.ReadKey();
}
static async Task Version1()
{
List<int> hello = new List<int>();
await Say(hello);
await Say(hello);
PrintHello(hello);
}
static async Task Version2()
{
List<int> hello = new List<int>();
await Task.WhenAll(Enumerable.Range(0, 2).Select(async x => await Say(hello)));
PrintHello(hello);
}
static void PrintHello(List<int> hello)
{
foreach (var i in hello)
Console.WriteLine(i);
}
static int GotANumber()
{
return m_nextNumber++;
}
static async Task Say(List<int> hello)
{
var rep = GotANumber();
hello.Add(rep);
}
}

Related

Async/Await in loop behaviour

Say I had a list of algorithms, each containing an async call somewhere in the body of that algorithm. The order in which I execute the algorithms is the order in which I want to receive the results. That is, I want the AlgorithmResult List to look like {Algorithm1Result, Algorithm2Result, Algorithm3Result} after all the algorithms have executed. Would I be right in saying that if Algorithm1 and 3 finished before 2 that my results would actually be in the order {Algorithm1Result, Algorithm3Result, Algorithm2Result}
var algorithms = new List<Algorithm>(){Algorithm1, Algorithm2, Algorithm3};
var algorithmResults = new List<AlgorithmResults>();
foreach (var algorithm in algorithms)
{
algorithmResults.Add(await algorithm.Execute());
}

NO, Result would be in the same order you added it to the list, since each operation is being waited for separately.
class Program
{
public static async Task<int> GetResult(int timing)
{
return await Task<int>.Run(() =>
{
Thread.Sleep(timing * 1000);
return timing;
});
}
public static async Task<List<int>> GetAll()
{
List<int> tasks = new List<int>();
tasks.Add(await GetResult(3));
tasks.Add(await GetResult(2));
tasks.Add(await GetResult(1));
return tasks;
}
static void Main(string[] args)
{
var res = GetAll().Result;
}
}
res anyway contains list in order it was added, also this is not parallel execution.

Since you don't add the tasks, but the results of the task after awaiting, they will be in the order you want.
Even if you did not await the result before adding the next task you could get the order you want:
in small steps, showing type awareness:
List<Task<AlgorithmResult>> tasks = new List<Task<AlgorithmResult>>();
foreach (Algorithm algorithm in algorithms)
{
Task<AlgorithmResult> task = algorithm.Execute();
// don't wait until task Completes, just remember the Task
// and continue executing the next Algorithm
tasks.Add(task);
}
Now some Tasks may be running, some may already have completed. Let's wait until they are all complete, and fetch the results:
Task.WhenAll(tasks.ToArray());
List<AlgrithmResults> results = tasks.Select(task => task.Result).ToList();

Task.ContinueWith seems to be called more often than there are actual tasks

First, this is from something much bigger and yes, I could completely avoid this using await under normal/other circumstances. For anyone interested, I'll explain below.
To track how many tasks I still have left before my program continues, I've built the following:
A counter:
private static int counter;
Some method:
public static void Test()
{
List<Task> tasks = new List<Task>();
for (int i = 0; i < 10000; i++)
{
TaskCompletionSource<object> tcs = new TaskCompletionSource<object>();
var task = DoTaskWork();
task.ContinueWith(t => // After DoTaskWork
{
// [...] Use t's Result
counter--; // Decrease counter
tcs.SetResult(null); // Finish the task that the UI or whatever is waiting for
});
tasks.Add(tcs.Task); // Store tasks to wait for
}
Task.WaitAll(tasks.ToArray()); // Wait for all tasks that actually only finish in the ContinueWith
Console.WriteLine(counter);
}
My super heavy work to do:
private static Task DoTaskWork()
{
counter++; // Increase counter
return Task.Delay(500);
}
Now, interestingly I do not receive the number 0 at the end when looking at counter. Instead, the number varies with each execution. Why is this? I tried various tests, but cannot find the reason for the irregularity. With the TaskCompletionSource I believed this to be reliable. Thanks.
Now, for anyone that is interested in why I do this:
I need to create loads of tasks without starting them. For this I need to use the Task constructor (one of its rare use cases). Its disadvantage to Task.Run() is that it cannot handle anything with await and that it needs a return type from the Task to properly run (hence the null as result). Therefore, I need a way around that. Other ideas welcome...

Well. I am stupid. Just 5 minutes in, I realize that.
I just did the same while locking a helper object before changing the counter in any way and now it works...
private static int counter;
private static object locker = new object();
// [...]
task.ContinueWith(t =>
{
lock(locker)
counter--;
tcs.SetResult(null);
});
// [...]
private static Task DoTaskWork()
{
lock (locker)
counter++;
return Task.Delay(500);
}

I need to create loads of tasks without starting them ... Therefore, I need a way around that. Other ideas welcome...
So, if I read it correct you want to build a list of tasks without actually run them on creation. You could do that by building a list of Func<Task> objects you invoke when required:
async Task Main()
{
// Create list of work to do later
var tasks = new List<Func<Task>>();
// Schedule some work
tasks.Add(() => DoTaskWork(1));
tasks.Add(() => DoTaskWork(2));
// Wait for user input before doing work to demonstrate they are not started right away
Console.ReadLine();
// Execute and wait for the completion of the work to be done
await Task.WhenAll(tasks.Select(t => t.Invoke()));
Console.WriteLine("Ready");
}
public async Task DoTaskWork(int taskNr)
{
await Task.Delay(100);
Console.WriteLine(taskNr);
}
This will work, even if you use Task.Run like this:
public Task DoTaskWork(int taskNr)
{
return Task.Run(() =>
{
Thread.Sleep(100); Console.WriteLine(taskNr);
});
}
It this is not want you want can you elaborate more about the tasks you want to create?

Use of Task.WhenAny with dictionary

I am currently writing a service which will create Task for each request he find as "Waiting for process" in the database.
The process can be long, and I want the service to check each time it loops if the task must be cancel, if it's the case I want to cancel the task with a token. So I need to store the ID of the request linked to the task.
I was thinking about having my static class with a dictionary as following:
public static Dictionary<Int32, Task<Int32>> _tasks = new Dictionary<int, Task<int>>();
Which I don't know if it's the better solution that exist but it still a working one I think.
Now I want to do a Task.WhenAny(..) to know when one of them is ended. The problem is that Task.WhenAny(..) accept an array, but not a Dictionary. I didn't see anything about passing a dictionary to a WhenAny and before I start working on the entire process which is very long I wanted to have a solution for each keypoint of my workflow. I could have get a list of the values of the dictionary but I would probably loose the id link. So I don't really know what to do ?
Is there a solution for that ? I don't wanted to recreate my "own" WhenAny and I don't even know if it's possible but I assume I can just parse the status of every row. However if it's the only option, I will.
I'm also open about the fact that storing the id of the request this way isn't a good way to do and in this case I'm open to any other suggestion.
EDIT : CODE ACORDING TO ANSWERS
I ended using this code which seems to be working. Now I'll test with more complicated tasks rather than just a file writing ! :)
public static class Worker
{
public static List<Task<Int32>> m_tasks = new List<Task<Int32>>();
public static Dictionary<Int32, CancellationTokenSource> m_cancellationTokenSources = new Dictionary<int, CancellationTokenSource>();
public static Int32 _testId = 1;
public static void run()
{
//Clean
Cleaner.CleanUploads();
Cleaner.CleanDownloads();
#region thread watching
if (m_tasks.Count > 0)
{
#region thread must be cancel
//Cancel thread
List<Task<Int32>> _removeTemp = new List<Task<Int32>>();
foreach (Task<Int32> _task in m_tasks)
{
if (DbWorker.mustBeCancel((Int32)_task.AsyncState))
{
m_cancellationTokenSources[(Int32)_task.AsyncState].Cancel();
//Cancellation actions
//task must be remove
_removeTemp.Add(_task);
}
}
foreach( Task<Int32> _taskToRemove in _removeTemp)
{
m_tasks.Remove(_taskToRemove);
}
#endregion
#region Conversion lookup
// Get conversion if any
// Create task
CancellationTokenSource _srcCancel = new CancellationTokenSource();
m_cancellationTokenSources.Add(_testId, _srcCancel);
m_tasks.Add(Task.Factory.StartNew(_testId => testRunner<Int32>((Int32)_testId), _testId, _srcCancel.Token));
_testId++;
// Attach task
#endregion
}
#endregion
else
{
CancellationTokenSource _srcCancel = new CancellationTokenSource();
m_cancellationTokenSources.Add(_testId, _srcCancel);
m_tasks.Add(Task.Factory.StartNew(_testId => testRunner<Int32>((Int32)_testId), _testId, _srcCancel.Token));
_testId++;
}
}
internal static void WaitAll()
{
Task.WaitAll(m_tasks.ToArray());
}
public static Int32 testRunner<T>(T _id)
{
for (Int32 i = 0; i <= 1000000; i++)
{
File.AppendAllText(#"C:\TestTemp\" + _id, i.ToString());
}
return 2;
}
}

Task.WhenAny return value is:
A task that represents the completion of one of the supplied tasks.
The return task's Result is the task that completed.
From the docs.
So basically you can pass to it the dictionary values and by awaiting it you will get the task that was finished, from here it is easy to attach to this task it's id using some LINQ:
var task = await Task.WhenAny(_tasks.Values);
var id = _tasks.Single(pair => pair.Value == task).Key;

There is a Task.WhenAny that takes an IEnumerable<Task>, and one that takes IEnumerable<Task<T>>, so you should just be able to use:
var winner = Task.WhenAny(theDictionary.Values);

Async Producer/Consumer

I have a instance of a class that is accessed from several threads. This class take this calls and add a tuple into a database. I need this to be done in a serial manner, as due to some db constraints, parallel threads could result in an inconsistent database.
As I am new to parallelism and concurrency in C#, I did this:
private BlockingCollection<Task> _tasks = new BlockingCollection<Task>();
public void AddDData(string info)
{
Task t = new Task(() => { InsertDataIntoBase(info); });
_tasks.Add(t);
}
private void InsertWorker()
{
Task.Factory.StartNew(() =>
{
while (!_tasks.IsCompleted)
{
Task t;
if (_tasks.TryTake(out t))
{
t.Start();
t.Wait();
}
}
});
}
The AddDData is the one who is called by multiple threads and InsertDataIntoBase is a very simple insert that should take few milliseconds.
The problem is that, for some reason that my lack of knowledge doesn't allow me to figure out, sometimes a task is been called twice! It always goes like this:
T1
T2
T3
T1 <- PK error.
T4
...
Did I understand .Take() completely wrong, am I missing something or my producer/ consumer implementation is really bad?
Best Regards,
Rafael
UPDATE:
As suggested, I made a quick sandbox test implementation with this architecture and as I was suspecting, it does not guarantee that a task will not be fired before the previous one finishes.
So the question remains: how to properly queue tasks and fire them sequentially?
UPDATE 2:
I simplified the code:
private BlockingCollection<Data> _tasks = new BlockingCollection<Data>();
public void AddDData(Data info)
{
_tasks.Add(info);
}
private void InsertWorker()
{
Task.Factory.StartNew(() =>
{
while (!_tasks.IsCompleted)
{
Data info;
if (_tasks.TryTake(out info))
{
InsertIntoDB(info);
}
}
});
}
Note that I got rid of Tasks as I'm relying on synced InsertIntoDB call (as it is inside a loop), but still no luck... The generation is fine and I'm absolutely sure that only unique instances are going to the queue. But no matter I try, sometimes the same object is used twice.

I think this should work:
private static BlockingCollection<string> _itemsToProcess = new BlockingCollection<string>();
static void Main(string[] args)
{
InsertWorker();
GenerateItems(10, 1000);
_itemsToProcess.CompleteAdding();
}
private static void InsertWorker()
{
Task.Factory.StartNew(() =>
{
while (!_itemsToProcess.IsCompleted)
{
string t;
if (_itemsToProcess.TryTake(out t))
{
// Do whatever needs doing here
// Order should be guaranteed since BlockingCollection
// uses a ConcurrentQueue as a backing store by default.
// http://msdn.microsoft.com/en-us/library/dd287184.aspx#remarksToggle
Console.WriteLine(t);
}
}
});
}
private static void GenerateItems(int count, int maxDelayInMs)
{
Random r = new Random();
string[] items = new string[count];
for (int i = 0; i < count; i++)
{
items[i] = i.ToString();
}
// Simulate many threads adding items to the collection
items
.AsParallel()
.WithDegreeOfParallelism(4)
.WithExecutionMode(ParallelExecutionMode.ForceParallelism)
.Select((x) =>
{
Thread.Sleep(r.Next(maxDelayInMs));
_itemsToProcess.Add(x);
return x;
}).ToList();
}
This does mean that the consumer is single threaded, but allows for multiple producer threads.

From your comment
"I simplified the code shown here, as the data is not a string"
I assume that info parameter passed into AddDData is a mutable reference type. Make sure that the caller is not using the same info instance for multple calls since that reference is captured in Task lambda .

Based on the trace that you provided the only logical possibility is that you have called InsertWorker twice (or more). There are thus two background threads waiting for items to appear in the collection and occasionally they both manage to grab an item and begin executing it.

async / await - am I correctly running these methods in parallel?

I have an abstract class called VehicleInfoFetcher which returns information asynchronously from a WebClient via this method:
public override async Task<DTOrealtimeinfo> getVehicleInfo(string stopID);
I'd like to combine the results of two separate instances of this class, running each in parallel before combining the results. This is done within a third class, CombinedVehicleInfoFetcher (also itself a subclass of VehicleInfoFetcher)
Here's my code - but I'm not quite convinced that it's running the tasks in parallel; am I doing it right? Could it be optimized?
public class CombinedVehicleInfoFetcher : VehicleInfoFetcher
{
public HashSet<VehicleInfoFetcher> VehicleInfoFetchers { get; set; }
public override async Task<DTOrealtimeinfo> getVehicleInfo(string stopID)
{
// Create a list of parallel tasks to run
var resultTasks = new List<Task<DTOrealtimeinfo>>();
foreach (VehicleInfoFetcher fetcher in VehicleInfoFetchers)
resultTasks.Add(fetcher.getVehicleInfo(stopID, stopID2, timePointLocal));
// run each task
foreach (var task in resultTasks)
await task;
// Wait for all the results to come in
await Task.WhenAll(resultTasks.ToArray());
// combine the results
var allRealtimeResults = new List<DTOrealtimeinfo>( resultTasks.Select(t => t.Result) );
return combineTaskResults(allRealtimeResults);
}
DTOrealtimeinfo combineTaskResults(List<DTOrealtimeinfo> realtimeResults)
{
// ...
return rtInfoOutput;
}
}
Edit
Some very helpful answers, here is a re-written example to aid discussion with usr below:
public override async Task<object> combineResults()
{
// Create a list of parallel tasks to run
var resultTasks= new List<object>();
foreach (AnotherClass cls in this.OtherClasses)
resultTasks.Add(cls.getResults() );
// Point A - have the cls.getResults() methods been called yet?
// Wait for all the results to come in
await Task.WhenAll(resultTasks.ToArray());
// combine the results
return new List<object>( resultTasks.Select(t => t.Result) );
}
}

Almost all tasks start out already started. Probably, whatever fetcher.getVehicleInfo returns is already started. So you can remove:
// run each task
foreach (var task in resultTasks)
await task;
Task.WhenAll is faster and has better error behavior (you want all exceptions to be propagated, not just the first you happen to stumble upon).
Also, await does not start a task. It waits for completion. You have to arrange for the tasks to be started separately, but as I said, almost all tasks are already started when you get them. This is best-practice as well.
To help our discussion in the comments:
Task Test1() { return new Task(() => {}); }
Task Test2() { return Task.Factory.StartNew(() => {}); }
Task Test3() { return new FileStream("").ReadAsync(...); }
Task Test4() { return new TaskCompletionSource<object>().Task; }
Does not "run" when returned from the method. Must be started. Bad practice.
Runs when returned. Does not matter what you do with it, it is already running. Not necessary to add it to a list or store it somewhere.
Already runs like (2).
The notion of running does not make sense here. This task will never complete although it cannot be explicitly started.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Why this order of events in a lambda expression? - c#

Related

Async/Await in loop behaviour

Task.ContinueWith seems to be called more often than there are actual tasks

Use of Task.WhenAny with dictionary

Async Producer/Consumer

async / await - am I correctly running these methods in parallel?

Categories

Resources