Transform IEnumerable<Task<T>> asynchronously by awaiting each task - c#

Today I was wondering how to transform a list of Tasks by awaiting each of it.
Consider the following example:
private static void Main(string[] args)
{
try
{
Run(args);
Console.ReadLine();
}
catch (Exception ex)
{
Console.WriteLine(ex.ToString());
Console.ReadLine();
}
}
static async Task Run(string[] args)
{
//Version 1: does compile, but ugly and List<T> overhead
var tasks1 = GetTasks();
List<string> gainStrings1 = new List<string>();
foreach (Task<string> task in tasks1)
{
gainStrings1.Add(await task);
}
Console.WriteLine(string.Join("", gainStrings1));
//Version 2: does not compile
var tasks2 = GetTasks();
IEnumerable<string> gainStrings2 = tasks2.Select(async t => await t);
Console.WriteLine(string.Join("", gainStrings2));
}
static IEnumerable<Task<string>> GetTasks()
{
string[] messages = new[] { "Hello", " ", "async", " ", "World" };
for (int i = 0; i < messages.Length; i++)
{
TaskCompletionSource<string> tcs = new TaskCompletionSource<string>();
tcs.SetResult(messages[i]);
yield return tcs.Task;
}
}
I'd like to transform my list of Tasks without the foreach, however either the anonymous function syntax nor the usual function syntax allows me to do what my foreach does.
Do I have to rely on my foreach and the List<T> or is there any way to get it to work with IEnumerable<T> and all its advantages?

What about this:
await Task.WhenAll(tasks1);
var gainStrings = tasks1.Select(t => t.Result).ToList();
Wait for all tasks to end and then extract results. This is ideal if you don't care in which order they are finished.
EDIT2:
Even better way:
var gainStrings = await Task.WhenAll(tasks1);

Related

How do I run a method both parallel and sequentially in C#?

I have a C# console app. In this app, I have a method that I will call DoWorkAsync. For the context of this question, this method looks like this:
private async Task<string> DoWorkAsync()
{
System.Threading.Thread.Sleep(5000);
var random = new Random();
var chars = "ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";
var length = random.Next(10, 101);
await Task.CompletedTask;
return new string(Enumerable.Repeat(chars, length)
.Select(s => s[random.Next(s.Length)]).ToArray());
}
I call DoWorkAsync from another method that determines a) how many times this will get ran and b) if each call will be ran in parallel or sequentially. That method looks like this:
private async Task<Task<string>[]> DoWork(int iterations, bool runInParallel)
{
var tasks = new List<Task<string>>();
for (var i=0; i<iterations; i++)
{
if (runInParallel)
{
var task = Task.Run(() => DoWorkAsync());
tasks.Add(task);
}
else
{
await DoWorkAsync();
}
}
return tasks.ToArray();
}
After all of the tasks are completed, I want to display the results. To do this, I have code that looks like this:
var random = new Random();
var tasks = await DoWork(random.Next(10, 101);
Task.WaitAll(tasks);
foreach (var task in tasks)
{
Console.WriteLine(task.Result);
}
This code works as expected if the code runs in parallel (i.e. runInParallel is true). However, when runInParallel is false (i.e. I want to run the Tasks sequentially) the Task array doesn't get populated. So, the caller doesn't have any results to work with. I don't know how to fix it though. I'm not sure how to add the method call as a Task that will run sequentially. I understand that the idea behind Tasks is to run in parallel. However, I have this need to toggle between parallel and sequential.
Thank you!
the Task array doesn't get populated.
So populate it:
else
{
var task = DoWorkAsync();
tasks.Add(task);
await task;
}
P.S.
Also your DoWorkAsync looks kinda wrong to me, why Thread.Sleep and not await Task.Delay (it is more correct way to simulate asynchronous execution, also you won't need await Task.CompletedTask this way). And if you expect DoWorkAsync to be CPU bound just make it like:
private Task<string> DoWorkAsync()
{
return Task.Run(() =>
{
// your cpu bound work
return "string";
});
}
After that you can do something like this (for both async/cpu bound work):
private async Task<string[]> DoWork(int iterations, bool runInParallel)
{
if(runInParallel)
{
var tasks = Enumerable.Range(0, iterations)
.Select(i => DoWorkAsync());
return await Task.WhenAll(tasks);
}
else
{
var result = new string[iterations];
for (var i = 0; i < iterations; i++)
{
result[i] = await DoWorkAsync();
}
return result;
}
}
Why is DoWorkAsync an async method?
It isn't currently doing anything asynchronous.
It seems that you are trying to utilise multiple threads to improve the performance of expensive CPU-bound work, so you would be better to make use of Parallel.For, which is designed for this purpose:
private string DoWork()
{
System.Threading.Thread.Sleep(5000);
var random = new Random();
var chars = "ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";
var length = random.Next(10, 101);
return new string(Enumerable.Repeat(chars, length)
.Select(s => s[random.Next(s.Length)]).ToArray());
}
private string[] DoWork(int iterations, bool runInParallel)
{
var results = new string[iterations];
if (runInParallel)
{
Parallel.For(0, iterations - 1, i => results[i] = DoWork());
}
else
{
for (int i = 0; i < iterations; i++) results[i] = DoWork();
}
return results;
}
Then:
var random = new Random();
var serial = DoWork(random.Next(10, 101));
var parallel = DoWork(random.Next(10, 101), true);
I think you'd be better off doing the following:
Create a function that creates a (cold) list of tasks (or an array Task<string>[] for instance). No need to run them. Let's call this GetTasks()
var jobs = GetTasks();
Then, if you want to run them "sequentially", just do
var results = new List<string>();
foreach (var job in jobs)
{
var result = await job;
results.Add(result);
}
return results;
If you want to run them in parallel :
foreach (var job in jobs)
{
job.Start();
}
await results = Task.WhenAll(jobs);
Another note,
All this in itself should be a Task<string[]>, the Task<Task<... smells like a problem.

How to await multiple IAsyncEnumerable

We have code like this:
var intList = new List<int>{1,2,3};
var asyncEnumerables = intList.Select(Foo);
private async IAsyncEnumerable<int> Foo(int a)
{
while (true)
{
await Task.Delay(5000);
yield return a;
}
}
I need to start await foreach for every asyncEnumerable's entry. Every loop iteration should wait each other, and when every iteration is done i need to collect every iteration's data and process that by another method.
Can i somehow achieve that by TPL? Otherwise, couldn't you give me some ideas?
What works for me is the Zip function in this repo (81 line)
I'm using it like this
var intList = new List<int> { 1, 2, 3 };
var asyncEnumerables = intList.Select(RunAsyncIterations);
var enumerableToIterate = async_enumerable_dotnet.AsyncEnumerable.Zip(s => s, asyncEnumerables.ToArray());
await foreach (int[] enumerablesConcatenation in enumerableToIterate)
{
Console.WriteLine(enumerablesConcatenation.Sum()); //Sum returns 6
await Task.Delay(2000);
}
static async IAsyncEnumerable<int> RunAsyncIterations(int i)
{
while (true)
yield return i;
}
Here is a generic method Zip you could use, implemented as an iterator. The cancellationToken is decorated with the EnumeratorCancellation attribute, so that the resulting IAsyncEnumerable is WithCancellation friendly.
using System.Runtime.CompilerServices;
public static async IAsyncEnumerable<TSource[]> Zip<TSource>(
IEnumerable<IAsyncEnumerable<TSource>> sources,
[EnumeratorCancellation]CancellationToken cancellationToken = default)
{
var enumerators = sources
.Select(x => x.GetAsyncEnumerator(cancellationToken))
.ToArray();
try
{
while (true)
{
var array = new TSource[enumerators.Length];
for (int i = 0; i < enumerators.Length; i++)
{
if (!await enumerators[i].MoveNextAsync()) yield break;
array[i] = enumerators[i].Current;
}
yield return array;
}
}
finally
{
foreach (var enumerator in enumerators)
{
await enumerator.DisposeAsync();
}
}
}
Usage example:
await foreach (int[] result in Zip(asyncEnumerables))
{
Console.WriteLine($"Result: {String.Join(", ", result)}");
}

Restricting the enumerations of LINQ queries to One Only

I have a LINQ query that should NOT be enumerated more than once, and I want to avoid enumerating it twice by mistake. Is there any extension method I can use to ensure that I am protected from such a mistake? I am thinking about something like this:
var numbers = Enumerable.Range(1, 10).OnlyOnce();
Console.WriteLine(numbers.Count()); // shows 10
Console.WriteLine(numbers.Count()); // throws InvalidOperationException: The query cannot be enumerated more than once.
The reason I want this functionality is because I have an enumerable of tasks, that is intended to instantiate and run the tasks progressivelly, while it is enumerated slowly under control. I already made the mistake to run the tasks twice because I forgot that it's a differed enumerable and not
an array.
var tasks = Enumerable.Range(1, 10).Select(n => Task.Run(() => Console.WriteLine(n)));
Task.WaitAll(tasks.ToArray()); // Lets wait for the tasks to finish...
Console.WriteLine(String.Join(", ", tasks.Select(t => t.Id))); // Lets see the completed task IDs...
// Oups! A new set of tasks started running!
I want to avoid enumerating it twice by mistake.
You can wrap the collection with a collection that throws if it's enumerated twice.
eg:
using System;
using System.Collections;
using System.Collections.Generic;
using System.Linq;
namespace ConsoleApp8
{
public static class EnumExtension
{
class OnceEnumerable<T> : IEnumerable<T>
{
IEnumerable<T> col;
bool hasBeenEnumerated = false;
public OnceEnumerable(IEnumerable<T> col)
{
this.col = col;
}
public IEnumerator<T> GetEnumerator()
{
if (hasBeenEnumerated)
{
throw new InvalidOperationException("This collection has already been enumerated.");
}
this.hasBeenEnumerated = true;
return col.GetEnumerator();
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}
public static IEnumerable<T> OnlyOnce<T>(this IEnumerable<T> col)
{
return new OnceEnumerable<T>(col);
}
}
class Program
{
static void Main(string[] args)
{
var col = Enumerable.Range(1, 10).OnlyOnce();
var colCount = col.Count(); //first enumeration
foreach (var c in col) //second enumeration
{
Console.WriteLine(c);
}
}
}
}
Enumerables enumerate, end of story. You just need to call ToList, or ToArray
// this will enumerate and start the tasks
var tasks = Enumerable.Range(1, 10)
.Select(n => Task.Run(() => Console.WriteLine(n)))
.ToList();
// wait for them all to finish
Task.WaitAll(tasks.ToArray());
Console.WriteLine(String.Join(", ", tasks.Select(t => t.Id)));
Hrm if you want parallelism
Parallel.For(0, 100, index => Console.WriteLine(index) );
or if you are using async and await pattern
public static async Task DoWorkLoads(IEnumerable <Something> results)
{
var options = new ExecutionDataflowBlockOptions
{
MaxDegreeOfParallelism = 50
};
var block = new ActionBlock<Something>(MyMethodAsync, options);
foreach (var result in results)
block.Post(result);
block.Complete();
await block.Completion;
}
...
public async Task MyMethodAsync(Something result)
{
await SomethingAsync(result);
}
Update, Since you are after a way to control the max degree of conncurrency, you could use this
public static async Task<IEnumerable<Task>> ExecuteInParallel<T>(this IEnumerable<T> collection,Func<T, Task> callback,int degreeOfParallelism)
{
var queue = new ConcurrentQueue<T>(collection);
var tasks = Enumerable.Range(0, degreeOfParallelism)
.Select(async _ =>
{
while (queue.TryDequeue(out var item))
await callback(item);
})
.ToArray();
await Task.WhenAll(tasks);
return tasks;
}
Rx certainly is an option to control parallelism.
var query =
Observable
.Range(1, 10)
.Select(n => Observable.FromAsync(() => Task.Run(() => new { Id = n })));
var tasks = query.Merge(maxConcurrent: 3).ToArray().Wait();
Console.WriteLine(String.Join(", ", tasks.Select(t => t.Id)));

Wait for all Threads

I have a little problem with Threads in this code..
I just want to run a lot of tasks together, and continue when all of them finish.
while (true)
{
// Run tasks together:
foreach (object T in objectsList)
{
if (T.something>0)
var task = Task.Factory.StartNew(() => T.RunObject());
task.ContinueWith(delegate { ChangeObject(T, 1); }, TaskContinuationOptions.NotOnFaulted);
}
// <-- Here I want to wait for all the task to be finish.
// I know its task.Wait() but how to waitAll()?
System.Threading.Thread.Sleep(this.GetNextTime());
var RefreshObjects = new Task(loadObjectsList); RefreshObjects .Start(); RefreshObjects.Wait();
}
I don't know how many objects will be in objectsList and I don't know if T.something will be > 0.
so I can't just use:
Task[] Tasks = new Task[objectsList.count()]
for (int T=0; T<objectsList.count(); ++T)
{
if (objectsList[T].something>0)
var task = Task.Factory.StartNew(() => objectsList[T].RunObject());
task.ContinueWith(delegate { ChangeObject(objectsList[T], 1); }, ...);
}
Task.WaitAll(Tasks);
Because Tasks will contains nulls when objectsList[T].something!>0...
Thanks for any advice!
Just switch the condition and create a List of tasks only for the objects which matches your criteria.
var tasks = objectsList
.Where(x => x.Something() > 0)
.Select(x => {
var task = Task.Factory.StartNew(() => x.RunObject());
task.ContinueWith(t => ChangeObject(....));
return task;
})
.ToArray();
Task.WaitAll(tasks);
Your code sample just waits for RunObject()to complete! If this is desired skip the rest of my answer. If you want to wait for the continuation to complete, too you can use this
var tasks = objectsList
.Where(x => x.Something() > 0)
.Select(x => Task.Factory.StartNew(() => x.RunObject()).ContinueWith(t => ChangeObject(....)))
.ToArray();
Task.WaitAll(tasks);
because ContinueWith generates a new Task.
If objectsList implements IEnumerable, (as an array does),
(And there are less than 64 objects in the list), you can use this:
public delegate void SyncDelegatesInParallelDelegate<in T>(T item);
public static class ParallelLinqExtensions
{
public static void SyncDelegatesInParallel<T>(
this IEnumerable<T> list,
SyncDelegatesInParallelDelegate<T> action)
{
var foundCriticalException = false;
Exception exception = null;
var waitHndls = new List<WaitHandle>();
foreach (var item in list)
{
// Temp copy of session for modified closure
var localItem = item;
var txEvnt = new ManualResetEvent(false);
// Temp copy of session for closure
ThreadPool.QueueUserWorkItem(
depTx =>
{
try { if (!foundCriticalException) action(localItem); }
catch (Exception gX)
{ exception = gX; foundCriticalException = true; }
finally { txEvnt.Set(); }
}, null);
waitHndls.Add(txEvnt);
}
if (waitHndls.Count > 0) WaitHandle.WaitAll(waitHndls.ToArray());
if (exception != null) throw exception;
}
}
you would call it like this
objectsList.SyncDelegatesInParallel(delegate { ChangeObject(T, 1);});

Waiting on multiple background threads

I want know when all thread has been finished in a multithread program
without something like pooling
while(!allThreadFinished){
thread.sleep(100);
}
The solution should be used Monitor but i can't how can i approve that it's correct.
since the "SomeMethod" in the following code using network, it consume times.
public object SomeMethod(string input);
public object[] MultiThreadMethod(string[] inputs) {
var result = new object[inputs.Count()];
int i = 0;
foreach (var item in inputs) {
BackgroundWorker work = new BackgroundWorker();
work.DoWork += (sender, doWorkEventArgs) => { doWorkEventArgs.Result = SomeMethod(item); };
work.RunWorkerCompleted += (sender, runWorkerCompletedEventArgs) => {
result[i] = runWorkerCompletedEventArgs.Result;
};
i++;
work.RunWorkerAsync();
}
/////////////////////////////////////////////////////////////
//**wait while all thread has been completed**
/////////////////////////////////////////////////////////////
return result;
}
Try using the TPL http://msdn.microsoft.com/en-us/library/dd460717.aspx.
List<Task> tasks = new List<Task>();
Task t1 = new Task(() =>
{
// Do something here...
});
t1.Start();
tasks.Add(t1);
Task t2 = new Task(() =>
{
// Do something here...
});
t2.Start();
tasks.Add(t2);
Task.WaitAll(tasks.ToArray());
You can use TPL to do the same, you will avoid using Thread.Sleep(), and it will be much clearer. Check this out: http://msdn.microsoft.com/en-us/library/dd537610.aspx
Your example with TPL would look like this (untested code):
private ConcurrentBag<object> _results;
public object[] MultiThreadMethod(string[] inputs)
{
_results = new ConcurrentBag<object>();
var tasks = new Task[inputs.Length];
for (int i = 0; i < inputs.Length; i++)
{
tasks[i] = Task.Factory.StartNew(() => DoWork(inputs[i]));
}
Task.WaitAll(tasks);
return _results.ToArray();
}
private void DoWork(string item)
{
_results.Add(SomeMethod(item));
}
EDIT: Without ConcurrentBag:
public object[] MultiThreadMethod(string[] inputs)
{
var tasks = new Task<object>[inputs.Length];
for (int i = 0; i < inputs.Length; i++)
{
tasks[i] = Task<object>.Factory.StartNew(() => DoWork(inputs[i]));
}
Task.WaitAll(tasks);
return tasks.Select(task => task.Result).ToArray();
}
private object DoWork(string item)
{
return SomeMethod(item);
}
Hook the RunWorkerCompleted event on the BackgroundWorker. It will fire when the work is done.
A complete example of how to use the BackgroundWorker properly can be found here.
http://msdn.microsoft.com/en-us/library/dd537608.aspx
// Sequential version
foreach (var item in sourceCollection)
Process(item);
// Parallel equivalent
Parallel.ForEach(sourceCollection, item => Process(item));

Categories