Enumerable foreach extend - c#

I created an extension to Enumerable to execute action fastly, so I have listed and in this method, I loop and if object executing the method in certain time out I return,
now I want to make the output generic because the method output will differ, any advice on what to do
this IEnumerable of processes, it's like load balancing, if the first not responded the second should, I want to return the output of the input Action
public static class EnumerableExtensions
{
public static void ForEach<T>(this IEnumerable<T> source, Action action, int timeOut)
{
foreach (T element in source)
{
lock (source)
{
// Loop for all connections and get the fastest responsive proxy
foreach (var mxAccessProxy in source)
{
try
{
// check for the health
Task executionTask = Task.Run(action);
if (executionTask.Wait(timeOut))
{
return ;
}
}
catch
{
//ignore
}
}
}
}
}
}
this code run like
_proxies.ForEach(certainaction, timeOut);

this will enhance the performance and code readability
No, it definitely won't :) Moreover, you bring some more problems with this code like redundant locking or exception swallowing, but don't actually execute code in parallel.
It seems like you want to get the fastest possible call for your Action using some sort of proxy objects. You need to run Tasks asynchronously, not consequently with .Wait().
Something like this could be helpful for you:
public static class TaskExtensions
{
public static TReturn ParallelSelectReturnFastest<TPoolObject, TReturn>(this TPoolObject[] pool,
Func<TPoolObject, CancellationToken, TReturn> func,
int? timeout = null)
{
var ctx = new CancellationTokenSource();
// for every object in pool schedule a task
Task<TReturn>[] tasks = pool
.Select(poolObject =>
{
ctx.Token.ThrowIfCancellationRequested();
return Task.Factory.StartNew(() => func(poolObject, ctx.Token), ctx.Token);
})
.ToArray();
// not sure if Cast is actually needed,
// just to get rid of co-variant array conversion
int firstCompletedIndex = timeout.HasValue
? Task.WaitAny(tasks.Cast<Task>().ToArray(), timeout.Value, ctx.Token)
: Task.WaitAny(tasks.Cast<Task>().ToArray(), ctx.Token);
// we need to cancel token to avoid unnecessary work to be done
ctx.Cancel();
if (firstCompletedIndex == -1) // no objects in pool managed to complete action in time
throw new NotImplementedException(); // custom exception goes here
return tasks[firstCompletedIndex].Result;
}
}
Now, you can use this extension method to call a specific action on any pool of objects and get the first executed result:
var pool = new[] { 1, 2, 3, 4, 5 };
var result = pool.ParallelSelectReturnFastest((x, token) => {
Thread.Sleep(x * 200);
token.ThrowIfCancellationRequested();
Console.WriteLine("calculate");
return x * x;
}, 100);
Console.WriteLine(result);
It outputs:
calculate
1
Because the first task will complete work in 200ms, return it, and all other tasks will be cancelled through cancellation token.
In your case it will be something like:
var actionResponse = proxiesList.ParallelSelectReturnFastest((proxy, token) => {
token.ThrowIfCancellationRequested();
return proxy.SomeAction();
});
Some things to mention:
Make sure that your actions are safe. You can't rely on how many of these will actually come to the actual execution of your action. If this action is CreateItem, then you can end up with many items to be created through different proxies
It cannot guarantee that you will run all of these actions in parallel, because it is up to TPL to chose the optimal number of running tasks
I have implemented in old-fashioned TPL way, because your original question contained it. If possible, you need to switch to async/await - in this case your Func will return tasks and you need to use await Task.WhenAny(tasks) instead of Task.WaitAny()

Related

How do you set up some function to run in a background thread but keep the normal return type instead of the Task<int>?

I have a function from the one service to that will get the count of all files inside a directory. And another service will get that int number to do some stuff with it.
public int GetNumberOfAvatarsInFile()
{
try
{
var path = GetAvatarsFilePath();
var numberOfAvatars = Directory.GetFiles(path, "*.*", SearchOption.TopDirectoryOnly).Length;
return numberOfAvatars;
}
catch (Exception exception)
{
var message = $"Error While Getting Total Numbers Of Avatars at {DateTime.Now}\n\t" +
$"Error: {JsonConvert.SerializeObject(exception)}";
sentryClient.CaptureMessage(message, SentryLevel.Error);
return 1;
}
}
private string GetAvatarsFilePath()
{
var webRootPath = webHostEnvironment.WebRootPath;
var path = Path.Combine(webRootPath, "path");
return path;
}
The other service will use this function like this
private int GetMaximumAvatarId() => avatarService.GetNumberOfAvatarsInFile();
How do I set up so that all these file getting logic and string combine will be separated to a background thread/another thread by either Task.Run or something similar?
When I try to set up the GetNumberOfAvatarsInFile() by implementing await Task.Run(async () => LOGIC+Return int);
I have to return a Task rather than the int from the other service that is calling it as well, which is not desirable since those are other people code and I should not change them. Also as far as I know all the Path.Combine and Directory functions do not employ any awaiter.
Is there a way to implement this?
As mentioned in the comments, the best practice is to provide async methods to the caller and use async all the way (see this article). However there are 2 things that can already be done:
1. Make your I/O method run asynchronously in a separate thread.
2. Have callers call your method asynchronously even if the implementation is synchronous.
The implementations on client side and on service side are independent. Here is a commented example that I hope shows how to do this. Most of the code below is unnecessary and is there only to illustrate what happens when multiple callers call your method and what is executed when. You may change the Thread.Sleep() values to simulate different execution time.
I also added a side note regarding the value you return in the Exception, that does not look ok to me.
public class Program
{
public static void Main()
{
// These simulate 3 callers calling your service at different times.
var t1 = Task.Run(() => GetMaximumAvatarId(1));
Thread.Sleep(100);
var t2 = Task.Run(() => GetMaximumAvatarId(2));
Thread.Sleep(2000);
var t3 = Task.Run(() => GetMaximumAvatarId(3));
// Example purposes.
Task.WaitAll(t1, t2, t3);
Console.WriteLine("MAIN: Done.");
Console.ReadKey();
}
// This is a synchronous call on the client side. This could very well be implemented
// as an asynchronous call, even if the service method is synchronous, by using a
// Task and having the caller await for it (GetMaximumAvatarIdAsync).
public static int GetMaximumAvatarId(int callerId)
{
Console.WriteLine($"CALLER {callerId}: Calling...");
var i = GetNumberOfAvatarsInFile(callerId);
Console.WriteLine($"CALLER {callerId}: Done -> there are {i} files.");
return i;
}
// This method has the same signature as yours. It's synchronous in the sense that it
// does not return an awaitable. However it now uses `Task.Run` in order to execute
// `Directory.GetFiles` in a threadpool thread, which allows to run other code in
// parallel (in this example `Sleep` calls, in real life useful code). It finally
// blocks waiting for the result of the task, then returns it to the caller as an int.
// The "callerId" is for the example only, you may remove this everywhere.
public static int GetNumberOfAvatarsInFile(int callerId)
{
Console.WriteLine($" SERVICE: Called by {callerId}...");
var path = GetAvatarsFilePath();
var t = Task.Run(() => Directory.GetFiles(path, "*.*", SearchOption.TopDirectoryOnly).Length);
// Simulate long work for a caller, showing the caller.
Console.WriteLine($" SERVICE: Working for {callerId}...");
Thread.Sleep(500);
Console.WriteLine($" SERVICE: Working for {callerId}...");
Thread.Sleep(500);
Console.WriteLine($" SERVICE: Working for {callerId}...");
Thread.Sleep(500);
Console.WriteLine($" SERVICE: Blocking for {callerId} until task completes.");
return t.Result; // Returns an int.
// --------------------------------------------------------
// Side note: you should return `-1` in the `Exception`.
// Otherwise it is impossible for the caller to know if there was an error or
// if there is 1 avatar in the file.
// --------------------------------------------------------
}
// Unchanged.
private string GetAvatarsFilePath()
{
var webRootPath = webHostEnvironment.WebRootPath;
var path = Path.Combine(webRootPath, "path");
return path;
}
}

Using cancellation token properly in c#

I was recently exposed to C# language and was working on getting data out of cassandra so I was working with below code which gets data from Cassandra and it works fine.
Only problem I have is in my ProcessCassQuery method - I am passing CancellationToken.None to my requestExecuter Function which might not be the right thing to do. What should be the right way to handle that case and what should I do to handle it correctly?
/**
*
* Below method does multiple async calls on each table for their corresponding id's by limiting it down using Semaphore.
*
*/
private async Task<List<T>> ProcessCassQueries<T>(IList<int> ids, Func<CancellationToken, int, Task<T>> mapperFunc, string msg) where T : class
{
var tasks = ids.Select(async id =>
{
await semaphore.WaitAsync();
try
{
ProcessCassQuery(ct => mapperFunc(ct, id), msg);
}
finally
{
semaphore.Release();
}
});
return (await Task.WhenAll(tasks)).Where(e => e != null).ToList();
}
// this might not be good idea to do it. how can I improve below method?
private Task<T> ProcessCassQuery<T>(Func<CancellationToken, Task<T>> requestExecuter, string msg) where T : class
{
return requestExecuter(CancellationToken.None);
}
As said in the official documentation, the cancellation token allows propagating a cancellation signal. This can be useful for example, to cancel long-running operations that for some reason do not make sense anymore or that are simply taking too long.
The CancelationTokenSource will allow you to get a custom token that you can pass to the requestExecutor. It will also provide the means for cancelling a running Task.
private CancellationTokenSource cts = new CancellationTokenSource();
// ...
private Task<T> ProcessCassQuery<T>(Func<CancellationToken, Task<T>> requestExecuter, string msg) where T : class
{
return requestExecuter(cts.Token);
}
Example
Let's take a look at a different minimal/dummy example so we can look at the inside of it.
Consider the following method, GetSomethingAsync that will yield return an incrementing integer every second.
The call to token.ThrowIfCancellationRequested will make sure a TaskCanceledException is thrown if this process is cancelled by an outside action. Other approaches can be taken, for example, check if token.IsCancellationRequested is true and do something about it.
private static async IAsyncEnumerable<int> GetSomethingAsync(CancellationToken token)
{
Console.WriteLine("starting to get something");
token.ThrowIfCancellationRequested();
for (var i = 0; i < 100; i++)
{
await Task.Delay(1000, token);
yield return i;
}
Console.WriteLine("finished getting something");
}
Now let's build the main method to call the above method.
public static async Task Main()
{
var cts = new CancellationTokenSource();
// cancel it after 3 seconds, just for demo purposes
cts.CancelAfter(3000);
// or: Task.Delay(3000).ContinueWith(_ => { cts.Cancel(); });
await foreach (var i in GetSomethingAsync(cts.Token))
{
Console.WriteLine(i);
}
}
If we run this, we will get an output that should look like:
starting to get something
0
1
Unhandled exception. System.Threading.Tasks.TaskCanceledException: A task was canceled.
Of course, this is just a dummy example, the cancellation could be triggered by a user action, or some event that happens, it does not have to be a timer.

How can I asynchronously transform one IEnumerable to another, just like LINQ's Select(), but using await on every transformed item?

Consider this situation:
class Product { }
interface IWorker
{
Task<Product> CreateProductAsync();
}
I am now given an IEnumerable<IWorker> workers and am supposed to create an IEnumerable<Product> from it that I have to pass to some other function that I cannot alter:
void CheckProducts(IEnumerable<Product> products);
This methods needs to have access to the entire IEnumerable<Product>. It is not possible to subdivide it and call CheckProducts on multiple subsets.
One obvious solution is this:
CheckProducts(workers.Select(worker => worker.CreateProductAsync().Result));
But this is blocking, of course, and hence it would only be my last resort.
Syntactically, I need precisely this, just without blocking.
I cannot use await inside of the function I'm passing to Select() as I would have to mark it as async and that would require it to return a Task itself and I would have gained nothing. In the end I need an IEnumerable<Product> and not an IEnumerable<Task<Product>>.
It is important to know that the order of the workers creating their products does matter, their work must not overlap. Otherwise, I would do this:
async Task<IEnumerable<Product>> CreateProductsAsync(IEnumerable<IWorker> workers)
{
var tasks = workers.Select(worker => worker.CreateProductAsync());
return await Task.WhenAll(tasks);
}
But unfortunately, Task.WhenAll() executes some tasks in parallel while I need them executed sequentially.
Here is one possibility to implement it if I had an IReadOnlyList<IWorker> instead of an IEnumerable<IWorker>:
async Task<IEnumerable<Product>> CreateProductsAsync(IReadOnlyList<IWorker> workers)
{
var resultList = new Product[workers.Count];
for (int i = 0; i < resultList.Length; ++i)
resultList[i] = await workers[i].CreateProductAsync();
return resultList;
}
But I must deal with an IEnumerable and, even worse, it is usually quite huge, sometimes it is even unlimited, yielding workers forever. If I knew that its size was decent, I would just call ToArray() on it and use the method above.
The ultimate solution would be this:
async Task<IEnumerable<Product>> CreateProductsAsync(IEnumerable<IWorker> workers)
{
foreach (var worker in workers)
yield return await worker.CreateProductAsync();
}
But yield and await are incompatible as described in this answer. Looking at that answer, would that hypothetical IAsyncEnumerator help me here? Does something similar meanwhile exist in C#?
A summary of the issues I'm facing:
I have a potentially endless IEnumerable<IWorker>
I want to asynchronously call CreateProductAsync() on each of them in the same order as they are coming in
In the end I need an IEnumerable<Product>
A summary of what I already tried, but doesn't work:
I cannot use Task.WhenAll() because it executes tasks in parallel.
I cannot use ToArray() and process that array manually in a loop because my sequence is sometimes endless.
I cannot use yield return because it's incompatible with await.
Does anybody have a solution or workaround for me?
Otherwise I will have to use that blocking code...
IEnumerator<T> is a synchronous interface, so blocking is unavoidable if CheckProducts enumerates the next product before the next worker has finished creating the product.
Nevertheless, you can achieve parallelism by creating products on another thread, adding them to a BlockingCollection<T>, and yielding them on the main thread:
static IEnumerable<Product> CreateProducts(IEnumerable<IWorker> workers)
{
var products = new BlockingCollection<Product>(3);
Task.Run(async () => // On the thread pool...
{
foreach (IWorker worker in workers)
{
Product product = await worker.CreateProductAsync(); // Create products serially.
products.Add(product); // Enqueue the product, blocking if the queue is full.
}
products.CompleteAdding(); // Notify GetConsumingEnumerable that we're done.
});
return products.GetConsumingEnumerable();
}
To avoid unbounded memory consumption, you can optionally specify the capacity of the queue as a constructor argument to BlockingCollection<T>. I used 3 in the code above.
The Situation:
Here you're saying you need to do this synchronously, because IEnumerable doesn't support async and the requirements are you need an IEnumerable<Product>.
I am now given an IEnumerable workers and am supposed to
create an IEnumerable from it that I have to pass to some
other function that I cannot alter:
Here you say the entire product set needs to be processed at the same time, presumably making a single call to void CheckProducts(IEnumerable<Product> products).
This methods needs to check the entire Product set as a whole. It is
not possible to subdivide the result.
And here you say the enumerable can yield an indefinite number of items
But I must deal with an IEnumerable and, even worse, it is usually
quite huge, sometimes it is even unlimited, yielding workers forever.
If I knew that its size was decent, I would just call ToArray() on it
and use the method above.
So lets put these together. You need to do asynchronous processing of an indefinite number of items within a synchronous environment and then evaluate the entire set as a whole... synchronously.
The Underlying Problems:
1: To evaluate a set as a whole, it must be completely enumerated. To completely enumerate a set, it must be finite. Therefore it is impossible to evaluate an infinite set as a whole.
2: Switching back and forth between sync and async forces the async code to run synchronously. that might be ok from a requirements perspective, but from a technical perspective it can cause deadlocks (maybe unavoidable, I don't know. Look that up. I'm not the expert).
Possible Solutions to Problem 1:
1: Force the source to be an ICollection<T> instead of IEnumerable<T>. This enforces finiteness.
2: Alter the CheckProducts algorithm to process iteratively, potentially yielding intermediary results while still maintaining an ongoing aggregation internally.
Possible Solutions to Problem 2:
1: Make the CheckProducts method asynchronous.
2: Make the CreateProduct... method synchronous.
Bottom Line
You can't do what you're asking how you're asking, and it sounds like someone else is dictating your requirements. They need to change some of the requirements, because what they're asking for is (and I really hate using this word) impossible. Is it possible you have misinterpreted some of the requirements?
Two ideas for you OP
Multiple call solution
If you are allowed to call CheckProducts more than once, you could simply do this:
foreach (var worker in workers)
{
var product = await worker.CreateProductAsync();
CheckProducts(new [] { product } );
}
If it adds value, I'm pretty sure you could work out a way to do it in batches of, say, 100 at a time, too.
Thread pool solution
If you are not allowed to call CheckProducts more than once, and not allowed to modify CheckProducts, there is no way to force it to yield control and allow other continuations to run. So no matter what you do, you cannot force asynchronousness into the IEnumerable that you pass to it, not just because of the compiler checking, but because it would probably deadlock.
So here is a thread pool solution. The idea is to create one separate thread to process the products in series; the processor is async, so a call to CreateProductAsync() will still yield control to anything else that has been posted to the synchronization context, as needed. However it can't magically force CheckProduct to give up control, so there is still some possibility that it will block occasionally if it is able to check products faster than they are created. In my example I'm using Monitor.Wait() so the O/S won't schedule the thread until there is something waiting for it. You'll still be using up a thread resource while it blocks, but at least you won't be wasting CPU time in a busy-wait loop.
public static IEnumerable<Product> CreateProducts(IEnumerable<Worker> workers)
{
var queue = new ConcurrentQueue<Product>();
var task = Task.Run(() => ConvertProducts(workers.GetEnumerator(), queue));
while (true)
{
while (queue.Count > 0)
{
Product product;
var ok = queue.TryDequeue(out product);
if (ok) yield return product;
}
if (task.IsCompleted && queue.Count == 0) yield break;
Monitor.Wait(queue, 1000);
}
}
private static async Task ConvertProducts(IEnumerator<Worker> input, ConcurrentQueue<Product> output)
{
while (input.MoveNext())
{
var current = input.Current;
var product = await current.CreateProductAsync();
output.Enqueue(product);
Monitor.Pulse(output);
}
}
From your requirements I can put together the following:
1) Workers processed in order
2) Open to receive new Workers at any time
So using the fact that a dataflow TransformBlock has a built in queue and processes items in order. Now we can accept Workers from the producer at any time.
Next we make the result of the TransformBlockobservale so that the consumer can consume Products on demand.
Made some quick changes and started the consumer portion. This simply takes the observable produced by the Transformer and maps it to an enumerable that yields each product. For background here is the ToEnumerable().
The ToEnumerator operator returns an enumerator from an observable sequence. The enumerator will yield each item in the sequence as it is produced
Source
using System;
using System.Threading.Tasks;
using System.Threading.Tasks.Dataflow;
namespace ClassLibrary1
{
public class WorkerProducer
{
public async Task ProduceWorker()
{
//await ProductTransformer_Transformer.SendAsync(new Worker())
}
}
public class ProductTransformer
{
public IObservable<Product> Products { get; private set; }
public TransformBlock<Worker, Product> Transformer { get; private set; }
private Task<Product> CreateProductAsync(Worker worker) => Task.FromResult(new Product());
public ProductTransformer()
{
Transformer = new TransformBlock<Worker, Product>(wrk => CreateProductAsync(wrk));
Products = Transformer.AsObservable();
}
}
public class ProductConsumer
{
private ThirdParty ThirdParty { get; set; } = new ThirdParty();
private ProductTransformer Transformer { get; set; }
public ProductConsumer()
{
ThirdParty.CheckProducts(Transformer.Products.ToEnumerable());
}
public class Worker { }
public class Product { }
public class ThirdParty
{
public void CheckProducts(IEnumerable<Product> products)
{
}
}
}
Unless I misunterstood something, I don't see why you don't simply do it like this:
var productList = new List<Product>(workers.Count())
foreach(var worker in workers)
{
productList.Add(await worker.CreateProductAsync());
}
CheckProducts(productList);
What about if you simply keep clearing a List of size 1?
var productList = new List<Product>(1);
var checkTask = Task.CompletedTask;
foreach(var worker in workers)
{
await checkTask;
productList.Clear();
productList.Add(await worker.CreateProductAsync());
checkTask = Task.Run(CheckProducts(productList));
}
await checkTask;
You can use Task.WhenAll, but instead of returning result of Task.WhenAll, return collection of tasks transformed to the collection of results.
async Task<IEnumerable<Product>> CreateProductsAsync(IEnumerable<IWorker> workers)
{
var tasks = workers.Select(worker => worker.CreateProductAsync()).ToList();
await Task.WhenAll(tasks);
return tasks.Select(task => task.Result);
}
Order of tasks will be persisted.
And seems like should be ok to go with just return await Task.WhenAll()
From docs of Task.WhenAll Method (IEnumerable>)
The Task.Result property of the returned task will be set to
an array containing all of the results of the supplied tasks in the
same order as they were provided...
If workers need to be executed one by one in the order they were created and based on requirement that another function need whole set of workers results
async Task<IEnumerable<Product>> CreateProductsAsync(IEnumerable<IWorker> workers)
{
var products = new List<product>();
foreach (var worker in workers)
{
product = await worker.CreateProductAsync();
products.Add(product);
}
return products;
}
You can do this now with async, IEnumerable and LINQ but every method in the chain after the async would be a Task<T>, and you need to use something like await Task.WhenAll at the end. You can use async lambdas in the LINQ methods, which return Task<T>. You don't need to wait synchronously in these.
The Select will start your tasks sequentially i.e. they won't even exist as tasks until the select enumerates each one, and won't keep going after you stop enumerating. You could also run your own foreach over the enumerable of tasks if you want to await them all individually.
You can break out of this like any other foreach without it starting all of them, so this will also work on an infinite enumerable.
public async Task Main()
{
// This async method call could also be an async lambda
foreach (var task in GetTasks())
{
var result = await task;
Console.WriteLine($"Result is {result}");
if (result > 5) break;
}
}
private IEnumerable<Task<int>> GetTasks()
{
return GetNumbers().Select(WaitAndDoubleAsync);
}
private async Task<int> WaitAndDoubleAsync(int i)
{
Console.WriteLine($"Waiting {i} seconds asynchronously");
await Task.Delay(TimeSpan.FromSeconds(i));
return i * 2;
}
/// Keeps yielding numbers
private IEnumerable<int> GetNumbers()
{
var i = 0;
while (true) yield return i++;
}
Outputs, the following, then stops:
Waiting 0 seconds asynchronously
Result is 0
Waiting 1 seconds asynchronously
Result is 2
Waiting 2 seconds asynchronously
Result is 4
Waiting 3 seconds asynchronously
Result is 6
The important thing is that you can't mix yield and await in the same method, but you can yield Tasks returned from a method that uses await absolutely fine, so you can use them together just by splitting them into separate methods. Select is already a method that uses yield, so you may not need to write your own method for this.
In your post you were looking for a Task<IEnumerable<Product>>, but what you can actually use is a IEnumerable<Task<Product>>.
You can go even further with this e.g. if you had something like a REST API where one resource can have links to other resources, like if you just wanted to get a list of users of a group, but stop when you found the user you were interested in:
public async Task<IEnumerable<Task<User>>> GetUserTasksAsync(int groupId)
{
var group = await GetGroupAsync(groupId);
return group.UserIds.Select(GetUserAsync);
}
foreach (var task in await GetUserTasksAsync(1))
{
var user = await task;
...
}
There is no solution to your problem. You can't transform a deferred IEnumerable<Task<Product>> to a deferred IEnumerable<Product>, such that the consuming thread will not get blocked while enumerating the IEnumerable<Product>. The IEnumerable<T> is a synchronous interface. It returns an enumerator with a synchronous MoveNext method. The MoveNext returns bool, which is not an awaitable type. An asynchronous interface IAsyncEnumerable<T> exists, whose enumerator has an asynchronous MoveNextAsync method, with a return type of ValueTask<bool>. But you have explicitly said that you can't change the consuming method, so you are stuck with the IEnumerable<T> interface. No solution then.
try
workers.ForEach(async wrkr =>
{
var prdlist = await wrkr.CreateProductAsync();
//Remaing tasks....
});

Unenviable duplication of code in C#

I have the following simple method in C#:
private static void ExtendTaskInternal<U>(
ref U task_to_update, U replace, Action a) where U : Task
{
var current = Interlocked.Exchange(ref task_to_update, replace);
if (current == null)
Task.Run(a);
else
current.AppendAction(a);
}
This is used for the following methods:
//A Task can only run once. But sometimes we wanted to have a reference to some
//tasks that can be restarted. Of cause, in this case "restart" a task means
//replace the reference with a new one. To safely do so we have to ensure a
//lot of things:
//
// * Would the referee be null?
// * Is it still running?
// * The replacement of the task must be atomic
//
//This method can help solving the above issues. If `task_to_update` is null,
//a new Task will be created to replace it. If it is already there, a new Task
//will be created as its continuation, which will only run when the previous
//one finishes.
//
//This is looks like a async mutex, since if you assume `ExtendTask` is the only
//function in your code that updates `task_to_update`, the delegates you pass to
//it runs sequentially. But the difference is that since you have a reference to
//a Task, you can attach continuations that receive notification of lock
//releases.
public static Task<T> ExtendTask<T>(ref Task<T> task_to_update, Func<T> func)
{
var next_ts = new TaskCompletionSource<T>();
ExtendTaskInternal(ref task_to_update, next_ts.Task,
() => next_ts.SetResult(func()));
return next_ts.Task;
}
If you want to do something but only after something else have already been done, this is useful.
Now, this version can only used to replace a Task<T>, not a Task since ref variables are invariant. So if you want it to work for Task as well you have to duplicate the code:
public static Task<T> ExtendTask<T>(ref Task task_to_update, Func<T> func)
{
var next_ts = new TaskCompletionSource<T>();
ExtendTaskInternal(ref task_to_update, next_ts.Task,
() => next_ts.SetResult(func()));
return next_ts.Task;
}
And so you can implement another version that works on Actions.
public static Task ExtendTask(ref Task task_to_update, Action a)
{
return ExtendTask(ref task_to_update, () =>
{
a();
return true;
});
}
So far so good. But I don't like the first and the second version of the ExtendTask, since the body looks exactly the same.
Are there any way to eliminate the duplication?
Background
People ask why not use ContinueWith.
First, notice that AppendAction is just a wrapper function (from Microsoft.VisualStudio.Threading) of ContinueWith so this code is already using it indirectly.
Second, What I did differently here is that I have a reference to update, so this is another wrapper function to ContinueWith, the purpose of those functions is to make it easier to use in some scenarios.
I provide the following concrete example (untested) to illustrate the usage of those methods.
public class Cat {
private Task miuTask = null;
//you have to finish a miu to start another...
private void DoMiu(){
//... do what ever required to "miu".
}
public Task MiuAsync(){
return MyTaskExtension.ExtendTask(ref miuTask, DoMiu);
}
public void RegisterMiuListener(Action whenMiued){
var current = miuTask;
if(current==null) current = TplExtensions.CompletedTask();
current.AppendAction(whenMiued);
}
}

async / await - am I correctly running these methods in parallel?

I have an abstract class called VehicleInfoFetcher which returns information asynchronously from a WebClient via this method:
public override async Task<DTOrealtimeinfo> getVehicleInfo(string stopID);
I'd like to combine the results of two separate instances of this class, running each in parallel before combining the results. This is done within a third class, CombinedVehicleInfoFetcher (also itself a subclass of VehicleInfoFetcher)
Here's my code - but I'm not quite convinced that it's running the tasks in parallel; am I doing it right? Could it be optimized?
public class CombinedVehicleInfoFetcher : VehicleInfoFetcher
{
public HashSet<VehicleInfoFetcher> VehicleInfoFetchers { get; set; }
public override async Task<DTOrealtimeinfo> getVehicleInfo(string stopID)
{
// Create a list of parallel tasks to run
var resultTasks = new List<Task<DTOrealtimeinfo>>();
foreach (VehicleInfoFetcher fetcher in VehicleInfoFetchers)
resultTasks.Add(fetcher.getVehicleInfo(stopID, stopID2, timePointLocal));
// run each task
foreach (var task in resultTasks)
await task;
// Wait for all the results to come in
await Task.WhenAll(resultTasks.ToArray());
// combine the results
var allRealtimeResults = new List<DTOrealtimeinfo>( resultTasks.Select(t => t.Result) );
return combineTaskResults(allRealtimeResults);
}
DTOrealtimeinfo combineTaskResults(List<DTOrealtimeinfo> realtimeResults)
{
// ...
return rtInfoOutput;
}
}
Edit
Some very helpful answers, here is a re-written example to aid discussion with usr below:
public override async Task<object> combineResults()
{
// Create a list of parallel tasks to run
var resultTasks= new List<object>();
foreach (AnotherClass cls in this.OtherClasses)
resultTasks.Add(cls.getResults() );
// Point A - have the cls.getResults() methods been called yet?
// Wait for all the results to come in
await Task.WhenAll(resultTasks.ToArray());
// combine the results
return new List<object>( resultTasks.Select(t => t.Result) );
}
}
Almost all tasks start out already started. Probably, whatever fetcher.getVehicleInfo returns is already started. So you can remove:
// run each task
foreach (var task in resultTasks)
await task;
Task.WhenAll is faster and has better error behavior (you want all exceptions to be propagated, not just the first you happen to stumble upon).
Also, await does not start a task. It waits for completion. You have to arrange for the tasks to be started separately, but as I said, almost all tasks are already started when you get them. This is best-practice as well.
To help our discussion in the comments:
Task Test1() { return new Task(() => {}); }
Task Test2() { return Task.Factory.StartNew(() => {}); }
Task Test3() { return new FileStream("").ReadAsync(...); }
Task Test4() { return new TaskCompletionSource<object>().Task; }
Does not "run" when returned from the method. Must be started. Bad practice.
Runs when returned. Does not matter what you do with it, it is already running. Not necessary to add it to a list or store it somewhere.
Already runs like (2).
The notion of running does not make sense here. This task will never complete although it cannot be explicitly started.

Categories