Task.WaitAll is blocking - c#

I want to run two tasks simultaneously, with one having a Task.Delay() in it.
i.e. one runs continuously and one runs every 15 minutes.
Here's what I have so far:
public class ContinousAndAggregatedCheckRunner<T, T2>
{
private readonly int _aggregationInterval;
private readonly List<T> _collectedData;
private readonly Func<IEnumerable<T>, Task<T2>> _aggregator;
private readonly Func<Task<IEnumerable<T>>> _collector;
private CancellationToken _aggregationToken = default(CancellationToken);
private CancellationToken _collectionToken = default(CancellationToken);
public ContinousAndAggregatedCheckRunner(Func<IEnumerable<T>, Task<T2>> aggregator,
int aggregationInterval,
Func<Task<IEnumerable<T>>>)
{
_aggregator = aggregator;
_aggregationInterval = aggregationInterval;
_collector = collector;
_collectedData = new List<T>();
}
public async Task Run()
{
Task.WaitAll(Collect(), Aggregate());
}
private async Task Collect()
{
while (!_collectionToken.IsCancellationRequested)
{
Console.WriteLine($"Collecting {DateTime.Now.ToLongDateString()} {DateTime.Now.ToLongTimeString()}");
try
{
var results = await _collector();
_collectedData.AddRange(results);
}
catch (TaskCanceledException)
{
break;
}
}
}
private async Task Aggregate()
{
while (!_aggregationToken.IsCancellationRequested)
{
Console.WriteLine("Aggregating");
try
{
var aggregate = await _aggregator(_collectedData);
var taskFactory = new TaskFactory();
await taskFactory.StartNew(() => Send(aggregate), _aggregationToken);
_collectedData.Clear();
await Task.Delay(TimeSpan.FromMinutes(_aggregationInterval), _aggregationToken);
}
catch (TaskCanceledException)
{
break;
}
}
}
The problem is, it runs collecting for a bit. Then it comes into Aggregate() and it stops doing anything for the Task.Delay() then it Sends(). Then it does nothing again.
By does nothing, I mean Collect() stops executing.
I assume at some point it is blocking.
Is their a pattern here I'm missing. I want to run two tasks indefinitely and allow one of them to pause for a specified amount of time without affecting the other.

There is a couple of things that could be fixed.
As it said by Ben Voigt, "WaitAll" is really a blocker, you'd better to make a sequence of tasks with WhenAll
It can be worth to use Task.Factory.StartNew instead of instatiating new TaskFactory instance
StartNew may be not a best choice at all, see details here https://blog.stephencleary.com/2013/08/startnew-is-dangerous.html

Related

Different HTTP calls, await same Task

I have a Task which starts a win process, which generates file if its not created yet and returns it. The problem is that the action is called more than once. To be more precisely its src attribute of a <track> element.
I have ConcurrentDictionary<Guid, Task<string>>
which keeps track of for which Id a process is currently running
public async Task<string> GenerateVTTFile(Guid Id)
{
if (_currentGenerators.TryGetValue(id, out Task<string> task))
{
return await task; // problem is here?
}
var t = Run(); // Task
_currentGenerators.TryAdd(id, t);
return await t;
}
In the action method of the controller
var path = await _svc.GetSomePath();
if (string.IsNullOrEmpty(path))
{
var path = await svc.GenerateVTTFile(id);
return PhysicalFile(path, "text/vtt");
}
return PhysicalFile(path, "text/vtt");
Run() method is just starting Process and waits it.
process.WaitForExit();
What I want to achieve is to return the result of the same task for the same Id. It seems that if the Id already exists in the dictionary and I await it starts another process (calls Run method again).
Is there a way to achieve that?
You can make the method atomic to protect the "dangerzone":
private SemaphoreSlim _sem = new SemaphoreSlim(1);
public Task<string> GenerateVTTFile(Guid Id)
{
_sem.Wait();
try
{
if (_currentGenerators.TryGetValue(Id, out Task<string> task))
{
return task;
}
var t = Run(); // Task
_currentGenerators.TryAdd(Id, t); // While Thread 1 is here,
// Thread 2 can already be past the check above ...
// unless we make the method atomic like here.
return t;
}
finally
{
_sem.Release();
}
}
Drawback here is, that also calls with different ids have to wait. So that makes for a bottleneck. Of course, you could make an effort but hey: the dotnet guys did it for you:
Preferably, you can use GetOrAdd to do the same with only ConcurrentDictionary's methods:
public Task<string> GenerateVTTFile(Guid Id)
{
// EDIT: This overload vv is actually NOT atomic!
// DO NOT USE:
//return _currentGenerators.GetOrAdd(Id, () => Run());
// Instead:
return _currentGenerators.GetOrAdd(Id,
_ => new Lazy<Task<string>>(() => Run(id))).Value;
// Fix "stolen" from Theodore Zoulias' Answer. Link to his answer below.
// If you find this helped you, please upvote HIS answer.
}
Yes, it's really a "one-liner".
Please see this answer: https://stackoverflow.com/a/61372518/982149 from which I took the fix for my flawed answer.
As pointed out already by João Reis, using simply the GetOrAdd method is not enough to ensure that a Task will be created only once per key. From the documentation:
If you call GetOrAdd simultaneously on different threads, valueFactory may be called multiple times, but only one key/value pair will be added to the dictionary.
The quick and lazy way to deal with this problem is to use the Lazy class. Instead of storing Task objects in the dictionary, you could store Lazy<Task> wrappers. This way even if a wrapper is created multiple times per key, all extraneous wrappers will be discarded without their Value property requested, and so without duplicate tasks created.
private ConcurrentDictionary<Guid, <Lazy<Task<string>>> _currentGenerators;
public Task<string> GenerateVTTFileAsync(Guid id)
{
return _currentGenerators.GetOrAdd(id,
_ => new Lazy<Task<string>>(() => Run(id))).Value;
}
In order to have multiple concurrent calls of that method but only one for each id, you need to use ConcurrentDictionary.GetOrAdd with SemaphoreSlim.
GetOrAdd is not enough because the factory parameter might be executed more than once, see "Remarks" here https://learn.microsoft.com/en-us/dotnet/api/system.collections.concurrent.concurrentdictionary-2.getoradd?view=netframework-4.8
Here is an example:
private ConcurrentDictionary<Guid, Generator> _currentGenerators =
new ConcurrentDictionary<Guid, Generator>();
public async Task<string> GenerateVTTFile(Guid id)
{
var generator = _currentGenerators.GetOrAdd(id, _ => new Generator());
return await generator.RunGenerator().ConfigureAwait(false);
}
public class Generator
{
private int _started = 0;
private Task<string> _task;
private readonly SemaphoreSlim _semaphore = new SemaphoreSlim(1);
public async Task<string> RunGenerator()
{
if (!IsInitialized())
{
await Initialize().ConfigureAwait(false);
}
return await Interlocked.CompareExchange(ref _task, null, null).ConfigureAwait(false);
}
private async Task Initialize()
{
await _semaphore.WaitAsync().ConfigureAwait(false);
try
{
// check again after acquiring the lock
if (IsInitialized())
{
return;
}
var task = Run();
_ = Interlocked.Exchange(ref _task, task);
Interlocked.Exchange(ref _started, 1);
}
finally
{
_semaphore.Release();
}
}
private bool IsInitialized()
{
return Interlocked.CompareExchange(ref _started, 0, 0) == 1;
}
private async Task<string> Run()
{
// your implementation here
}
}

Async wait for multiple threads to finish

I have a code block which is eventually accessed by multiple threads. I search for an up to date async mechanism to continue executing when all threads have passed.
Currently I do the following with a CountDownEvent which works just fine (without async support).
public class Watcher
{
private static readonly Logger Log = LogManager.GetCurrentClassLogger();
private readonly CountdownEvent _isUpdating = new CountdownEvent(1);
private readonly IActivity _activity;
public Watcher([NotNull] IActivity activity)
{
_activity = activity ?? throw new ArgumentNullException(nameof(activity));
_activity.Received += OnReceived;
}
private void OnReceived(IReadOnlyCollection<Summary> summaries)
{
_isUpdating.AddCount();
try
{
// Threads processing
}
finally
{
_isUpdating.Signal();
}
}
private void Disable()
{
_activity.Received -= OnReceived;
_isUpdating.Signal();
/* await */ _isUpdating.Wait();
}
}
Do I need to use any of those AsyncCountdownEvent implementations or is there any other built-in mechanism? I already thought about using a BufferBlock because it has async functionality but I think it's a bit overkill.
Additional to the comments:
IActivity is a WebService call (but shouldn't effect the implementation on top or vice versa)
public async Task Start(bool alwayRetry = true, CancellationToken cancellationToken = new CancellationToken())
{
var milliseconds = ReloadSeconds * 1000;
do
{
try
{
var summaries = await PublicAPI.GetSummariesAsync(cancellationToken).ConfigureAwait(false);
OnSummariesReceived(summaries);
}
catch (Exception ex)
{
Log.Error(ex.Message);
OnErrorOccurred(ex);
}
await Task.Delay(milliseconds, cancellationToken).ConfigureAwait(false);
// ReSharper disable once LoopVariableIsNeverChangedInsideLoop
} while (alwayRetry);
}
It's not clear the IActivity signatures; but you can wait for a range of tasks to be completed:
class MultiAsyncTest {
Task SomeAsync1() { return Task.Delay(1000); }
Task SomeAsync2() { return Task.Delay(2000);}
Task EntryPointAsync() {
var tasks = new List<Task>();
tasks.Add(SomeAsync1());
tasks.Add(SomeAsync2());
return Task.WhenAll(tasks);
}
}
What's IActivity's signature? Does it support Task? Or you are using Thread? More explanation would help to a more specified answer.

Array of ManualResetEvent objects

here's my story: I have wcf service. It receives request with work to do. Each task is inserted into blocking queue. The server will take items from this queue periodically and do the work (totally async in different thread). In my "Do" service I need to know when "my" task was done. Like this:
public bool Do(int input)
{
// 1. Add task to the BlockingCollection queue
// 2. Block this thread from returning and observe/wait til my task is finished
return true;
}
Here's my suggestion/solution:
public bool Do(int input)
{
// 1. Create a ManualResetEvent object
// 2. Add this object to task
// 3. Add task to the BlockingCollection queue
// 4. Block this thread from returning - wait for ManualResetEvent object
return true;
}
So, there will be as many ManualResetEvent objects as there are tasks to do. I will literally have an array of sync objects. Is it good solution for my problem?
Or is there better synchronization class to use in my case? Like Wait and Pulse?
Thanks for help,
I'm sorry for the title. I didn't know how to ask this question in the title.
Your plan is good, however I would suggest not tying up a dedicated thread waiting for the work to be done. Switching from a new ManualResetEvent(false) to a new SemephoreSlim(0,1) will let you use WaitAsync() which would allow you to use async/await in your Do method and freeing up the thread to do other work. (UPDATE: This really should be a TaskCompletionSource instead of a Semaphore Slim, but I will not update this example, see the 2nd part below)
public async Task<bool> Do(int input)
{
using(var completion = new new SemephoreSlim(0,1))
{
var job = new JobTask(input, completion);
_workQueue.Add(job);
await completion.WaitAsync().ConfigureAwait(false);
return job.ResultData;
}
}
private void ProcessingLoop()
{
foreach(var job in _workQueue.GetConsumingEnumerable())
{
job.PerformWork(); //Inside PerformWork there is a _completion.Release(); call.
}
}
To make everything self contained you can change the SemaphoreSlim / TaskCompletionSource and put it inside the job then just return the job itself.
public JobTask Do(int input)
{
var job = new JobTask(input);
_workQueue.Add(job);
return job;
}
public class JobTask
{
private readonly int _input;
private readonly TaskCompletionSource<bool> _completionSource;
public JobTask(int input)
{
_input = input;
_completionSource = new TaskCompletionSource<bool>();
}
public void PerformWork()
{
try
{
// Do stuff here with _input.
_completionSource.TrySetResult(true);
}
catch(Exception ex)
{
_completionSource.TrySetException(ex);
}
}
public Task<bool> Work { get { return _completionSource.Task; } }
}

await not blocking until Task finishes

I am trying to block RequestHandler.ParseAll() with await ConsumerTask;, but when i set a breakpoint there, i ALWAYS get the "Done..." output first... and then Parse2() fails with a NullReferenceException. (thats my guess: "the GC starts cleaning up because _handler got out of scope")
Anyway, I can't figure out why that happens.
class MainClass
{
public async void DoWork()
{
RequestHandler _handler = new RequestHandler();
string[] mUrls;
/* fill mUrls here with values */
await Task.Run(() => _handler.ParseSpecific(mUrls));
Console.WriteLine("Done...");
}
}
static class Parser
{
public static async Task<IEnumerable<string>> QueryWebPage(string url) { /*Query the url*/ }
public static async Task Parse1(Query query)
{
Parallel.ForEach(/*Process data here*/);
}
public static async Task Parse2(Query query)
{
foreach(string line in query.WebPage)
/* Here i get a NullReference exception because query.WebPage == null */
}
}
sealed class RequestHandler
{
private BlockingCollection<Query> Queue;
private Task ConsumerTask = Task.Run(() => /* call consume() for each elem in the queue*/);
private async void Consume(Query obj)
{
await (obj.BoolField ? Parser.Parse1(obj) : Parser.Parse2(obj));
}
public async void ParseSpecific(string[] urls)
{
foreach(string v in urls)
Queue.Add(new Query(await QueryWebPage(v), BoolField: false));
Queue.CompleteAdding();
await ConsumerTask;
await ParseAll(true);
}
private async Task ParseAll(bool onlySome)
{
ReInit();
Parallel.ForEach(mCollection, v => Queue.Add(new Query(url, BoolField:false)));
Queue.CompleteAdding();
await ConsumerTask;
/* Process stuff further */
}
}
struct Query
{
public readonly string[] WebPage;
public readonly bool BoolField;
public Query(uint e, IEnumerable<string> page, bool b) : this()
{
Webpage = page.ToArray();
BoolField = b;
}
}
CodesInChaos has spotted the problem in comments. It stems from having async methods returning void, which you should almost never do - it means you've got no way to track them.
Instead, if your async methods don't have any actual value to return, you should just make them return Task.
What's happening is that ParseSpecific is only running synchronously until the first await QueryWebPage(v) that doesn't complete immediately. It's then returning... so the task started here:
await Task.Run(() => _handler.ParseSpecific(mUrls));
... completes immediately, and "Done" gets printed.
Once you've made all your async methods return Task, you can await them. You also won't need Task.Run at all. So you'd have:
public async void DoWork()
{
RequestHandler _handler = new RequestHandler();
string[] mUrls;
await _handler.ParseSpecific(mUrls);
Console.WriteLine("Done...");
}
...
public async TaskParseSpecific(string[] urls)
{
foreach(string v in urls)
{
// Refactored for readability, although I'm not sure it really
// makes sense now that it's clearer! Are you sure this is what
// you want?
var page = await QueryWebPage(v);
Queue.Add(new Query(page, false);
}
Queue.CompleteAdding();
await ConsumerTask;
await ParseAll(true);
}
Your Reinit method also needs changing, as currently the ConsumerTask will basically complete almost immediately, as Consume will return immediately as it's another async method returning void.
To be honest, what you've got looks very complex, without a proper understanding of async/await. I would read up more on async/await and then probably start from scratch. I strongly suspect you can make this much, much simpler. You might also want to read up on TPL Dataflow which is designed to make producer/consumer scenarios simpler.

TPL Queue Processing

I'm currently working on a a project and I have a need to queue some jobs for processing, here's the requirement:
Jobs must be processed one at a time
A queued item must be able to be waited on
So I want something akin to:
Task<result> QueueJob(params here)
{
/// Queue the job and somehow return a waitable task that will wait until the queued job has been executed and return the result.
}
I've tried having a background running task that just pulls items off a queue and processes the job, but the difficulty is getting from a background task to the method.
If need be I could go the route of just requesting a completion callback in the QueueJob method, but it'd be great if I could get a transparent Task back that allows you to wait on the job to be processed (even if there are jobs before it in the queue).
You might find TaskCompletionSource<T> useful, it can be used to create a Task that completes exactly when you want it to. If you combine it with BlockingCollection<T>, you will get your queue:
class JobProcessor<TInput, TOutput> : IDisposable
{
private readonly Func<TInput, TOutput> m_transform;
// or a custom type instead of Tuple
private readonly
BlockingCollection<Tuple<TInput, TaskCompletionSource<TOutput>>>
m_queue =
new BlockingCollection<Tuple<TInput, TaskCompletionSource<TOutput>>>();
public JobProcessor(Func<TInput, TOutput> transform)
{
m_transform = transform;
Task.Factory.StartNew(ProcessQueue, TaskCreationOptions.LongRunning);
}
private void ProcessQueue()
{
Tuple<TInput, TaskCompletionSource<TOutput>> tuple;
while (m_queue.TryTake(out tuple, Timeout.Infinite))
{
var input = tuple.Item1;
var tcs = tuple.Item2;
try
{
tcs.SetResult(m_transform(input));
}
catch (Exception ex)
{
tcs.SetException(ex);
}
}
}
public Task<TOutput> QueueJob(TInput input)
{
var tcs = new TaskCompletionSource<TOutput>();
m_queue.Add(Tuple.Create(input, tcs));
return tcs.Task;
}
public void Dispose()
{
m_queue.CompleteAdding();
}
}
I would go for something like this:
class TaskProcessor<TResult>
{
// TODO: Error handling!
readonly BlockingCollection<Task<TResult>> blockingCollection = new BlockingCollection<Task<TResult>>(new ConcurrentQueue<Task<TResult>>());
public Task<TResult> AddTask(Func<TResult> work)
{
var task = new Task<TResult>(work);
blockingCollection.Add(task);
return task; // give the task back to the caller so they can wait on it
}
public void CompleteAddingTasks()
{
blockingCollection.CompleteAdding();
}
public TaskProcessor()
{
ProcessQueue();
}
void ProcessQueue()
{
Task<TResult> task;
while (blockingCollection.TryTake(out task))
{
task.Start();
task.Wait(); // ensure this task finishes before we start a new one...
}
}
}
Depending on the type of app that is using it, you could switch out the BlockingCollection/ConcurrentQueue for something simpler (eg just a plain queue). You can also adjust the signature of the "AddTask" method depending on what sort of methods/parameters you will be queueing up...
Func<T> takes no parameters and returns a value of type T. The jobs are run one by one and you can wait on the returned task to get the result.
public class TaskQueue
{
private Queue<Task> InnerTaskQueue;
private bool IsJobRunning;
public void Start()
{
Task.Factory.StartNew(() =>
{
while (true)
{
if (InnerTaskQueue.Count > 0 && !IsJobRunning)
{
var task = InnerTaskQueue.Dequeue()
task.Start();
IsJobRunning = true;
task.ContinueWith(t => IsJobRunning = false);
}
else
{
Thread.Sleep(1000);
}
}
}
}
public Task<T> QueueJob(Func<T> job)
{
var task = new Task<T>(() => job());
InnerTaskQueue.Enqueue(task);
return task;
}
}

Categories