Select with asynchronous method

Select with asynchronous method - c#

I've found a question that was really useful for me, but I still can't realize what equivalent in LINQ-world has further construction:
public async Task<List<ObjectInfo>> GetObjectsInfo(string[] objectIds)
{
var result = new List<ObjectInfo>(objectIds.Length);
foreach (var id in objectIds)
{
result.Add(await GetObjectInfo(id));
}
return result;
}
If I wrote instead
var result = await Task.WhenAll(objectIds.Select(id => GetObjectInfo(id)));
wouldn't these tasks be started simultaneously?
In my case it would be better to run them in series.
Edit 1: Answering to the comment of Theodor Zoulias.
Of course, I forgot Async suffix in methods' names!
The method GetObjectInfoAsync makes http request to external service. Additionally, this service has restriction for requests' frequency, so I use following construction.
using (var throttler = new Throttler(clientId))
{
while (!throttler.IsCallAllowed(out var waitTime))
{
await Task.Delay(waitTime);
}
var response = await client.PerformHttpRequestAsync(request);
return response.Content.FromJson<TResponse>(serializerSettings);
}
Throttler knowns last request's time for each client.

You should take into account the following considerations when you are using whenAll
Quotes taken from MS documentation
If any of the supplied tasks completes in a faulted state, the returned task will also complete in a Faulted state, where its exceptions will contain the aggregation of the set of unwrapped exceptions from each of the supplied tasks.
If none of the supplied tasks faulted but at least one of them was canceled, the returned task will end in the Canceled state.
If none of the tasks faulted and none of the tasks were canceled, the resulting task will end in the RanToCompletion state.
If the supplied array/enumerable contains no tasks, the returned task will immediately transition to a RanToCompletion state before it's returned to the caller.
That means that while the items will run maybe simultaneously (you can't know how as it depends on the resources of the machine and you can't know the order), the above can't be avoided.
If you want to have specific error handling per case and a specific order, then the first solution is the way to go. It kind of beats the point of the async/await but not in it's entirety. Even with sequential execution of the async items, your thread at least will not go to sleep and can still be used until the awaitable is ready.

Related

Log data into cassandra using c#

I trying to log data into Cassandra using c#. So my aim is to log as much data points as I can in 200ms.
I am trying to save time, random key and value in 200ms. Please see code for refrence. the problem how can I execute session after while loop.
Cluster cluster = Cluster.Builder()
.AddContactPoint("127.0.0.1")
.Build();
ISession session = cluster.Connect("log"); //keyspace to connect with
var ps = session.Prepare("Insert into logcassandra(nanodate, key, value) values (?,?,?)");
stopwatch.Start();
while(stop.ElapsedMilliseconds <= 200)
{
i++;
var statement = ps.Bind(nanoTime(),"key"+i,"value"+i);
session.ExecuteAsync(statement);
}

Please prefer System.Threading.Timer with a TimerCallback over Stopwatch.
EDIT: (reply to the comment)
Hi, I'm not sure what you want to achieve, but here are some general concepts about async calls and parallel execution. In .NET world the async is mainly used for Non-blocking I/O operations, which means your caller thread will not wait for the response of the I/O driver. In other words, you instantiate an I/O operation and dispatch this work to a "thing" which is outside of the .NET ecosystem and that will gives you back a future (a Task). The driver acknowledges back that it received the request and it promises that it will process it once it has free capacity.
That Task represents an async work that either succeeded or fail. But because you are calling it asynchronously you are not awaiting its result (not blocking the caller thread to wait for external work) rather move on to the next statement. Eventually this operation will be finished and at that time the driver will notify that Task that a request operation has been finished. (The Task can be seen as the primary communication channel between the caller and the callee)
In your case you are using a fire and forget style async call. That means you are firing off a lot of I/O operations in async and you forget to process the result of them. You don't know either any of them failed or not. But you have called the Casandra to do a lot of staff. Your time measurement is used only for firing off jobs, which means you have no idea how much of these jobs has been finished.
If you would choose to use await against your async calls, that would mean that your while loop would be serially executed. You would firing off a job and you can't move on to the next iteration because you are awaiting it, so your caller thread will move one level higher in its call stack and examines if it can processed with something. If there is an await as well, then it moves one level higher and so on...
while(stop.ElapsedMilliseconds <= 200)
{
await session.ExecuteAsync(statement);
}
If you don't want serial execution rather parallel, you can create as many jobs as you need and await them as a whole. That's where Task.WhenAll comes into the play. You will fire off a lot of jobs and you will await that single job that will track all of other jobs.
var cassandraCalls = new List<Task>();
cassandraCalls.AddRange(Enumerable.Range(0, 100).Select(_ => session.ExecuteAsync(statement)));
await Task.WhenAll(cassandraCalls);
But this code will run until all of the jobs are finished. If you want to constrain the whole execution time then you should use some cancellation mechanism. Task.WhenAll does not support CancellationToken. But you can overcome of this limitation in several way. The simplest solution is a combination of the Task.Delay and the Task.WhenAny. Task.Delay will be used for the timeout, and Task.WhenAny will be used to await either the your cassandra calls or the timeout to complete.
var cassandraCalls = new List<Task>();
cassandraCalls.AddRange(Enumerable.Range(0, 100).Select(_ => ExecuteAsync()));
await Task.WhenAny(Task.WhenAll(cassandraCalls), Task.Delay(1000));
In this way, you have fired off as many jobs as you wanted and depending on your driver they may be executed in parallel or concurrently. You are awaiting either to finish all or elapse a certain amount of time. When the WhenAny job finishes then you can examine the result of the jobs, but simply iterating over the cassandraCalls
foreach (var call in cassandraCalls)
{
Console.WriteLine(call.IsCompleted);
}
I hope this explanation helped you a bit.

Throttling with SemaphoreSlim -- "Task.Run()" vs "new Func<Task>()"

This might not be specific to SemaphoreSlim exclusively, but basically my question is about whether there is a difference between the below two methods of throttling a collection of long running tasks, and if so, what that difference is (and when if ever to use either).
In the example below, let's say that each tracked task involves loading data from a Url (totally made up example, but is a common one that I've found for SemaphoreSlim examples).
The main difference comes down to how the individual tasks are added to the list of tracked tasks. In the first example, we call Task.Run() with a lambda, whereas in the second, we new up a Func(<Task<Result>>()) with a lambda and then immediately call that func and add the result to the tracked task list.
Examples:
Using Task.Run():
SemaphoreSlim ss = new SemaphoreSlim(_concurrentTasks);
List<string> urls = ImportUrlsFromSource();
List<Task<Result>> trackedTasks = new List<Task<Result>>();
foreach (var item in urls)
{
await ss.WaitAsync().ConfigureAwait(false);
trackedTasks.Add(Task.Run(async () =>
{
try
{
return await ProcessUrl(item);
}
catch (Exception e)
{
_log.Error($"logging some stuff");
throw;
}
finally
{
ss.Release();
}
}));
}
var results = await Task.WhenAll(trackedTasks);
Using a new Func:
SemaphoreSlim ss = new SemaphoreSlim(_concurrentTasks);
List<string> urls = ImportUrlsFromSource();
List<Task<Result>> trackedTasks = new List<Task<Result>>();
foreach (var item in urls)
{
trackedTasks.Add(new Func<Task<Result>>(async () =>
{
await ss.WaitAsync().ConfigureAwait(false);
try
{
return await ProcessUrl(item);
}
catch (Exception e)
{
_log.Error($"logging some stuff");
throw;
}
finally
{
ss.Release();
}
})());
}
var results = await Task.WhenAll(trackedTasks);

There are two differences:
Task.Run does error handling
First off all, when you call the lambda, it runs. On the other hand, Task.Run would call it. This is relevant because Task.Run does a bit of work behind the scenes. The main work it does is handling a faulted task...
If you call a lambda, and the lambda throws, it would throw before you add the Task to the list...
However, in your case, because your lambda is async, the compiler would create the Task for it (you are not making it by hand), and it will correctly handle the exception and make it available via the returned Task. Therefore this point is moot.
Task.Run prevents task attachment
Task.Run sets DenyChildAttach. This means that the tasks created inside the Task.Run run independently from (are not synchronized with) the returned Task.
For example, this code:
List<Task<int>> trackedTasks = new List<Task<int>>();
var numbers = new int[]{0, 1, 2, 3, 4};
foreach (var item in numbers)
{
trackedTasks.Add(Task.Run(async () =>
{
var x = 0;
(new Func<Task<int>>(async () =>{x = item; return x;}))().Wait();
Console.WriteLine(x);
return x;
}));
}
var results = await Task.WhenAll(trackedTasks);
Will output the numbers from 0 to 4, in unknown order. However the following code:
List<Task<int>> trackedTasks = new List<Task<int>>();
var numbers = new int[]{0, 1, 2, 3, 4};
foreach (var item in numbers)
{
trackedTasks.Add(new Func<Task<int>>(async () =>
{
var x = 0;
(new Func<Task<int>>(async () =>{x = item; return x;}))().Wait();
Console.WriteLine(x);
return x;
})());
}
var results = await Task.WhenAll(trackedTasks);
Will output the numbers from 0 to 4, in order, every time. This is odd, right? What happens is that the inner task is attached to outer one, and executed right away in the same thread. But if you use Task.Run, the inner task is not attached and scheduled independently.
This remain true even if you use await, as long as the task you await does not go to an external system...
What happens with external system? Well, for example, if your task is reading from an URL - as in your example - the system would create a TaskCompletionSource, get the Task from it, set a response handler that writes the result to the TaskCompletionSource, make the request, and return the Task. This Task is not scheduled, it running on the same thread as a parent task makes no sense. And thus, it can break the order.
Since, you are using await to wait on an external system, this point is moot too.
Conclusion
I must conclude that these are equivalent.
If you want to be safe, and make sure it works as expected, even if - in a future version - some of the above points stops being moot, then keep Task.Run. On the other hand, if you really want to optimize, use the lambda and avoid the Task.Run (very small) overhead. However, that probably won't be a bottleneck.
Addendum
When I talk about a task that goes to an external system, I refer to something that runs outside of .NET. There a bit of code that will run in .NET to interface with the external system, but the bulk of the code will not run in .NET, and thus will not be in a managed thread at all.
The consumer of the API specify nothing for this to happen. The task would be a promise task, but that is not exposed, for the consumer there is nothing special about it.
In fact, a task that goes to an external system may barely run in the CPU at all. Futhermore, it might just be waiting on something exterior to the computer (it could be the network or user input).
The pattern is as follows:
The library creates a TaskCompletionSource.
The library sets a means to recieve a notification. It can be a callback, event, message loop, hook, listening to a socket, a pipe line, waiting on a global mutex... whatever is necesary.
The library sets code to react to the notification that will call SetResult, or SetException on the TaskCompletionSource as appropiate for the notification recieved.
The library does the actual call to the external system.
The library returns TaskCompletionSource.Task.
Note: with extra care of optimization not reordering things where it should not, and with care of handling errors during the setup phase. Also, if a CancellationToken is involved, it has to be taken into account (and call SetCancelled on the TaskCompletionSource when appropiate). Also, there could be tear down necesary in the reaction to the notification (or on cancellation). Ah, do not forget to validate your parameters.
Then the external system goes and does whatever it does. Then when it finishes, or something goes wrong, gives the library the notification, and your Task is sudendtly completed, faulted... (or if cancellation happened, your Task is now cancelled) and .NET will schedule the continuations of the task as needed.
Note: async/await uses continuations behind the scenes, that is how execution resumes.
Incidentally, if you wanted to implement SempahoreSlim yourself, you would have to do something very similar to what I describe above. You can see it in my backport of SemaphoreSlim.
Let us see a couple of examples of promise tasks...
Task.Delay: when we are waiting with Task.Delay, the CPU is not spinning. This is not running in a thread. In this case the notification mechanism will be an OS timer. When the OS sees that the time of the timer has elapsed, it will call into the CLR, and then the CLR will mark the task as completed. What thread was waiting? none.
FileStream.ReadSync: when we are reading from storage with FileStream.ReadSync the actual work is done by the device. The CRL has to declare a custom event, then pass the event, the file handle and the buffer to the OS... the OS calls the device driver, the device driver interfaces with the device. As the storage device recovers the information, it will write to memory (directly on the specified buffer) via DMA technology. And when it is done, it will set an interruption, that is handled by the driver, that notifies the OS, that calls the custom event, that marks the task as completed. What thread did read the data from storage? none.
A similar pattern will be used to download from a web page, except, this time the device goes to the network. How to make an HTTP request and how the system waits for a response is beyond the scope of this answer.
It is also possible that the external system is another program, in which case it would run on a thread. But it won't be a managed thread on your process.
Your take away is that these task do not run on any of your threads. And their timing might depend on external factors. Thus, it makes no sense to think of them as running in the same thread, or that we can predict their timing (well, except of course, in the case of the timer).

Both are not very good because they create the tasks immediately. The func version is a little less overhead since it saves the Task.Run route over the thread pool just to immediately end the thread pool work and suspend on the semaphore. You don't need an async Func, you could simplify this by using an async method (possibly a local function).
But you should not do this at all. Instead, use a helper method that implements a parallel async foreach.
public static Task ForEachAsync<T>(this IEnumerable<T> source, int dop, Func<T, Task> body)
{
return Task.WhenAll(
from partition in Partitioner.Create(source).GetPartitions(dop)
select Task.Run(async delegate {
using (partition)
while (partition.MoveNext())
await body(partition.Current);
}));
}
Then you just go urls.ForEachAsync(myDop, async input => await ProcessAsync(input));
Here, the tasks are created on demand. You can even make the input stream lazy.

Why an additional async operation is making my code faster than when the operation is not taking place at all?

I'm working on a SMS-based game (Value Added Service), in which a question must be sent to each subscriber on a daily basis. There are over 500,000 subscribers and therefore performance is a key factor. Since each subscriber can be a difference state of the competition with different variables, database must be queried separately for each subscriber before sending a text message. To achieve the best performance I'm using .Net Task Parallel Library (TPL) to spawn parallel threadpool threads and do as much async operations as possible in each thread to finally send texts asap.
Before describing the actual problem there are some more information necessary to give about the code.
At first there was no async operation in the code. I just scheduled some 500,000 tasks with the default task scheduler into the Threadpool and each task would work through the routines, blocking on all EF (Entity Framework) queries and sequentially finishing its job. It was good, but not fast enough. Then I changed all EF queries to Async, the outcome was superb in speed but there has been so many deadlocks and timeouts in SQL server that about a third of the subscribers never received a text! After trying different solutions, I decided not to do too many Async Database operations while I have over 500,000 tasks running on a 24 core server (with at least 24 concurrent threadpool threads)!
I rolled back all the changes (the Asycn ones) expect for one web service call in each task which remained Async.
Now the weird case:
In my code, I have a boolean variable named "isCrossSellActive". When the variable is set some more DB operations take place and an asycn webservice call will happen on which the thread awaits. When this variable is false, none of these operations will happen including the async webservice call. Awkwardly when the variable is set the code runs so much faster than when it's not! It seems like for some reason the awaited async code (the cooperative thread) is making the code faster.
Here is the code:
public async Task AutoSendMessages(...)
{
//Get list of subscriptions plus some initialization
LimitedConcurrencyLevelTaskScheduler lcts = new LimitedConcurrencyLevelTaskScheduler(numberOfThreads);
TaskFactory taskFactory = new TaskFactory(lcts);
List<Task> tasks = new List<Task>();
//....
foreach (var sub in subscriptions)
{
AutoSendData data = new AutoSendData
{
ServiceId = serviceId,
MSISDN = sub.subscriber,
IsCrossSellActive = bolCrossSellHeader
};
tasks.Add(await taskFactory.StartNew(async (x) =>
{
await SendQuestion(x);
}, data));
}
GC.Collect();
try
{
Task.WaitAll(tasks.ToArray());
}
catch (AggregateException ae)
{
ae.Handle((ex) =>
{
_logRepo.LogException(1, "", ex);
return true;
});
}
await _autoSendRepo.SetAutoSendingStatusEnd(statusId);
}
public async Task SendQuestion(object data)
{
//extract variables from input parameter
try
{
if (isCrossSellActive)
{
int pieceCount = subscriptionRepo.GetSubscriberCarPieces(curSubscription.service, curSubscription.subscriber).Count(c => c.isConfirmed);
foreach (var rule in csRules)
{
if (rule.Applies)
{
if (await HttpClientHelper.GetJsonAsync<bool>(url, rule.TargetServiceBaseAddress))
{
int noOfAddedPieces = SomeCalculations();
if (noOfAddedPieces > 0)
{
crossSellRepo.SetPromissedPieces(curSubscription.subscriber, curSubscription.service,
rule.TargetShortCode, noOfAddedPieces, 0, rule.ExpirationLimitDays);
}
}
}
}
}
// The rest of the code. (Some db CRUD)
await SmsClient.SendSoapMessage(subscriber, smsBody);
}
catch (Exception ex){//...}
}

Ok, thanks to #usr and the clue he gave me, the problem is finally solved!
His comment drew my attention to the awaited taskFactory.StartNew(...) line which sequentially adds new tasks to the "tasks" list which is then awaited on by Task.WaitAll(tasks);
At first I removed the await keyword before the taskFactory.StartNew() and it led the code towards a horrible state of malfunction! I then returned the await keyword to before taskFactory.StartNew() and debugged the code using breakpoints and amazingly saw that the threads are ran one after another and sequentially before the first thread reaches the first await inside the "SendQuestion" routine. When the "isCrossSellActive" flag was set despite the more jobs a thread should do the first await keyword is reached earlier thus enabling the next scheduled task to run. But when its not set the only await keyword is the last line of the routine so its most likely to run sequentially to the end.
usr's suggestion to remove the await keyword in the for loop seemed to be correct but the problem was the Task.WaitAll() line would wait on the wrong list of Task<Task<void>> instead of Task<void>. I finally used Task.Run instead of TaskFactory.StartNew and everything changed. Now the service is working well. The final code inside the for loop is:
tasks.Add(Task.Run(async () =>
{
await SendQuestion(data);
}));
and the problem was solved.
Thank you all.
P.S. Read this article on Task.Run and why TaskFactory.StartNew is dangerous: http://blog.stephencleary.com/2013/08/startnew-is-dangerous.html

It's extremly hard to tell unless you add some profiling that tell you which code is taking longer now.
Without seeing more numbers my best guess would be that the SMS service doesn't like when you send too many requests in a short time and chokes. When you add the extra DB calls the extra delay make the sms service work better.
A few other small details:
await Task.WhenAll is usually a bit better than Task.WaitAll. WaitAll means the thread will sit around waiting. Making a deadlock slightly more likely.
Instead of:
tasks.Add(await taskFactory.StartNew(async (x) =>
{
await SendQuestion(x);
}, data));
You should be able to do
tasks.Add(SendQuestion(data));

Async Task.WhenAll with timeout - issue with completed tasks accumulating

I have created the following in order to execute multiple async tasks with a timeout. I was looking for something that will allow extracting results from the tasks - taking only those that beat the timeout, regardless if the rest of tasks failed to do so (simplified):
TimeSpan timeout = TimeSpan.FromSeconds(5.0);
Task<Task>[] tasksOfTasks =
{
Task.WhenAny(SomeTaskAsync("a"), Task.Delay(timeout)),
Task.WhenAny(SomeTaskAsync("b"), Task.Delay(timeout)),
Task.WhenAny(SomeTaskAsync("c"), Task.Delay(timeout))
};
Task[] completedTasks = await Task.WhenAll(tasksOfTasks);
List<MyResult> = completedTasks.OfType<Task<MyResult>>().Select(task => task.Result).ToList();
I have implemented this in a non-static class in the (Web API) server.
This worked well on the first call, however additional calls caused completedTasks to strangely accumulate tasks from previous calls to the server (as shown by the debugger). On the second call there were 6 completed tasks, on the third call 9 and so on.
My questions:
Any idea why is that?
I assume it's because the previous tasks weren't cancelled however this code is in a new instance of a class!
Any idea how to avoid this accumulation?
PS: See my answer to this question.

I couldn't use my psychic debugging to understand why your code "caused completedTasks to strangely accumulate tasks from previous calls" but it does probably expose some of your misunderstandings.
Here's a working example based on your code (using string instead of MyResult):
Task<string> timeoutTask =
Task.Delay(TimeSpan.FromSeconds(5)).ContinueWith(_ => string.Empty);
Task<Task<string>>[] tasksOfTasks =
{
Task.WhenAny(SomeTaskAsync("a"), timeoutTask),
Task.WhenAny(SomeTaskAsync("b"), timeoutTask),
Task.WhenAny(SomeTaskAsync("c"), timeoutTask)
};
Task<string>[] completedTasks = await Task.WhenAll(tasksOfTasks);
List<string> results = completedTasks.Where(task => task != timeoutTask).
Select(task => task.Result).ToList();
So, what's different:
I'm using the same timeout task for all WhenAny calls. There's no need to use more, and they could complete in slightly different times.
I make the timeout task return a value, so it's actually a Task<string> and not a Task.
That makes each WhenAny call also return a Task<string> (and tasksOfTasks be Task<Task<string>>[]) which would make it possible to actually return a result out of these tasks.
After awaiting we need to filter out WhenAny calls that returned our timeout task, because there would be no result there (only string.Empty) using completedTasks.Where(task => task != timeoutTask).
P.S : I've also answered that question and I would (surprisingly) recommend you use my solution.
Note: Using the Task.Result property isn't advisable. You should await it instead (even when you know it's already completed)

Preventing task from running on certain thread

I have been struggling a bit with some async await stuff. I am using RabbitMQ for sending/receiving messages between some programs.
As a bit of background, the RabbitMQ client uses 3 or so threads that I can see: A connection thread and two heartbeat threads. Whenever a message is received via TCP, the connection thread handles it and calls a callback which I have supplied via an interface. The documentation says that it is best to avoid doing lots of work during this call since its done on the same thread as the connection and things need to continue on. They supply a QueueingBasicConsumer which has a blocking 'Dequeue' method which is used to wait for a message to be received.
I wanted my consumers to be able to actually release their thread context during this waiting time so somebody else could do some work, so I decided to use async/await tasks. I wrote an AwaitableBasicConsumer class which uses TaskCompletionSources in the following fashion:
I have an awaitable Dequeue method:
public Task<RabbitMQ.Client.Events.BasicDeliverEventArgs> DequeueAsync(CancellationToken cancellationToken)
{
//we are enqueueing a TCS. This is a "read"
rwLock.EnterReadLock();
try
{
TaskCompletionSource<RabbitMQ.Client.Events.BasicDeliverEventArgs> tcs = new TaskCompletionSource<RabbitMQ.Client.Events.BasicDeliverEventArgs>();
//if we are cancelled before we finish, this will cause the tcs to become cancelled
cancellationToken.Register(() =>
{
tcs.TrySetCanceled();
});
//if there is something in the undelivered queue, the task will be immediately completed
//otherwise, we queue the task into deliveryTCS
if (!TryDeliverUndelivered(tcs))
deliveryTCS.Enqueue(tcs);
}
return tcs.Task;
}
finally
{
rwLock.ExitReadLock();
}
}
The callback which the rabbitmq client calls fulfills the tasks: This is called from the context of the AMQP Connection thread
public void HandleBasicDeliver(string consumerTag, ulong deliveryTag, bool redelivered, string exchange, string routingKey, RabbitMQ.Client.IBasicProperties properties, byte[] body)
{
//we want nothing added while we remove. We also block until everybody is done.
rwLock.EnterWriteLock();
try
{
RabbitMQ.Client.Events.BasicDeliverEventArgs e = new RabbitMQ.Client.Events.BasicDeliverEventArgs(consumerTag, deliveryTag, redelivered, exchange, routingKey, properties, body);
bool sent = false;
TaskCompletionSource<RabbitMQ.Client.Events.BasicDeliverEventArgs> tcs;
while (deliveryTCS.TryDequeue(out tcs))
{
//once we manage to actually set somebody's result, we are done with handling this
if (tcs.TrySetResult(e))
{
sent = true;
break;
}
}
//if nothing was sent, we queue up what we got so that somebody can get it later.
/**
* Without the rwlock, this logic would cause concurrency problems in the case where after the while block completes without sending, somebody enqueues themselves. They would get the
* next message and the person who enqueues after them would get the message received now. Locking prevents that from happening since nobody can add to the queue while we are
* doing our thing here.
*/
if (!sent)
{
undelivered.Enqueue(e);
}
}
finally
{
rwLock.ExitWriteLock();
}
}
rwLock is a ReaderWriterLockSlim. The two queues (deliveryTCS and undelivered) are ConcurrentQueues.
The problem:
Every once in a while, the method that awaits the dequeue method throws an exception. This would not normally be an issue since that method is also async and so it enters the "Exception" completion state that tasks enter. The problem comes in the situation where the task that calls DequeueAsync is resumed after the await on the AMQP Connection thread that the RabbitMQ client creates. Normally I have seen tasks resume onto the main thread or one of the worker threads floating around. However, when it resumes onto the AMQP thread and an exception is thrown, everything stalls. The task does not enter its "Exception state" and the AMQP Connection thread is left saying that it is executing the method that had the exception occur.
My main confusion here is why this doesn't work:
var task = c.RunAsync(); //<-- This method awaits the DequeueAsync and throws an exception afterwards
ConsumerTaskState state = new ConsumerTaskState()
{
Connection = connection,
CancellationToken = cancellationToken
};
//if there is a problem, we execute our faulted method
//PROBLEM: If task fails when its resumed onto the AMQP thread, this method is never called
task.ContinueWith(this.OnFaulted, state, TaskContinuationOptions.OnlyOnFaulted);
Here is the RunAsync method, set up for the test:
public async Task RunAsync()
{
using (var channel = this.Connection.CreateModel())
{
...
AwaitableBasicConsumer consumer = new AwaitableBasicConsumer(channel);
var result = consumer.DequeueAsync(this.CancellationToken);
//wait until we find something to eat
await result;
throw new NotImplementeException(); //<-- the test exception. Normally this causes OnFaulted to be called, but sometimes, it stalls
...
} //<-- This is where the debugger says the thread is sitting at when I find it in the stalled state
}
Reading what I have written, I see that I may not have explained my problem very well. If clarification is needed, just ask.
My solutions that I have come up with are as follows:
Remove all Async/Await code and just use straight up threads and block. Performance will be decreased, but at least it won't stall sometimes
Somehow exempt the AMQP threads from being used for resuming tasks. I assume that they were sleeping or something and then the default TaskScheduler decided to use them. If I could find a way to tell the task scheduler that those threads are off limits, that would be great.
Does anyone have an explanation for why this is happening or any suggestions to solving this? Right now I am removing the async code just so that the program is reliable, but I really want to understand what is going on here.

I first recommend that you read my async intro, which explains in precise terms how await will capture a context and use that to resume execution. In short, it will capture the current SynchronizationContext (or the current TaskScheduler if SynchronizationContext.Current is null).
The other important detail is that async continuations are scheduled with TaskContinuationOptions.ExecuteSynchronously (as #svick pointed out in a comment). I have a blog post about this but AFAIK it is not officially documented anywhere. This detail does make writing an async producer/consumer queue difficult.
The reason await isn't "switching back to the original context" is (probably) because the RabbitMQ threads don't have a SynchronizationContext or TaskScheduler - thus, the continuation is executed directly when you call TrySetResult because those threads look just like regular thread pool threads.
BTW, reading through your code, I suspect your use of a reader/writer lock and concurrent queues are incorrect. I can't be sure without seeing the whole code, but that's my impression.
I strongly recommend you use an existing async queue and build a consumer around that (in other words, let someone else do the hard part :). The BufferBlock<T> type in TPL Dataflow can act as an async queue; that would be my first recommendation if you have Dataflow available on your platform. Otherwise, I have an AsyncProducerConsumerQueue type in my AsyncEx library, or you could write your own (as I describe on my blog).
Here's an example using BufferBlock<T>:
private readonly BufferBlock<RabbitMQ.Client.Events.BasicDeliverEventArgs> _queue = new BufferBlock<RabbitMQ.Client.Events.BasicDeliverEventArgs>();
public void HandleBasicDeliver(string consumerTag, ulong deliveryTag, bool redelivered, string exchange, string routingKey, RabbitMQ.Client.IBasicProperties properties, byte[] body)
{
RabbitMQ.Client.Events.BasicDeliverEventArgs e = new RabbitMQ.Client.Events.BasicDeliverEventArgs(consumerTag, deliveryTag, redelivered, exchange, routingKey, properties, body);
_queue.Post(e);
}
public Task<RabbitMQ.Client.Events.BasicDeliverEventArgs> DequeueAsync(CancellationToken cancellationToken)
{
return _queue.ReceiveAsync(cancellationToken);
}
In this example, I'm keeping your DequeueAsync API. However, once you start using TPL Dataflow, consider using it elsewhere as well. When you need a queue like this, it's common to find other parts of your code that would also benefit from a dataflow approach. E.g., instead of having a bunch of methods calling DequeueAsync, you could link your BufferBlock to an ActionBlock.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Select with asynchronous method - c#

Related

Log data into cassandra using c#

Throttling with SemaphoreSlim -- "Task.Run()" vs "new Func<Task>()"

Why an additional async operation is making my code faster than when the operation is not taking place at all?

Async Task.WhenAll with timeout - issue with completed tasks accumulating

Preventing task from running on certain thread

Categories

Resources