Only Fan-out (and forget) in Durable Functions - c#

I have an existing Function App with 2 Functions and a storage queue. F1 is triggered by a message in a service bus topic. For each msg received, F1 calculates some sub-tasks (T1,T2,...) which have to be executed with varying amount of delay. Ex - T1 to be fired after 3 mins, T2 after 5min etc. F1 posts messages to a storage queue with appropriate visibility timeouts (to simulate the delay) and F2 is triggered whenever a message is visible in the queue. All works fine.
I now want to migrate this app to use 'Durable Functions'. F1 now only starts the orchestrator. The orchestrator code is something as follows -
public static async Task Orchestrator([OrchestrationTrigger] DurableOrchestrationContext context, TraceWriter log)
{
var results = await context.CallActivityAsync<List<TaskInfo>>("CalculateTasks", "someinput");
List<Task> tasks = new List<Task>();
foreach (var value in results)
{
var pnTask = context.CallActivityAsync("PerformSubTask", value);
tasks.Add(pnTask);
}
//dont't await as we want to fire and forget. No fan-in!
//await Task.WhenAll(tasks);
}
[FunctionName("PerformSubTask")]
public async static Task Run([ActivityTrigger]TaskInfo info, TraceWriter log)
{
TimeSpan timeDifference = DateTime.UtcNow - info.Origin.ToUniversalTime();
TimeSpan delay = TimeSpan.FromSeconds(info.DelayInSeconds);
var actualDelay = timeDifference > delay ? TimeSpan.Zero : delay - timeDifference;
//will still keep the activity function running and incur costs??
await Task.Delay(actualDelay);
//perform subtask work after delay!
}
I would only like to fan-out (no fan-in to collect the results) and start the subtasks. The orchestrator starts all the tasks and avoids call 'await Task.WhenAll'. The activity function calls 'Task.Delay' to wait for the specified amount of time and then does its work.
My questions
Does it make sense to use Durable Functions for this workflow?
Is this the right approach to orchestrate 'Fan-out' workflow?
I do not like the fact that the activity function is running for specified amount of time (3 or 5 mins) doing nothing. It will incurs costs,or?
Also if a delay of more than 10 minutes is required there is no way for an activity function to succeed with this approach!
My earlier attempt to avoid this was to use 'CreateTimer' in the orchestrator and then add the activity as a continuation, but I see only timer entries in the 'History' table. The continuation does not fire! Am I violating the constraint for orchestrator code - 'Orchestrator code must never initiate any async operation' ?
foreach (var value in results)
{
//calculate time to start
var timeToStart = ;
var pnTask = context.CreateTimer(timeToStart , CancellationToken.None).ContinueWith(t => context.CallActivityAsync("PerformSubTask", value));
tasks.Add(pnTask);
}
UPDATE : using approach suggested by Chris
Activity that calculates subtasks and delays
[FunctionName("CalculateTasks")]
public static List<TaskInfo> CalculateTasks([ActivityTrigger]string input,TraceWriter log)
{
//in reality time is obtained by calling an endpoint
DateTime currentTime = DateTime.UtcNow;
return new List<TaskInfo> {
new TaskInfo{ DelayInSeconds = 10, Origin = currentTime },
new TaskInfo{ DelayInSeconds = 20, Origin = currentTime },
new TaskInfo{ DelayInSeconds = 30, Origin = currentTime },
};
}
public static async Task Orchestrator([OrchestrationTrigger] DurableOrchestrationContext context, TraceWriter log)
{
var results = await context.CallActivityAsync<List<TaskInfo>>("CalculateTasks", "someinput");
var currentTime = context.CurrentUtcDateTime;
List<Task> tasks = new List<Task>();
foreach (var value in results)
{
TimeSpan timeDifference = currentTime - value.Origin;
TimeSpan delay = TimeSpan.FromSeconds(value.DelayInSeconds);
var actualDelay = timeDifference > delay ? TimeSpan.Zero : delay - timeDifference;
var timeToStart = currentTime.Add(actualDelay);
Task delayedActivityCall = context
.CreateTimer(timeToStart, CancellationToken.None)
.ContinueWith(t => context.CallActivityAsync("PerformSubtask", value));
tasks.Add(delayedActivityCall);
}
await Task.WhenAll(tasks);
}
Simply scheduling tasks from within the orchestrator seems to work.In my case I am calculating the tasks and the delays in another activity (CalculateTasks) before the loop. I want the delays to be calculated using the 'current time' when the activity was run. I am using DateTime.UtcNow in the activity. This somehow does not play well when used in the orchestrator. The activities specified by 'ContinueWith' just don't run and the orchestrator is always in 'Running' state.
Can I not use the time recorded by an activity from within the orchestrator?
UPDATE 2
So the workaround suggested by Chris works!
Since I do not want to collect the results of the activities I avoid calling 'await Tasks.WhenAll(tasks)' after scheduling all activities. I do this in order to reduce the contention on the control queue i.e. be able to start another orchestration if reqd. Nevertheless the status of the 'orchestrator' is still 'Running' till all the activities finish running. I guess it moves to 'Complete' only after the last activity posts a 'done' message to the control queue.
Am I right? Is there any way to free the orchestrator earlier i.e right after scheduling all activities?

The ContinueWith approach worked fine for me. I was able to simulate a version of your scenario using the following orchestrator code:
[FunctionName("Orchestrator")]
public static async Task Orchestrator(
[OrchestrationTrigger] DurableOrchestrationContext context,
TraceWriter log)
{
var tasks = new List<Task>(10);
for (int i = 0; i < 10; i++)
{
int j = i;
DateTime timeToStart = context.CurrentUtcDateTime.AddSeconds(10 * j);
Task delayedActivityCall = context
.CreateTimer(timeToStart, CancellationToken.None)
.ContinueWith(t => context.CallActivityAsync("PerformSubtask", j));
tasks.Add(delayedActivityCall);
}
await Task.WhenAll(tasks);
}
And for what it's worth, here is the activity function code.
[FunctionName("PerformSubtask")]
public static void Activity([ActivityTrigger] int j, TraceWriter log)
{
log.Warning($"{DateTime.Now:o}: {j:00}");
}
From the log output, I saw that all activity invocations ran 10 seconds apart from each other.
Another approach would be to fan out to multiple sub-orchestrations (like #jeffhollan suggested) which are simple a short sequence of a durable timer delay and your activity call.
UPDATE
I tried using your updated sample and was able to reproduce your problem! If you run locally in Visual Studio and configure the exception settings to always break on exceptions, then you should see the following:
System.InvalidOperationException: 'Multithreaded execution was detected. This can happen if the orchestrator function code awaits on a task that was not created by a DurableOrchestrationContext method. More details can be found in this article https://learn.microsoft.com/en-us/azure/azure-functions/durable-functions-checkpointing-and-replay#orchestrator-code-constraints.'
This means the thread which called context.CallActivityAsync("PerformSubtask", j) was not the same as the thread which called the orchestrator function. I don't know why my initial example didn't hit this, or why your version did. It has something to do with how the TPL decides which thread to use to run your ContinueWith delegate - something I need to look more into.
The good news is that there is a simple workaround, which is to specify TaskContinuationOptions.ExecuteSynchronously, like this:
Task delayedActivityCall = context
.CreateTimer(timeToStart, CancellationToken.None)
.ContinueWith(
t => context.CallActivityAsync("PerformSubtask", j),
TaskContinuationOptions.ExecuteSynchronously);
Please try that and let me know if that fixes the issue you're observing.
Ideally you wouldn't need to do this workaround when using Task.ContinueWith. I've opened an issue in GitHub to track this: https://github.com/Azure/azure-functions-durable-extension/issues/317
Since I do not want to collect the results of the activities I avoid calling await Tasks.WhenAll(tasks) after scheduling all activities. I do this in order to reduce the contention on the control queue i.e. be able to start another orchestration if reqd. Nevertheless the status of the 'orchestrator' is still 'Running' till all the activities finish running. I guess it moves to 'Complete' only after the last activity posts a 'done' message to the control queue.
This is expected. Orchestrator functions never actually complete until all outstanding durable tasks have completed. There isn't any way to work around this. Note that you can still start other orchestrator instances, there just might be some contention if they happen to land on the same partition (there are 4 partitions by default).

await Task.Delay is definitely not the best option: you will pay for this time while your function won't do anything useful. The max delay is also bound to 10 minutes on Consumption plan.
In my opinion, raw Queue messages are the best option for fire-and-forget scenarios. Set the proper visibility timeouts, and your scenario will work reliably and efficiently.
The killer feature of Durable Functions are awaits, which do their magic of pausing and resuming while keeping the scope. Thus, it's a great way to implement fan-in, but you don't need that.

I think durable definitely makes sense for this workflow. I do think the best option would be to leverage the delay / timer feature as you said, but based on the synchronous nature of execution I don't think I would add everything to a task list which is really expecting a .WhenAll() or .WhenAny() which you aren't aiming for. I think I personally would just do a sequential foreach loop with timer delays for each task. So pseudocode of:
for(int x = 0; x < results.Length; x++) {
await context.CreateTimer(TimeSpan.FromMinutes(1), ...);
await context.CallActivityAsync("PerformTaskAsync", results[x]);
}
You need those awaits in there regardless, so just avoiding the await Task.WhenAll(...) is likely causing some issues in code sample above. Hope that helps

You should be able to use the IDurableOrchestrationContext.StartNewOrchestration() method that's been added in 2019 to suport this scenario. See https://github.com/Azure/azure-functions-durable-extension/issues/715 for context

Related

Log data into cassandra using c#

I trying to log data into Cassandra using c#. So my aim is to log as much data points as I can in 200ms.
I am trying to save time, random key and value in 200ms. Please see code for refrence. the problem how can I execute session after while loop.
Cluster cluster = Cluster.Builder()
.AddContactPoint("127.0.0.1")
.Build();
ISession session = cluster.Connect("log"); //keyspace to connect with
var ps = session.Prepare("Insert into logcassandra(nanodate, key, value) values (?,?,?)");
stopwatch.Start();
while(stop.ElapsedMilliseconds <= 200)
{
i++;
var statement = ps.Bind(nanoTime(),"key"+i,"value"+i);
session.ExecuteAsync(statement);
}
Please prefer System.Threading.Timer with a TimerCallback over Stopwatch.
EDIT: (reply to the comment)
Hi, I'm not sure what you want to achieve, but here are some general concepts about async calls and parallel execution. In .NET world the async is mainly used for Non-blocking I/O operations, which means your caller thread will not wait for the response of the I/O driver. In other words, you instantiate an I/O operation and dispatch this work to a "thing" which is outside of the .NET ecosystem and that will gives you back a future (a Task). The driver acknowledges back that it received the request and it promises that it will process it once it has free capacity.
That Task represents an async work that either succeeded or fail. But because you are calling it asynchronously you are not awaiting its result (not blocking the caller thread to wait for external work) rather move on to the next statement. Eventually this operation will be finished and at that time the driver will notify that Task that a request operation has been finished. (The Task can be seen as the primary communication channel between the caller and the callee)
In your case you are using a fire and forget style async call. That means you are firing off a lot of I/O operations in async and you forget to process the result of them. You don't know either any of them failed or not. But you have called the Casandra to do a lot of staff. Your time measurement is used only for firing off jobs, which means you have no idea how much of these jobs has been finished.
If you would choose to use await against your async calls, that would mean that your while loop would be serially executed. You would firing off a job and you can't move on to the next iteration because you are awaiting it, so your caller thread will move one level higher in its call stack and examines if it can processed with something. If there is an await as well, then it moves one level higher and so on...
while(stop.ElapsedMilliseconds <= 200)
{
await session.ExecuteAsync(statement);
}
If you don't want serial execution rather parallel, you can create as many jobs as you need and await them as a whole. That's where Task.WhenAll comes into the play. You will fire off a lot of jobs and you will await that single job that will track all of other jobs.
var cassandraCalls = new List<Task>();
cassandraCalls.AddRange(Enumerable.Range(0, 100).Select(_ => session.ExecuteAsync(statement)));
await Task.WhenAll(cassandraCalls);
But this code will run until all of the jobs are finished. If you want to constrain the whole execution time then you should use some cancellation mechanism. Task.WhenAll does not support CancellationToken. But you can overcome of this limitation in several way. The simplest solution is a combination of the Task.Delay and the Task.WhenAny. Task.Delay will be used for the timeout, and Task.WhenAny will be used to await either the your cassandra calls or the timeout to complete.
var cassandraCalls = new List<Task>();
cassandraCalls.AddRange(Enumerable.Range(0, 100).Select(_ => ExecuteAsync()));
await Task.WhenAny(Task.WhenAll(cassandraCalls), Task.Delay(1000));
In this way, you have fired off as many jobs as you wanted and depending on your driver they may be executed in parallel or concurrently. You are awaiting either to finish all or elapse a certain amount of time. When the WhenAny job finishes then you can examine the result of the jobs, but simply iterating over the cassandraCalls
foreach (var call in cassandraCalls)
{
Console.WriteLine(call.IsCompleted);
}
I hope this explanation helped you a bit.

Guaranteeing serial task execution with Rx when using Observable.FromAsync and the Switch operator?

I'm trying to use Observable.FromAsync() in combination with Switch() to execute a task until a new notification comes in, something a bit like this:
public static async void Main(string[] args)
{
var pump = Observable.Range(0, 3);
var sub = pump.Select(_ => Observable.FromAsync(ct => Run(ct, _))).Switch();
await sub.ToTask();
}
private static async Task Run(CancellationToken ct, int i)
{
Console.WriteLine($"Started {i}");
while (true) await Task.Delay(TimeSpan.FromSeconds(1)); // note not using ct on purpose
}
This prints:
Started 0
Started 1
Started 2
But this is not quite what I want. I would like the tasks produced by the FromAsync() call to only ever be executed serially and exclusively, so in the case where the task does not support cancellation (or the cancellation is very slow) I want the output to be just:
Started 0
In effect, the observable should now become blocked until the previous task has completed.
I know Concat() can be used instead of Switch() to remove parallelism in this situation, but the semantic of this is different; it won't unsubscribe from the current observable and so the cancellation token will not get revoked and the task cannot gracefully stop. It's important that we try to cancel the token so I can wind down the task, not just replace it.
Note: If we amend Run() to pass the cancellation token to the Task.Delay() call we see it outputs the same three lines, which is desired.
My workaround at the moment involves using a semaphore in the Run() method, this is simple but not ideal. I feel like I should be able to solve this at the Rx layer.

running multiple tasks asynchronously and adding new ones to the task list as tasks finish

Apologies for the bad title, I'm not sure how to succinctly describe this.
I'm building a service in .net core to process files. The general idea would be to (in a loop) check for new tasks to run on a file by using a DB call, and if found, fire them off async. There would be a throttler to limit the number of running tasks. If it hit the limit, it would just wait until a task is finished before firing off a new one. If the DB call returns nothing, then it would just sleep for a bit and try again later.
Fairly basic I think, but I'm running into issues because everything I've read says you should always await a task and not "fire and forget". But if I do that, then I'm left either running files synchronously or batching them. I'm missing something I'm sure.
Currently it's implemented as a IHostedService (BackgroundService specifically). Here us the main ExecuteAsync method that does the work. Implemented as fire and forget. (Visual studio gives the "call not awaited" warning on Task.Run)
protected override async Task ExecuteAsync(CancellationToken stoppingToken)
{
var throttleTimeout = 10000;
//set up semaphore
var taskThrottler = new SemaphoreSlim(10, 10);
while (!stoppingToken.IsCancellationRequested)
{
//wait for free slot
await taskThrottler.WaitAsync(throttleTimeout,stoppingToken);
//get next file to process
var currentFile = await _fileService.GetNextFileForProcessing();
// create task
Task.Run( async
() =>
{
try
{
//do work with file tracker
var result = await _fileProcessorController.ProcessFile(currentFile);
}
finally
{
taskThrottler.Release();
}
}
, stoppingToken);
}
}
Now if I change that and tack on an await before Task.Run, then I'm (essentially) running synchronously because each task will be awaited before the next is fired..
If I add the tasks to a list I can use "await Task.WhenAll(tasks);". But then I'm batching.
I need to be able to gracefully handle failed processing jobs, so an await in some form is a must I think.
All the examples I've found that tackle a similar problem usually have the list of tasks (or files) to do in advance, and then just iterate through. But I'm not sure what the best method is when I need to add tasks to the list while processing others.
Perhaps something that uses a task list and WhenAny? And then can I add new tasks to the task list on each loop if any are found?

Throttling with SemaphoreSlim -- "Task.Run()" vs "new Func<Task>()"

This might not be specific to SemaphoreSlim exclusively, but basically my question is about whether there is a difference between the below two methods of throttling a collection of long running tasks, and if so, what that difference is (and when if ever to use either).
In the example below, let's say that each tracked task involves loading data from a Url (totally made up example, but is a common one that I've found for SemaphoreSlim examples).
The main difference comes down to how the individual tasks are added to the list of tracked tasks. In the first example, we call Task.Run() with a lambda, whereas in the second, we new up a Func(<Task<Result>>()) with a lambda and then immediately call that func and add the result to the tracked task list.
Examples:
Using Task.Run():
SemaphoreSlim ss = new SemaphoreSlim(_concurrentTasks);
List<string> urls = ImportUrlsFromSource();
List<Task<Result>> trackedTasks = new List<Task<Result>>();
foreach (var item in urls)
{
await ss.WaitAsync().ConfigureAwait(false);
trackedTasks.Add(Task.Run(async () =>
{
try
{
return await ProcessUrl(item);
}
catch (Exception e)
{
_log.Error($"logging some stuff");
throw;
}
finally
{
ss.Release();
}
}));
}
var results = await Task.WhenAll(trackedTasks);
Using a new Func:
SemaphoreSlim ss = new SemaphoreSlim(_concurrentTasks);
List<string> urls = ImportUrlsFromSource();
List<Task<Result>> trackedTasks = new List<Task<Result>>();
foreach (var item in urls)
{
trackedTasks.Add(new Func<Task<Result>>(async () =>
{
await ss.WaitAsync().ConfigureAwait(false);
try
{
return await ProcessUrl(item);
}
catch (Exception e)
{
_log.Error($"logging some stuff");
throw;
}
finally
{
ss.Release();
}
})());
}
var results = await Task.WhenAll(trackedTasks);
There are two differences:
Task.Run does error handling
First off all, when you call the lambda, it runs. On the other hand, Task.Run would call it. This is relevant because Task.Run does a bit of work behind the scenes. The main work it does is handling a faulted task...
If you call a lambda, and the lambda throws, it would throw before you add the Task to the list...
However, in your case, because your lambda is async, the compiler would create the Task for it (you are not making it by hand), and it will correctly handle the exception and make it available via the returned Task. Therefore this point is moot.
Task.Run prevents task attachment
Task.Run sets DenyChildAttach. This means that the tasks created inside the Task.Run run independently from (are not synchronized with) the returned Task.
For example, this code:
List<Task<int>> trackedTasks = new List<Task<int>>();
var numbers = new int[]{0, 1, 2, 3, 4};
foreach (var item in numbers)
{
trackedTasks.Add(Task.Run(async () =>
{
var x = 0;
(new Func<Task<int>>(async () =>{x = item; return x;}))().Wait();
Console.WriteLine(x);
return x;
}));
}
var results = await Task.WhenAll(trackedTasks);
Will output the numbers from 0 to 4, in unknown order. However the following code:
List<Task<int>> trackedTasks = new List<Task<int>>();
var numbers = new int[]{0, 1, 2, 3, 4};
foreach (var item in numbers)
{
trackedTasks.Add(new Func<Task<int>>(async () =>
{
var x = 0;
(new Func<Task<int>>(async () =>{x = item; return x;}))().Wait();
Console.WriteLine(x);
return x;
})());
}
var results = await Task.WhenAll(trackedTasks);
Will output the numbers from 0 to 4, in order, every time. This is odd, right? What happens is that the inner task is attached to outer one, and executed right away in the same thread. But if you use Task.Run, the inner task is not attached and scheduled independently.
This remain true even if you use await, as long as the task you await does not go to an external system...
What happens with external system? Well, for example, if your task is reading from an URL - as in your example - the system would create a TaskCompletionSource, get the Task from it, set a response handler that writes the result to the TaskCompletionSource, make the request, and return the Task. This Task is not scheduled, it running on the same thread as a parent task makes no sense. And thus, it can break the order.
Since, you are using await to wait on an external system, this point is moot too.
Conclusion
I must conclude that these are equivalent.
If you want to be safe, and make sure it works as expected, even if - in a future version - some of the above points stops being moot, then keep Task.Run. On the other hand, if you really want to optimize, use the lambda and avoid the Task.Run (very small) overhead. However, that probably won't be a bottleneck.
Addendum
When I talk about a task that goes to an external system, I refer to something that runs outside of .NET. There a bit of code that will run in .NET to interface with the external system, but the bulk of the code will not run in .NET, and thus will not be in a managed thread at all.
The consumer of the API specify nothing for this to happen. The task would be a promise task, but that is not exposed, for the consumer there is nothing special about it.
In fact, a task that goes to an external system may barely run in the CPU at all. Futhermore, it might just be waiting on something exterior to the computer (it could be the network or user input).
The pattern is as follows:
The library creates a TaskCompletionSource.
The library sets a means to recieve a notification. It can be a callback, event, message loop, hook, listening to a socket, a pipe line, waiting on a global mutex... whatever is necesary.
The library sets code to react to the notification that will call SetResult, or SetException on the TaskCompletionSource as appropiate for the notification recieved.
The library does the actual call to the external system.
The library returns TaskCompletionSource.Task.
Note: with extra care of optimization not reordering things where it should not, and with care of handling errors during the setup phase. Also, if a CancellationToken is involved, it has to be taken into account (and call SetCancelled on the TaskCompletionSource when appropiate). Also, there could be tear down necesary in the reaction to the notification (or on cancellation). Ah, do not forget to validate your parameters.
Then the external system goes and does whatever it does. Then when it finishes, or something goes wrong, gives the library the notification, and your Task is sudendtly completed, faulted... (or if cancellation happened, your Task is now cancelled) and .NET will schedule the continuations of the task as needed.
Note: async/await uses continuations behind the scenes, that is how execution resumes.
Incidentally, if you wanted to implement SempahoreSlim yourself, you would have to do something very similar to what I describe above. You can see it in my backport of SemaphoreSlim.
Let us see a couple of examples of promise tasks...
Task.Delay: when we are waiting with Task.Delay, the CPU is not spinning. This is not running in a thread. In this case the notification mechanism will be an OS timer. When the OS sees that the time of the timer has elapsed, it will call into the CLR, and then the CLR will mark the task as completed. What thread was waiting? none.
FileStream.ReadSync: when we are reading from storage with FileStream.ReadSync the actual work is done by the device. The CRL has to declare a custom event, then pass the event, the file handle and the buffer to the OS... the OS calls the device driver, the device driver interfaces with the device. As the storage device recovers the information, it will write to memory (directly on the specified buffer) via DMA technology. And when it is done, it will set an interruption, that is handled by the driver, that notifies the OS, that calls the custom event, that marks the task as completed. What thread did read the data from storage? none.
A similar pattern will be used to download from a web page, except, this time the device goes to the network. How to make an HTTP request and how the system waits for a response is beyond the scope of this answer.
It is also possible that the external system is another program, in which case it would run on a thread. But it won't be a managed thread on your process.
Your take away is that these task do not run on any of your threads. And their timing might depend on external factors. Thus, it makes no sense to think of them as running in the same thread, or that we can predict their timing (well, except of course, in the case of the timer).
Both are not very good because they create the tasks immediately. The func version is a little less overhead since it saves the Task.Run route over the thread pool just to immediately end the thread pool work and suspend on the semaphore. You don't need an async Func, you could simplify this by using an async method (possibly a local function).
But you should not do this at all. Instead, use a helper method that implements a parallel async foreach.
public static Task ForEachAsync<T>(this IEnumerable<T> source, int dop, Func<T, Task> body)
{
return Task.WhenAll(
from partition in Partitioner.Create(source).GetPartitions(dop)
select Task.Run(async delegate {
using (partition)
while (partition.MoveNext())
await body(partition.Current);
}));
}
Then you just go urls.ForEachAsync(myDop, async input => await ProcessAsync(input));
Here, the tasks are created on demand. You can even make the input stream lazy.

Why an additional async operation is making my code faster than when the operation is not taking place at all?

I'm working on a SMS-based game (Value Added Service), in which a question must be sent to each subscriber on a daily basis. There are over 500,000 subscribers and therefore performance is a key factor. Since each subscriber can be a difference state of the competition with different variables, database must be queried separately for each subscriber before sending a text message. To achieve the best performance I'm using .Net Task Parallel Library (TPL) to spawn parallel threadpool threads and do as much async operations as possible in each thread to finally send texts asap.
Before describing the actual problem there are some more information necessary to give about the code.
At first there was no async operation in the code. I just scheduled some 500,000 tasks with the default task scheduler into the Threadpool and each task would work through the routines, blocking on all EF (Entity Framework) queries and sequentially finishing its job. It was good, but not fast enough. Then I changed all EF queries to Async, the outcome was superb in speed but there has been so many deadlocks and timeouts in SQL server that about a third of the subscribers never received a text! After trying different solutions, I decided not to do too many Async Database operations while I have over 500,000 tasks running on a 24 core server (with at least 24 concurrent threadpool threads)!
I rolled back all the changes (the Asycn ones) expect for one web service call in each task which remained Async.
Now the weird case:
In my code, I have a boolean variable named "isCrossSellActive". When the variable is set some more DB operations take place and an asycn webservice call will happen on which the thread awaits. When this variable is false, none of these operations will happen including the async webservice call. Awkwardly when the variable is set the code runs so much faster than when it's not! It seems like for some reason the awaited async code (the cooperative thread) is making the code faster.
Here is the code:
public async Task AutoSendMessages(...)
{
//Get list of subscriptions plus some initialization
LimitedConcurrencyLevelTaskScheduler lcts = new LimitedConcurrencyLevelTaskScheduler(numberOfThreads);
TaskFactory taskFactory = new TaskFactory(lcts);
List<Task> tasks = new List<Task>();
//....
foreach (var sub in subscriptions)
{
AutoSendData data = new AutoSendData
{
ServiceId = serviceId,
MSISDN = sub.subscriber,
IsCrossSellActive = bolCrossSellHeader
};
tasks.Add(await taskFactory.StartNew(async (x) =>
{
await SendQuestion(x);
}, data));
}
GC.Collect();
try
{
Task.WaitAll(tasks.ToArray());
}
catch (AggregateException ae)
{
ae.Handle((ex) =>
{
_logRepo.LogException(1, "", ex);
return true;
});
}
await _autoSendRepo.SetAutoSendingStatusEnd(statusId);
}
public async Task SendQuestion(object data)
{
//extract variables from input parameter
try
{
if (isCrossSellActive)
{
int pieceCount = subscriptionRepo.GetSubscriberCarPieces(curSubscription.service, curSubscription.subscriber).Count(c => c.isConfirmed);
foreach (var rule in csRules)
{
if (rule.Applies)
{
if (await HttpClientHelper.GetJsonAsync<bool>(url, rule.TargetServiceBaseAddress))
{
int noOfAddedPieces = SomeCalculations();
if (noOfAddedPieces > 0)
{
crossSellRepo.SetPromissedPieces(curSubscription.subscriber, curSubscription.service,
rule.TargetShortCode, noOfAddedPieces, 0, rule.ExpirationLimitDays);
}
}
}
}
}
// The rest of the code. (Some db CRUD)
await SmsClient.SendSoapMessage(subscriber, smsBody);
}
catch (Exception ex){//...}
}
Ok, thanks to #usr and the clue he gave me, the problem is finally solved!
His comment drew my attention to the awaited taskFactory.StartNew(...) line which sequentially adds new tasks to the "tasks" list which is then awaited on by Task.WaitAll(tasks);
At first I removed the await keyword before the taskFactory.StartNew() and it led the code towards a horrible state of malfunction! I then returned the await keyword to before taskFactory.StartNew() and debugged the code using breakpoints and amazingly saw that the threads are ran one after another and sequentially before the first thread reaches the first await inside the "SendQuestion" routine. When the "isCrossSellActive" flag was set despite the more jobs a thread should do the first await keyword is reached earlier thus enabling the next scheduled task to run. But when its not set the only await keyword is the last line of the routine so its most likely to run sequentially to the end.
usr's suggestion to remove the await keyword in the for loop seemed to be correct but the problem was the Task.WaitAll() line would wait on the wrong list of Task<Task<void>> instead of Task<void>. I finally used Task.Run instead of TaskFactory.StartNew and everything changed. Now the service is working well. The final code inside the for loop is:
tasks.Add(Task.Run(async () =>
{
await SendQuestion(data);
}));
and the problem was solved.
Thank you all.
P.S. Read this article on Task.Run and why TaskFactory.StartNew is dangerous: http://blog.stephencleary.com/2013/08/startnew-is-dangerous.html
It's extremly hard to tell unless you add some profiling that tell you which code is taking longer now.
Without seeing more numbers my best guess would be that the SMS service doesn't like when you send too many requests in a short time and chokes. When you add the extra DB calls the extra delay make the sms service work better.
A few other small details:
await Task.WhenAll is usually a bit better than Task.WaitAll. WaitAll means the thread will sit around waiting. Making a deadlock slightly more likely.
Instead of:
tasks.Add(await taskFactory.StartNew(async (x) =>
{
await SendQuestion(x);
}, data));
You should be able to do
tasks.Add(SendQuestion(data));

Categories