In an AWS Lambda function, I would like to be able to call a component to create a RDS DB Snapshot. There is an async method on the client named CreateDBSnapshotAsync. But, because this is AWS Lambda, I only have 5 minutes to complete the task. So, if I await it, the AWS Lambda function will timeout. And, apparently when it times out, the call is cancelled and then the snapshot is not completed.
Is there some way I can make the call in a COMPLETELY asynchronously way so that once I invoke it, it will complete no matter if my Lambda function times out or not?
In other words, I don't care about the result, I just want to invoke the process and move on, a "set it and forget it" mentality.
My call (without the await, obviously) is as below
using (var rdsClient = new AmazonRDSClient())
{
Task<CreateDBSnapshotResponse> response = rdsClient.CreateDBSnapshotAsync(new CreateDBSnapshotRequest($"MySnapShot", instanceId));
}
As requested, here's the full method:
public async Task<CloudFormationResponse> MigrateDatabase(CloudFormationRequest request, ILambdaContext context)
{
LambdaLogger.Log($"{nameof(MigrateDatabase)} invoked: " + JsonConvert.SerializeObject(request));
if (request.RequestType != "Delete")
{
try
{
var migrations = this.Context.Database.GetPendingMigrations().OrderBy(b=>b).ToList();
for (int i = 0; i < migrations.Count(); i++)
{
string thisMigration = migrations [i];
this.ApplyMigrationInternal(thisMigration);
}
this.TakeSnapshotAsync(context,migrations.Last());
return await CloudFormationResponse.CompleteCloudFormationResponse(null, request, context);
}
catch (Exception e)
{
LambdaLogger.Log(e.ToString());
if (e.InnerException != null) LambdaLogger.Log(e.InnerException.ToString());
return await CloudFormationResponse.CompleteCloudFormationResponse(e, request, context);
}
}
return await CloudFormationResponse.CompleteCloudFormationResponse(null, request, context);
}
internal void TakeSnapshotAsync(ILambdaContext context, string migration)
{
var instanceId = this.GetEnvironmentVariable(nameof(DBInstance));
using (var rdsClient = new AmazonRDSClient())
{
Task<CreateDBSnapshotResponse> response = rdsClient.CreateDBSnapshotAsync(new CreateDBSnapshotRequest($"{instanceId}{migration.Replace('_','-')}", instanceId));
while (context.RemainingTime > TimeSpan.FromSeconds(15))
{
Thread.Sleep(15000);
}
}
}
First refactor that sub function to use proper async syntax along with the use of Task.WhenAny.
internal async Task TakeSnapshotAsync(ILambdaContext context, string migration) {
var instanceId = this.GetEnvironmentVariable(nameof(DBInstance));
//don't wrap in using block or it will be disposed before you are done with it.
var rdsClient = new AmazonRDSClient();
var request = new CreateDBSnapshotRequest($"{instanceId}{migration.Replace('_','-')}", instanceId);
//don't await this long running task
Task<CreateDBSnapshotResponse> response = rdsClient.CreateDBSnapshotAsync(request);
Task delay = Task.Run(async () => {
while (context.RemainingTime > TimeSpan.FromSeconds(15)) {
await Task.Delay(15000); //Don't mix Thread.Sleep. use Task.Delay and await it.
}
}
// The call returns as soon as the first operation completes,
// even if the others are still running.
await Task.WhenAny(response, delay);
}
So if the RemainingTime runs out, it will break out of the call even if the snap shot task is still running so that the request does not time out.
Now you should be able to await the snapshot while there is still time available in the context
public async Task<CloudFormationResponse> MigrateDatabase(CloudFormationRequest request, ILambdaContext context) {
LambdaLogger.Log($"{nameof(MigrateDatabase)} invoked: " + JsonConvert.SerializeObject(request));
if (request.RequestType != "Delete") {
try {
var migrations = this.Context.Database.GetPendingMigrations().OrderBy(b=>b).ToList();
for (int i = 0; i < migrations.Count(); i++) {
string thisMigration = migrations [i];
this.ApplyMigrationInternal(thisMigration);
}
await this.TakeSnapshotAsync(context, migrations.Last());
return await CloudFormationResponse.CompleteCloudFormationResponse(null, request, context);
} catch (Exception e) {
LambdaLogger.Log(e.ToString());
if (e.InnerException != null) LambdaLogger.Log(e.InnerException.ToString());
return await CloudFormationResponse.CompleteCloudFormationResponse(e, request, context);
}
}
return await CloudFormationResponse.CompleteCloudFormationResponse(null, request, context);
}
This should also allow for any exceptions thrown by the RDS client to be caught by the currently executing thread. Which should help with troubleshooting any exception messages.
Some interesting information from documentation.
Using Async in C# Functions with AWS Lambda
If you know your Lambda function will require a long-running process, such as uploading large files to Amazon S3 or reading a large stream of records from DynamoDB, you can take advantage of the async/await pattern. When you use this signature, Lambda executes the function synchronously and waits for the function to return a response or for execution to time out.
From docs about timeouts
Function Settings
...
Timeout – The amount of time that Lambda allows a function to run before stopping it. The default is 3 seconds. The maximum allowed value is 900 seconds.
If getting a HTTP timeout then shorten the delay but leave the long running task. You still use the Task.WhenAny to give the long running task an opportunity to finish first even if that is not the expectation.
internal async Task TakeSnapshotAsync(ILambdaContext context, string migration) {
var instanceId = this.GetEnvironmentVariable(nameof(DBInstance));
//don't wrap in using block or it will be disposed before you are done with it.
var rdsClient = new AmazonRDSClient();
var request = new CreateDBSnapshotRequest($"{instanceId}{migration.Replace('_','-')}", instanceId);
//don't await this long running task
Task<CreateDBSnapshotResponse> response = rdsClient.CreateDBSnapshotAsync(request);
Task delay = Task.Delay(TimeSpan.FromSeconds(2.5));
// The call returns as soon as the first operation completes,
// even if the others are still running.
await Task.WhenAny(response, delay);
}
Related
I was recently exposed to C# language and was working on getting data out of cassandra so I was working with below code which gets data from Cassandra and it works fine.
Only problem I have is in my ProcessCassQuery method - I am passing CancellationToken.None to my requestExecuter Function which might not be the right thing to do. What should be the right way to handle that case and what should I do to handle it correctly?
/**
*
* Below method does multiple async calls on each table for their corresponding id's by limiting it down using Semaphore.
*
*/
private async Task<List<T>> ProcessCassQueries<T>(IList<int> ids, Func<CancellationToken, int, Task<T>> mapperFunc, string msg) where T : class
{
var tasks = ids.Select(async id =>
{
await semaphore.WaitAsync();
try
{
ProcessCassQuery(ct => mapperFunc(ct, id), msg);
}
finally
{
semaphore.Release();
}
});
return (await Task.WhenAll(tasks)).Where(e => e != null).ToList();
}
// this might not be good idea to do it. how can I improve below method?
private Task<T> ProcessCassQuery<T>(Func<CancellationToken, Task<T>> requestExecuter, string msg) where T : class
{
return requestExecuter(CancellationToken.None);
}
As said in the official documentation, the cancellation token allows propagating a cancellation signal. This can be useful for example, to cancel long-running operations that for some reason do not make sense anymore or that are simply taking too long.
The CancelationTokenSource will allow you to get a custom token that you can pass to the requestExecutor. It will also provide the means for cancelling a running Task.
private CancellationTokenSource cts = new CancellationTokenSource();
// ...
private Task<T> ProcessCassQuery<T>(Func<CancellationToken, Task<T>> requestExecuter, string msg) where T : class
{
return requestExecuter(cts.Token);
}
Example
Let's take a look at a different minimal/dummy example so we can look at the inside of it.
Consider the following method, GetSomethingAsync that will yield return an incrementing integer every second.
The call to token.ThrowIfCancellationRequested will make sure a TaskCanceledException is thrown if this process is cancelled by an outside action. Other approaches can be taken, for example, check if token.IsCancellationRequested is true and do something about it.
private static async IAsyncEnumerable<int> GetSomethingAsync(CancellationToken token)
{
Console.WriteLine("starting to get something");
token.ThrowIfCancellationRequested();
for (var i = 0; i < 100; i++)
{
await Task.Delay(1000, token);
yield return i;
}
Console.WriteLine("finished getting something");
}
Now let's build the main method to call the above method.
public static async Task Main()
{
var cts = new CancellationTokenSource();
// cancel it after 3 seconds, just for demo purposes
cts.CancelAfter(3000);
// or: Task.Delay(3000).ContinueWith(_ => { cts.Cancel(); });
await foreach (var i in GetSomethingAsync(cts.Token))
{
Console.WriteLine(i);
}
}
If we run this, we will get an output that should look like:
starting to get something
0
1
Unhandled exception. System.Threading.Tasks.TaskCanceledException: A task was canceled.
Of course, this is just a dummy example, the cancellation could be triggered by a user action, or some event that happens, it does not have to be a timer.
I started to have HUGE doubts regarding my code and I need some advice from more experienced programmers.
In my application on the button click, the application runs a command, that is calling a ScrapJockeys method:
if (UpdateJockeysPl) await ScrapJockeys(JPlFrom, JPlTo + 1, "jockeysPl"); //1 - 1049
ScrapJockeys is triggering a for loop, repeating code block between 20K - 150K times (depends on the case). Inside the loop, I need to call a service method, where the execution of the method takes a lot of time. Also, I wanted to have the ability of cancellation of the loop and everything that is going on inside of the loop/method.
Right now I am with a method with a list of tasks, and inside of the loop is triggered a Task.Run. Inside of each task, I am calling an awaited service method, which reduces execution time of everything to 1/4 comparing to synchronous code. Also, each task has assigned a cancellation token, like in the example GitHub link:
public async Task ScrapJockeys(int startIndex, int stopIndex, string dataType)
{
//init values and controls in here
List<Task> tasks = new List<Task>();
for (int i = startIndex; i < stopIndex; i++)
{
int j = i;
Task task = Task.Run(async () =>
{
LoadedJockey jockey = new LoadedJockey();
CancellationToken.ThrowIfCancellationRequested();
if (dataType == "jockeysPl") jockey = await _scrapServices.ScrapSingleJockeyPlAsync(j);
if (dataType == "jockeysCz") jockey = await _scrapServices.ScrapSingleJockeyCzAsync(j);
//doing some stuff with results in here
}, TokenSource.Token);
tasks.Add(task);
}
try
{
await Task.WhenAll(tasks);
}
catch (OperationCanceledException)
{
//
}
finally
{
await _dataServices.SaveAllJockeysAsync(Jockeys.ToList()); //saves everything to JSON file
//soing some stuff with UI props in here
}
}
So about my question, is there everything fine with my code? According to this article:
Many async newbies start off by trying to treat asynchronous tasks the
same as parallel (TPL) tasks and this is a major misstep.
What should I use then?
And according to this article:
On a busy server, this kind of implementation can kill scalability.
So how am I supposed to do it?
Please be noted, that the service interface method signature is Task<LoadedJockey> ScrapSingleJockeyPlAsync(int index);
And also I am not 100% sure that I am using Task.Run correctly within my service class. The methods inside are wrapping the code inside await Task.Run(() =>, like in the example GitHub link:
public async Task<LoadedJockey> ScrapSingleJockeyPlAsync(int index)
{
LoadedJockey jockey = new LoadedJockey();
await Task.Run(() =>
{
//do some time consuming things
});
return jockey;
}
As far as I understand from the articles, this is a kind of anti-pattern. But I am confused a bit. Based on this SO reply, it should be fine...? If not, how to replace it?
On the UI side, you should be using Task.Run when you have CPU-bound code that is long enough that you need to move it off the UI thread. This is completely different than the server side, where using Task.Run at all is an anti-pattern.
In your case, all your code seems to be I/O-based, so I don't see a need for Task.Run at all.
There is a statement in your question that conflicts with the provided code:
I am calling an awaited service method
public async Task<LoadedJockey> ScrapSingleJockeyPlAsync(int index)
{
await Task.Run(() =>
{
//do some time consuming things
});
}
The lambda passed to Task.Run is not async, so the service method cannot possibly be awaited. And indeed it is not.
A better solution would be to load the HTML asynchronously (e.g., using HttpClient.GetStringAsync), and then call HtmlDocument.LoadHtml, something like this:
public async Task<LoadedJockey> ScrapSingleJockeyPlAsync(int index)
{
LoadedJockey jockey = new LoadedJockey();
...
string link = sb.ToString();
var html = await httpClient.GetStringAsync(link).ConfigureAwait(false);
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(html);
if (jockey.Name == null)
...
return jockey;
}
And also remove the Task.Run from your for loop:
private async Task ScrapJockey(string dataType)
{
LoadedJockey jockey = new LoadedJockey();
CancellationToken.ThrowIfCancellationRequested();
if (dataType == "jockeysPl") jockey = await _scrapServices.ScrapSingleJockeyPlAsync(j).ConfigureAwait(false);
if (dataType == "jockeysCz") jockey = await _scrapServices.ScrapSingleJockeyCzAsync(j).ConfigureAwait(false);
//doing some stuff with results in here
}
public async Task ScrapJockeys(int startIndex, int stopIndex, string dataType)
{
//init values and controls in here
List<Task> tasks = new List<Task>();
for (int i = startIndex; i < stopIndex; i++)
{
tasks.Add(ScrapJockey(dataType));
}
try
{
await Task.WhenAll(tasks);
}
catch (OperationCanceledException)
{
//
}
finally
{
await _dataServices.SaveAllJockeysAsync(Jockeys.ToList()); //saves everything to JSON file
//soing some stuff with UI props in here
}
}
As far as I understand from the articles, this is a kind of anti-pattern.
It is an anti-pattern. But if can't modify the service implementation, you should at least be able to execute the tasks in parallel. Something like this:
public async Task ScrapJockeys(int startIndex, int stopIndex, string dataType)
{
ConcurrentBag<Task> tasks = new ConcurrentBag<Task>();
ParallelOptions parallelLoopOptions = new ParallelOptions() { CancellationToken = CancellationToken };
Parallel.For(startIndex, stopIndex, parallelLoopOptions, i =>
{
int j = i;
switch (dataType)
{
case "jockeysPl":
tasks.Add(_scrapServices.ScrapSingleJockeyPlAsync(j));
break;
case "jockeysCz":
tasks.Add(_scrapServices.ScrapSingleJockeyCzAsync(j));
break;
}
});
try
{
await Task.WhenAll(tasks);
}
catch (OperationCanceledException)
{
//
}
finally
{
await _dataServices.SaveAllJockeysAsync(Jockeys.ToList()); //saves everything to JSON file
//soing some stuff with UI props in here
}
}
I have a console program which sends async HTTP requests to an external web API. (HttpClient.GetAsync());)
These tasks can take several minutes to complete - during which I'd like to be able to show to the user that the app is still running - for example by sending Console.WriteLine("I ain't dead - yet") every 10 seconds.
I am not sure how to do it right, without the risk of hiding exceptions, introducing deadlocks etc.
I am aware of the IProgress<T>, however I don't know whether I can introduce it in this case. I am await a single async call which does not report progress. (It's essentially an SDK which calls httpClient GetAsync() method
Also:
I cannot set the GUI to 'InProgress', because there is no GUI, its a console app - and it seems to the user as if it stopped working if I don't send an update message every now and then.
Current idea:
try
{
var task = httpClient.GetAsync(uri); //actually this is an SDK method call (which I cannot control and which does not report progress itself)
while (!task.IsCompleted)
{
await Task.Delay(1000 * 10);
this.Logger.Log(Verbosity.Verbose, "Waiting for reply...");
}
onSuccessCallback(task.Result);
}
catch (Exception ex)
{
if (onErrorCallback == null)
{
throw this.Logger.Error(this.GetProperException(ex, caller));
}
this.Logger.Log(Verbosity.Error, $"An error when executing command [{action?.Command}] on {typeof(T).Name}", ex);
onErrorCallback(this.GetProperException(ex, caller));
}
Let me tidy this code up a bit for you
async Task Main()
{
var reporter = new ConsoleProgress();
var result = await WeatherWaxProgressWrapper(() => GetAsync("foo"), reporter);
Console.WriteLine(result);
}
public async Task<int> GetAsync(string uri)
{
await Task.Delay(TimeSpan.FromSeconds(10));
return 1;
}
public async Task<T> WeatherWaxProgressWrapper<T>(Func<Task<T>> method, System.IProgress<string> progress)
{
var task = method();
while(!task.IsCompleted && !task.IsCanceled && !task.IsFaulted)
{
await Task.WhenAny(task, Task.Delay(1000));
progress.Report("I ain't dead");
}
return await task;
}
public class ConsoleProgress : System.IProgress<string>
{
public void Report(string value)
{
Console.WriteLine(value);
}
}
You could have a never-ending Task as a beacon that signals every 10 sec, and cancel it after the completion of the long running I/O operation:
var beaconCts = new CancellationTokenSource();
var beaconTask = Task.Run(async () =>
{
while (true)
{
await Task.Delay(TimeSpan.FromSeconds(10), beaconCts.Token);
Console.WriteLine("Still going...");
}
});
await LongRunningOperationAsync();
beaconCts.Cancel();
You are looking for System.Progress<T>, a wonderful implementation of IProgress.
https://learn.microsoft.com/en-us/dotnet/api/system.progress-1
You create an object of this class on the "UI thread" or the main thread in your case, and it captures the SynchronizationContext for you. Pass it to your worker thread and every call to Report will be executed on the captured thread, you don't have to worry about anything.
Very useful in WPF or WinForms applications.
I'm kinda new at async programming so don't uderstand many things. Please, I need a little bit more help. I did as was recommened in my other question.
public async Task<TResponse> SendRequestAsync<TResponse>(Func<Task<TResponse>> sendAsync)
{
int timeout = 15;
if (await Task.WhenAny(sendAsync, Task.Delay(timeout) == sendAsync))
{
return await sendAsync();
}
else
{
throw new Exception("time out!!!");
}
}
But I need to get a result of sendAsync() and return it. So have I questions:
1) What the best way to do that and how to use Task.Delay with Func<Task<TResponse>>(or may be something instead of it)? I can't figure out how convert(or something) Func to Task.
2) It seems that return await sendAsync() inside if permorms request once more. It is not great. Can I get result of my Func<Task<..>> inside if somehow?
Since you are new in async programming - it's better to not put too much stuff in one statement and better split that:
public async Task<TResponse> SendRequestAsync<TResponse>(Func<Task<TResponse>> sendAsync) {
int timeout = 15;
// here you create Task which represents ongoing request
var sendTask = sendAsync();
// Task which will complete after specified amount of time, in milliseconds
// which means your timeout should be 15000 (for 15 seconds), not 15
var delay = Task.Delay(timeout);
// wait for any of those tasks to complete, returns task that completed first
var taskThatCompletedFirst = await Task.WhenAny(sendTask, delay);
if (taskThatCompletedFirst == sendTask) {
// if that's our task and not "delay" task - we are fine
// await it so that all exceptions if any are thrown here
// this will _not_ cause it to execute once again
return await sendTask;
}
else {
// "delay" task completed first, which means 15 seconds has passed
// but our request has not been completed
throw new Exception("time out!!!");
}
}
Request is sent twice because sendAsync is a Func returning Task, different on each call. You call it first under Task.WhenAny() and repeat in operator return await sendAsync().
To avoid this duplicated call you should save a task to variable and pass that task to both calls:
public async Task<TResponse> SendRequestAsync<TResponse>(Func<Task<TResponse>> sendAsync)
{
int timeout = 15;
var task = sendAsync();
if (await Task.WhenAny(task, Task.Delay(timeout) == task))
{
return await task;
}
else
{
throw new Exception("time out!!!");
}
}
await on completed task will just return its result without rerunning the task.
i have a function that makes a few http requests, and runs in a task.
that function can be interrupted in the middle, since i cant abort the task i added some boolean conditions in the function.
example:
public int foo(ref bool cancel)
{
if(cancel)
{
return null
}
//do some work...
if(cancel)
{
return null
}
//http webrequest
if(cancel)
{
return null
}
}
thisworked pretty good, although this is quite some ugly code.
another problem is when i already executed the web request, and it takes time for me to get the response than the function cncelation takes a lot of time (till i get a response).
is there a better way for me to check this? or mybe i should use threads instead of task?
edit
i added a cancelation token: declared a cancelationTokenSource, and passed its token to the task
CancellationTokenSource cncelToken = new CancellationTokenSource();
Task t = new Task(() => {foo()},cancelToken.token);
when i do cancelToken.Cancel();
i still wait for the response, and the tsk isnt cancelling.
Tasks support cancellation - see here.
Here's a quick snippet.
class Program
{
static void Main(string[] args)
{
var token = new CancellationTokenSource();
var t = Task.Factory.StartNew(
o =>
{
while (true)
Console.WriteLine("{0}: Processing", DateTime.Now);
}, token);
token.CancelAfter(1000);
t.Wait(token.Token);
}
}
Remember to wait for the task using the provided cancellation token. You ought to receive an OperationCanceledException.