I have several asynchronous tasks/jobs that I need to run on a schedule and it seems that I could do this nicely using Observables. When a job fetches the data, an exception could occur (eg 404), and when the resultant data is processed, an error could also occur.
I have seen this answer by Enigmativity which seems like the perfect solution to wrap the IObservable<> so that if an error occurs (when I fetch the data) I can trap it and continue (ultimately skipping the processing for that particular fetch).
I understand that when an Observable errors it is meant to terminate, but given the answer I mentioned above, it seems that there are ways around this, which would make for a decent job scheduling system. Alternative approaches are welcome, but I would like to understand how to do this with Observables.
I would also like to provide some feedback/logging about the state of the job.
Currently, I have the below method, which won't compile!
job is the object that contains information about the job (eg a list of job runs and their outcomes/success/failure, run frequency, status, errors, boolean flag indicating if the job should proceed, etc)
interval(job) returns the frequency in milliseconds that the job should run at
runSelect(job) is a boolean method that signals if a job should proceed (I think this would be better replaced with an observable? And of course there is the option of using a CancellationToken, but again I'm not sure how to integrate that!)
select(job) is the method that fetches the data
subscribe(job) is the method that processes the data
public static IDisposable BuildObservable<TJob, TSelect>(TJob job, Func<TJob, int> interval, Func<TJob, bool> runSelect, Func<TJob, Task<TSelect>> select,
Func<TSelect, Task> subscribe)
where TJob : Job
where TSelect : class
{
return Observable.Timer(TimeSpan.Zero, TimeSpan.FromMilliseconds(interval(job)))
.SelectMany(x => Observable.FromAsync(async () =>
{
JobRunDetail jobRunDetail = job.StartNewRun();
if (runSelect(job))
{
jobRunDetail.SetRunningSelect();
return new { Result = await select(job), JobRunDetail = jobRunDetail };
}
else
{
jobRunDetail.SetAbandonedSelect();
return new { Result = (TSelect)null!, JobRunDetail = jobRunDetail };
}
}).ToExceptional())
.Subscribe(async (resultAndJobRunDetail) =>
{
//none of the resultAndJobRunDetail.Value.JobDetail or resultAndJobRunDetail.Value.Result statements will compile
resultAndJobRunDetail.Value.JobRunDetail.SetRunningSubscribe();
try
{
if (resultAndJobRunDetail.Value.Result!= null)
await subscribe(resultAndJobRunDetail.Value.Result);
resultAndJobRunDetail.Value.JobRunDetail.SetCompleted();
}
catch (Exception ee)
{
resultAndJobRunDetail.Value.JobRunDetail.SetErrorSubscribe(ee);
}
});
}
As noted, none of the resultAndJobRunDetail.Value.JobDetail or resultAndJobRunDetail.Value.Result statements will compile because resultAndJobRunDetail.Value is still an Observable<>, but when I remove the .ToExceptional() call, the value returned is no longer an Observable.
Clearly I'm missing something.
I have seen different answers on SO that use Do() rather than Subscribe() so I'm not sure which is appropriate. I have also seen answers that suggest using Retry() or one of the "observable error handling methods" but I'm not sure how these would work if I just want my job to keep repeating ad infinitum?
Ultimately, I'm still learning how the whole Observable infrastructure fits together, so I could well be completely off track!
It's worth nothing that searching Google for "Schedule Job using Observable" it pretty fruitless as Observables use schedulers!
I'm not sure if this helps or not, but your .ToExceptional() call was in the wrong place:
public static IDisposable BuildObservable<TJob, TSelect>(TJob job, Func<TJob, int> interval, Func<TJob, bool> runSelect, Func<TJob, Task<TSelect>> select,
Func<TSelect, Task> subscribe)
where TJob : Job
where TSelect : class
{
return Observable.Timer(TimeSpan.Zero, TimeSpan.FromMilliseconds(interval(job)))
.SelectMany(x => Observable.FromAsync(async () =>
{
JobRunDetail jobRunDetail = job.StartNewRun();
if (runSelect(job))
{
jobRunDetail.SetRunningSelect();
return new { Result = await select(job), JobRunDetail = jobRunDetail }.ToExceptional();
}
else
{
jobRunDetail.SetAbandonedSelect();
return new { Result = (TSelect)null!, JobRunDetail = jobRunDetail }.ToExceptional();
}
}))
.Subscribe(async (resultAndJobRunDetail) =>
{
//none of the resultAndJobRunDetail.Value.JobDetail or resultAndJobRunDetail.Value.Result statements will compile
resultAndJobRunDetail.Value.JobRunDetail.SetRunningSubscribe();
try
{
if (resultAndJobRunDetail.Value.Result != null)
await subscribe(resultAndJobRunDetail.Value.Result);
resultAndJobRunDetail.Value.JobRunDetail.SetCompleted();
}
catch (Exception ee)
{
resultAndJobRunDetail.Value.JobRunDetail.SetErrorSubscribe(ee);
}
});
}
Related
In my project, I reference types and interfaces from a dynamic link library.
The very first thing I have to do when using this specific library is to create an instance of EA.Repository, which is defined within the library and serves as kind of an entry point for further usage.
The instantiation EA.Repository repository = new EA.Repository() performs some complex stuff in the background, and I find myself confronted with three possible outcomes:
Instantiation takes some time but finishes successfully in the end
An exception is thrown (either immediately or after some time)
The instantiation blocks forever (in which case I'd like to cancel and inform the user)
I was able to come up with an asynchronous approach using Task:
public static void Connect()
{
// Do the lengthy instantiation asynchronously
Task<EA.Repository> task = Task.Run(() => { return new EA.Repository(); });
bool isCompletedInTime;
try
{
// Timeout after 5.0 seconds
isCompletedInTime = task.Wait(5000);
}
catch (Exception)
{
// If the instantiation fails (in time), throw a custom exception
throw new ConnectionException();
}
if (isCompletedInTime)
{
// If the instantiation finishes in time, store the object for later
EapManager.Repository = task.Result;
}
else
{
// If the instantiation did not finish in time, throw a custom exception
throw new TimeoutException();
}
}
(I know, you can probably already spot a lot of issues here. Please be patient with me... Recommendations would be appreciated!)
This approach works so far - I can simulate both the "exception" and the "timeout" scenario and I obtain the desired behavior.
However, I have identified another edge case: Let's assume the instantiation task takes long enough that the timeout expires and then throws an exception. In this case, I sometimes end up with an AggregateException saying that the task has not been observed.
I'm struggling to find a feasible solution to this. I can't really cancel the task when the timeout expires, because the blocking instantiation obviously prevents me from using the CancellationToken approach.
The only thing I could come up with is to start observing the task asynchronously (i.e. start another task) right before throwing my custom TimeoutException:
Task observerTask = Task.Run(() => {
try { task.Wait(); }
catch (Exception) { }
});
throw new TimeoutException();
Of course, if the instantiation really blocks forever, I already had the first task never finish. With the observer task, now I even have two!
I'm quite insecure about this whole approach, so any advice would be welcome!
Thank you very much in advance!
I'm not sure if I fully understood what you're trying to achieve, but what if you do something like this -
public static void Connect()
{
Task<EA.Repository> _realWork = Task.Run(() => { return new EA.Repository(); });
Task _timeoutTask = Task.Delay(5000);
Task.WaitAny(new Task[]{_realWork, timeoutTask});
if (_timeoutTask.Completed)
{
// timed out
}
else
{
// all good, access _realWork.Result
}
}
or you can even go a bit shorter -
public static void Connect()
{
Task<EA.Repository> _realWork = Task.Run(() => { return new EA.Repository(); });
var completedTaskIndex = Task.WaitAny(new Task[]{_realWork}, 5000);
if (completedTaskIndex == -1)
{
// timed out
}
else
{
// all good, access _realWork.Result
}
}
You can also always call Task.Run with a CancellationToken that will time out, but that will raise an exception - the above solutions give you control of the behaviour without an exception being thrown (even though you can always try/catch)
Here is an extension method that you could use to explicitly observe the tasks that may fail while unobserved:
public static Task<T> AsObserved<T>(this Task<T> task)
{
task.ContinueWith(t => t.Exception);
return task;
}
Usage example:
var task = Task.Run(() => new EA.Repository()).AsObserved();
I created an extension to Enumerable to execute action fastly, so I have listed and in this method, I loop and if object executing the method in certain time out I return,
now I want to make the output generic because the method output will differ, any advice on what to do
this IEnumerable of processes, it's like load balancing, if the first not responded the second should, I want to return the output of the input Action
public static class EnumerableExtensions
{
public static void ForEach<T>(this IEnumerable<T> source, Action action, int timeOut)
{
foreach (T element in source)
{
lock (source)
{
// Loop for all connections and get the fastest responsive proxy
foreach (var mxAccessProxy in source)
{
try
{
// check for the health
Task executionTask = Task.Run(action);
if (executionTask.Wait(timeOut))
{
return ;
}
}
catch
{
//ignore
}
}
}
}
}
}
this code run like
_proxies.ForEach(certainaction, timeOut);
this will enhance the performance and code readability
No, it definitely won't :) Moreover, you bring some more problems with this code like redundant locking or exception swallowing, but don't actually execute code in parallel.
It seems like you want to get the fastest possible call for your Action using some sort of proxy objects. You need to run Tasks asynchronously, not consequently with .Wait().
Something like this could be helpful for you:
public static class TaskExtensions
{
public static TReturn ParallelSelectReturnFastest<TPoolObject, TReturn>(this TPoolObject[] pool,
Func<TPoolObject, CancellationToken, TReturn> func,
int? timeout = null)
{
var ctx = new CancellationTokenSource();
// for every object in pool schedule a task
Task<TReturn>[] tasks = pool
.Select(poolObject =>
{
ctx.Token.ThrowIfCancellationRequested();
return Task.Factory.StartNew(() => func(poolObject, ctx.Token), ctx.Token);
})
.ToArray();
// not sure if Cast is actually needed,
// just to get rid of co-variant array conversion
int firstCompletedIndex = timeout.HasValue
? Task.WaitAny(tasks.Cast<Task>().ToArray(), timeout.Value, ctx.Token)
: Task.WaitAny(tasks.Cast<Task>().ToArray(), ctx.Token);
// we need to cancel token to avoid unnecessary work to be done
ctx.Cancel();
if (firstCompletedIndex == -1) // no objects in pool managed to complete action in time
throw new NotImplementedException(); // custom exception goes here
return tasks[firstCompletedIndex].Result;
}
}
Now, you can use this extension method to call a specific action on any pool of objects and get the first executed result:
var pool = new[] { 1, 2, 3, 4, 5 };
var result = pool.ParallelSelectReturnFastest((x, token) => {
Thread.Sleep(x * 200);
token.ThrowIfCancellationRequested();
Console.WriteLine("calculate");
return x * x;
}, 100);
Console.WriteLine(result);
It outputs:
calculate
1
Because the first task will complete work in 200ms, return it, and all other tasks will be cancelled through cancellation token.
In your case it will be something like:
var actionResponse = proxiesList.ParallelSelectReturnFastest((proxy, token) => {
token.ThrowIfCancellationRequested();
return proxy.SomeAction();
});
Some things to mention:
Make sure that your actions are safe. You can't rely on how many of these will actually come to the actual execution of your action. If this action is CreateItem, then you can end up with many items to be created through different proxies
It cannot guarantee that you will run all of these actions in parallel, because it is up to TPL to chose the optimal number of running tasks
I have implemented in old-fashioned TPL way, because your original question contained it. If possible, you need to switch to async/await - in this case your Func will return tasks and you need to use await Task.WhenAny(tasks) instead of Task.WaitAny()
I would like to generate an observable of files, such that the discovery of the files names could be cancelled in any moment. For the sake of this example, the cancellation takes place in 1 second automatically.
Here is my current code:
class Program
{
static void Main()
{
try
{
RunAsync(#"\\abc\xyz").GetAwaiter().GetResult();
}
catch (Exception exc)
{
Console.Error.WriteLine(exc);
}
Console.Write("Press Enter to exit");
Console.ReadLine();
}
private static async Task RunAsync(string path)
{
var cts = new CancellationTokenSource(TimeSpan.FromSeconds(1));
await GetFileSource(path, cts);
}
private static IObservable<string> GetFileSource(string path, CancellationTokenSource cts)
{
return Observable.Create<string>(obs => Task.Run(async () =>
{
Console.WriteLine("Inside Before");
foreach (var file in Directory.EnumerateFiles(path, "*", SearchOption.AllDirectories).Take(50))
{
cts.Token.ThrowIfCancellationRequested();
obs.OnNext(file);
await Task.Delay(100);
}
Console.WriteLine("Inside After");
obs.OnCompleted();
return Disposable.Empty;
}, cts.Token))
.Do(Console.WriteLine);
}
}
I do not like two aspects of my implementation (if there are more - please feel free to point out):
I have an enumerable of files, yet I iterate over each manually. Could I use the ToObservable extension somehow?
I could not figure out how to make use of the cts.Token passed to Task.Run. Had to use the cts captured from the outer context (GetFileSource parameter). Seems ugly to me.
Is this how it should be done? Must be a better way.
I'm still not convinced this is really a Reactive Problem, you are asking for backpressure on the producer which is really against how Reactive is supposed to work.
That being said, if you are going to do it this way you should realize that very fine-grained time manipulation should almost always be delegated to a Scheduler rather than trying to do coordination with Tasks and CancellationTokens. So I would refactor to look like this:
public static IObservable<string> GetFileSource(string path, Func<string, Task<string>> processor, IScheduler scheduler = null) {
scheduler = scheduler ?? Scheduler.Default;
return Observable.Create<string>(obs =>
{
//Grab the enumerator as our iteration state.
var enumerator = Directory.EnumerateFiles(path, "*", SearchOption.AllDirectories)
.GetEnumerator();
return scheduler.Schedule(enumerator, async (e, recurse) =>
{
if (!e.MoveNext())
{
obs.OnCompleted();
return;
}
//Wait here until processing is done before moving on
obs.OnNext(await processor(e.Current));
//Recursively schedule
recurse(e);
});
});
}
Then, instead of passing in a cancellation token, use TakeUntil:
var source = GetFileSource(path, x => {/*Do some async task here*/; return x; })
.TakeUntil(Observable.Timer(TimeSpan.FromSeconds(1));
You can also see a more advanced example for an implementation of an async Generate method.
I would recommend that you avoid Observable.Create when you can use the other operators.
Also, when you do a return Disposable.Empty; within Observable.Create you are creating an observable that cannot be stopped by the normal Rx subscription disposable. This can lead to memory leaks and unnecessary processing.
Finally, throwing exceptions to end normal computation is a bad bad idea.
There is a good clean solution that seems to do what you want:
private static IObservable<string> GetFileSource(string path, CancellationTokenSource cts)
{
return
Directory
.EnumerateFiles(path, "*", SearchOption.AllDirectories)
.ToObservable()
.Take(50)
.TakeWhile(f => !cts.IsCancellationRequested);
}
The only thing that I didn't include was the Task.Delay(100);. Why are you doing that?
I have the following code:
var tasks = await taskSeedSource
.Select(taskSeed => GetPendingOrRunningTask(taskSeed, createTask, onFailed, onSuccess, sem))
.ToList()
.ToTask();
if (tasks.Count == 0)
{
return;
}
if (tasks.Contains(null))
{
tasks = tasks.Where(t => t != null).ToArray();
if (tasks.Count == 0)
{
return;
}
}
await Task.WhenAll(tasks);
Where taskSeedSource is a Reactive Observable. It could be that this code have many problems, but I see at least two:
I am collecting tasks whereas I could do without it.
Somehow, the returned tasks list may contain nulls, even though GetPendingOrRunningTask is an async method and hence never returns null. I failed to understand why it happens, so I had to defend against it without understanding the cause of the problem.
I would like to use the AsyncCountdownEvent from the AsyncEx framework instead of collecting the tasks and then awaiting on them.
So, I can pass the countdown event to GetPendingOrRunningTask which will increment it immediately and signal before returning after awaiting for the completion of its internal logic. However, I do not understand how to integrate the countdown event into the monad (that is the Reactive jargon, isn't it?).
What is the right way to do it?
EDIT
Guys, let us forget about the mysterious nulls in the returned list. Suppose everything is green and the code is
var tasks = await taskSeedSource
.Select(taskSeed => GetPendingOrRunningTask(taskSeed, ...))
.ToList()
.ToTask();
await Task.WhenAll(tasks);
Now the question is how do I do it with the countdown event? So, suppose I have:
var c = new AsyncCountdownEvent(1);
and
async Task GetPendingOrRunningTask<T>(AsyncCountdownEvent c, T taskSeed, ...)
{
c.AddCount();
try
{
await ....
}
catch (Exception exc)
{
// The exception is handled
}
c.Signal();
}
My problem is that I no longer need the returned task. These tasks where collected and awaited to get the moment when all the work items are over, but now the countdown event can be used to indicate when the work is over.
My problem is that I am not sure how to integrate it into the Reactive chain. Essentially, the GetPendingOrRunningTask can be async void. And here I am stuck.
EDIT 2
Strange appearance of a null entry in the list of tasks
#Servy is correct that you need to solve the null Task problem at the source. Nobody wants to answer a question about how to workaround a problem that violates the contracts of a method that you've defined yourself and yet haven't provided the source for examination.
As for the issue about collecting tasks, it's easy to avoid with Merge if your method returns a generic Task<T>:
await taskSeedSource
.Select(taskSeed => GetPendingOrRunningTask(taskSeed, createTask, onFailed, onSuccess, sem))
.Where(task => task != null) // According to you, this shouldn't be necessary.
.Merge();
However, unfortunately there's no official Merge overload for the non-generic Task but that's easy enough to define:
public static IObservable<Unit> Merge(this IObservable<Task> sources)
{
return sources.Select(async source =>
{
await source.ConfigureAwait(false);
return Unit.Default;
})
.Merge();
}
I am new the using Task.Run() along with async and await to make UI more responsive, so likely I have not implemented something correctly.
I have reviewed the great articles from Stephen Cleary about using AsyncCommands and have used his code from Patterns for Asynchronous MVVM Applications: Commands as a basis for having a responsive UI but when I run the code it still seems to freeze up (I am not able to move the window or interact with other buttons until the function has fully finished.
I am trying to perform a search which usually takes 5-10 seconds to return. Below is the code that creates the AsyncCommand along with the what the function does.
Code:
public ICommand SearchCommand
{
get
{
if (_SearchCommand == null)
{
_SearchCommand = AsyncCommand.Create(() => Search());
}
return _SearchCommand;
}
}
private async Task Search()
{
IEnumerable<PIPoint> points = await SearchAsync(_CurrentPIServer, NameSearch, PointSourceSearch).ConfigureAwait(false);
SearchResults.Clear();
foreach (PIPoint point in points)
{
SearchResults.Add(point.Name);
}
}
private async Task<IEnumerable<PIPoint>> SearchAsync(string Server, string NameSearch, string PointSourceSearch)
{
{
PIServers KnownServers = new PIServers();
PIServer server = KnownServers[Server];
server.Connect();
return await Task.Run<IEnumerable<PIPoint>>(()=>PIPoint.FindPIPoints(server, NameSearch, PointSourceSearch)).ConfigureAwait(false);
}
}
I am thinking that the issue is somewhere in how I am pushing the long running function onto a thread and its not getting off of the UI thread or my understanding of how Tasks and async/await are completely off.
EDIT 1:
Following Stephen's answer I updated the functions, but I am not seeing any change in the UI responsiveness. I created a second command that performs the same actions and I get the same response from UI in either case. The code now looks like the following
CODE:
public ICommand SearchCommand
{
get
{
if (_SearchCommand == null)
{
_SearchCommand = AsyncCommand.Create(async () =>
{
var results = await Task.Run(()=>Search(_CurrentPIServer, NameSearch, PointSourceSearch));
SearchResults = new ObservableCollection<string>(results.Select(x => x.Name));
});
}
return _SearchCommand;
}
}
public ICommand SearchCommand2
{
get
{
if (_SearchCommand2 == null)
{
_SearchCommand2 = new RelayCommand(() =>
{
var results = Search(_CurrentPIServer, NameSearch, PointSourceSearch);
SearchResults = new ObservableCollection<string>(results.Select(x => x.Name));
}
,()=> true);
}
return _SearchCommand2;
}
}
private IEnumerable<PIPoint> Search(string Server, string NameSearch, string PointSourceSearch)
{
PIServers KnownServers = new PIServers();
PIServer server = KnownServers[Server];
server.Connect();
return PIPoint.FindPIPoints(server, NameSearch, PointSourceSearch);
}
I must be missing something but I am not sure what at this point.
EDIT 2:
After more investigation on what was taking so long it turns out the iterating of the list after the results are found is what was hanging the process. By simply changing what the Search function was returning and having it already iterated over the list of objects allows for the UI to remain responsive. I marked Stephen's answer as correct as it handled my main problem of properly moving work off of the UI thread I just didnt move the actual time consuming work off.
My first guess is that the work queued to Task.Run is quite fast, and the delay is caused by other code (e.g., PIServer.Connect).
Another thing of note is that you are using ConfigureAwait(false) in Search which updates SearchResults - which I suspect is wrong. If SearchResults is bound to the UI, then you should be in the UI context when updating it, so ConfigureAwait(false) should not be used.
That said, there's a Task.Run principle that's good to keep in mind: push Task.Run as far up your call stack as possible. I explain this in more detail on my blog. The general idea is that Task.Run should be used to invoke synchronous methods; it shouldn't be used in the implementation of an asynchronous method (at least, not one that is intended to be reused).
As a final note, async is functional in nature. So it's more natural to return results than update collections as a side effect.
Combining these recommendations, the resulting code would look like:
private IEnumerable<PIPoint> Search(string Server, string NameSearch, string PointSourceSearch)
{
PIServers KnownServers = new PIServers();
PIServer server = KnownServers[Server];
// TODO: If "Connect" or "FindPIPoints" are naturally asynchronous,
// then this method should be converted back to an asynchronous method.
server.Connect();
return PIPoint.FindPIPoints(server, NameSearch, PointSourceSearch);
}
public ICommand SearchCommand
{
get
{
if (_SearchCommand == null)
{
_SearchCommand = AsyncCommand.Create(async () =>
{
var results = await Task.Run(() =>
Search(_CurrentPIServer, NameSearch, PointSourceSearch));
SearchResults = new ObservableCollection<string>(
results.Select(x => x.Name));
});
}
return _SearchCommand;
}
}