Properly using multiple resources with Rx - c#

I need to use multiple disposable resources with Rx. This is how I have nested the Observable.Using statements (the inner source is just for testing).
var obs = Observable.Using(
() => new FileStream("file.txt", FileMode.Open),
fs => Observable.Using(
() => new StreamReader(fs),
sr => Observable.Create<string>(o =>
TaskPoolScheduler.Default.ScheduleAsync(async (sch, ct) =>
{
while (!ct.IsCancellationRequested)
{
var s = await sr.ReadLineAsync().ConfigureAwait(false);
if (s is null) break;
o.OnNext(s);
}
o.OnCompleted();
}))));
obs.Subscribe(Console.WriteLine);
Is there a more concise way to use multiple disposable resources?

I can't think of a general way to use unlimited number of resources, but at least you could make helper methods for the common cases of 2-3 resources. Here is an implementation for two:
public static IObservable<TResult> Using<TResult, TResource1, TResource2>(
Func<TResource1> resourceFactory1,
Func<TResource1, TResource2> resourceFactory2,
Func<TResource1, TResource2, IObservable<TResult>> observableFactory)
where TResource1 : IDisposable
where TResource2 : IDisposable
{
return Observable.Using(resourceFactory1, resource1 => Observable.Using(
() => resourceFactory2(resource1),
resource2 => observableFactory(resource1, resource2)));
}
Usage example:
var obs = Using(
() => new FileStream("file.txt", FileMode.Open),
(fs) => new StreamReader(fs),
(fs, sr) => Observable.Create<string>(o =>
TaskPoolScheduler.Default.ScheduleAsync(async (sch, ct) =>
{
while (!ct.IsCancellationRequested)
{
var s = await sr.ReadLineAsync().ConfigureAwait(false);
if (s is null) break;
o.OnNext(s);
}
o.OnCompleted();
})));

Usually when I have a set of resources to be used in a cold observable, I stick with the create pattern which is much easier to read.
var obs = Observable.Create<string>(observer =>
{
var fs = new FileStream("file.txt", FileMode.Open);
var sr = new StreamReader(fs);
var task = TaskPoolScheduler.Default.ScheduleAsync(async (_, ct) =>
{
while (!ct.IsCancellationRequested)
{
var s = await sr.ReadLineAsync().ConfigureAwait(false);
if (s is null) break;
observer.OnNext(s);
}
observer.OnCompleted();
});
return new CompositeDisposable(task, fs, sr);
});

I'd just continue to use the double Using pattern, but I'd clean up the internals of how you're reading the lines. Try this:
var obs =
Observable.Using(() => new FileStream("file.txt", FileMode.Open),
fs => Observable.Using(() => new StreamReader(fs),
sr =>
Observable
.Defer(() => Observable.FromAsync(() => sr.ReadLineAsync()))
.Repeat()
.TakeWhile(x => x != null)));
You should try to avoid Observable.Create where possible.
Cleaner observable queries means that the Using operators don't look so out-of-place.

Related

C# and Rx: testing a timeout

I've implemented an observer for a FileSystemWatcher. The idea is to track copies in a folder and wait until copy is finished. Once done, let's do something with copied files.
I've tried my implementation on a small program and it works. What I wanted to do is to make it more formal using unit test:
[TestMethod]
public void TestIsFileWritingFinished()
{
try
{
var dirName = Path.GetTempPath()+Path.DirectorySeparatorChar+DateTime.Now.ToString("MMddyyyy");
if (Directory.Exists(dirName))
{
Directory.Delete(dirName, true);
}
var dir = Directory.CreateDirectory(dirName);
var observer = new FileSystemObserver(dirName, "*.*", true)
.ChangedFiles
.Where(x => (new FileInfo(x.FullPath)).Length > 0)
.Select(x => x.Name);
var timeout = observer.Timeout(/*DateTimeOffset.UtcNow.AddSeconds(1)*/TimeSpan.FromSeconds(1));
var filesChanged = new List<string>();
var terminated = false;
timeout.Subscribe(Console.WriteLine, Console.WriteLine, ()=>terminated=true);
Thread.Sleep(100);
var origin = #"C:\Users\David\ResultA";
CopyDirectory(origin, dirName, true);
Thread.Sleep(100);
Console.WriteLine("nap for 5s");
Thread.Sleep(5000);
//Directory.Delete(dirName, true);
Assert.IsTrue(terminated);
}
catch(Exception ex)
{
Assert.Fail(ex.Message);
}
}
So, when timeout happens, I expect the boolean to be true. But looks like it's not.
Any idea about what's wrong with my test?
Thanks in advance, your suggestions will be appreciated,
Kind regards,
Even thought there is a Timeout operator, I find it is a bit of an anti-pattern. It's like programming with exceptions. I find the following pattern more useful:
IObservable<bool> query =
Observable.Amb(
source.LastAsync().Select(x => true),
Observable.Timer(TimeSpan.FromSeconds(seconds)).Select(x => false));
This is effectively running source to completion and then returns true, but if the Timer completes first it returns false.
So, in your code, I'd try something like this:
var dirName = Path.GetTempPath() + Path.DirectorySeparatorChar + DateTime.Now.ToString("MMddyyyy");
if (Directory.Exists(dirName))
{
Directory.Delete(dirName, true);
}
var dir = Directory.CreateDirectory(dirName);
var observer =
new FileSystemObserver(dirName, "*.*", true)
.ChangedFiles
.Where(x => (new FileInfo(x.FullPath)).Length > 0)
.Select(x => x.Name);
var result =
Observable.Amb(
observer.ToArray().Select(x => true),
Observable.Timer(TimeSpan.FromSeconds(1.0)).Select(x => false));
var query =
result.Zip(Observable.Start(() =>
{
var origin = #"C:\Users\David\ResultA";
CopyDirectory(origin, dirName, true);
}), (r, s) => r);
var terminated = query.Wait();
Assert.IsTrue(terminated);
That totally avoids any pesky sleeps.

Parallel.ForEach blocking calling method

I am having a problem with Parallel.ForEach. I have written simple application that adds file names to be downloaded to the queue, then using while loop it iterates through the queue, downloads file one at a time, then when file has been downloaded, another async method is called to create object from downloaded memoryStream. Returned result of this method is not awaited, it is discarded, so the next download starts immediately. Everything works fine if I use simple foreach in object creation - objects are being created while download is continuing. But if I would like to speed up the object creation process and use Parallel.ForEach it stops download process until the object is created. UI is fully responsive, but it just won't download the next object. I don't understand why is this happening - Parallel.ForEach is inside await Task.Run() and to my limited knowledge about asynchronous programming this should do the trick. Can anyone help me understand why is it blocking first method and how to avoid it?
Here is a small sample:
public async Task DownloadFromCloud(List<string> constructNames)
{
_downloadDataQueue = new Queue<string>();
var _gcsClient = StorageClient.Create();
foreach (var item in constructNames)
{
_downloadDataQueue.Enqueue(item);
}
while (_downloadDataQueue.Count > 0)
{
var memoryStream = new MemoryStream();
await _gcsClient.DownloadObjectAsync("companyprojects",
_downloadDataQueue.Peek(), memoryStream);
memoryStream.Position = 0;
_ = ReadFileXml(memoryStream);
_downloadDataQueue.Dequeue();
}
}
private async Task ReadFileXml(MemoryStream memoryStream)
{
var reader = new XmlReader();
var properties = reader.ReadXmlTest(memoryStream);
await Task.Run(() =>
{
var entityList = new List<Entity>();
foreach (var item in properties)
{
entityList.Add(CreateObjectsFromDownloadedProperties(item));
}
//Parallel.ForEach(properties item =>
//{
// entityList.Add(CreateObjectsFromDownloadedProperties(item));
//});
});
}
EDIT
This is simplified object creation method:
public Entity CreateObjectsFromDownloadedProperties(RebarProperties properties)
{
var path = new LinearPath(properties.Path);
var section = new Region(properties.Region);
var sweep = section.SweepAsMesh(path, 1);
return sweep;
}
Returned result of this method is not awaited, it is discarded, so the next download starts immediately.
This is also dangerous. "Fire and forget" means "I don't care when this operation completes, or if it completes. Just discard all exceptions because I don't care." So fire-and-forget should be extremely rare in practice. It's not appropriate here.
UI is fully responsive, but it just won't download the next object.
I have no idea why it would block the downloads, but there's a definite problem in switching to Parallel.ForEach: List<T>.Add is not threadsafe.
private async Task ReadFileXml(MemoryStream memoryStream)
{
var reader = new XmlReader();
var properties = reader.ReadXmlTest(memoryStream);
await Task.Run(() =>
{
var entityList = new List<Entity>();
Parallel.ForEach(properties, item =>
{
var itemToAdd = CreateObjectsFromDownloadedProperties(item);
lock (entityList) { entityList.Add(itemToAdd); }
});
});
}
One tip: if you have a result value, PLINQ is often cleaner than Parallel:
private async Task ReadFileXml(MemoryStream memoryStream)
{
var reader = new XmlReader();
var properties = reader.ReadXmlTest(memoryStream);
await Task.Run(() =>
{
var entityList = proeprties
.AsParallel()
.Select(CreateObjectsFromDownloadedProperties)
.ToList();
});
}
However, the code still suffers from the fire-and-forget problem.
For a better fix, I'd recommend taking a step back and using something more suited to "pipeline"-style processing. E.g., TPL Dataflow:
public async Task DownloadFromCloud(List<string> constructNames)
{
// Set up the pipeline.
var gcsClient = StorageClient.Create();
var downloadBlock = new TransformBlock<string, MemoryStream>(async constructName =>
{
var memoryStream = new MemoryStream();
await gcsClient.DownloadObjectAsync("companyprojects", constructName, memoryStream);
memoryStream.Position = 0;
return memoryStream;
});
var processBlock = new TransformBlock<MemoryStream, List<Entity>>(memoryStream =>
{
var reader = new XmlReader();
var properties = reader.ReadXmlTest(memoryStream);
return proeprties
.AsParallel()
.Select(CreateObjectsFromDownloadedProperties)
.ToList();
});
var resultsBlock = new ActionBlock<List<Entity>>(entities => { /* TODO */ });
downloadBlock.LinkTo(processBlock, new DataflowLinkOptions { PropagateCompletion = true });
processBlock.LinkTo(resultsBlock, new DataflowLinkOptions { PropagateCompletion = true });
// Push data into the pipeline.
foreach (var constructName in constructNames)
await downloadBlock.SendAsync(constructName);
downlodBlock.Complete();
// Wait for pipeline to complete.
await resultsBlock.Completion;
}

How can I change an iteration to be Reactive (Rx)

I would like to change the following code to be Observable-based:
// 'assets' is a IReadOnly list of StorageFile (approx. 10-20 files)
foreach (var file in assets)
{
img.Source = new BitmapImage(new Uri(file.Path));
img.ImageOpened += async (sender, e) =>
{
// Do some work (can contain Task-based code)
};
}
But when I try to change it, I end up with some design problems:
assets
.ToObservable()
.Select(file =>
{
img.Source = new BitmapImage(new Uri(file.Path));
return img.Events().ImageOpened;
})
.Switch()
.Select(event =>
{
// Now I'm stuck, I don't have the file...
})
.Subscribe(
_ =>
{
},
ex => System.Diagnostics.Debug.WriteLine("Error on subscribing to ImageOpened"))
.DisposeWith(_subscriptions);
I feel I'm going about this the wrong way...
In the end I needed to change my logic due to another limitation with my Image control, but my solution was in the following direction using Zip:
Observable
.Zip(
assets
.ToObservable()
.Do(file => imageControl.Source = new BitmapImage(new Uri(file.Path))),
imageControl
.Events() // Extension class for my events
.ImageOpened,
(asset, _) =>
{
// Do some work ...
})
.Subscribe(
_ => { },
ex => System.Diagnostics.Debug.WriteLine("Error on subscribing to Zip"));
.DisposeWith(_subscriptions);

How to implement Task.WhenAny() with a predicate

I want to execute several asynchronous tasks concurrently. Each task will run an HTTP request that can either complete successfully or throw an exception. I need to await until the first task completes successfully, or until all the tasks have failed.
How can I implement an overload of the Task.WhenAny method that accepts a predicate, so that I can exclude the non-successfully completed tasks?
Wait for any task and return the task if the condition is met. Otherwise wait again for the other tasks until there is no more task to wait for.
public static async Task<Task> WhenAny( IEnumerable<Task> tasks, Predicate<Task> condition )
{
var tasklist = tasks.ToList();
while ( tasklist.Count > 0 )
{
var task = await Task.WhenAny( tasklist );
if ( condition( task ) )
return task;
tasklist.Remove( task );
}
return null;
}
simple check for that
var tasks = new List<Task> {
Task.FromException( new Exception() ),
Task.FromException( new Exception() ),
Task.FromException( new Exception() ),
Task.CompletedTask, };
var completedTask = WhenAny( tasks, t => t.Status == TaskStatus.RanToCompletion ).Result;
if ( tasks.IndexOf( completedTask ) != 3 )
throw new Exception( "not expected" );
public static Task<T> GetFirstResult<T>(
ICollection<Func<CancellationToken, Task<T>>> taskFactories,
Predicate<T> predicate) where T : class
{
var tcs = new TaskCompletionSource<T>();
var cts = new CancellationTokenSource();
int completedCount = 0;
// in case you have a lot of tasks you might need to throttle them
//(e.g. so you don't try to send 99999999 requests at the same time)
// see: http://stackoverflow.com/a/25877042/67824
foreach (var taskFactory in taskFactories)
{
taskFactory(cts.Token).ContinueWith(t =>
{
if (t.Exception != null)
{
Console.WriteLine($"Task completed with exception: {t.Exception}");
}
else if (predicate(t.Result))
{
cts.Cancel();
tcs.TrySetResult(t.Result);
}
if (Interlocked.Increment(ref completedCount) == taskFactories.Count)
{
tcs.SetException(new InvalidOperationException("All tasks failed"));
}
}, cts.Token);
}
return tcs.Task;
}
Sample usage:
using System.Net.Http;
var client = new HttpClient();
var response = await GetFirstResult(
new Func<CancellationToken, Task<HttpResponseMessage>>[]
{
ct => client.GetAsync("http://microsoft123456.com", ct),
ct => client.GetAsync("http://microsoft123456.com", ct),
ct => client.GetAsync("http://microsoft123456.com", ct),
ct => client.GetAsync("http://microsoft123456.com", ct),
ct => client.GetAsync("http://microsoft123456.com", ct),
ct => client.GetAsync("http://microsoft123456.com", ct),
ct => client.GetAsync("http://microsoft123456.com", ct),
ct => client.GetAsync("http://microsoft.com", ct),
ct => client.GetAsync("http://microsoft123456.com", ct),
ct => client.GetAsync("http://microsoft123456.com", ct),
},
rm => rm.IsSuccessStatusCode);
Console.WriteLine($"Successful response: {response}");
public static Task<Task<T>> WhenFirst<T>(IEnumerable<Task<T>> tasks, Func<Task<T>, bool> predicate)
{
if (tasks == null) throw new ArgumentNullException(nameof(tasks));
if (predicate == null) throw new ArgumentNullException(nameof(predicate));
var tasksArray = (tasks as IReadOnlyList<Task<T>>) ?? tasks.ToArray();
if (tasksArray.Count == 0) throw new ArgumentException("Empty task list", nameof(tasks));
if (tasksArray.Any(t => t == null)) throw new ArgumentException("Tasks contains a null reference", nameof(tasks));
var tcs = new TaskCompletionSource<Task<T>>();
var count = tasksArray.Count;
Action<Task<T>> continuation = t =>
{
if (predicate(t))
{
tcs.TrySetResult(t);
}
if (Interlocked.Decrement(ref count) == 0)
{
tcs.TrySetResult(null);
}
};
foreach (var task in tasksArray)
{
task.ContinueWith(continuation);
}
return tcs.Task;
}
Sample usage:
var task = await WhenFirst(tasks, t => t.Status == TaskStatus.RanToCompletion);
if (task != null)
var value = await task;
Note that this doesn't propagate exceptions of failed tasks (just as WhenAny doesn't).
You can also create a version of this for the non-generic Task.
Here is an attempted improvement of the excellent Eli Arbel's answer. These are the improved points:
An exception in the predicate is propagated as a fault of the returned task.
The predicate is not called after a task has been accepted as the result.
The predicate is executed in the original SynchronizationContext. This makes it possible to access UI elements (if the WhenFirst method is called from a UI thread)
The source IEnumerable<Task<T>> is enumerated directly, without being converted to an array first.
public static Task<Task<T>> WhenFirst<T>(IEnumerable<Task<T>> tasks,
Func<Task<T>, bool> predicate)
{
if (tasks == null) throw new ArgumentNullException(nameof(tasks));
if (predicate == null) throw new ArgumentNullException(nameof(predicate));
var tcs = new TaskCompletionSource<Task<T>>(
TaskCreationOptions.RunContinuationsAsynchronously);
var pendingCount = 1; // The initial 1 represents the enumeration itself
foreach (var task in tasks)
{
if (task == null) throw new ArgumentException($"The {nameof(tasks)}" +
" argument included a null value.", nameof(tasks));
Interlocked.Increment(ref pendingCount);
HandleTaskCompletion(task);
}
if (Interlocked.Decrement(ref pendingCount) == 0) tcs.TrySetResult(null);
return tcs.Task;
async void HandleTaskCompletion(Task<T> task)
{
try
{
await task; // Continue on the captured context
}
catch { } // Ignore exception
if (tcs.Task.IsCompleted) return;
try
{
if (predicate(task))
tcs.TrySetResult(task);
else
if (Interlocked.Decrement(ref pendingCount) == 0)
tcs.TrySetResult(null);
}
catch (Exception ex)
{
tcs.TrySetException(ex);
}
}
}
Another way of doing this, very similar to Sir Rufo's answer, but using AsyncEnumerable and Ix.NET
Implement a little helper method to stream any task as soon as it's completed:
static IAsyncEnumerable<Task<T>> WhenCompleted<T>(IEnumerable<Task<T>> source) =>
AsyncEnumerable.Create(_ =>
{
var tasks = source.ToList();
Task<T> current = null;
return AsyncEnumerator.Create(
async () => tasks.Any() && tasks.Remove(current = await Task.WhenAny(tasks)),
() => current,
async () => { });
});
}
One can then process the tasks in completion order, e.g. returning the first matching one as requested:
await WhenCompleted(tasks).FirstOrDefault(t => t.Status == TaskStatus.RanToCompletion)
Just wanted to add on some of the answers #Peebo and #SirRufo that are using List.Remove (because I can't comment yet)
I would consider using:
var tasks = source.ToHashSet();
instead of:
var tasks = source.ToList();
so removing would be more efficient

How to Multiple C# Task Factory

This is the C# Code
app.UseCors(CorsOptions.AllowAll);
var hubConfiguration = new HubConfiguration();
hubConfiguration.EnableDetailedErrors = true;
app.MapSignalR(hubConfiguration);
CpuEngine cpuEngine = new CpuEngine(1500);
MemoryEngine memoryEngine = new MemoryEngine(1500);
// Task.Factory.StartNew(async () => await cpuEngine.StartCpuCheck());
// Task.Factory.StartNew(async () => await memoryEngine.StartCheckMemory());
Only the first one is running. How can I run each other?
1) Use Task.Run instead.
2) Remove the keywords async and await in the lambda.
3) Use Task.WhenAll and pass in the two tasks.
public async Task InvokeAsync()
{
var cpuEngine = new CpuEngine(1500);
var memoryEngine = new MemoryEngine(1500);
await Task.WhenAll(
Task.Run(() => cpuEngine.StartCpuCheck()),
Task.Run(() => memoryEngine.StartCheckMemory()));
}

Categories