How can I change an iteration to be Reactive (Rx) - c#

I would like to change the following code to be Observable-based:
// 'assets' is a IReadOnly list of StorageFile (approx. 10-20 files)
foreach (var file in assets)
{
img.Source = new BitmapImage(new Uri(file.Path));
img.ImageOpened += async (sender, e) =>
{
// Do some work (can contain Task-based code)
};
}
But when I try to change it, I end up with some design problems:
assets
.ToObservable()
.Select(file =>
{
img.Source = new BitmapImage(new Uri(file.Path));
return img.Events().ImageOpened;
})
.Switch()
.Select(event =>
{
// Now I'm stuck, I don't have the file...
})
.Subscribe(
_ =>
{
},
ex => System.Diagnostics.Debug.WriteLine("Error on subscribing to ImageOpened"))
.DisposeWith(_subscriptions);
I feel I'm going about this the wrong way...

In the end I needed to change my logic due to another limitation with my Image control, but my solution was in the following direction using Zip:
Observable
.Zip(
assets
.ToObservable()
.Do(file => imageControl.Source = new BitmapImage(new Uri(file.Path))),
imageControl
.Events() // Extension class for my events
.ImageOpened,
(asset, _) =>
{
// Do some work ...
})
.Subscribe(
_ => { },
ex => System.Diagnostics.Debug.WriteLine("Error on subscribing to Zip"));
.DisposeWith(_subscriptions);

Related

C# and Rx: testing a timeout

I've implemented an observer for a FileSystemWatcher. The idea is to track copies in a folder and wait until copy is finished. Once done, let's do something with copied files.
I've tried my implementation on a small program and it works. What I wanted to do is to make it more formal using unit test:
[TestMethod]
public void TestIsFileWritingFinished()
{
try
{
var dirName = Path.GetTempPath()+Path.DirectorySeparatorChar+DateTime.Now.ToString("MMddyyyy");
if (Directory.Exists(dirName))
{
Directory.Delete(dirName, true);
}
var dir = Directory.CreateDirectory(dirName);
var observer = new FileSystemObserver(dirName, "*.*", true)
.ChangedFiles
.Where(x => (new FileInfo(x.FullPath)).Length > 0)
.Select(x => x.Name);
var timeout = observer.Timeout(/*DateTimeOffset.UtcNow.AddSeconds(1)*/TimeSpan.FromSeconds(1));
var filesChanged = new List<string>();
var terminated = false;
timeout.Subscribe(Console.WriteLine, Console.WriteLine, ()=>terminated=true);
Thread.Sleep(100);
var origin = #"C:\Users\David\ResultA";
CopyDirectory(origin, dirName, true);
Thread.Sleep(100);
Console.WriteLine("nap for 5s");
Thread.Sleep(5000);
//Directory.Delete(dirName, true);
Assert.IsTrue(terminated);
}
catch(Exception ex)
{
Assert.Fail(ex.Message);
}
}
So, when timeout happens, I expect the boolean to be true. But looks like it's not.
Any idea about what's wrong with my test?
Thanks in advance, your suggestions will be appreciated,
Kind regards,
Even thought there is a Timeout operator, I find it is a bit of an anti-pattern. It's like programming with exceptions. I find the following pattern more useful:
IObservable<bool> query =
Observable.Amb(
source.LastAsync().Select(x => true),
Observable.Timer(TimeSpan.FromSeconds(seconds)).Select(x => false));
This is effectively running source to completion and then returns true, but if the Timer completes first it returns false.
So, in your code, I'd try something like this:
var dirName = Path.GetTempPath() + Path.DirectorySeparatorChar + DateTime.Now.ToString("MMddyyyy");
if (Directory.Exists(dirName))
{
Directory.Delete(dirName, true);
}
var dir = Directory.CreateDirectory(dirName);
var observer =
new FileSystemObserver(dirName, "*.*", true)
.ChangedFiles
.Where(x => (new FileInfo(x.FullPath)).Length > 0)
.Select(x => x.Name);
var result =
Observable.Amb(
observer.ToArray().Select(x => true),
Observable.Timer(TimeSpan.FromSeconds(1.0)).Select(x => false));
var query =
result.Zip(Observable.Start(() =>
{
var origin = #"C:\Users\David\ResultA";
CopyDirectory(origin, dirName, true);
}), (r, s) => r);
var terminated = query.Wait();
Assert.IsTrue(terminated);
That totally avoids any pesky sleeps.

Parallel.ForEach blocking calling method

I am having a problem with Parallel.ForEach. I have written simple application that adds file names to be downloaded to the queue, then using while loop it iterates through the queue, downloads file one at a time, then when file has been downloaded, another async method is called to create object from downloaded memoryStream. Returned result of this method is not awaited, it is discarded, so the next download starts immediately. Everything works fine if I use simple foreach in object creation - objects are being created while download is continuing. But if I would like to speed up the object creation process and use Parallel.ForEach it stops download process until the object is created. UI is fully responsive, but it just won't download the next object. I don't understand why is this happening - Parallel.ForEach is inside await Task.Run() and to my limited knowledge about asynchronous programming this should do the trick. Can anyone help me understand why is it blocking first method and how to avoid it?
Here is a small sample:
public async Task DownloadFromCloud(List<string> constructNames)
{
_downloadDataQueue = new Queue<string>();
var _gcsClient = StorageClient.Create();
foreach (var item in constructNames)
{
_downloadDataQueue.Enqueue(item);
}
while (_downloadDataQueue.Count > 0)
{
var memoryStream = new MemoryStream();
await _gcsClient.DownloadObjectAsync("companyprojects",
_downloadDataQueue.Peek(), memoryStream);
memoryStream.Position = 0;
_ = ReadFileXml(memoryStream);
_downloadDataQueue.Dequeue();
}
}
private async Task ReadFileXml(MemoryStream memoryStream)
{
var reader = new XmlReader();
var properties = reader.ReadXmlTest(memoryStream);
await Task.Run(() =>
{
var entityList = new List<Entity>();
foreach (var item in properties)
{
entityList.Add(CreateObjectsFromDownloadedProperties(item));
}
//Parallel.ForEach(properties item =>
//{
// entityList.Add(CreateObjectsFromDownloadedProperties(item));
//});
});
}
EDIT
This is simplified object creation method:
public Entity CreateObjectsFromDownloadedProperties(RebarProperties properties)
{
var path = new LinearPath(properties.Path);
var section = new Region(properties.Region);
var sweep = section.SweepAsMesh(path, 1);
return sweep;
}
Returned result of this method is not awaited, it is discarded, so the next download starts immediately.
This is also dangerous. "Fire and forget" means "I don't care when this operation completes, or if it completes. Just discard all exceptions because I don't care." So fire-and-forget should be extremely rare in practice. It's not appropriate here.
UI is fully responsive, but it just won't download the next object.
I have no idea why it would block the downloads, but there's a definite problem in switching to Parallel.ForEach: List<T>.Add is not threadsafe.
private async Task ReadFileXml(MemoryStream memoryStream)
{
var reader = new XmlReader();
var properties = reader.ReadXmlTest(memoryStream);
await Task.Run(() =>
{
var entityList = new List<Entity>();
Parallel.ForEach(properties, item =>
{
var itemToAdd = CreateObjectsFromDownloadedProperties(item);
lock (entityList) { entityList.Add(itemToAdd); }
});
});
}
One tip: if you have a result value, PLINQ is often cleaner than Parallel:
private async Task ReadFileXml(MemoryStream memoryStream)
{
var reader = new XmlReader();
var properties = reader.ReadXmlTest(memoryStream);
await Task.Run(() =>
{
var entityList = proeprties
.AsParallel()
.Select(CreateObjectsFromDownloadedProperties)
.ToList();
});
}
However, the code still suffers from the fire-and-forget problem.
For a better fix, I'd recommend taking a step back and using something more suited to "pipeline"-style processing. E.g., TPL Dataflow:
public async Task DownloadFromCloud(List<string> constructNames)
{
// Set up the pipeline.
var gcsClient = StorageClient.Create();
var downloadBlock = new TransformBlock<string, MemoryStream>(async constructName =>
{
var memoryStream = new MemoryStream();
await gcsClient.DownloadObjectAsync("companyprojects", constructName, memoryStream);
memoryStream.Position = 0;
return memoryStream;
});
var processBlock = new TransformBlock<MemoryStream, List<Entity>>(memoryStream =>
{
var reader = new XmlReader();
var properties = reader.ReadXmlTest(memoryStream);
return proeprties
.AsParallel()
.Select(CreateObjectsFromDownloadedProperties)
.ToList();
});
var resultsBlock = new ActionBlock<List<Entity>>(entities => { /* TODO */ });
downloadBlock.LinkTo(processBlock, new DataflowLinkOptions { PropagateCompletion = true });
processBlock.LinkTo(resultsBlock, new DataflowLinkOptions { PropagateCompletion = true });
// Push data into the pipeline.
foreach (var constructName in constructNames)
await downloadBlock.SendAsync(constructName);
downlodBlock.Complete();
// Wait for pipeline to complete.
await resultsBlock.Completion;
}

Properly using multiple resources with Rx

I need to use multiple disposable resources with Rx. This is how I have nested the Observable.Using statements (the inner source is just for testing).
var obs = Observable.Using(
() => new FileStream("file.txt", FileMode.Open),
fs => Observable.Using(
() => new StreamReader(fs),
sr => Observable.Create<string>(o =>
TaskPoolScheduler.Default.ScheduleAsync(async (sch, ct) =>
{
while (!ct.IsCancellationRequested)
{
var s = await sr.ReadLineAsync().ConfigureAwait(false);
if (s is null) break;
o.OnNext(s);
}
o.OnCompleted();
}))));
obs.Subscribe(Console.WriteLine);
Is there a more concise way to use multiple disposable resources?
I can't think of a general way to use unlimited number of resources, but at least you could make helper methods for the common cases of 2-3 resources. Here is an implementation for two:
public static IObservable<TResult> Using<TResult, TResource1, TResource2>(
Func<TResource1> resourceFactory1,
Func<TResource1, TResource2> resourceFactory2,
Func<TResource1, TResource2, IObservable<TResult>> observableFactory)
where TResource1 : IDisposable
where TResource2 : IDisposable
{
return Observable.Using(resourceFactory1, resource1 => Observable.Using(
() => resourceFactory2(resource1),
resource2 => observableFactory(resource1, resource2)));
}
Usage example:
var obs = Using(
() => new FileStream("file.txt", FileMode.Open),
(fs) => new StreamReader(fs),
(fs, sr) => Observable.Create<string>(o =>
TaskPoolScheduler.Default.ScheduleAsync(async (sch, ct) =>
{
while (!ct.IsCancellationRequested)
{
var s = await sr.ReadLineAsync().ConfigureAwait(false);
if (s is null) break;
o.OnNext(s);
}
o.OnCompleted();
})));
Usually when I have a set of resources to be used in a cold observable, I stick with the create pattern which is much easier to read.
var obs = Observable.Create<string>(observer =>
{
var fs = new FileStream("file.txt", FileMode.Open);
var sr = new StreamReader(fs);
var task = TaskPoolScheduler.Default.ScheduleAsync(async (_, ct) =>
{
while (!ct.IsCancellationRequested)
{
var s = await sr.ReadLineAsync().ConfigureAwait(false);
if (s is null) break;
observer.OnNext(s);
}
observer.OnCompleted();
});
return new CompositeDisposable(task, fs, sr);
});
I'd just continue to use the double Using pattern, but I'd clean up the internals of how you're reading the lines. Try this:
var obs =
Observable.Using(() => new FileStream("file.txt", FileMode.Open),
fs => Observable.Using(() => new StreamReader(fs),
sr =>
Observable
.Defer(() => Observable.FromAsync(() => sr.ReadLineAsync()))
.Repeat()
.TakeWhile(x => x != null)));
You should try to avoid Observable.Create where possible.
Cleaner observable queries means that the Using operators don't look so out-of-place.

Execution Timeout Expired errors using Entity Framework

My code below starts out great and seems to run pretty quickly but then I will start getting this error message after awhile. I realize that Entity Framework/Dbcontext is not thread safe and this is probably causing the issue so is there a way to change this code so that it keeps the same performance and doesn't have issues with not closing threads which is probably causing the problem or is there another way to speed up this process? I have over 9000 symbols to download and insert into a database and I tried doing basic for loops with the await command but it was extremely slow and took more than 10 times longer to achieve the same results.
public static async Task startInitialMarketSymbolsDownload(string market)
{
try
{
List<string> symbolList = new List<string>();
symbolList = getStockSymbols(market);
var historicalGroups = symbolList.Select((x, i) => new { x, i })
.GroupBy(x => x.i / 50)
.Select(g => g.Select(x => x.x).ToArray());
await Task.WhenAll(historicalGroups.Select(g => Task.Run(() => getLocalHistoricalStockData(g, market))));
}
catch (Exception ex)
{
Console.WriteLine(ex.Message);
}
}
public static async Task getLocalHistoricalStockData(string[] symbols, string market)
{
// download data for list of symbols and then upload to db tables
string symbolInfo = null;
try
{
using (financeEntities context = new financeEntities())
{
foreach (string symbol in symbols)
{
symbolInfo = symbol;
List<HistoryPrice> hList = Get(symbol, new DateTime(1900, 1, 1), DateTime.UtcNow);
var backDates = context.DailyAmexDatas.Where(c => c.Symbol == symbol).Select(d => d.Date).ToList();
List<HistoryPrice> newHList = hList.Where(c => backDates.Contains(c.Date) == false).ToList<HistoryPrice>();
foreach (HistoryPrice h in newHList)
{
DailyAmexData amexData = new DailyAmexData();
// set the data then add it
amexData.Symbol = symbol;
amexData.Open = h.Open;
amexData.High = h.High;
amexData.Low = h.Low;
amexData.Close = h.Close;
amexData.Volume = h.Volume;
amexData.AdjustedClose = h.AdjClose;
amexData.Date = h.Date;
context.DailyAmexDatas.Add(amexData);
}
// now save everything
await context.SaveChangesAsync();
Console.WriteLine(symbol + " added to the " + market + " database!");
}
}
}
catch (Exception ex)
{
Console.WriteLine(ex.InnerException.Message);
}
}

How to observe dependent events in Reactive Extensions (Rx)?

What is the best way to handle dependent events such as;
There is an object for which I need to test if connection is succeeded or failed.
But the object first needs to pass the initialization step which I test for success or failure and then continue to connection step.
If initialization fails return is connection failed.
If initialization succeeds return is result of the connection step.
My code is below. Is there a better way to handle those dependent events because I'm subscribing for connection inside initialization subscription?
If I have more dependent events like this will I keep nesting the subscriptions?
public static void Test()
{
const int maxValue = 501;
var random = new Random(BitConverter.ToInt32(Guid.NewGuid().ToByteArray(), 0));
var initOk = Observable.Interval(TimeSpan.FromMilliseconds(random.Next(maxValue))).Select(i => true);
var initKo = Observable.Interval(TimeSpan.FromMilliseconds(random.Next(maxValue))).Select(i => false);
var connectOk = Observable.Interval(TimeSpan.FromMilliseconds(random.Next(maxValue))).Select(i => true);
var connectKo = Observable.Interval(TimeSpan.FromMilliseconds(random.Next(maxValue))).Select(i => false);
var initResult = initOk.Amb(initKo).Take(1);
var connectResult = connectOk.Amb(connectKo).Take(1);
var id =
initResult.Subscribe(ir =>
{
if (ir)
{
var cd =
connectResult.Subscribe(cr =>
{
Console.WriteLine(cr
? "Connection succeeded."
: "Connection failed.");
});
}
else
{
Console.WriteLine("Initialization failed thus connection failed.");
}
});
}
You can normally avoid nesting by utilising a variety of the rx operators to chain calls up.
Your example could be tidied up in using:
initResult.SelectMany(ir =>
{
if (ir != null)
{
return connectResult;
}
Console.WriteLine("Initialization failed thus connection failed.");
return Observable.Throw(new Exception("Some Exception"));
})
.Subscribe(cr =>
{
Console.WriteLine(cr != null
? "Connection succeeded."
: "Connection failed.");
})
You could use this:
var finalResult =
initResult
.Select(ir =>
Observable.If(() => ir, connectResult, Observable.Return(false)))
.Merge();
To get your messages out, you could change it like this:
var initResultText =
initResult
.Select(ir =>
ir ? (string)null : "Initialization failed thus connection failed.");
var connectResultText =
connectResult
.Select(cr =>
String.Format("Connection {0}.", cr ? "succeeded" : "failed"));
var finalResult =
initResultText
.Select(irt =>
Observable.If(() =>
irt == null, connectResultText, Observable.Return(irt)))
.Merge();
If you need to nest further than this you should consider making an extension method that wraps up the complexity and thus composition would be much easier.

Categories