Parallel.Foreach + yield return?

Parallel.Foreach + yield return? - c#

I want to process something using parallel loop like this :
public void FillLogs(IEnumerable<IComputer> computers)
{
Parallel.ForEach(computers, cpt=>
{
cpt.Logs = cpt.GetRawLogs().ToList();
});
}
Ok, it works fine. But How to do if I want the FillLogs method return an IEnumerable ?
public IEnumerable<IComputer> FillLogs(IEnumerable<IComputer> computers)
{
Parallel.ForEach(computers, cpt=>
{
cpt.Logs = cpt.GetRawLogs().ToList();
yield return cpt // KO, don't work
});
}
EDIT
It seems not to be possible... but I use something like this :
public IEnumerable<IComputer> FillLogs(IEnumerable<IComputer> computers)
{
return computers.AsParallel().Select(cpt => cpt);
}
But where I put the cpt.Logs = cpt.GetRawLogs().ToList(); instruction

Short version - no, that isn't possible via an iterator block; the longer version probably involves synchronized queue/dequeue between the caller's iterator thread (doing the dequeue) and the parallel workers (doing the enqueue); but as a side note - logs are usually IO-bound, and parallelising things that are IO-bound often doesn't work very well.
If the caller is going to take some time to consume each, then there may be some merit to an approach that only processes one log at a time, but can do that while the caller is consuming the previous log; i.e. it begins a Task for the next item before the yield, and waits for completion after the yield... but that is again, pretty complex. As a simplified example:
static void Main()
{
foreach(string s in Get())
{
Console.WriteLine(s);
}
}
static IEnumerable<string> Get() {
var source = new[] {1, 2, 3, 4, 5};
Task<string> outstandingItem = null;
Func<object, string> transform = x => ProcessItem((int) x);
foreach(var item in source)
{
var tmp = outstandingItem;
// note: passed in as "state", not captured, so not a foreach/capture bug
outstandingItem = new Task<string>(transform, item);
outstandingItem.Start();
if (tmp != null) yield return tmp.Result;
}
if (outstandingItem != null) yield return outstandingItem.Result;
}
static string ProcessItem(int i)
{
return i.ToString();
}

I don't want to be offensive, but maybe there is a lack of understanding. Parallel.ForEach means that the TPL will run the foreach according to the available hardware in several threads. But that means, that ii is possible to do that work in parallel! yield return gives you the opportunity to get some values out of a list (or what-so-ever) and give them back one-by-one as they are needed. It prevents of the need to first find all items matching the condition and then iterate over them. That is indeed a performance advantage, but can't be done in parallel.

Although the question is old I've managed to do something just for fun.
class Program
{
static void Main(string[] args)
{
foreach (var message in GetMessages())
{
Console.WriteLine(message);
}
}
// Parallel yield
private static IEnumerable<string> GetMessages()
{
int total = 0;
bool completed = false;
var batches = Enumerable.Range(1, 100).Select(i => new Computer() { Id = i });
var qu = new ConcurrentQueue<Computer>();
Task.Run(() =>
{
try
{
Parallel.ForEach(batches,
() => 0,
(item, loop, subtotal) =>
{
Thread.Sleep(1000);
qu.Enqueue(item);
return subtotal + 1;
},
result => Interlocked.Add(ref total, result));
}
finally
{
completed = true;
}
});
int current = 0;
while (current < total || !completed)
{
SpinWait.SpinUntil(() => current < total || completed);
if (current == total) yield break;
current++;
qu.TryDequeue(out Computer computer);
yield return $"Completed {computer.Id}";
}
}
}
public class Computer
{
public int Id { get; set; }
}
Compared to Koray's answer this one really uses all the CPU cores.

You can use the following extension method
public static class ParallelExtensions
{
public static IEnumerable<T1> OrderedParallel<T, T1>(this IEnumerable<T> list, Func<T, T1> action)
{
var unorderedResult = new ConcurrentBag<(long, T1)>();
Parallel.ForEach(list, (o, state, i) =>
{
unorderedResult.Add((i, action.Invoke(o)));
});
var ordered = unorderedResult.OrderBy(o => o.Item1);
return ordered.Select(o => o.Item2);
}
}
use like:
public void FillLogs(IEnumerable<IComputer> computers)
{
cpt.Logs = computers.OrderedParallel(o => o.GetRawLogs()).ToList();
}
Hope this will save you some time.

How about
Queue<string> qu = new Queue<string>();
bool finished = false;
Task.Factory.StartNew(() =>
{
Parallel.ForEach(get_list(), (item) =>
{
string itemToReturn = heavyWorkOnItem(item);
lock (qu)
qu.Enqueue(itemToReturn );
});
finished = true;
});
while (!finished)
{
lock (qu)
while (qu.Count > 0)
yield return qu.Dequeue();
//maybe a thread sleep here?
}
Edit:
I think this is better:
public static IEnumerable<TOutput> ParallelYieldReturn<TSource, TOutput>(this IEnumerable<TSource> source, Func<TSource, TOutput> func)
{
ConcurrentQueue<TOutput> qu = new ConcurrentQueue<TOutput>();
bool finished = false;
AutoResetEvent re = new AutoResetEvent(false);
Task.Factory.StartNew(() =>
{
Parallel.ForEach(source, (item) =>
{
qu.Enqueue(func(item));
re.Set();
});
finished = true;
re.Set();
});
while (!finished)
{
re.WaitOne();
while (qu.Count > 0)
{
TOutput res;
if (qu.TryDequeue(out res))
yield return res;
}
}
}
Edit2: I agree with the short No answer. This code is useless; you cannot break the yield loop.

Related

How to await multiple IAsyncEnumerable

We have code like this:
var intList = new List<int>{1,2,3};
var asyncEnumerables = intList.Select(Foo);
private async IAsyncEnumerable<int> Foo(int a)
{
while (true)
{
await Task.Delay(5000);
yield return a;
}
}
I need to start await foreach for every asyncEnumerable's entry. Every loop iteration should wait each other, and when every iteration is done i need to collect every iteration's data and process that by another method.
Can i somehow achieve that by TPL? Otherwise, couldn't you give me some ideas?

What works for me is the Zip function in this repo (81 line)
I'm using it like this
var intList = new List<int> { 1, 2, 3 };
var asyncEnumerables = intList.Select(RunAsyncIterations);
var enumerableToIterate = async_enumerable_dotnet.AsyncEnumerable.Zip(s => s, asyncEnumerables.ToArray());
await foreach (int[] enumerablesConcatenation in enumerableToIterate)
{
Console.WriteLine(enumerablesConcatenation.Sum()); //Sum returns 6
await Task.Delay(2000);
}
static async IAsyncEnumerable<int> RunAsyncIterations(int i)
{
while (true)
yield return i;
}

Here is a generic method Zip you could use, implemented as an iterator. The cancellationToken is decorated with the EnumeratorCancellation attribute, so that the resulting IAsyncEnumerable is WithCancellation friendly.
using System.Runtime.CompilerServices;
public static async IAsyncEnumerable<TSource[]> Zip<TSource>(
IEnumerable<IAsyncEnumerable<TSource>> sources,
[EnumeratorCancellation]CancellationToken cancellationToken = default)
{
var enumerators = sources
.Select(x => x.GetAsyncEnumerator(cancellationToken))
.ToArray();
try
{
while (true)
{
var array = new TSource[enumerators.Length];
for (int i = 0; i < enumerators.Length; i++)
{
if (!await enumerators[i].MoveNextAsync()) yield break;
array[i] = enumerators[i].Current;
}
yield return array;
}
}
finally
{
foreach (var enumerator in enumerators)
{
await enumerator.DisposeAsync();
}
}
}
Usage example:
await foreach (int[] result in Zip(asyncEnumerables))
{
Console.WriteLine($"Result: {String.Join(", ", result)}");
}

Concatenate two infinite C# IEnumerables together in no particular order

I have two methods that each return infinite IEnumerable that never ends. I want to concatenate them so whenever any of the IEnumerables return a value, I can instantly get and process it.
static void Main(string[] args)
{
var streamOfBoth = get1().Concat(get2());
foreach(var item in streamOfBoth)
{
Console.WriteLine(item);
// I'd expect mixed numbers 1 and 2
// Instead I receive only 1s
}
}
static IEnumerable<int> get1()
{
while (true)
{
System.Threading.Thread.Sleep(1000);
yield return 1;
}
}
static IEnumerable<int> get2()
{
while (true)
{
System.Threading.Thread.Sleep(200);
yield return 2;
}
}
Is there a way to do this with IEnumerables without having to use threads?

This is fairly easily achieved with System.Reactive
static void Main()
{
get1().ToObservable(TaskPoolScheduler.Default).Subscribe(Print);
get2().ToObservable(TaskPoolScheduler.Default).Subscribe(Print);
}
static void Print(int i)
{
Console.WriteLine(i);
}
static IEnumerable<int> get1()
{
while (true)
{
System.Threading.Thread.Sleep(1000);
yield return 1;
}
}
static IEnumerable<int> get2()
{
while (true)
{
System.Threading.Thread.Sleep(200);
yield return 2;
}
}
This produces the following output on my machine:
2 2 2 2 1 2 2 2 2 2 1 2 2 2 2 2 1 2 2 2 2 2 1 2 2 2 2 2 1 2 2 2 2 2 1 ...
Note that ToObservable is called with the argument TaskPoolScheduler.Default; just calling ToObservable without it will result in synchronous execution, meaning it will keep enumerating the first sequence forever and never get to the second one.

You may want to interleave get1 and get2 (take item from get1, then from get2, then again from get1 and then from get2 etc.). Generalized (IEnumerable<T> are not necessary infinite and of the same size) Interleave<T> extension method can be like this:
public static partial class EnumerableExtensions {
public static IEnumerable<T> Interleave<T>(this IEnumerable<T> source,
params IEnumerable<T>[] others) {
if (null == source)
throw new ArgumentNullException(nameof(source));
else if (null == others)
throw new ArgumentNullException(nameof(others));
IEnumerator<T>[] enums = new IEnumerator<T>[] { source.GetEnumerator() }
.Concat(others
.Where(item => item != null)
.Select(item => item.GetEnumerator()))
.ToArray();
try {
bool hasValue = true;
while (hasValue) {
hasValue = false;
for (int i = 0; i < enums.Length; ++i) {
if (enums[i] != null && enums[i].MoveNext()) {
hasValue = true;
yield return enums[i].Current;
}
else {
enums[i].Dispose();
enums[i] = null;
}
}
}
}
finally {
for (int i = enums.Length - 1; i >= 0; --i)
if (enums[i] != null)
enums[i].Dispose();
}
}
}
Then use it:
var streamOfBoth = get1().Interleave(get2());
foreach(var item in streamOfBoth)
{
Console.WriteLine(item);
}
Edit: if
"whenever any ... return a value, I can instantly get and process"
is crucial phrase in your question you can try BlockingCollection and implement producer-consumer pattern:
static BlockingCollection<int> streamOfBoth = new BlockingCollection<int>();
// Producer #1
static void get1() {
while (true) {
System.Threading.Thread.Sleep(1000);
streamOfBoth.Add(1); // value (1) is ready and pushed into streamOfBoth
}
}
// Producer #2
static void get2() {
while (true) {
System.Threading.Thread.Sleep(200);
streamOfBoth.Add(2); // value (2) is ready and pushed into streamOfBoth
}
}
...
Task.Run(() => get1()); // Start producer #1
Task.Run(() => get2()); // Start producer #2
...
// Cosumer: when either Producer #1 or Producer #2 create a value
// consumer can starts process it
foreach(var item in streamOfBoth.GetConsumingEnumerable()) {
Console.WriteLine(item);
}

Here is a generic Merge method that merges IEnumerables. Each IEnumerable is enumerated in a dedicated thread.
using System.Reactive.Linq;
using System.Reactive.Concurrency;
public static IEnumerable<T> Merge<T>(params IEnumerable<T>[] sources)
{
IEnumerable<IObservable<T>> observables = sources
.Select(source => source.ToObservable(NewThreadScheduler.Default));
IObservable<T> merged = Observable.Merge(observables);
return merged.ToEnumerable();
}
Usage example:
var streamOfBoth = Merge(get1(), get2());
Enumerating the resulting IEnumerable will block the current thread until the enumeration is finished.
This implementation depends on the System.Reactive and System.Interactive.Async packages.

Observable wait until no more changes for time x and then notify [duplicate]

I'm using reactive extensions to collate data into buffers of 100ms:
this.subscription = this.dataService
.Where(x => !string.Equals("FOO", x.Key.Source))
.Buffer(TimeSpan.FromMilliseconds(100))
.ObserveOn(this.dispatcherService)
.Where(x => x.Count != 0)
.Subscribe(this.OnBufferReceived);
This works fine. However, I want slightly different behavior than that provided by the Buffer operation. Essentially, I want to reset the timer if another data item is received. Only when no data has been received for the entire 100ms do I want to handle it. This opens up the possibility of never handling the data, so I should also be able to specify a maximum count. I would imagine something along the lines of:
.SlidingBuffer(TimeSpan.FromMilliseconds(100), 10000)
I've had a look around and haven't been able to find anything like this in Rx? Can anyone confirm/deny this?

This is possible by combining the built-in Window and Throttle methods of Observable. First, let's solve the simpler problem where we ignore the maximum count condition:
public static IObservable<IList<T>> BufferUntilInactive<T>(this IObservable<T> stream, TimeSpan delay)
{
var closes = stream.Throttle(delay);
return stream.Window(() => closes).SelectMany(window => window.ToList());
}
The powerful Window method did the heavy lifting. Now it's easy enough to see how to add a maximum count:
public static IObservable<IList<T>> BufferUntilInactive<T>(this IObservable<T> stream, TimeSpan delay, Int32? max=null)
{
var closes = stream.Throttle(delay);
if (max != null)
{
var overflows = stream.Where((x,index) => index+1>=max);
closes = closes.Merge(overflows);
}
return stream.Window(() => closes).SelectMany(window => window.ToList());
}
I'll write a post explaining this on my blog. https://gist.github.com/2244036
Documentation for the Window method:
http://leecampbell.blogspot.co.uk/2011/03/rx-part-9join-window-buffer-and-group.html
http://enumeratethis.com/2011/07/26/financial-charts-reactive-extensions/

I wrote an extension to do most of what you're after - BufferWithInactivity.
Here it is:
public static IObservable<IEnumerable<T>> BufferWithInactivity<T>(
this IObservable<T> source,
TimeSpan inactivity,
int maximumBufferSize)
{
return Observable.Create<IEnumerable<T>>(o =>
{
var gate = new object();
var buffer = new List<T>();
var mutable = new SerialDisposable();
var subscription = (IDisposable)null;
var scheduler = Scheduler.ThreadPool;
Action dump = () =>
{
var bts = buffer.ToArray();
buffer = new List<T>();
if (o != null)
{
o.OnNext(bts);
}
};
Action dispose = () =>
{
if (subscription != null)
{
subscription.Dispose();
}
mutable.Dispose();
};
Action<Action<IObserver<IEnumerable<T>>>> onErrorOrCompleted =
onAction =>
{
lock (gate)
{
dispose();
dump();
if (o != null)
{
onAction(o);
}
}
};
Action<Exception> onError = ex =>
onErrorOrCompleted(x => x.OnError(ex));
Action onCompleted = () => onErrorOrCompleted(x => x.OnCompleted());
Action<T> onNext = t =>
{
lock (gate)
{
buffer.Add(t);
if (buffer.Count == maximumBufferSize)
{
dump();
mutable.Disposable = Disposable.Empty;
}
else
{
mutable.Disposable = scheduler.Schedule(inactivity, () =>
{
lock (gate)
{
dump();
}
});
}
}
};
subscription =
source
.ObserveOn(scheduler)
.Subscribe(onNext, onError, onCompleted);
return () =>
{
lock (gate)
{
o = null;
dispose();
}
};
});
}

With Rx Extensions 2.0, your can answer both requirements with a new Buffer overload accepting a timeout and a size:
this.subscription = this.dataService
.Where(x => !string.Equals("FOO", x.Key.Source))
.Buffer(TimeSpan.FromMilliseconds(100), 1)
.ObserveOn(this.dispatcherService)
.Where(x => x.Count != 0)
.Subscribe(this.OnBufferReceived);
See https://msdn.microsoft.com/en-us/library/hh229200(v=vs.103).aspx for the documentation.

I guess this can be implemented on top of Buffer method as shown below:
public static IObservable<IList<T>> SlidingBuffer<T>(this IObservable<T> obs, TimeSpan span, int max)
{
return Observable.CreateWithDisposable<IList<T>>(cl =>
{
var acc = new List<T>();
return obs.Buffer(span)
.Subscribe(next =>
{
if (next.Count == 0) //no activity in time span
{
cl.OnNext(acc);
acc.Clear();
}
else
{
acc.AddRange(next);
if (acc.Count >= max) //max items collected
{
cl.OnNext(acc);
acc.Clear();
}
}
}, err => cl.OnError(err), () => { cl.OnNext(acc); cl.OnCompleted(); });
});
}
NOTE: I haven't tested it, but I hope it gives you the idea.

Colonel Panic's solution is almost perfect. The only thing that is missing is a Publish component, in order to make the solution work with cold sequences too.
/// <summary>
/// Projects each element of an observable sequence into a buffer that's sent out
/// when either a given inactivity timespan has elapsed, or it's full,
/// using the specified scheduler to run timers.
/// </summary>
public static IObservable<IList<T>> BufferUntilInactive<T>(
this IObservable<T> source, TimeSpan dueTime, int maxCount,
IScheduler scheduler = default)
{
if (maxCount < 1) throw new ArgumentOutOfRangeException(nameof(maxCount));
scheduler ??= Scheduler.Default;
return source.Publish(published =>
{
var combinedBoundaries = Observable.Merge
(
published.Throttle(dueTime, scheduler),
published.Skip(maxCount - 1)
);
return published
.Window(() => combinedBoundaries)
.SelectMany(window => window.ToList());
});
}
Beyond adding the Publish, I've also replaced the original .Where((_, index) => index + 1 >= maxCount) with the equivalent but shorter .Skip(maxCount - 1). For completeness there is also an IScheduler parameter, which configures the scheduler where the timer is run.

Make using statement usable for multiple disposable objects

I have a bunch of text files in a folder, and all of them should have identical headers. In other words the first 100 lines of all files should be identical. So I wrote a function to check this condition:
private static bool CheckHeaders(string folderPath, int headersCount)
{
var enumerators = Directory.EnumerateFiles(folderPath)
.Select(f => File.ReadLines(f).GetEnumerator())
.ToArray();
//using (enumerators)
//{
for (int i = 0; i < headersCount; i++)
{
foreach (var e in enumerators)
{
if (!e.MoveNext()) return false;
}
var values = enumerators.Select(e => e.Current);
if (values.Distinct().Count() > 1) return false;
}
return true;
//}
}
The reason I am using enumerators is memory efficiency. Instead of loading all file contents in memory I enumerate the files concurrently line-by-line until a mismatch is found, or all headers have been examined.
My problem is evident by the commented lines of code. I would like to utilize a using block to safely dispose all the enumerators, but unfortunately using (enumerators) doesn't compile. Apparently using can handle only a single disposable object. I know that I can dispose the enumerators manually, by wrapping the whole thing in a try-finally block, and running the disposing logic in a loop inside finally, but is seems awkward. Is there any mechanism I could employ to make the using statement a viable option in this case?
Update
I just realized that my function has a serious flaw. The construction of the enumerators is not robust. A locked file can cause an exception, while some enumerators have already been created. These enumerators will not be disposed. This is something I want to fix. I am thinking about something like this:
var enumerators = Directory.EnumerateFiles(folderPath)
.ToDisposables(f => File.ReadLines(f).GetEnumerator());
The extension method ToDisposables should ensure that in case of an exception no disposables are left undisposed.

You can create a disposable-wrapper over your enumerators:
class DisposableEnumerable : IDisposable
{
private IEnumerable<IDisposable> items;
public event UnhandledExceptionEventHandler DisposalFailed;
public DisposableEnumerable(IEnumerable<IDisposable> items) => this.items = items;
public void Dispose()
{
foreach (var item in items)
{
try
{
item.Dispose();
}
catch (Exception e)
{
var tmp = DisposalFailed;
tmp?.Invoke(this, new UnhandledExceptionEventArgs(e, false));
}
}
}
}
and use it with the lowest impact to your code:
private static bool CheckHeaders(string folderPath, int headersCount)
{
var enumerators = Directory.EnumerateFiles(folderPath)
.Select(f => File.ReadLines(f).GetEnumerator())
.ToArray();
using (var disposable = new DisposableEnumerable(enumerators))
{
for (int i = 0; i < headersCount; i++)
{
foreach (var e in enumerators)
{
if (!e.MoveNext()) return false;
}
var values = enumerators.Select(e => e.Current);
if (values.Distinct().Count() > 1) return false;
}
return true;
}
}
The thing is you have to dispose those objects separately one by one anyway. But it's up to you where to encapsulate that logic. And the code I've suggested has no manual try-finally,)

To the second part of the question. If I get you right this should be sufficient:
static class DisposableHelper
{
public static IEnumerable<TResult> ToDisposable<TSource, TResult>(this IEnumerable<TSource> source,
Func<TSource, TResult> selector) where TResult : IDisposable
{
var exceptions = new List<Exception>();
var result = new List<TResult>();
foreach (var i in source)
{
try { result.Add(selector(i)); }
catch (Exception e) { exceptions.Add(e); }
}
if (exceptions.Count == 0)
return result;
foreach (var i in result)
{
try { i.Dispose(); }
catch (Exception e) { exceptions.Add(e); }
}
throw new AggregateException(exceptions);
}
}
Usage:
private static bool CheckHeaders(string folderPath, int headersCount)
{
var enumerators = Directory.EnumerateFiles(folderPath)
.ToDisposable(f => File.ReadLines(f).GetEnumerator())
.ToArray();
using (new DisposableEnumerable(enumerators))
{
for (int i = 0; i < headersCount; i++)
{
foreach (var e in enumerators)
{
if (!e.MoveNext()) return false;
}
var values = enumerators.Select(e => e.Current);
if (values.Distinct().Count() > 1) return false;
}
return true;
}
}
and
try
{
CheckHeaders(folderPath, headersCount);
}
catch(AggregateException e)
{
// Prompt to fix errors and try again
}

I'm going to suggest an approach that uses recursive calls to Zip to allow parallel enumeration of a normal IEnumerable<string> without the need to resort to using IEnumerator<string>.
bool Zipper(IEnumerable<IEnumerable<string>> sources, int take)
{
IEnumerable<string> ZipperImpl(IEnumerable<IEnumerable<string>> ss)
=> (!ss.Skip(1).Any())
? ss.First().Take(take)
: ss.First().Take(take).Zip(
ZipperImpl(ss.Skip(1)),
(x, y) => (x == null || y == null || x != y) ? null : x);
var matching_lines = ZipperImpl(sources).TakeWhile(x => x != null).ToArray();
return matching_lines.Length == take;
}
Now build up your enumerables:
IEnumerable<string>[] enumerables =
Directory
.EnumerateFiles(folderPath)
.Select(f => File.ReadLines(f))
.ToArray();
Now it's simple to call:
bool headers_match = Zipper(enumerables, 100);
Here's a trace of running this code against three files with more than 4 lines:
Ben Petering at 5:28 PM ACST
Ben Petering at 5:28 PM ACST
Ben Petering at 5:28 PM ACST
From a call 2019-05-23, James mentioned he’d like the ability to edit the current shipping price rules (eg in shipping_rules.xml) via the admin.
From a call 2019-05-23, James mentioned he’d like the ability to edit the current shipping price rules (eg in shipping_rules.xml) via the admin.
From a call 2019-05-23, James mentioned he’d like the ability to edit the current shipping price rules (eg in shipping_rules.xml) via the admin.
He also mentioned he’d like to be able to set different shipping price rules for a given time window, e.g. Jan 1 to Jan 30.
He also mentioned he’d like to be able to set different shipping price rules for a given time window, e.g. Jan 1 to Jan 30.
He also mentioned he’d like to be able to set different shipping price rules for a given time window, e.g. Jan 1 to Jan 30.
These storyishes should be considered when choosing the appropriate module to use.
These storyishes should be considered when choosing the appropriate module to use.X
These storyishes should be considered when choosing the appropriate module to use.
Note that the enumerations stop when they encountered a mismatch header in the 4th line on the second file. All enumerations then stopped.

Creating an IDisposable wrapper as #Alex suggested is correct. It needs just a logic to dispose already opened files if some of them is locked and probably some logic for error states. Maybe something like this (error state logic is very simple):
public class HeaderChecker : IDisposable
{
private readonly string _folderPath;
private readonly int _headersCount;
private string _lockedFile;
private readonly List<IEnumerator<string>> _files = new List<IEnumerator<string>>();
public HeaderChecker(string folderPath, int headersCount)
{
_folderPath = folderPath;
_headersCount = headersCount;
}
public string LockedFile => _lockedFile;
public bool CheckFiles()
{
_lockedFile = null;
if (!TryOpenFiles())
{
return false;
}
if (_files.Count == 0)
{
return true; // Not sure what to return here.
}
for (int i = 0; i < _headersCount; i++)
{
if (!_files[0].MoveNext()) return false;
string currentLine = _files[0].Current;
for (int fileIndex = 1; fileIndex < _files.Count; fileIndex++)
{
if (!_files[fileIndex].MoveNext()) return false;
if (_files[fileIndex].Current != currentLine) return false;
}
}
return true;
}
private bool TryOpenFiles()
{
bool result = true;
foreach (string file in Directory.EnumerateFiles(_folderPath))
{
try
{
_files.Add(File.ReadLines(file).GetEnumerator());
}
catch
{
_lockedFile = file;
result = false;
break;
}
}
if (!result)
{
DisposeCore(); // Close already opened files.
}
return result;
}
private void DisposeCore()
{
foreach (var item in _files)
{
try
{
item.Dispose();
}
catch
{
}
}
_files.Clear();
}
public void Dispose()
{
DisposeCore();
}
}
// Usage
using (var checker = new HeaderChecker(folderPath, headersCount))
{
if (!checker.CheckFiles())
{
if (checker.LockedFile is null)
{
// Error while opening files.
}
else
{
// Headers do not match.
}
}
}
I also removed .Select() and .Distinct() when checking the lines. The first just iterates over the enumerators array - the same as foreach above it, so you are enumerating this array twice. Then creates a new list of lines and .Distinct() enumerates over it.

Pause and Resume Subscription on cold IObservable

Using Rx, I desire pause and resume functionality in the following code:
How to implement Pause() and Resume() ?
static IDisposable _subscription;
static void Main(string[] args)
{
Subscribe();
Thread.Sleep(500);
// Second value should not be shown after two seconds:
Pause();
Thread.Sleep(5000);
// Continue and show second value and beyond now:
Resume();
}
static void Subscribe()
{
var list = new List<int> { 1, 2, 3, 4, 5 };
var obs = list.ToObservable();
_subscription = obs.SubscribeOn(Scheduler.NewThread).Subscribe(p =>
{
Console.WriteLine(p.ToString());
Thread.Sleep(2000);
},
err => Console.WriteLine("Error"),
() => Console.WriteLine("Sequence Completed")
);
}
static void Pause()
{
// Pseudocode:
//_subscription.Pause();
}
static void Resume()
{
// Pseudocode:
//_subscription.Resume();
}
Rx Solution?
I believe I could make it work with some kind of Boolean field gating combined with thread locking (Monitor.Wait and Monitor.Pulse)
But is there an Rx operator or some other reactive shorthand to achieve the same aim?

Here's a reasonably simple Rx way to do what you want. I've created an extension method called Pausable that takes a source observable and a second observable of boolean that pauses or resumes the observable.
public static IObservable<T> Pausable<T>(
this IObservable<T> source,
IObservable<bool> pauser)
{
return Observable.Create<T>(o =>
{
var paused = new SerialDisposable();
var subscription = Observable.Publish(source, ps =>
{
var values = new ReplaySubject<T>();
Func<bool, IObservable<T>> switcher = b =>
{
if (b)
{
values.Dispose();
values = new ReplaySubject<T>();
paused.Disposable = ps.Subscribe(values);
return Observable.Empty<T>();
}
else
{
return values.Concat(ps);
}
};
return pauser.StartWith(false).DistinctUntilChanged()
.Select(p => switcher(p))
.Switch();
}).Subscribe(o);
return new CompositeDisposable(subscription, paused);
});
}
It can be used like this:
var xs = Observable.Generate(
0,
x => x < 100,
x => x + 1,
x => x,
x => TimeSpan.FromSeconds(0.1));
var bs = new Subject<bool>();
var pxs = xs.Pausable(bs);
pxs.Subscribe(x => { /* Do stuff */ });
Thread.Sleep(500);
bs.OnNext(true);
Thread.Sleep(5000);
bs.OnNext(false);
Thread.Sleep(500);
bs.OnNext(true);
Thread.Sleep(5000);
bs.OnNext(false);
It should be fairly easy for you to put this in your code with the Pause & Resume methods.

Here it is as an application of IConnectableObservable that I corrected slightly for the newer api (original here):
public static class ObservableHelper {
public static IConnectableObservable<TSource> WhileResumable<TSource>(Func<bool> condition, IObservable<TSource> source) {
var buffer = new Queue<TSource>();
var subscriptionsCount = 0;
var isRunning = System.Reactive.Disposables.Disposable.Create(() => {
lock (buffer)
{
subscriptionsCount--;
}
});
var raw = Observable.Create<TSource>(subscriber => {
lock (buffer)
{
subscriptionsCount++;
if (subscriptionsCount == 1)
{
while (buffer.Count > 0) {
subscriber.OnNext(buffer.Dequeue());
}
Observable.While(() => subscriptionsCount > 0 && condition(), source)
.Subscribe(
v => { if (subscriptionsCount == 0) buffer.Enqueue(v); else subscriber.OnNext(v); },
e => subscriber.OnError(e),
() => { if (subscriptionsCount > 0) subscriber.OnCompleted(); }
);
}
}
return isRunning;
});
return raw.Publish();
}
}

Here is my answer. I believe there may be a race condition around pause resume, however this can be mitigated by serializing all activity onto a scheduler. (favor Serializing over synchronizing).
using System;
using System.Reactive.Concurrency;
using System.Reactive.Disposables;
using System.Reactive.Linq;
using System.Reactive.Subjects;
using Microsoft.Reactive.Testing;
using NUnit.Framework;
namespace StackOverflow.Tests.Q7620182_PauseResume
{
[TestFixture]
public class PauseAndResumeTests
{
[Test]
public void Should_pause_and_resume()
{
//Arrange
var scheduler = new TestScheduler();
var isRunningTrigger = new BehaviorSubject<bool>(true);
Action pause = () => isRunningTrigger.OnNext(false);
Action resume = () => isRunningTrigger.OnNext(true);
var source = scheduler.CreateHotObservable(
ReactiveTest.OnNext(0.1.Seconds(), 1),
ReactiveTest.OnNext(2.0.Seconds(), 2),
ReactiveTest.OnNext(4.0.Seconds(), 3),
ReactiveTest.OnNext(6.0.Seconds(), 4),
ReactiveTest.OnNext(8.0.Seconds(), 5));
scheduler.Schedule(TimeSpan.FromSeconds(0.5), () => { pause(); });
scheduler.Schedule(TimeSpan.FromSeconds(5.0), () => { resume(); });
//Act
var sut = Observable.Create<IObservable<int>>(o =>
{
var current = source.Replay();
var connection = new SerialDisposable();
connection.Disposable = current.Connect();
return isRunningTrigger
.DistinctUntilChanged()
.Select(isRunning =>
{
if (isRunning)
{
//Return the current replayed values.
return current;
}
else
{
//Disconnect and replace current.
current = source.Replay();
connection.Disposable = current.Connect();
//yield silence until the next time we resume.
return Observable.Never<int>();
}
})
.Subscribe(o);
}).Switch();
var observer = scheduler.CreateObserver<int>();
using (sut.Subscribe(observer))
{
scheduler.Start();
}
//Assert
var expected = new[]
{
ReactiveTest.OnNext(0.1.Seconds(), 1),
ReactiveTest.OnNext(5.0.Seconds(), 2),
ReactiveTest.OnNext(5.0.Seconds(), 3),
ReactiveTest.OnNext(6.0.Seconds(), 4),
ReactiveTest.OnNext(8.0.Seconds(), 5)
};
CollectionAssert.AreEqual(expected, observer.Messages);
}
}
}

It just works:
class SimpleWaitPulse
{
static readonly object _locker = new object();
static bool _go;
static void Main()
{ // The new thread will block
new Thread (Work).Start(); // because _go==false.
Console.ReadLine(); // Wait for user to hit Enter
lock (_locker) // Let's now wake up the thread by
{ // setting _go=true and pulsing.
_go = true;
Monitor.Pulse (_locker);
}
}
static void Work()
{
lock (_locker)
while (!_go)
Monitor.Wait (_locker); // Lock is released while we’re waiting
Console.WriteLine ("Woken!!!");
}
}
Please, see How to Use Wait and Pulse for more details

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Parallel.Foreach + yield return? - c#

Related

How to await multiple IAsyncEnumerable

Concatenate two infinite C# IEnumerables together in no particular order

Observable wait until no more changes for time x and then notify [duplicate]

Make using statement usable for multiple disposable objects

Pause and Resume Subscription on cold IObservable

Categories

Resources