Rx on WinRT: Iterate collection over time - c#

I want to iterate a collection by moving from element to element during specific time intervals. So for example this method works fine:
var a = new List<int> { 1, 2, 3, 4}.ToObservable();
var b = Observable.Interval(TimeSpan.FromSeconds(1));
var c = a.Zip(b, (l, r) => l);
c.Subscribe(x => Debug.WriteLine(x));
But I would like to use the value of each element in the list as the interval, so I am using this code:
var a = new List<int> { 1, 2, 3, 4}.ToObservable();
var b = a.Delay(x => Observable.Timer(TimeSpan.FromSeconds(x)));
b.Subscribe(x => Debug.WriteLine(x));
As it stated here http://blogs.msdn.com/b/rxteam/archive/2012/03/12/reactive-extensions-v2-0-beta-available-now.aspx "the new overloads to Delay allow one to specify (optionally) the delay for the subscription as well as the delay for each element based on a selector function". But running the code does not work like expected. It just spits out the elements of the list using 1 sec interval. Like so:
...(1 sec)
1
...(1 sec)
2
...(1 sec)
3
...(1 sec)
4
Instead of
...(1 sec)
1
...(2 sec)
2
...(3 sec)
3
...(4 sec)
4
Am I missing something?

How about this:
new[] {1,2,3,4,5,}.ToObservable()
.Select(x => Observable.Return(x).Delay(TimeSpan.FromSeconds(x)))
.Concat();
Right? You want to use the number as both the value as well as the amount of time to delay?

Here is a solution that advances the iterator on the enumerable source only one step before it waits again (works with infinite enumerable source sequences):
static IObservable<T> ToObservableDelay<T>(
IEnumerable<T> source,
Func<T, TimeSpan> delaySelector,
IScheduler scheduler
)
{
return Observable.Create<T>(o =>
scheduler.ScheduleAsync(async (s, c) =>
{
try
{
foreach (var x in source)
{
await Task.Delay(delaySelector(x), c);
o.OnNext(x);
}
o.OnCompleted();
}
catch (TaskCanceledException) { /* ignore */ }
catch (Exception e)
{
o.OnError(e);
}
})
);
}
And here a little demo with an infinite generator:
static IEnumerable<double> Uniform(Random rng)
{
while (true)
yield return rng.NextDouble();
}
static void Main(string[] args)
{
var source = Uniform(new Random());
Console.WriteLine("press any key to quit");
using (var subscription =
ToObservableDelay(source, TimeSpan.FromSeconds, Scheduler.Default)
.Subscribe(Console.WriteLine))
{
Console.ReadKey();
}
}
Simple demo code that runs in a RT app:
var source = Uniform(new Random());
ToObservableDelay(source, TimeSpan.FromSeconds, Scheduler.Default)
.Take(10)
.Subscribe(x => Debug.WriteLine(x));

I found this question while trying to solve the same issue. My solution was to use Observable.Generator together with the IEnumerable's enumerator, like so:
public static IObservable<T> ToObservable<T>(this IEnumerable<T> source, TimeSpan time)
{
var enumerator = source.GetEnumerator();
var observable = Observable.Generate(
enumerator.Current,
_ => enumerator.MoveNext(),
_ => enumerator.Current,
_ => enumerator.Current,
_ => time);
return observable;
}
This extension method is then simply called like this:
var observable = Enumerable.Range(1, 10)
.ToObservable(TimeSpan.FromSeconds(1));
using (observable.Subscribe(Console.WriteLine))
{
Console.WriteLine("Press the Any key to abort");
Console.ReadKey();
}

Related

Why is my multi-threaded code not faster?

I'm running this in a console application:
public void ForEachParallel(Action<TElement> action)
{
var elements = new Queue<TElement>(_set);
var tasks = Enumerable.Range(0, _threadCount)
.Where(index => elements.Any())
.Select(index => elements.Dequeue())
.Select(element => Task.Run(() => action(element)))
.ToList();
while (tasks.Any())
{
var index = Task.WaitAny(tasks.ToArray());
tasks.RemoveAt(index);
if (elements.Any())
{
var element = elements.Dequeue();
tasks.Add(Task.Run(() => action(element)));
}
}
}
I have an equivalent ForEach method that does all of this in a serial way. I'm using 10 threads, but the ForEachParallel is taking just as much time as the ForEach. I have an i7 with 6 cores. Either this has a whole lot of overhead, or it is somehow running these tasks with a single thread.
Each action is an independent read, process, and write.
Here's my test code:
void Main()
{
Action<int> action = n =>
{
Console.Write($" +{n} ");
Thread.Sleep(TimeSpan.FromSeconds(n + 1));
Console.Write($" {n}- ");
};
ForEachParallel(Enumerable.Range(0, 6), 4, action);
}
public void ForEachParallel<TElement>(IEnumerable<TElement> source, int threadCount, Action<TElement> action)
{
var elements = new Queue<TElement>(source);
var tasks =
source
.Take(threadCount)
.Where(index => elements.Any())
.Select(index => elements.Dequeue())
.Select(element => Task.Run(() => action(element)))
.ToList();
while (tasks.Any())
{
var index = Task.WaitAny(tasks.ToArray());
tasks.RemoveAt(index);
if (elements.Any())
{
var element = elements.Dequeue();
tasks.Add(Task.Run(() => action(element)));
}
}
}
It's a effectively the same as your ForEachParallel but I've made it more generic.
When I change the threadCount I get differing execution lengths. This is clearly running as expected.

Replace buffered value with latest in TPL Dataflow

I need help with making a TPL Dataflow pipeline update an input buffer with the latest value.
I am subscribed to a live stream of elements, which are posted one by one onto a dataflow pipeline. Each element is processed, which takes some time - it takes significantly more time to process one element than what it takes to produce it (i.e. fast producer, slow consumer).
However, if there are multiple elements on the input queue with the same identity, only the most recent one needs processing. The intermediate ones can be discarded. This is the part I am having trouble figuring out.
Here is an example of what I am trying to achieve:
public record Bid(int Id, int Value);
async Task Main()
{
// This block is just here to log that an input is received.
var startBlock = new TransformBlock<Bid, Bid>(d =>
{
Console.WriteLine("Input: {0} ({1})", d.Id, d.Value);
return d;
});
//TODO: Check for duplicate identity (Bid.Id) and replace the
// current element with the most recent one.
var updateWithMostRecentBlock = new TransformBlock<Bid, Bid>(d => d);
var processBlock = new TransformBlock<Bid, Bid>(async d =>
{
Console.WriteLine("Processing: {0} ({1})", d.Id, d.Value);
await Task.Delay(1000);
return d;
});
var finishBlock = new ActionBlock<Bid>(d =>
{
Console.WriteLine("Done: {0} ({1})", d.Id, d.Value);
});
var propagateCompletion = new DataflowLinkOptions { PropagateCompletion = true };
startBlock.LinkTo(updateWithMostRecentBlock, propagateCompletion);
updateWithMostRecentBlock.LinkTo(processBlock, propagateCompletion);
processBlock.LinkTo(finishBlock, propagateCompletion);
var data = new[]
{
new Bid(1, 0), // Processed immediately
new Bid(1, 1), // Replaced with (1,2)
new Bid(2, 0), // Replaced with (2,1)
new Bid(1, 2), // Queued
new Bid(2, 1) // Queued
};
foreach (var d in data)
startBlock.Post(d);
startBlock.Complete();
await finishBlock.Completion;
}
When processBlock is ready to receive the next element, I want updateWithMostRecentBlock to provide only the most relevant element.
Actual output:
Input: 1 (0)
Input: 1 (1)
Input: 2 (0)
Input: 1 (2)
Input: 2 (1)
Processing: 1 (0)
Processing: 1 (1)
Done: 1 (0)
Processing: 2 (0)
Done: 1 (1)
Processing: 1 (2)
Done: 2 (0)
Processing: 2 (1)
Done: 1 (2)
Done: 2 (1)
Expected output:
Input: 1 (0) // Immediately processed
Input: 1 (1) // Replaced by (1,2)
Input: 2 (0) // Replaced by (2,1)
Input: 1 (2) // Queued
Input: 2 (1) // Queued
Processing: 1 (0)
Done: 1 (0)
Processing: 1 (2)
Done: 1 (2)
Processing: 2 (1)
Done: 2 (1)
Hint:
Stephen Toub has an elegant solution to the exact opposite of what I'm trying to achieve. His solution rejects all incoming elements and retains the oldest one.
I'm sorry for answering my own question, but #TheGeneral brought me on the right track with his hint about bounded capacity.
I had to configure the processBlock to set bounded capacity to 1:
var processBlock = new TransformBlock<Bid, Bid>(
async d =>
{
Console.WriteLine("Processing: {0} ({1})", d.Id, d.Value);
await Task.Delay(1000);
return d;
},
new ExecutionDataflowBlockOptions
{
BoundedCapacity = 1
});
Then I replaced the updateWithMostRecentBlock with a custom block that has this implementation:
public class DiscardAndReplaceDuplicatesBlock<TValue, TKey> : IPropagatorBlock<TValue, TValue>
where TKey : IEquatable<TKey>
{
private readonly ITargetBlock<TValue> _target;
private readonly IReceivableSourceBlock<TValue> _source;
public DiscardAndReplaceDuplicatesBlock(Func<TValue, TKey> keyAccessor)
{
var buffer = new ConcurrentDictionary<TKey, (TValue Value, Task Task, CancellationTokenSource Token)>();
var outgoing = new BufferBlock<TValue>(new ExecutionDataflowBlockOptions
{
BoundedCapacity = 1,
MaxMessagesPerTask = 1
});
var incoming = new ActionBlock<TValue>(value =>
{
var key = keyAccessor(value);
var cts = new CancellationTokenSource();
var isQueued = buffer.TryGetValue(key, out var previous);
if (isQueued)
{
buffer.TryRemove(key, out var current);
Console.WriteLine("Remove: {0}", current.Value);
if (!previous.Task.IsCompleted)
{
previous.Token.Cancel();
previous.Token.Dispose();
Console.WriteLine("Cancel: {0}", current.Value);
}
}
var task = outgoing.SendAsync(value, cts.Token);
if (task.IsCompleted)
{
cts.Dispose();
Console.WriteLine("Sent: {0}", value);
return;
}
buffer.AddOrUpdate(key, (value, task, cts), (k, t) => (value, task, cts));
Console.WriteLine("Buffered: {0}", value);
});
incoming.Completion.ContinueWith(
async t =>
{
if (t.IsFaulted)
{
((ITargetBlock<TValue>)outgoing).Fault(t.Exception.InnerException);
}
else
{
await WaitForBufferToCompleteAsync().ConfigureAwait(false);
outgoing.Complete();
}
},
default,
TaskContinuationOptions.ExecuteSynchronously,
TaskScheduler.Default);
Task WaitForBufferToCompleteAsync()
{
if (!buffer.Any())
return Task.CompletedTask;
var buffered = buffer.Where(kvp => !kvp.Value.Task.IsCompleted);
var tasks = buffered.Select(b => b.Value.Task);
return Task.WhenAll(tasks);
}
_target = incoming;
_source = outgoing;
}
public Task Completion =>
_source.Completion;
public void Complete() =>
_target.Complete();
public void Fault(Exception exception) =>
_target.Fault(exception);
public IDisposable LinkTo(ITargetBlock<TValue> target, DataflowLinkOptions linkOptions) =>
_source.LinkTo(target, linkOptions);
public TValue ConsumeMessage(DataflowMessageHeader messageHeader, ITargetBlock<TValue> target, out bool messageConsumed) =>
_source.ConsumeMessage(messageHeader, target, out messageConsumed);
public DataflowMessageStatus OfferMessage(DataflowMessageHeader messageHeader, TValue messageValue, ISourceBlock<TValue>? source, bool consumeToAccept) =>
_target.OfferMessage(messageHeader, messageValue, source, consumeToAccept);
public bool ReserveMessage(DataflowMessageHeader messageHeader, ITargetBlock<TValue> target) =>
_source.ReserveMessage(messageHeader, target);
public void ReleaseReservation(DataflowMessageHeader messageHeader, ITargetBlock<TValue> target) =>
_source.ReleaseReservation(messageHeader, target);
}
It is not very pretty, and it is not production tested, but it seems to work. In order to actually replace an already dispatched element, I had to retain the cancellation token used so I could cancel an outdated but unprocessed element. I'm not sure this is the best idea so any critique is welcome!
One note, though: This will also process element (1,1) because after (1,0) has been dispatched to the processorBlock, element (1,1) is successfully sent to the custom block's output buffer. I don't think this can be avoided.
Here is my take on this problem:
public static IPropagatorBlock<TInput, TOutput>
CreateTransformBlockDropOldestByKey<TInput, TOutput, TKey>(
Func<TInput, Task<TOutput>> transform,
Func<TInput, TKey> keySelector,
ExecutionDataflowBlockOptions dataflowBlockOptions = null,
IEqualityComparer<TKey> keyComparer = null,
IProgress<TInput> droppedItems = null)
{
if (transform == null) throw new ArgumentNullException(nameof(transform));
if (keySelector == null) throw new ArgumentNullException(nameof(keySelector));
dataflowBlockOptions = dataflowBlockOptions ?? new ExecutionDataflowBlockOptions();
keyComparer = keyComparer ?? EqualityComparer<TKey>.Default;
var dictionary = new Dictionary<TKey, TInput>(keyComparer);
var outputBlock = new TransformManyBlock<TKey, TOutput>(async key =>
{
bool removed; TInput removedItem;
lock (dictionary) removed = dictionary.Remove(key, out removedItem);
if (!removed) return Enumerable.Empty<TOutput>();
return new[] { await transform(removedItem).ConfigureAwait(false) };
}, dataflowBlockOptions);
var inputBlock = new ActionBlock<TInput>(item =>
{
var key = keySelector(item);
bool dropped; TInput droppedItem;
lock (dictionary)
{
dropped = dictionary.TryGetValue(key, out droppedItem);
dictionary[key] = item;
}
if (dropped) droppedItems?.Report(droppedItem);
return outputBlock.SendAsync(key);
}, new ExecutionDataflowBlockOptions()
{
BoundedCapacity = 1,
CancellationToken = dataflowBlockOptions.CancellationToken,
TaskScheduler = dataflowBlockOptions.TaskScheduler,
});
PropagateCompletion(inputBlock, outputBlock);
PropagateFailure(outputBlock, inputBlock);
return DataflowBlock.Encapsulate(inputBlock, outputBlock);
async void PropagateCompletion(IDataflowBlock source, IDataflowBlock target)
{
try { await source.Completion.ConfigureAwait(false); } catch { }
var ex = source.Completion.IsFaulted ? source.Completion.Exception : null;
if (ex != null) target.Fault(ex); else target.Complete();
}
async void PropagateFailure(IDataflowBlock source, IDataflowBlock target)
{
try { await source.Completion.ConfigureAwait(false); } catch { }
if (source.Completion.IsFaulted) target.Fault(source.Completion.Exception);
}
}
Usage example:
var droppedItems = new Progress<Bid>(b =>
{
Console.WriteLine($"Dropped: {b.Id} ({b.Value})");
});
var processBlock = CreateTransformBlockDropOldestByKey<Bid, Bid, int>(async b =>
{
Console.WriteLine($"Processing: {b.Id} ({b.Value})");
await Task.Delay(1000);
return b;
}, b => b.Id, droppedItems: droppedItems);
The reason that the two internal blocks, the inputBlock and the outputBlock are not linked together directly, is because otherwise a fault in the outputBlock could potentially leave the inputBlock hanging in a not complete state forever. It is important that if one of the two blocks fail, the other should fail too, so that any pending SendAsync operation towards the inputBlock to be canceled. The blocks are linked together indirectly, by using the PropagateCompletion and PropagateFailure methods.
Configuring the processBlock with a BoundedCapacity should take into account that the block may contain in its input queue keys that may have been dropped, so setting this configuration to a slightly higher value is advised.

Observable wait until no more changes for time x and then notify [duplicate]

I'm using reactive extensions to collate data into buffers of 100ms:
this.subscription = this.dataService
.Where(x => !string.Equals("FOO", x.Key.Source))
.Buffer(TimeSpan.FromMilliseconds(100))
.ObserveOn(this.dispatcherService)
.Where(x => x.Count != 0)
.Subscribe(this.OnBufferReceived);
This works fine. However, I want slightly different behavior than that provided by the Buffer operation. Essentially, I want to reset the timer if another data item is received. Only when no data has been received for the entire 100ms do I want to handle it. This opens up the possibility of never handling the data, so I should also be able to specify a maximum count. I would imagine something along the lines of:
.SlidingBuffer(TimeSpan.FromMilliseconds(100), 10000)
I've had a look around and haven't been able to find anything like this in Rx? Can anyone confirm/deny this?
This is possible by combining the built-in Window and Throttle methods of Observable. First, let's solve the simpler problem where we ignore the maximum count condition:
public static IObservable<IList<T>> BufferUntilInactive<T>(this IObservable<T> stream, TimeSpan delay)
{
var closes = stream.Throttle(delay);
return stream.Window(() => closes).SelectMany(window => window.ToList());
}
The powerful Window method did the heavy lifting. Now it's easy enough to see how to add a maximum count:
public static IObservable<IList<T>> BufferUntilInactive<T>(this IObservable<T> stream, TimeSpan delay, Int32? max=null)
{
var closes = stream.Throttle(delay);
if (max != null)
{
var overflows = stream.Where((x,index) => index+1>=max);
closes = closes.Merge(overflows);
}
return stream.Window(() => closes).SelectMany(window => window.ToList());
}
I'll write a post explaining this on my blog. https://gist.github.com/2244036
Documentation for the Window method:
http://leecampbell.blogspot.co.uk/2011/03/rx-part-9join-window-buffer-and-group.html
http://enumeratethis.com/2011/07/26/financial-charts-reactive-extensions/
I wrote an extension to do most of what you're after - BufferWithInactivity.
Here it is:
public static IObservable<IEnumerable<T>> BufferWithInactivity<T>(
this IObservable<T> source,
TimeSpan inactivity,
int maximumBufferSize)
{
return Observable.Create<IEnumerable<T>>(o =>
{
var gate = new object();
var buffer = new List<T>();
var mutable = new SerialDisposable();
var subscription = (IDisposable)null;
var scheduler = Scheduler.ThreadPool;
Action dump = () =>
{
var bts = buffer.ToArray();
buffer = new List<T>();
if (o != null)
{
o.OnNext(bts);
}
};
Action dispose = () =>
{
if (subscription != null)
{
subscription.Dispose();
}
mutable.Dispose();
};
Action<Action<IObserver<IEnumerable<T>>>> onErrorOrCompleted =
onAction =>
{
lock (gate)
{
dispose();
dump();
if (o != null)
{
onAction(o);
}
}
};
Action<Exception> onError = ex =>
onErrorOrCompleted(x => x.OnError(ex));
Action onCompleted = () => onErrorOrCompleted(x => x.OnCompleted());
Action<T> onNext = t =>
{
lock (gate)
{
buffer.Add(t);
if (buffer.Count == maximumBufferSize)
{
dump();
mutable.Disposable = Disposable.Empty;
}
else
{
mutable.Disposable = scheduler.Schedule(inactivity, () =>
{
lock (gate)
{
dump();
}
});
}
}
};
subscription =
source
.ObserveOn(scheduler)
.Subscribe(onNext, onError, onCompleted);
return () =>
{
lock (gate)
{
o = null;
dispose();
}
};
});
}
With Rx Extensions 2.0, your can answer both requirements with a new Buffer overload accepting a timeout and a size:
this.subscription = this.dataService
.Where(x => !string.Equals("FOO", x.Key.Source))
.Buffer(TimeSpan.FromMilliseconds(100), 1)
.ObserveOn(this.dispatcherService)
.Where(x => x.Count != 0)
.Subscribe(this.OnBufferReceived);
See https://msdn.microsoft.com/en-us/library/hh229200(v=vs.103).aspx for the documentation.
I guess this can be implemented on top of Buffer method as shown below:
public static IObservable<IList<T>> SlidingBuffer<T>(this IObservable<T> obs, TimeSpan span, int max)
{
return Observable.CreateWithDisposable<IList<T>>(cl =>
{
var acc = new List<T>();
return obs.Buffer(span)
.Subscribe(next =>
{
if (next.Count == 0) //no activity in time span
{
cl.OnNext(acc);
acc.Clear();
}
else
{
acc.AddRange(next);
if (acc.Count >= max) //max items collected
{
cl.OnNext(acc);
acc.Clear();
}
}
}, err => cl.OnError(err), () => { cl.OnNext(acc); cl.OnCompleted(); });
});
}
NOTE: I haven't tested it, but I hope it gives you the idea.
Colonel Panic's solution is almost perfect. The only thing that is missing is a Publish component, in order to make the solution work with cold sequences too.
/// <summary>
/// Projects each element of an observable sequence into a buffer that's sent out
/// when either a given inactivity timespan has elapsed, or it's full,
/// using the specified scheduler to run timers.
/// </summary>
public static IObservable<IList<T>> BufferUntilInactive<T>(
this IObservable<T> source, TimeSpan dueTime, int maxCount,
IScheduler scheduler = default)
{
if (maxCount < 1) throw new ArgumentOutOfRangeException(nameof(maxCount));
scheduler ??= Scheduler.Default;
return source.Publish(published =>
{
var combinedBoundaries = Observable.Merge
(
published.Throttle(dueTime, scheduler),
published.Skip(maxCount - 1)
);
return published
.Window(() => combinedBoundaries)
.SelectMany(window => window.ToList());
});
}
Beyond adding the Publish, I've also replaced the original .Where((_, index) => index + 1 >= maxCount) with the equivalent but shorter .Skip(maxCount - 1). For completeness there is also an IScheduler parameter, which configures the scheduler where the timer is run.

Restricting the enumerations of LINQ queries to One Only

I have a LINQ query that should NOT be enumerated more than once, and I want to avoid enumerating it twice by mistake. Is there any extension method I can use to ensure that I am protected from such a mistake? I am thinking about something like this:
var numbers = Enumerable.Range(1, 10).OnlyOnce();
Console.WriteLine(numbers.Count()); // shows 10
Console.WriteLine(numbers.Count()); // throws InvalidOperationException: The query cannot be enumerated more than once.
The reason I want this functionality is because I have an enumerable of tasks, that is intended to instantiate and run the tasks progressivelly, while it is enumerated slowly under control. I already made the mistake to run the tasks twice because I forgot that it's a differed enumerable and not
an array.
var tasks = Enumerable.Range(1, 10).Select(n => Task.Run(() => Console.WriteLine(n)));
Task.WaitAll(tasks.ToArray()); // Lets wait for the tasks to finish...
Console.WriteLine(String.Join(", ", tasks.Select(t => t.Id))); // Lets see the completed task IDs...
// Oups! A new set of tasks started running!
I want to avoid enumerating it twice by mistake.
You can wrap the collection with a collection that throws if it's enumerated twice.
eg:
using System;
using System.Collections;
using System.Collections.Generic;
using System.Linq;
namespace ConsoleApp8
{
public static class EnumExtension
{
class OnceEnumerable<T> : IEnumerable<T>
{
IEnumerable<T> col;
bool hasBeenEnumerated = false;
public OnceEnumerable(IEnumerable<T> col)
{
this.col = col;
}
public IEnumerator<T> GetEnumerator()
{
if (hasBeenEnumerated)
{
throw new InvalidOperationException("This collection has already been enumerated.");
}
this.hasBeenEnumerated = true;
return col.GetEnumerator();
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}
public static IEnumerable<T> OnlyOnce<T>(this IEnumerable<T> col)
{
return new OnceEnumerable<T>(col);
}
}
class Program
{
static void Main(string[] args)
{
var col = Enumerable.Range(1, 10).OnlyOnce();
var colCount = col.Count(); //first enumeration
foreach (var c in col) //second enumeration
{
Console.WriteLine(c);
}
}
}
}
Enumerables enumerate, end of story. You just need to call ToList, or ToArray
// this will enumerate and start the tasks
var tasks = Enumerable.Range(1, 10)
.Select(n => Task.Run(() => Console.WriteLine(n)))
.ToList();
// wait for them all to finish
Task.WaitAll(tasks.ToArray());
Console.WriteLine(String.Join(", ", tasks.Select(t => t.Id)));
Hrm if you want parallelism
Parallel.For(0, 100, index => Console.WriteLine(index) );
or if you are using async and await pattern
public static async Task DoWorkLoads(IEnumerable <Something> results)
{
var options = new ExecutionDataflowBlockOptions
{
MaxDegreeOfParallelism = 50
};
var block = new ActionBlock<Something>(MyMethodAsync, options);
foreach (var result in results)
block.Post(result);
block.Complete();
await block.Completion;
}
...
public async Task MyMethodAsync(Something result)
{
await SomethingAsync(result);
}
Update, Since you are after a way to control the max degree of conncurrency, you could use this
public static async Task<IEnumerable<Task>> ExecuteInParallel<T>(this IEnumerable<T> collection,Func<T, Task> callback,int degreeOfParallelism)
{
var queue = new ConcurrentQueue<T>(collection);
var tasks = Enumerable.Range(0, degreeOfParallelism)
.Select(async _ =>
{
while (queue.TryDequeue(out var item))
await callback(item);
})
.ToArray();
await Task.WhenAll(tasks);
return tasks;
}
Rx certainly is an option to control parallelism.
var query =
Observable
.Range(1, 10)
.Select(n => Observable.FromAsync(() => Task.Run(() => new { Id = n })));
var tasks = query.Merge(maxConcurrent: 3).ToArray().Wait();
Console.WriteLine(String.Join(", ", tasks.Select(t => t.Id)));

How to create a range of Observable<int> and complete when condition is true

The below code will run for all sequences, I want the Observable to stop raising events after the condition is true.
private IObservable<string> EnsureIndexWithCounter(string index)
{
return Observable.Range(0, 5)
.SelectMany(p => IncrementCounterIfIndexExistsAsync(index, p)
.ToObservable()
.RepeatUntil(x => !x.Item1, 5))
.TakeWhile(p => !p.Item1)
.Select(p => p.Item2);
}
// Will be invoked 4 times, should be invoked as long the Item1 of the return tuple is true
private async Task<Tuple<bool, string>> IncrementCounterIfIndexExistsAsync(string index, int counter)
{
var existsResponse = await Client.IndexExistsAsync(new IndexExistsRequest(index)).ConfigureAwait(false);
var newCounter = existsResponse.Exists ? ++counter : counter;
return Tuple.Create(existsResponse.Exists, $"{index}_{newCounter}");
}

Categories