Concatenate two infinite C# IEnumerables together in no particular order - c#

I have two methods that each return infinite IEnumerable that never ends. I want to concatenate them so whenever any of the IEnumerables return a value, I can instantly get and process it.
static void Main(string[] args)
{
var streamOfBoth = get1().Concat(get2());
foreach(var item in streamOfBoth)
{
Console.WriteLine(item);
// I'd expect mixed numbers 1 and 2
// Instead I receive only 1s
}
}
static IEnumerable<int> get1()
{
while (true)
{
System.Threading.Thread.Sleep(1000);
yield return 1;
}
}
static IEnumerable<int> get2()
{
while (true)
{
System.Threading.Thread.Sleep(200);
yield return 2;
}
}
Is there a way to do this with IEnumerables without having to use threads?

This is fairly easily achieved with System.Reactive
static void Main()
{
get1().ToObservable(TaskPoolScheduler.Default).Subscribe(Print);
get2().ToObservable(TaskPoolScheduler.Default).Subscribe(Print);
}
static void Print(int i)
{
Console.WriteLine(i);
}
static IEnumerable<int> get1()
{
while (true)
{
System.Threading.Thread.Sleep(1000);
yield return 1;
}
}
static IEnumerable<int> get2()
{
while (true)
{
System.Threading.Thread.Sleep(200);
yield return 2;
}
}
This produces the following output on my machine:
2 2 2 2 1 2 2 2 2 2 1 2 2 2 2 2 1 2 2 2 2 2 1 2 2 2 2 2 1 2 2 2 2 2 1 ...
Note that ToObservable is called with the argument TaskPoolScheduler.Default; just calling ToObservable without it will result in synchronous execution, meaning it will keep enumerating the first sequence forever and never get to the second one.

You may want to interleave get1 and get2 (take item from get1, then from get2, then again from get1 and then from get2 etc.). Generalized (IEnumerable<T> are not necessary infinite and of the same size) Interleave<T> extension method can be like this:
public static partial class EnumerableExtensions {
public static IEnumerable<T> Interleave<T>(this IEnumerable<T> source,
params IEnumerable<T>[] others) {
if (null == source)
throw new ArgumentNullException(nameof(source));
else if (null == others)
throw new ArgumentNullException(nameof(others));
IEnumerator<T>[] enums = new IEnumerator<T>[] { source.GetEnumerator() }
.Concat(others
.Where(item => item != null)
.Select(item => item.GetEnumerator()))
.ToArray();
try {
bool hasValue = true;
while (hasValue) {
hasValue = false;
for (int i = 0; i < enums.Length; ++i) {
if (enums[i] != null && enums[i].MoveNext()) {
hasValue = true;
yield return enums[i].Current;
}
else {
enums[i].Dispose();
enums[i] = null;
}
}
}
}
finally {
for (int i = enums.Length - 1; i >= 0; --i)
if (enums[i] != null)
enums[i].Dispose();
}
}
}
Then use it:
var streamOfBoth = get1().Interleave(get2());
foreach(var item in streamOfBoth)
{
Console.WriteLine(item);
}
Edit: if
"whenever any ... return a value, I can instantly get and process"
is crucial phrase in your question you can try BlockingCollection and implement producer-consumer pattern:
static BlockingCollection<int> streamOfBoth = new BlockingCollection<int>();
// Producer #1
static void get1() {
while (true) {
System.Threading.Thread.Sleep(1000);
streamOfBoth.Add(1); // value (1) is ready and pushed into streamOfBoth
}
}
// Producer #2
static void get2() {
while (true) {
System.Threading.Thread.Sleep(200);
streamOfBoth.Add(2); // value (2) is ready and pushed into streamOfBoth
}
}
...
Task.Run(() => get1()); // Start producer #1
Task.Run(() => get2()); // Start producer #2
...
// Cosumer: when either Producer #1 or Producer #2 create a value
// consumer can starts process it
foreach(var item in streamOfBoth.GetConsumingEnumerable()) {
Console.WriteLine(item);
}

Here is a generic Merge method that merges IEnumerables. Each IEnumerable is enumerated in a dedicated thread.
using System.Reactive.Linq;
using System.Reactive.Concurrency;
public static IEnumerable<T> Merge<T>(params IEnumerable<T>[] sources)
{
IEnumerable<IObservable<T>> observables = sources
.Select(source => source.ToObservable(NewThreadScheduler.Default));
IObservable<T> merged = Observable.Merge(observables);
return merged.ToEnumerable();
}
Usage example:
var streamOfBoth = Merge(get1(), get2());
Enumerating the resulting IEnumerable will block the current thread until the enumeration is finished.
This implementation depends on the System.Reactive and System.Interactive.Async packages.

Related

Make using statement usable for multiple disposable objects

I have a bunch of text files in a folder, and all of them should have identical headers. In other words the first 100 lines of all files should be identical. So I wrote a function to check this condition:
private static bool CheckHeaders(string folderPath, int headersCount)
{
var enumerators = Directory.EnumerateFiles(folderPath)
.Select(f => File.ReadLines(f).GetEnumerator())
.ToArray();
//using (enumerators)
//{
for (int i = 0; i < headersCount; i++)
{
foreach (var e in enumerators)
{
if (!e.MoveNext()) return false;
}
var values = enumerators.Select(e => e.Current);
if (values.Distinct().Count() > 1) return false;
}
return true;
//}
}
The reason I am using enumerators is memory efficiency. Instead of loading all file contents in memory I enumerate the files concurrently line-by-line until a mismatch is found, or all headers have been examined.
My problem is evident by the commented lines of code. I would like to utilize a using block to safely dispose all the enumerators, but unfortunately using (enumerators) doesn't compile. Apparently using can handle only a single disposable object. I know that I can dispose the enumerators manually, by wrapping the whole thing in a try-finally block, and running the disposing logic in a loop inside finally, but is seems awkward. Is there any mechanism I could employ to make the using statement a viable option in this case?
Update
I just realized that my function has a serious flaw. The construction of the enumerators is not robust. A locked file can cause an exception, while some enumerators have already been created. These enumerators will not be disposed. This is something I want to fix. I am thinking about something like this:
var enumerators = Directory.EnumerateFiles(folderPath)
.ToDisposables(f => File.ReadLines(f).GetEnumerator());
The extension method ToDisposables should ensure that in case of an exception no disposables are left undisposed.
You can create a disposable-wrapper over your enumerators:
class DisposableEnumerable : IDisposable
{
private IEnumerable<IDisposable> items;
public event UnhandledExceptionEventHandler DisposalFailed;
public DisposableEnumerable(IEnumerable<IDisposable> items) => this.items = items;
public void Dispose()
{
foreach (var item in items)
{
try
{
item.Dispose();
}
catch (Exception e)
{
var tmp = DisposalFailed;
tmp?.Invoke(this, new UnhandledExceptionEventArgs(e, false));
}
}
}
}
and use it with the lowest impact to your code:
private static bool CheckHeaders(string folderPath, int headersCount)
{
var enumerators = Directory.EnumerateFiles(folderPath)
.Select(f => File.ReadLines(f).GetEnumerator())
.ToArray();
using (var disposable = new DisposableEnumerable(enumerators))
{
for (int i = 0; i < headersCount; i++)
{
foreach (var e in enumerators)
{
if (!e.MoveNext()) return false;
}
var values = enumerators.Select(e => e.Current);
if (values.Distinct().Count() > 1) return false;
}
return true;
}
}
The thing is you have to dispose those objects separately one by one anyway. But it's up to you where to encapsulate that logic. And the code I've suggested has no manual try-finally,)
To the second part of the question. If I get you right this should be sufficient:
static class DisposableHelper
{
public static IEnumerable<TResult> ToDisposable<TSource, TResult>(this IEnumerable<TSource> source,
Func<TSource, TResult> selector) where TResult : IDisposable
{
var exceptions = new List<Exception>();
var result = new List<TResult>();
foreach (var i in source)
{
try { result.Add(selector(i)); }
catch (Exception e) { exceptions.Add(e); }
}
if (exceptions.Count == 0)
return result;
foreach (var i in result)
{
try { i.Dispose(); }
catch (Exception e) { exceptions.Add(e); }
}
throw new AggregateException(exceptions);
}
}
Usage:
private static bool CheckHeaders(string folderPath, int headersCount)
{
var enumerators = Directory.EnumerateFiles(folderPath)
.ToDisposable(f => File.ReadLines(f).GetEnumerator())
.ToArray();
using (new DisposableEnumerable(enumerators))
{
for (int i = 0; i < headersCount; i++)
{
foreach (var e in enumerators)
{
if (!e.MoveNext()) return false;
}
var values = enumerators.Select(e => e.Current);
if (values.Distinct().Count() > 1) return false;
}
return true;
}
}
and
try
{
CheckHeaders(folderPath, headersCount);
}
catch(AggregateException e)
{
// Prompt to fix errors and try again
}
I'm going to suggest an approach that uses recursive calls to Zip to allow parallel enumeration of a normal IEnumerable<string> without the need to resort to using IEnumerator<string>.
bool Zipper(IEnumerable<IEnumerable<string>> sources, int take)
{
IEnumerable<string> ZipperImpl(IEnumerable<IEnumerable<string>> ss)
=> (!ss.Skip(1).Any())
? ss.First().Take(take)
: ss.First().Take(take).Zip(
ZipperImpl(ss.Skip(1)),
(x, y) => (x == null || y == null || x != y) ? null : x);
var matching_lines = ZipperImpl(sources).TakeWhile(x => x != null).ToArray();
return matching_lines.Length == take;
}
Now build up your enumerables:
IEnumerable<string>[] enumerables =
Directory
.EnumerateFiles(folderPath)
.Select(f => File.ReadLines(f))
.ToArray();
Now it's simple to call:
bool headers_match = Zipper(enumerables, 100);
Here's a trace of running this code against three files with more than 4 lines:
Ben Petering at 5:28 PM ACST
Ben Petering at 5:28 PM ACST
Ben Petering at 5:28 PM ACST
From a call 2019-05-23, James mentioned he’d like the ability to edit the current shipping price rules (eg in shipping_rules.xml) via the admin.
From a call 2019-05-23, James mentioned he’d like the ability to edit the current shipping price rules (eg in shipping_rules.xml) via the admin.
From a call 2019-05-23, James mentioned he’d like the ability to edit the current shipping price rules (eg in shipping_rules.xml) via the admin.
He also mentioned he’d like to be able to set different shipping price rules for a given time window, e.g. Jan 1 to Jan 30.
He also mentioned he’d like to be able to set different shipping price rules for a given time window, e.g. Jan 1 to Jan 30.
He also mentioned he’d like to be able to set different shipping price rules for a given time window, e.g. Jan 1 to Jan 30.
These storyishes should be considered when choosing the appropriate module to use.
These storyishes should be considered when choosing the appropriate module to use.X
These storyishes should be considered when choosing the appropriate module to use.
Note that the enumerations stop when they encountered a mismatch header in the 4th line on the second file. All enumerations then stopped.
Creating an IDisposable wrapper as #Alex suggested is correct. It needs just a logic to dispose already opened files if some of them is locked and probably some logic for error states. Maybe something like this (error state logic is very simple):
public class HeaderChecker : IDisposable
{
private readonly string _folderPath;
private readonly int _headersCount;
private string _lockedFile;
private readonly List<IEnumerator<string>> _files = new List<IEnumerator<string>>();
public HeaderChecker(string folderPath, int headersCount)
{
_folderPath = folderPath;
_headersCount = headersCount;
}
public string LockedFile => _lockedFile;
public bool CheckFiles()
{
_lockedFile = null;
if (!TryOpenFiles())
{
return false;
}
if (_files.Count == 0)
{
return true; // Not sure what to return here.
}
for (int i = 0; i < _headersCount; i++)
{
if (!_files[0].MoveNext()) return false;
string currentLine = _files[0].Current;
for (int fileIndex = 1; fileIndex < _files.Count; fileIndex++)
{
if (!_files[fileIndex].MoveNext()) return false;
if (_files[fileIndex].Current != currentLine) return false;
}
}
return true;
}
private bool TryOpenFiles()
{
bool result = true;
foreach (string file in Directory.EnumerateFiles(_folderPath))
{
try
{
_files.Add(File.ReadLines(file).GetEnumerator());
}
catch
{
_lockedFile = file;
result = false;
break;
}
}
if (!result)
{
DisposeCore(); // Close already opened files.
}
return result;
}
private void DisposeCore()
{
foreach (var item in _files)
{
try
{
item.Dispose();
}
catch
{
}
}
_files.Clear();
}
public void Dispose()
{
DisposeCore();
}
}
// Usage
using (var checker = new HeaderChecker(folderPath, headersCount))
{
if (!checker.CheckFiles())
{
if (checker.LockedFile is null)
{
// Error while opening files.
}
else
{
// Headers do not match.
}
}
}
I also removed .Select() and .Distinct() when checking the lines. The first just iterates over the enumerators array - the same as foreach above it, so you are enumerating this array twice. Then creates a new list of lines and .Distinct() enumerates over it.

C# Multi-Threaded Tree traversal

I am trying to write a C# system that will multi-threaded traverse a tree structure. Another way to look at this is where the consumer of the BlockingCollection is also the producer.
The problem I am having is telling when everything is finished.
The test I really need is to see if all the threads are on the TryTake.
If they are then everything has finished, but I cannot find a way to test of this or wrap this with anything that would help achieve this.
The code below is a very simple example of this code as far as I have it, but there is a condition in which this code can fail. If the first thread just passed the test.TryTake(out v,-1) and has not yet executed the s.Release(); and it just pulled the last item from the collection, and the second thread just performed the if(s.CurrentCount == 0 && test.Count ==0) this could return true, and incorrectly start finishing things up.
But then the first thread would continue on and try and add more to the collection.
If I could make the lines:
if (!test.TryTake(out v, -1))
break;
s.Release();
atomic then I believe this code would work. (Which is obviously not possible.)
But I cannot figure out how to fix this flaw.
class Program
{
private static BlockingCollection<int> test;
static void Main(string[] args)
{
test = new BlockingCollection<int>();
WorkClass.s = new SemaphoreSlim(2);
WorkClass w0 = new WorkClass("A");
WorkClass w1 = new WorkClass("B");
Thread t0 = new Thread(w0.WorkFunction);
Thread t1 = new Thread(w1.WorkFunction);
test.Add(10);
t0.Start();
t1.Start();
t0.Join();
t1.Join();
Console.WriteLine("Done");
Console.ReadLine();
}
class WorkClass
{
public static SemaphoreSlim s;
private readonly string _name;
public WorkClass(string name)
{
_name = name;
}
public void WorkFunction(object t)
{
while (true)
{
int v;
s.Wait();
if (s.CurrentCount == 0 && test.Count == 0)
test.CompleteAdding();
if (!test.TryTake(out v, -1))
break;
s.Release();
Console.WriteLine(_name + " = " + v);
Thread.Sleep(5);
for (int i = 0; i < v; i++)
test.Add(i);
}
Console.WriteLine("Done " + _name);
}
}
}
This can be parallelized using task parallelism. Every node in the tree is considered to be a task which may spawn sub-tasks. See Dynamic Task Parallelism for a more detailed description.
For a binary tree with 5 levels that writes each node to console and waits for 5 milliseconds as in your example, the ParallelWalk method would then look for example as follows:
class Program
{
internal class TreeNode
{
internal TreeNode(int level)
{
Level = level;
}
internal int Level { get; }
}
static void Main(string[] args)
{
ParallelWalk(new TreeNode(0));
Console.Read();
}
static void ParallelWalk(TreeNode node)
{
if (node == null) return;
Console.WriteLine(node.Level);
Thread.Sleep(5);
if(node.Level > 4) return;
int nextLevel = node.Level + 1;
var t1 = Task.Factory.StartNew(
() => ParallelWalk(new TreeNode(nextLevel)));
var t2 = Task.Factory.StartNew(
() => ParallelWalk(new TreeNode(nextLevel)));
Task.WaitAll(t1, t2);
}
}
The central lines are where the tasks t1 and t2 are spawned.
By this decomposition in tasks, the scheduling is done by the Task Parallel Library and you don't have to manage a shared set of nodes anymore.

c# Wait did not seem to reacquire lock

I am not sure why this piece of code is not safe. I have a test case to prove it is not safe.
List<T> _l = new List<T>();
public void Add(T t)
{
lock (_l)
{
_l.Add(t);
Monitor.PulseAll(_l);
}
}
public T[] RemoveToArray(TimeSpan timeout)
{
lock (_l)
{
if (_l.Count == 0)
{
bool timedout = !Monitor.Wait(_l, timeout);
// no lock
if (timedout)
{
return new T[0];
}
}
// with lock
T[] items = _l.ToArray();
_l.Clear();
return items;
}
}
The function is supposed to wait for some amount of time (after which times out) for new items to arrive. When times up, returns an empty array, otherwise drains all the elements in the internal list to an array. In the test code, I created two Task, one to add, another to remove.
times = 3;
Task[] tasks = new Task[2];
tasks[0] = Task.Factory.StartNew(() =>
{
firstRemoveItems = _l.RemoveToArray(_infinite);
Util.Delay(TimeSpan.FromMilliseconds(10)).Wait();
secondRemoveItems = _l.RemoveToArray(_infinite);
});
tasks[1] = Task.Factory.StartNew(() =>
{
_l.Add(new object());
Util.Delay(TimeSpan.FromMilliseconds(5)).Wait();
for (int i = 1; i < times; i++)
_l.Add(new object());
});
Three items have been added, however, the total number of items had been 2, 3, 4. I am not quite sure what the issue is. I assume the lock is reacquired after Wait times out?

List<T>Get Chunk Number being executed

I am breaking a list into chunks and processing it as below:
foreach (var partialist in breaklistinchunks(chunksize))
{
try
{
do something
}
catch
{
print error
}
}
public static class IEnumerableExtensions
{
public static IEnumerable<List<T>> BreakListinChunks<T>(this IEnumerable<T> sourceList, int chunkSize)
{
List<T> chunkReturn = new List<T>(chunkSize);
foreach (var item in sourceList)
{
chunkReturn.Add(item);
if (chunkReturn.Count == chunkSize)
{
yield return chunkReturn;
chunkReturn = new List<T>(chunkSize);
}
}
if (chunkReturn.Any())
{
yield return chunkReturn;
}
}
}
If there is an error, I wish to run the chunk again. Is it possible to find the particular chunk number where we received the error and run that again ?
The batches have to be executed in sequential order .So if batch#2 generates an error, then I need to be able to run 2 again, if it fails again. I just need to get out of the loop for good .
List<Chunk> failedChunks = new List<Chunk>();
foreach (var partialist in breaklistinchunks(chunksize))
{
try
{
//do something
}
catch
{
//print error
failedChunks.Add(partiallist);
}
}
// attempt to re-process failed chunks here
I propose this answer based on your comment to Aaron's answer.
The batches have to be executed in sequential order .So if 2 is a problem , then I need to be able to run 2 again, if it fails again. I just need to get out of the loop for good.
foreach (var partialist in breaklistinchunks(chunksize))
{
int fails = 0;
bool success = false;
do
{
try
{
// do your action
success = true; // should be on the last line before the 'catch'
}
catch
{
fails += 1;
// do something about error before running again
}
}while (!success && fails < 2);
// exit the iteration if not successful and fails is 2
if (!success && fails >= 2)
break;
}
I made a possible solution for you if you don't mind switching from Enumerable to Queue, which kind of fits given the requirements...
void Main()
{
var list = new Queue<int>();
list.Enqueue(1);
list.Enqueue(2);
list.Enqueue(3);
list.Enqueue(4);
list.Enqueue(5);
var random = new Random();
int chunksize = 2;
foreach (var chunk in list.BreakListinChunks(chunksize))
{
foreach (var item in chunk)
{
try
{
if(random.Next(0, 3) == 0) // 1 in 3 chance of error
throw new Exception(item + " is a problem");
else
Console.WriteLine (item + " is OK");
}
catch (Exception ex)
{
Console.WriteLine (ex.Message);
list.Enqueue(item);
}
}
}
}
public static class IEnumerableExtensions
{
public static IEnumerable<List<T>> BreakListinChunks<T>(this Queue<T> sourceList, int chunkSize)
{
List<T> chunkReturn = new List<T>(chunkSize);
while(sourceList.Count > 0)
{
chunkReturn.Add(sourceList.Dequeue());
if (chunkReturn.Count == chunkSize || sourceList.Count == 0)
{
yield return chunkReturn;
chunkReturn = new List<T>(chunkSize);
}
}
}
}
Outputs
1 is a problem
2 is OK
3 is a problem
4 is a problem
5 is a problem
1 is a problem
3 is OK
4 is OK
5 is OK
1 is a problem
1 is OK
One possibility would be to use a for loop instead of a foreach loop and use the counter as a means to determine where an error occurred. Then you could continue from where you left off.
You can use break to exit out of the loop as soon as a chunk fails twice:
foreach (var partialList in breaklistinchunks(chunksize))
{
if(!TryOperation(partialList) && !TryOperation(partialList))
{
break;
}
}
private bool TryOperation<T>(List<T> list)
{
try
{
// do something
}
catch
{
// print error
return false;
}
return true;
}
You could even make the loop into a one-liner with LINQ, but it is generally bad practice to combine LINQ with side-effects, and it's not very readable:
breaklistinchunks(chunksize).TakeWhile(x => TryOperation(x) || TryOperation(x));

Parallel.Foreach + yield return?

I want to process something using parallel loop like this :
public void FillLogs(IEnumerable<IComputer> computers)
{
Parallel.ForEach(computers, cpt=>
{
cpt.Logs = cpt.GetRawLogs().ToList();
});
}
Ok, it works fine. But How to do if I want the FillLogs method return an IEnumerable ?
public IEnumerable<IComputer> FillLogs(IEnumerable<IComputer> computers)
{
Parallel.ForEach(computers, cpt=>
{
cpt.Logs = cpt.GetRawLogs().ToList();
yield return cpt // KO, don't work
});
}
EDIT
It seems not to be possible... but I use something like this :
public IEnumerable<IComputer> FillLogs(IEnumerable<IComputer> computers)
{
return computers.AsParallel().Select(cpt => cpt);
}
But where I put the cpt.Logs = cpt.GetRawLogs().ToList(); instruction
Short version - no, that isn't possible via an iterator block; the longer version probably involves synchronized queue/dequeue between the caller's iterator thread (doing the dequeue) and the parallel workers (doing the enqueue); but as a side note - logs are usually IO-bound, and parallelising things that are IO-bound often doesn't work very well.
If the caller is going to take some time to consume each, then there may be some merit to an approach that only processes one log at a time, but can do that while the caller is consuming the previous log; i.e. it begins a Task for the next item before the yield, and waits for completion after the yield... but that is again, pretty complex. As a simplified example:
static void Main()
{
foreach(string s in Get())
{
Console.WriteLine(s);
}
}
static IEnumerable<string> Get() {
var source = new[] {1, 2, 3, 4, 5};
Task<string> outstandingItem = null;
Func<object, string> transform = x => ProcessItem((int) x);
foreach(var item in source)
{
var tmp = outstandingItem;
// note: passed in as "state", not captured, so not a foreach/capture bug
outstandingItem = new Task<string>(transform, item);
outstandingItem.Start();
if (tmp != null) yield return tmp.Result;
}
if (outstandingItem != null) yield return outstandingItem.Result;
}
static string ProcessItem(int i)
{
return i.ToString();
}
I don't want to be offensive, but maybe there is a lack of understanding. Parallel.ForEach means that the TPL will run the foreach according to the available hardware in several threads. But that means, that ii is possible to do that work in parallel! yield return gives you the opportunity to get some values out of a list (or what-so-ever) and give them back one-by-one as they are needed. It prevents of the need to first find all items matching the condition and then iterate over them. That is indeed a performance advantage, but can't be done in parallel.
Although the question is old I've managed to do something just for fun.
class Program
{
static void Main(string[] args)
{
foreach (var message in GetMessages())
{
Console.WriteLine(message);
}
}
// Parallel yield
private static IEnumerable<string> GetMessages()
{
int total = 0;
bool completed = false;
var batches = Enumerable.Range(1, 100).Select(i => new Computer() { Id = i });
var qu = new ConcurrentQueue<Computer>();
Task.Run(() =>
{
try
{
Parallel.ForEach(batches,
() => 0,
(item, loop, subtotal) =>
{
Thread.Sleep(1000);
qu.Enqueue(item);
return subtotal + 1;
},
result => Interlocked.Add(ref total, result));
}
finally
{
completed = true;
}
});
int current = 0;
while (current < total || !completed)
{
SpinWait.SpinUntil(() => current < total || completed);
if (current == total) yield break;
current++;
qu.TryDequeue(out Computer computer);
yield return $"Completed {computer.Id}";
}
}
}
public class Computer
{
public int Id { get; set; }
}
Compared to Koray's answer this one really uses all the CPU cores.
You can use the following extension method
public static class ParallelExtensions
{
public static IEnumerable<T1> OrderedParallel<T, T1>(this IEnumerable<T> list, Func<T, T1> action)
{
var unorderedResult = new ConcurrentBag<(long, T1)>();
Parallel.ForEach(list, (o, state, i) =>
{
unorderedResult.Add((i, action.Invoke(o)));
});
var ordered = unorderedResult.OrderBy(o => o.Item1);
return ordered.Select(o => o.Item2);
}
}
use like:
public void FillLogs(IEnumerable<IComputer> computers)
{
cpt.Logs = computers.OrderedParallel(o => o.GetRawLogs()).ToList();
}
Hope this will save you some time.
How about
Queue<string> qu = new Queue<string>();
bool finished = false;
Task.Factory.StartNew(() =>
{
Parallel.ForEach(get_list(), (item) =>
{
string itemToReturn = heavyWorkOnItem(item);
lock (qu)
qu.Enqueue(itemToReturn );
});
finished = true;
});
while (!finished)
{
lock (qu)
while (qu.Count > 0)
yield return qu.Dequeue();
//maybe a thread sleep here?
}
Edit:
I think this is better:
public static IEnumerable<TOutput> ParallelYieldReturn<TSource, TOutput>(this IEnumerable<TSource> source, Func<TSource, TOutput> func)
{
ConcurrentQueue<TOutput> qu = new ConcurrentQueue<TOutput>();
bool finished = false;
AutoResetEvent re = new AutoResetEvent(false);
Task.Factory.StartNew(() =>
{
Parallel.ForEach(source, (item) =>
{
qu.Enqueue(func(item));
re.Set();
});
finished = true;
re.Set();
});
while (!finished)
{
re.WaitOne();
while (qu.Count > 0)
{
TOutput res;
if (qu.TryDequeue(out res))
yield return res;
}
}
}
Edit2: I agree with the short No answer. This code is useless; you cannot break the yield loop.

Categories