I ran into a weird issue and I'm wondering what I should do about it.
I have this class that return a IEnumerable<MyClass> and it is a deferred execution. Right now, there are two possible consumers. One of them sorts the result.
See the following example :
public class SomeClass
{
public IEnumerable<MyClass> GetMyStuff(Param givenParam)
{
double culmulativeSum = 0;
return myStuff.Where(...)
.OrderBy(...)
.TakeWhile( o =>
{
bool returnValue = culmulativeSum < givenParam.Maximum;
culmulativeSum += o.SomeNumericValue;
return returnValue;
};
}
}
Consumers call the deferred execution only once, but if they were to call it more than that, the result would be wrong as the culmulativeSum wouldn't be reset. I've found the issue by inadvertence with unit testing.
The easiest way for me to fix the issue would be to just add .ToArray() and get rid of the deferred execution at the cost of a little bit of overhead.
I could also add unit test in consumers class to ensure they call it only once, but that wouldn't prevent any new consumer coded in the future from this potential issue.
Another thing that came to my mind was to make subsequent execution throw.
Something like
return myStuff.Where(...)
.OrderBy(...)
.TakeWhile(...)
.ThrowIfExecutedMoreThan(1);
Obviously this doesn't exist.
Would it be a good idea to implement such thing and how would you do it?
Otherwise, if there is a big pink elephant that I don't see, pointing it out will be appreciated. (I feel there is one because this question is about a very basic scenario :| )
EDIT :
Here is a bad consumer usage example :
public class ConsumerClass
{
public void WhatEverMethod()
{
SomeClass some = new SomeClass();
var stuffs = some.GetMyStuff(param);
var nb = stuffs.Count(); //first deferred execution
var firstOne = stuff.First(); //second deferred execution with the culmulativeSum not reset
}
}
You can solve the incorrect result issue by simply turning your method into iterator:
double culmulativeSum = 0;
var query = myStuff.Where(...)
.OrderBy(...)
.TakeWhile(...);
foreach (var item in query) yield return item;
It can be encapsulated in a simple extension method:
public static class Iterators
{
public static IEnumerable<T> Lazy<T>(Func<IEnumerable<T>> source)
{
foreach (var item in source())
yield return item;
}
}
Then all you need to do in such scenarios is to surround the original method body with Iterators.Lazy call, e.g.:
return Iterators.Lazy(() =>
{
double culmulativeSum = 0;
return myStuff.Where(...)
.OrderBy(...)
.TakeWhile(...);
});
You can use the following class:
public class JustOnceOrElseEnumerable<T> : IEnumerable<T>
{
private readonly IEnumerable<T> decorated;
public JustOnceOrElseEnumerable(IEnumerable<T> decorated)
{
this.decorated = decorated;
}
private bool CalledAlready;
public IEnumerator<T> GetEnumerator()
{
if (CalledAlready)
throw new Exception("Enumerated already");
CalledAlready = true;
return decorated.GetEnumerator();
}
IEnumerator IEnumerable.GetEnumerator()
{
if (CalledAlready)
throw new Exception("Enumerated already");
CalledAlready = true;
return decorated.GetEnumerator();
}
}
to decorate an enumerable so that it can only be enumerated once. After that it would throw an exception.
You can use this class like this:
return new JustOnceOrElseEnumerable(
myStuff.Where(...)
...
);
Please note that I do not recommend this approach because it violates the contract of the IEnumerable interface and thus the Liskov Substitution Principle. It is legal for consumers of this contract to assume that they can enumerate the enumerable as many times as they like.
Instead, you can use a cached enumerable that caches the result of enumeration. This ensures that the enumerable is only enumerated once and that all subsequent enumeration attempts would read from the cache. See this answer here for more information.
Ivan's answer is very fitting for the underlying issue in OP's example - but for the general case, I have approached this in the past using an extension method similar to the one below. This ensures that the Enumerable has a single evaluation but is also deferred:
public static IMemoizedEnumerable<T> Memoize<T>(this IEnumerable<T> source)
{
return new MemoizedEnumerable<T>(source);
}
private class MemoizedEnumerable<T> : IMemoizedEnumerable<T>, IDisposable
{
private readonly IEnumerator<T> _sourceEnumerator;
private readonly List<T> _cache = new List<T>();
public MemoizedEnumerable(IEnumerable<T> source)
{
_sourceEnumerator = source.GetEnumerator();
}
public IEnumerator<T> GetEnumerator()
{
return IsMaterialized ? _cache.GetEnumerator() : Enumerate();
}
private IEnumerator<T> Enumerate()
{
foreach (var value in _cache)
{
yield return value;
}
while (_sourceEnumerator.MoveNext())
{
_cache.Add(_sourceEnumerator.Current);
yield return _sourceEnumerator.Current;
}
_sourceEnumerator.Dispose();
IsMaterialized = true;
}
IEnumerator IEnumerable.GetEnumerator() => GetEnumerator();
public List<T> Materialize()
{
if (IsMaterialized)
return _cache;
while (_sourceEnumerator.MoveNext())
{
_cache.Add(_sourceEnumerator.Current);
}
_sourceEnumerator.Dispose();
IsMaterialized = true;
return _cache;
}
public bool IsMaterialized { get; private set; }
void IDisposable.Dispose()
{
if(!IsMaterialized)
_sourceEnumerator.Dispose();
}
}
public interface IMemoizedEnumerable<T> : IEnumerable<T>
{
List<T> Materialize();
bool IsMaterialized { get; }
}
Example Usage:
void Consumer()
{
//var results = GetValuesComplex();
//var results = GetValuesComplex().ToList();
var results = GetValuesComplex().Memoize();
if(results.Any(i => i == 3))
{
Console.WriteLine("\nFirst Iteration");
//return; //Potential for early exit.
}
var last = results.Last(); // Causes multiple enumeration in naive case.
Console.WriteLine("\nSecond Iteration");
}
IEnumerable<int> GetValuesComplex()
{
for (int i = 0; i < 5; i++)
{
//... complex operations ...
Console.Write(i + ", ");
yield return i;
}
}
Naive: ✔ Deferred, ✘ Single enumeration.
ToList: ✘ Deferred, ✔ Single enumeration.
Memoize: ✔ Deferred, ✔ Single enumeration.
.
Edited to use the proper terminology and flesh out the implementation.
Related
Background:
I am modifying existing code using the Harmony Library. The existing C# code follows this structure:
public class ToModify
{
public override void Update()
{
foreach (StatusItemGroup.Entry entry in collection)
{
// I am trying to alter an operation at the end of this loop.
}
}
}
public class StatusItemGroup
{
public IEnumerator<Entry> GetEnumerator()
{
return items.GetEnumerator();
}
private List<Entry> items = new List<Entry>();
public struct Entry { }
}
Due to the situation, I must modify the IL code that is being generated, to do so I must obtain the MethodInfo of my target operand. This is the target:
IL_12B6: callvirt instance bool [mscorlib]System.Collections.IEnumerator::MoveNext()
Question:
How do I obtain the MethodInfo for the MoveNext method of an enumerator?
What I've tried:
Everything I can think of has yielded null results. This is my most basic attempt:
MethodInfo targetMethod = typeof(IEnumerator<StatusItemGroup.Entry>).GetMethod("MoveNext");
I don't understand why this doesn't work, and I don't know what I need to do to correctly obtain the MethodInfo.
MoveNext is not defined on IEnumerator<T>, but on the non-generic IEnumerator which is inherited by IEnumerator<T>.
Interface inheritance is a little weird in combination with reflection, so you need to obtain the method info directly from the base interface where it's defined:
MethodInfo targetMethod = typeof(System.Collections.IEnumerator).GetMethod("MoveNext");
Using the free LinqPad, I create this with Harmony 2.0 RC2. As you can see did I use a pass-through postfix to change the enumerator and wrap it. There are other ways and I suspect that you actually have an IEnumeration somewhere instead. That would be way easier to patch by using the pass-through postfix directly on the original method that returns the IEnumeration. No need to wrap the enumerator in that case.
But I don't know your full use case, so for now, this is the working example:
void Main()
{
var harmony = new Harmony("test");
harmony.PatchAll();
var group = new StatusItemGroup();
var items = new List<StatusItemGroup.Entry>() { StatusItemGroup.Entry.Make("A"), StatusItemGroup.Entry.Make("B") };
Traverse.Create(group).Field("items").SetValue(items);
var enumerator = group.GetEnumerator();
while(enumerator.MoveNext())
Console.WriteLine(enumerator.Current.id);
}
[HarmonyPatch]
class Patch
{
public class ProxyEnumerator<T> : IEnumerable<T>
{
public IEnumerator<T> enumerator;
public Func<T, T> transformer;
IEnumerator IEnumerable.GetEnumerator() { return GetEnumerator(); }
public IEnumerator<T> GetEnumerator()
{
while(enumerator.MoveNext())
yield return transformer(enumerator.Current);
}
}
[HarmonyPatch(typeof(StatusItemGroup), "GetEnumerator")]
static IEnumerator<StatusItemGroup.Entry> Postfix(IEnumerator<StatusItemGroup.Entry> enumerator)
{
StatusItemGroup.Entry Transform(StatusItemGroup.Entry entry)
{
entry.id += "+";
return entry;
}
var myEnumerator = new ProxyEnumerator<StatusItemGroup.Entry>()
{
enumerator = enumerator,
transformer = Transform
};
return myEnumerator.GetEnumerator();
}
}
public class StatusItemGroup
{
public IEnumerator<Entry> GetEnumerator()
{
return items.GetEnumerator();
}
private List<Entry> items = new List<Entry>();
public struct Entry
{
public string id;
public static Entry Make(string id) { return new Entry() { id = id }; }
}
}
Consider the following possible interface for an immutable generic enumerator:
interface IImmutableEnumerator<T>
{
(bool Succesful, IImmutableEnumerator<T> NewEnumerator) MoveNext();
T Current { get; }
}
How would you implement this in a reasonably performant way in c#? I'm a little out of ideas, because the IEnumerator infrastructure in .NET is inherently mutable and I can't see a way around it.
A naive implementation would be to simply create a new enumerator on every MoveNext() handing down a new inner mutable enumerator with current.Skip(1).GetEnumerator() but that is horribly inefficient.
I'm implementing a parser that needs to be able to look ahead; using an immutable enumerator would make things cleaner and easier to follow so I'm curious if there is an easy way to do this that I might be missing.
The input is an IEnumerable<T> and I can't change that. I can always materialize the enumerable with ToList() of course (with an IList in hand, looking ahead is trivial), but the data can be pretty large and I'd like to avoid it, if possible.
This is it:
public class ImmutableEnumerator<T> : IImmutableEnumerator<T>, IDisposable
{
public static (bool Succesful, IImmutableEnumerator<T> NewEnumerator) Create(IEnumerable<T> source)
{
var enumerator = source.GetEnumerator();
var successful = enumerator.MoveNext();
return (successful, new ImmutableEnumerator<T>(successful, enumerator));
}
private IEnumerator<T> _enumerator;
private (bool Succesful, IImmutableEnumerator<T> NewEnumerator) _runOnce = (false, null);
private ImmutableEnumerator(bool successful, IEnumerator<T> enumerator)
{
_enumerator = enumerator;
this.Current = successful ? _enumerator.Current : default(T);
if (!successful)
{
_enumerator.Dispose();
}
}
public (bool Succesful, IImmutableEnumerator<T> NewEnumerator) MoveNext()
{
if (_runOnce.NewEnumerator == null)
{
var successful = _enumerator.MoveNext();
_runOnce = (successful, new ImmutableEnumerator<T>(successful, _enumerator));
}
return _runOnce;
}
public T Current { get; private set; }
public void Dispose()
{
_enumerator.Dispose();
}
}
My test code succeeds nicely:
var xs = new[] { 1, 2, 3 };
var ie = ImmutableEnumerator<int>.Create(xs);
if (ie.Succesful)
{
Console.WriteLine(ie.NewEnumerator.Current);
var ie1 = ie.NewEnumerator.MoveNext();
if (ie1.Succesful)
{
Console.WriteLine(ie1.NewEnumerator.Current);
var ie2 = ie1.NewEnumerator.MoveNext();
if (ie2.Succesful)
{
Console.WriteLine(ie2.NewEnumerator.Current);
var ie3 = ie2.NewEnumerator.MoveNext();
if (ie3.Succesful)
{
Console.WriteLine(ie3.NewEnumerator.Current);
var ie4 = ie3.NewEnumerator.MoveNext();
}
}
}
}
This outputs:
1
2
3
It's immutable and it's efficient.
Here's a version using Lazy<(bool, IImmutableEnumerator<T>)> as per a request in the comments:
public class ImmutableEnumerator<T> : IImmutableEnumerator<T>, IDisposable
{
public static (bool Succesful, IImmutableEnumerator<T> NewEnumerator) Create(IEnumerable<T> source)
{
var enumerator = source.GetEnumerator();
var successful = enumerator.MoveNext();
return (successful, new ImmutableEnumerator<T>(successful, enumerator));
}
private IEnumerator<T> _enumerator;
private Lazy<(bool, IImmutableEnumerator<T>)> _runOnce;
private ImmutableEnumerator(bool successful, IEnumerator<T> enumerator)
{
_enumerator = enumerator;
this.Current = successful ? _enumerator.Current : default(T);
if (!successful)
{
_enumerator.Dispose();
}
_runOnce = new Lazy<(bool, IImmutableEnumerator<T>)>(() =>
{
var s = _enumerator.MoveNext();
return (s, new ImmutableEnumerator<T>(s, _enumerator));
});
}
public (bool Succesful, IImmutableEnumerator<T> NewEnumerator) MoveNext()
{
return _runOnce.Value;
}
public T Current { get; private set; }
public void Dispose()
{
_enumerator.Dispose();
}
}
You can achieve pseudo immutability suitable in this particular scenario by utilising a singly linked list. It allows for infinite look-ahead (limited only by your heap size) without the ability to look at previously processed nodes (unless you happen to store a reference to a previously processed node - which you shouldn't).
This solution addresses the requirements as stated (except for not conforming to your exact interface, with all of its functionality nevertheless intact).
The usage of such a linked list might look like this:
IEnumerable<int> numbersFromZeroToNine = Enumerable.Range(0, 10);
using (IEnumerator<int> enumerator = numbersFromZeroToNine.GetEnumerator())
{
var node = LazySinglyLinkedListNode<int>.CreateListHead(enumerator);
while (node != null)
{
Console.WriteLine($"Current value: {node.Value}.");
if (node.Next != null)
{
// Single-element look-ahead. Technically you could do node.Next.Next...Next.
// You can also nest another while loop here, and look ahead as much as needed.
Console.WriteLine($"Next value: {node.Next.Value}.");
}
else
{
Console.WriteLine("End of collection reached. There is no next value.");
}
node = node.Next;
// At this point the object which used to be referenced by the "node" local
// becomes eligible for collection, preventing unbounded memory growth.
}
}
Output:
Current value: 0.
Next value: 1.
Current value: 1.
Next value: 2.
Current value: 2.
Next value: 3.
Current value: 3.
Next value: 4.
Current value: 4.
Next value: 5.
Current value: 5.
Next value: 6.
Current value: 6.
Next value: 7.
Current value: 7.
Next value: 8.
Current value: 8.
Next value: 9.
Current value: 9.
End of collection reached. There is no next value.
The implementation is as follows:
sealed class LazySinglyLinkedListNode<T>
{
public static LazySinglyLinkedListNode<T> CreateListHead(IEnumerator<T> enumerator)
{
return enumerator.MoveNext() ? new LazySinglyLinkedListNode<T>(enumerator) : null;
}
public T Value { get; }
private IEnumerator<T> Enumerator;
private LazySinglyLinkedListNode<T> _next;
public LazySinglyLinkedListNode<T> Next
{
get
{
if (_next == null && Enumerator != null)
{
if (Enumerator.MoveNext())
{
_next = new LazySinglyLinkedListNode<T>(Enumerator);
}
else
{
Enumerator = null; // We've reached the end.
}
}
return _next;
}
}
private LazySinglyLinkedListNode(IEnumerator<T> enumerator)
{
Value = enumerator.Current;
Enumerator = enumerator;
}
}
An important thing to note here is that the source collection is only enumerated once, lazily, with MoveNext being called at most once per each node's lifetime regardless of how many times you access Next.
Using a doubly-linked list would allow look-behind, but would cause infinite memory growth and require periodic pruning, which is not trivial. Singly-linked list avoids this issue as long as you are not storing node references outside of your main loop. In the example above you could replace numbersFromZeroToNine with an IEnumerable<int> generator which infinitely yields integers, and the loop will run forever without running out of memory.
I am trying to understand how to use the IEnumerator interface and what it is used for. I have a class which implements the IEnumerator interface. A string array is passed to the constructor method.
The problem is when I execute the code then the array is not listed properly. It should be doing it in the order "ali", "veli", "hatca" but it’s listed at the console in this order "veli", "hatca" and -1. I am so confused. What am I doing wrong here? Can you please help?
static void Main(string[] args)
{
ogr o = new ogr();
while (o.MoveNext())
{
Console.WriteLine(o.Current.ToString());
}
}
public class ogr: IEnumerator
{
ArrayList array_ = new ArrayList();
string[] names = new string[] {
"ali", "veli", "hatca"
};
public ogr()
{
array_.AddRange(names);
}
public void addOgr(string name)
{
array_.Add(name);
}
int position;
public object Current
{
get
{
if (position >= 0 && position < array_.Count)
{
return array_[position];
}
else
{
return -1;
}
}
}
public bool MoveNext()
{
if (position < array_.Count && position >= 0)
{
position++;
return true;
}
else
{
return false;
}
}
public void Reset()
{
position = 0;
}
}
IEnumerator is quite difficult to grasp at first, but luckily it's an interface you hardly ever use in itself. Instead, you should probably implement IEnumerable<T>.
However, the source of your confusion comes from this line from the IEnumerator documentation:
Initially, the enumerator is positioned before the first element in
the collection. The Reset method also brings the enumerator back to
this position. After an enumerator is created or the Reset method is
called, you must call the MoveNext method to advance the enumerator to
the first element of the collection before reading the value of
Current; otherwise, Current is undefined.
Your implementation has its current position at 0 initially, instead of -1, causing the strange behavior. Your enumerator begins with Current on the first element instead of being before it.
It is pretty rare for people to use that API directly. More commonly, it is simply used via the foreach statement, i.e.
foreach(var value in someEnumerable) { ... }
where someEnumerable implements IEnumerable, IEnumerable<T> or just the duck-typed pattern. Your class ogr certainly isn't an IEnumerator, and shouldn't be made to try to act like one.
If the intend is for ogr to be enumerable, then:
public ogr : IEnumerable {
IEnumerator IEnumerable.GetEnumerator() {
return array_.GetEnumerator();
}
}
I suspect it would be better to be IEnumerable<string>, though, using List<string> as the backing list:
public SomeType : IEnumerable<string> {
private readonly List<string> someField = new List<string>();
public IEnumerator<string> GetEnumerator()
{ return someField.GetEnumerator(); }
IEnumerator IEnumerable.GetEnumerator()
{ return someField.GetEnumerator(); }
}
Provided items is the result of a LINQ expression:
var items = from item in ItemsSource.RetrieveItems()
where ...
Suppose generation of each item takes some non-negligeble time.
Two modes of operation are possible:
Using foreach would allow to start working with items in the beginning of the collection much sooner than whose in the end become available. However if we wanted to later process the same collection again, we'll have to copy save it:
var storedItems = new List<Item>();
foreach(var item in items)
{
Process(item);
storedItems.Add(item);
}
// Later
foreach(var item in storedItems)
{
ProcessMore(item);
}
Because if we'd just made foreach(... in items) then ItemsSource.RetrieveItems() would get called again.
We could use .ToList() right upfront, but that would force us wait for the last item to be retrieved before we could start processing the first one.
Question: Is there an IEnumerable implementation that would iterate first time like regular LINQ query result, but would materialize in process so that second foreach would iterate over stored values?
A fun challenge so I have to provide my own solution. So fun in fact that my solution now is in version 3. Version 2 was a simplification I made based on feedback from Servy. I then realized that my solution had huge drawback. If the first enumeration of the cached enumerable didn't complete no caching would be done. Many LINQ extensions like First and Take will only enumerate enough of the enumerable to get the job done and I had to update to version 3 to make this work with caching.
The question is about subsequent enumerations of the enumerable which does not involve concurrent access. Nevertheless I have decided to make my solution thread safe. It adds some complexity and a bit of overhead but should allow the solution to be used in all scenarios.
public static class EnumerableExtensions {
public static IEnumerable<T> Cached<T>(this IEnumerable<T> source) {
if (source == null)
throw new ArgumentNullException("source");
return new CachedEnumerable<T>(source);
}
}
class CachedEnumerable<T> : IEnumerable<T> {
readonly Object gate = new Object();
readonly IEnumerable<T> source;
readonly List<T> cache = new List<T>();
IEnumerator<T> enumerator;
bool isCacheComplete;
public CachedEnumerable(IEnumerable<T> source) {
this.source = source;
}
public IEnumerator<T> GetEnumerator() {
lock (this.gate) {
if (this.isCacheComplete)
return this.cache.GetEnumerator();
if (this.enumerator == null)
this.enumerator = source.GetEnumerator();
}
return GetCacheBuildingEnumerator();
}
public IEnumerator<T> GetCacheBuildingEnumerator() {
var index = 0;
T item;
while (TryGetItem(index, out item)) {
yield return item;
index += 1;
}
}
bool TryGetItem(Int32 index, out T item) {
lock (this.gate) {
if (!IsItemInCache(index)) {
// The iteration may have completed while waiting for the lock.
if (this.isCacheComplete) {
item = default(T);
return false;
}
if (!this.enumerator.MoveNext()) {
item = default(T);
this.isCacheComplete = true;
this.enumerator.Dispose();
return false;
}
this.cache.Add(this.enumerator.Current);
}
item = this.cache[index];
return true;
}
}
bool IsItemInCache(Int32 index) {
return index < this.cache.Count;
}
IEnumerator IEnumerable.GetEnumerator() {
return GetEnumerator();
}
}
The extension is used like this (sequence is an IEnumerable<T>):
var cachedSequence = sequence.Cached();
// Pulling 2 items from the sequence.
foreach (var item in cachedSequence.Take(2))
// ...
// Pulling 2 items from the cache and the rest from the source.
foreach (var item in cachedSequence)
// ...
// Pulling all items from the cache.
foreach (var item in cachedSequence)
// ...
There is slight leak if only part of the enumerable is enumerated (e.g. cachedSequence.Take(2).ToList(). The enumerator that is used by ToList will be disposed but the underlying source enumerator is not disposed. This is because the first 2 items are cached and the source enumerator is kept alive should requests for subsequent items be made. In that case the source enumerator is only cleaned up when eligigble for garbage Collection (which will be the same time as the possibly large cache).
Take a look at the Reactive Extentsions library - there is a MemoizeAll() extension which will cache the items in your IEnumerable once they're accessed, and store them for future accesses.
See this blog post by Bart De Smet for a good read on MemoizeAll and other Rx methods.
Edit: This is actually found in the separate Interactive Extensions package now - available from NuGet or Microsoft Download.
public static IEnumerable<T> SingleEnumeration<T>(this IEnumerable<T> source)
{
return new SingleEnumerator<T>(source);
}
private class SingleEnumerator<T> : IEnumerable<T>
{
private CacheEntry<T> cacheEntry;
public SingleEnumerator(IEnumerable<T> sequence)
{
cacheEntry = new CacheEntry<T>(sequence.GetEnumerator());
}
public IEnumerator<T> GetEnumerator()
{
if (cacheEntry.FullyPopulated)
{
return cacheEntry.CachedValues.GetEnumerator();
}
else
{
return iterateSequence<T>(cacheEntry).GetEnumerator();
}
}
IEnumerator IEnumerable.GetEnumerator()
{
return this.GetEnumerator();
}
}
private static IEnumerable<T> iterateSequence<T>(CacheEntry<T> entry)
{
using (var iterator = entry.CachedValues.GetEnumerator())
{
int i = 0;
while (entry.ensureItemAt(i) && iterator.MoveNext())
{
yield return iterator.Current;
i++;
}
}
}
private class CacheEntry<T>
{
public bool FullyPopulated { get; private set; }
public ConcurrentQueue<T> CachedValues { get; private set; }
private static object key = new object();
private IEnumerator<T> sequence;
public CacheEntry(IEnumerator<T> sequence)
{
this.sequence = sequence;
CachedValues = new ConcurrentQueue<T>();
}
/// <summary>
/// Ensure that the cache has an item a the provided index. If not, take an item from the
/// input sequence and move to the cache.
///
/// The method is thread safe.
/// </summary>
/// <returns>True if the cache already had enough items or
/// an item was moved to the cache,
/// false if there were no more items in the sequence.</returns>
public bool ensureItemAt(int index)
{
//if the cache already has the items we don't need to lock to know we
//can get it
if (index < CachedValues.Count)
return true;
//if we're done there's no race conditions hwere either
if (FullyPopulated)
return false;
lock (key)
{
//re-check the early-exit conditions in case they changed while we were
//waiting on the lock.
//we already have the cached item
if (index < CachedValues.Count)
return true;
//we don't have the cached item and there are no uncached items
if (FullyPopulated)
return false;
//we actually need to get the next item from the sequence.
if (sequence.MoveNext())
{
CachedValues.Enqueue(sequence.Current);
return true;
}
else
{
FullyPopulated = true;
return false;
}
}
}
}
So this has been edited (substantially) to support multithreaded access. Several threads can ask for items, and on an item by item basis, they will be cached. It doesn't need to wait for the entire sequence to be iterated for it to return cached values. Below is a sample program that demonstrates this:
private static IEnumerable<int> interestingIntGenertionMethod(int maxValue)
{
for (int i = 0; i < maxValue; i++)
{
Thread.Sleep(1000);
Console.WriteLine("actually generating value: {0}", i);
yield return i;
}
}
public static void Main(string[] args)
{
IEnumerable<int> sequence = interestingIntGenertionMethod(10)
.SingleEnumeration();
int numThreads = 3;
for (int i = 0; i < numThreads; i++)
{
int taskID = i;
Task.Factory.StartNew(() =>
{
foreach (int value in sequence)
{
Console.WriteLine("Task: {0} Value:{1}",
taskID, value);
}
});
}
Console.WriteLine("Press any key to exit...");
Console.ReadKey(true);
}
You really need to see it run to understand the power here. As soon as a single thread forces the next actual values to be generated all of the remaining threads can immediately print that generated value, but they will all be waiting if there are no uncached values for that thread to print. (Obviously thread/threadpool scheduling may result in one task taking longer to print it's value than needed.)
There have already been posted thread-safe implementations of the Cached/SingleEnumeration operator by Martin Liversage and Servy respectively, and the thread-safe Memoise operator from the System.Interactive package is also available. In case thread-safety is not a requirement, and paying the cost of thread-synchronization is undesirable, there are answers offering unsynchronized ToCachedEnumerable implementations in this question. All these implementations have in common that they are based on custom types. My challenge was to write a similar not-synchronized operator in a single self-contained extension method (no strings attached). Here is my implementation:
public static IEnumerable<T> MemoiseNotSynchronized<T>(this IEnumerable<T> source)
{
// Argument validation omitted
IEnumerator<T> enumerator = null;
List<T> buffer = null;
return Implementation();
IEnumerable<T> Implementation()
{
if (buffer != null && enumerator == null)
{
// The source has been fully enumerated
foreach (var item in buffer) yield return item;
yield break;
}
enumerator ??= source.GetEnumerator();
buffer ??= new();
for (int i = 0; ; i = checked(i + 1))
{
if (i < buffer.Count)
{
yield return buffer[i];
}
else if (enumerator.MoveNext())
{
Debug.Assert(buffer.Count == i);
var current = enumerator.Current;
buffer.Add(current);
yield return current;
}
else
{
enumerator.Dispose(); enumerator = null;
yield break;
}
}
}
}
Usage example:
IEnumerable<Point> points = GetPointsFromDB().MemoiseNotSynchronized();
// Enumerate the 'points' any number of times, on a single thread.
// The data will be fetched from the DB only once.
// The connection with the DB will open when the 'points' is enumerated
// for the first time, partially or fully.
// The connection will stay open until the 'points' is enumerated fully
// for the first time.
Testing the MemoiseNotSynchronized operator on Fiddle.
I have an interface that, among other things, implements a "public IEnumerator GetEnumerator()" method, so I can use the interface in a foreach statement.
I implement this interface in several classes and in one of them, I want to return an empty IEnumerator. Right now I do this the following way:
public IEnumerator GetEnumerator()
{
ArrayList arr = new ArrayList();
return arr.GetEnumerator();
}
However I consider this an ugly hack, and I can't help but think that there is a better way of returning an empty IEnumerator. Is there?
This is simple in C# 2:
public IEnumerator GetEnumerator()
{
yield break;
}
You need the yield break statement to force the compiler to treat it as an iterator block.
This will be less efficient than a "custom" empty iterator, but it's simpler code...
There is an extra function in the framework:
public static class Enumerable
{
public static IEnumerable<TResult> Empty<TResult>();
}
Using this you can write:
var emptyEnumerable = Enumerable.Empty<int>();
var emptyEnumerator = Enumerable.Empty<int>().GetEnumerator();
You could implement a dummy class that implements IEnumerator, and return an instance of it:
class DummyEnumerator : IEnumerator
{
public object Current
{
get
{
throw new InvalidOperationException();
}
}
public bool MoveNext()
{
return false;
}
public void Reset()
{
}
}
I was curious and went a bit further. I made a test that checks how efficient the methods are comparing yield break, Enumerable.Emtpy and custom class.
You can check it out on dotnetfiddle https://dotnetfiddle.net/p5ZkUN or use the code below.
The result of one of the many dotnetfiddle runs using 190 000 iterations was:
Yield break: 00:00:00.0012208
Enumerable.Empty(): 00:00:00.0007815
EmptyEnumerator instance: 00:00:00.0010226
using System;
using System.Diagnostics;
using System.Collections;
using System.Linq;
public class Program
{
private const int Iterations = 190000;
public static void Main()
{
var sw = new Stopwatch();
IEnumerator enumerator1 = YieldBreak();
sw.Start();
for (int i = 0; i < Iterations; i++)
{
while(enumerator1.MoveNext())
{
throw new InvalidOperationException("Should not occur");
}
}
sw.Stop();
Console.WriteLine("Yield break: {0}", sw.Elapsed);
GC.Collect();
IEnumerator enumerator2 = Enumerable.Empty<object>().GetEnumerator();
sw.Restart();
for (int i = 0; i < Iterations; i++)
{
while(enumerator2.MoveNext())
{
throw new InvalidOperationException("Should not occur");
}
}
sw.Stop();
Console.WriteLine("Enumerable.Empty<T>(): {0}", sw.Elapsed);
GC.Collect();
var enumerator3 = new EmptyEnumerator();
sw.Restart();
for (int i = 0; i < Iterations; i++)
{
while(enumerator3.MoveNext())
{
throw new InvalidOperationException("Should not occur");
}
}
sw.Stop();
Console.WriteLine("EmptyEnumerator instance: {0}", sw.Elapsed);
}
public static IEnumerator YieldBreak()
{
yield break;
}
private class EmptyEnumerator : IEnumerator
{
//public static readonly EmptyEnumerator Instance = new EmptyEnumerator();
public bool MoveNext()
{
return false;
}
public void Reset()
{
}
public object Current { get { return null; } }
}
}
The way I use is to use the enumerator of an empty array:
public IEnumerator GetEnumerator() {
return new object[0].GetEnumerator();
}
It can also be used for generic IEnumerator or IEnumerable (use an array of the appropriate type)
You can implement IEnumerator interface and IEnumerable, and return false from MoveNext function of IEnumerable interfase
private class EmptyEnumerator : IEnumerator
{
public EmptyEnumerator()
{
}
#region IEnumerator Members
public void Reset() { }
public object Current
{
get
{
throw new InvalidOperationException();
}
}
public bool MoveNext()
{ return false; }
}
public class EmptyEnumerable : IEnumerable
{
public IEnumerator GetEnumerator()
{
return new EmptyEnumerator();
}
}
I wrote it like this:
public IEnumerator<T> GetEnumerator()
{
return this.source?.GetEnumerator() ??
Enumerable.Empty<T>().GetEnumerator();
}
You can make a NullEnumerator which implements the IEnumerator interface. You can just pass an instance off the NullEnumerator.
here is an example of an EmptyEnumerator
Found this question looking for the simplest way to get an empty enumerator. After seeing the answer comparing performance I decided to use the empty enumerator class solution, but mine is more compact than the other examples, and is a generic type, and also provides a default instance so you don't have to create new instances all the time, which should even further improve performance.
class EmptyEnumerator<T> : IEnumerator<T>
{
public readonly static EmptyEnumerator<T> value = new EmptyEnumerator<T>();
public T Current => throw new InvalidOperationException();
object IEnumerator.Current => throw new InvalidOperationException();
public void Dispose() { }
public bool MoveNext() => false;
public void Reset() { }
}