Manually increment an enumerator inside foreach loop - c#

I have a nested while loop inside a foreach loop where I would like to advance the enumerator indefinitately while a certain condition is met. To do this I try casting the enumerator to IEnumerator< T > (which it must be if it is in a foreach loop) then calling MoveNext() on the casted object but it gives me an error saying I cannot convert it.
Cannot convert type 'System.DateTime' to System.Collections.Generic.IEnumerator via a reference conversion, boxing conversion, unboxing conversion, wrapping conversion, or null type conversion.
foreach (DateTime time in times)
{
while (condition)
{
// perform action
// move to next item
(time as IEnumerator<DateTime>).MoveNext(); // will not let me do this
}
// code to execute after while condition is met
}
What is the best way to manually increment the IEnumerator inside of the foreach loop?
EDIT:
Edited to show there is code after the while loop that I would like executed once the condition is met which is why I wanted to manually increment inside the while then break out of it as opposed to continue which would put me back at the top. If this isn't possible I believe the best thing is to redesign how I am doing it.

Many of the other answers recommend using continue, which may very well help you do what you need to do. However, in the interests of showing manually moving the enumerator, first you must have the enumerator, and that means writing your loop as a while.
using (var enumerator = times.GetEnumerator())
{
DateTime time;
while (enumerator.MoveNext())
{
time = enumerator.Current;
// pre-condition code
while (condition)
{
if (enumerator.MoveNext())
{
time = enumerator.Current;
// condition code
}
else
{
condition = false;
}
}
// post-condition code
}
}
From your comments:
How can the foreach loop advance it if it doesn't implement the IEnumerator interface?
In your loop, time is a DateTime. It is not the object that needs to implement an interface or pattern to work in the loop. times is a sequence of DateTime values, it is the one that must implement the enumerable pattern. This is generally fulfilled by implementing the IEnumerable<T> and IEnumerable interfaces, which simply require T GetEnumerator() and object GetEnumerator() methods. The methods return an object implementing IEnumerator<T> and IEnumerator, which define a bool MoveNext() method and a T or object Current property. But time cannot be cast to IEnumerator, because it is no such thing, and neither is the times sequence.

You cannot modify the enumerator from inside the for loop. The language does not permit this. You need to use the continue statement in order to advance to the next iteration of a loop.
However, I'm not convinced that your loop even needs a continue. Read on.
In the context of your code you would need to convert the while to an if in order to make the continue refer to the foreach block.
foreach (DateTime time in times)
{
if (condition)
{
// perform action
continue;
}
// code to execute if condition is not met
}
But written like this it is clear that the following equivalent variant is simpler still
foreach (DateTime time in times)
{
if (condition)
{
// perform action
}
else
{
// code to execute if condition is not met
}
}
This is equivalent to your pseudo-code because the part marked code to execute after while condition is met is executed for each item for which condition is false.
My assumption in all of this is that condition is evaluated for each item in the list.

Perhaps you can use continue?

You would use the continue statement:
continue;

This is just a guess, but it sounds like what you're trying to do is take a list of datetimes and move past all of them which meet a certain criteria, then perform an action on the rest of the list. If that's what you're trying to do, you probably want something like SkipWhile() from System.Linq. For example, the following code takes a series of datetimes and skips past all of them which are before the cutoff date; then it prints out the remaining datetimes:
var times = new List<DateTime>()
{
DateTime.Now.AddDays(1), DateTime.Now.AddDays(2), DateTime.Now.AddDays(3), DateTime.Now.AddDays(4)
};
var cutoff = DateTime.Now.AddDays(2);
var timesAfterCutoff = times.SkipWhile(datetime => datetime.CompareTo(cutoff) < 1)
.Select(datetime => datetime);
foreach (var dateTime in timesAfterCutoff)
{
Console.WriteLine(dateTime);
}
Console.ReadLine();
Is that the sort of thing you're trying to do?

I definitely do not condone what I am about to suggest, but you can create a wrapper around the original IEnumerable to transform it into something that returns items which can be used to navigate the underlying the enumerator. The end result might look like the following.
public static void Main(string[] args)
{
IEnumerable<DateTime> times = GetTimes();
foreach (var step in times.StepWise())
{
while (condition)
{
step.MoveNext();
}
Console.WriteLine(step.Current);
}
}
Then we need to create our StepWise extension method.
public static class EnumerableExtension
{
public static IEnumerable<Step<T>> StepWise<T>(this IEnumerable<T> instance)
{
using (IEnumerator<T> enumerator = instance.GetEnumerator())
{
while (enumerator.MoveNext())
{
yield return new Step<T>(enumerator);
}
}
}
public struct Step<T>
{
private IEnumerator<T> enumerator;
public Step(IEnumerator<T> enumerator)
{
this.enumerator = enumerator;
}
public bool MoveNext()
{
return enumerator.MoveNext();
}
public T Current
{
get { return enumerator.Current; }
}
}
}

You could use a func as your iterator and keep the state that you are changing in that delegate to be evaluated each iteration.
public static IEnumerable<T> FunkyIEnumerable<T>(this Func<Tuple<bool, T>> nextOrNot)
{
while(true)
{
var result = nextOrNot();
if(result.Item1)
yield return result.Item2;
else
break;
}
yield break;
}
Func<Tuple<bool, int>> nextNumber = () =>
Tuple.Create(SomeRemoteService.CanIContinueToSendNumbers(), 1);
foreach(var justGonnaBeOne in nextNumber.FunkyIEnumerable())
Console.Writeline(justGonnaBeOne.ToString());

One alternative not yet mentioned is to have an enumerator return a wrapper object which allows access to itself in addition to the data element being enumerated. For sample:
struct ControllableEnumeratorItem<T>
{
private ControllableEnumerator parent;
public T Value {get {return parent.Value;}}
public bool MoveNext() {return parent.MoveNext();}
public ControllableEnumeratorItem(ControllableEnumerator newParent)
{parent = newParent;}
}
This approach could also be used by data structures that want to allow collections to be modified in controlled fashion during enumeration (e.g. by including "DeleteCurrentItem", "AddBeforeCurrentItem", and "AddAfterCurrentItem" methods).

Related

Does IEnumerable<T> store a function to be called later?

I recently came across some code that does not behave how I would have expected.
1: int[] numbers = { 1, 2, 3, 4, 5, 6, 7, 8 };
2: IEnumerable<int> result = numbers.Select(n => n % 2 == 0 ? n : 0);
3:
4: int a = result.ElementAt(0);
5: numbers[0] = 10;
6: int b = result.ElementAt(0);
When I stepped through this code with Visual Studio, I was surprised to see that the yellow highlighting jumped from line 4 back to the lambda expression on line 2, then again from line 6 to the lambda on line 2.
Moreover, the value of a after running this code is 0 and the value of b is 10.
The original code that made me realize that this could/would happen involved a method call within the Select(), and accessing any property or specific element of the IEnumerable resulted in the method within Select() being called again and again.
// The following code prints out:
// Doing something... 1
// Doing something... 5
// Doing something... 1
// Doing something... 2
// Doing something... 3
// Doing something... 4
// Doing something... 5
using System;
using System.Linq;
using System.Collections.Generic;
class Program
{
static void Main(string[] args)
{
int[] numbers = { 1, 2, 3, 4, 5 };
IEnumerable<int> result = numbers.Select(DoSomething);
int a = result.ElementAt(0);
int b = result.ElementAt(4);
int c = result.Count();
}
static int DoSomething(int x)
{
Console.WriteLine("Doing something... " + x);
return x;
}
}
I feel like I now understand how the code will behave (and I've found other questions online that are the result of this behavior). However, what exactly causes the code within the Select() to be called from later lines?
You have a reference to a LINQ query, which are evaluated as many times as you iterate over them.
From the docs (you can see this is called Deferred Execution):
As stated previously, the query variable itself only stores the query commands. The actual execution of the query is deferred until you iterate over the query variable in a foreach statement. This concept is referred to as deferred execution
...
Because the query variable itself never holds the query results, you can execute it as often as you like. For example, you may have a database that is being updated continually by a separate application. In your application, you could create one query that retrieves the latest data, and you could execute it repeatedly at some interval to retrieve different results every time.
So, when you have
IEnumerable<int> result = numbers.Select(DoSomething);
You have a reference to a query that will transform each element in numbers to the result of DoSomething.
So, you could say that the following:
int a = result.ElementAt(0);
iterates result up until the first element. The same happens for ElementAt(4), but this times it iterates until the fifth element. Notice that you only see printed Doing something... 5 because .Current is evaluated once.
The call would fail if the query, at that moment, couldn't produce 5 items.
The .Count call, again iterates the result query and returns the amount of elements at that moment.
If instead of keeping the reference to the query, you kept a reference to the results, i.e:
IEnumerable<int> result = numbers.Select(DoSomething).ToArray();
// or
IEnumerable<int> result = numbers.Select(DoSomething).ToList();
You would only see this output:
// Doing something... 1
// Doing something... 2
// Doing something... 3
// Doing something... 4
// Doing something... 5
Let's break this down piece by piece until you understand it. Trust me; take your time and read this and it will be a revelation to you understanding Enumerable types and answer your question.
Look at the IEnumerable interface which is the base of IEnumerable<T>. It contains one method; IEnumerator GetEnumerator();.
Enumerables are a tricky beast because they can do whatever they want. All that really matters is the call to the GetEnumerator() that happens automatically in a foreach loop; or you can do it manually.
What does GetEnumerator() do? It returns another interface, IEnumerator.
This is the magic. The IEnumerator has 1 property and 2 methods.
object Current { get; }
bool MoveNext();
void Reset();
Let's break down the magic.
First let me explain what they are typically, and I say typically because like I mentioned it can be a tricky beast. You're allowed to implement this however you choose... Some types don't follow the standards.
object Current { get; } is obvious. It gets the current object in the IEnumerator; by default this might be null.
bool MoveNext(); This returns true if there is another object in the IEnumerator and it should set the Current value to that new object.
void Reset(); tells the type to start over from the beginning.
Now lets implement this. Please take the time to review this IEnumerator type so that you understand it. Realize that when you reference an IEnumerable type you are not even referencing the IEnumerator (this); however, you're referencing a type that returns this IEnumerator via GetEnumerator()
Note: Be careful not to confuse the names. IEnumerator is different than IEnumerable.
IEnumerator
public class MyEnumerator : IEnumerator
{
private string First => nameof(First);
private string Second => nameof(Second);
private string Third => nameof(Third);
private int counter = 0;
public object Current { get; private set; }
public bool MoveNext()
{
if (counter > 2) return false;
counter++;
switch (counter)
{
case 1:
Current = First;
break;
case 2:
Current = Second;
break;
case 3:
Current = Third;
break;
}
return true;
}
public void Reset()
{
counter = 0;
}
}
Now, let's make an IEnumerable type and use this IEnumerator.
IEnumerable
public class MyEnumerable : IEnumerable
{
public IEnumerator GetEnumerator() => new MyEnumerator();
}
This is something to soak in... When you make a call like numbers.Select(n => n % 2 == 0 ? n : 0) you aren't iterating any items... you're returning a type much like the one above. .Select(…) returns IEnumerable<int>. Well looky above... IEnumerable isn't anything but an interface that calls GetEnumerator(). That happens whenever you enter a looping situation or it can be done manually. So, with that in mind you can already see the iteration never starts until you call GetEnumerator() and even then it never starts until you call the MoveNext() method of the result of GetEnumerator() which is the IEnumerator type.
So...
In other words, you just have a reference to an IEnumerable<T> in your call and nothing more. No iterations have taken place. This is why the code jumps back up in yours because it finally does iterate in the ElementAt method and it's then looking at the lamba expression. Stay with me and I'll later update an example to take this lesson full circle but for now let's continue our simple example:
Let's now make a simple console app to test our new types.
Console App
class Program
{
static void Main(string[] args)
{
var myEnumerable = new MyEnumerable();
foreach (var item in myEnumerable)
Console.WriteLine(item);
Console.ReadKey();
}
// OUTPUT
// First
// Second
// Third
}
Now let's do the same thing but make it generic. I won't write as much but monitor the code closely for changes and you'll get it.
I'm going to copy and paste it all in one.
Entire Console App
using System;
using System.Collections;
using System.Collections.Generic;
namespace Question_Answer_Console_App
{
class Program
{
static void Main(string[] args)
{
var myEnumerable = new MyEnumerable<Person>();
foreach (var person in myEnumerable)
Console.WriteLine(person.Name);
Console.ReadKey();
}
// OUTPUT
// Test 0
// Test 1
// Test 2
}
public class Person
{
static int personCounter = 0;
public string Name { get; } = "Test " + personCounter++;
}
public class MyEnumerator<T> : IEnumerator<T>
{
private T First { get; set; }
private T Second { get; set; }
private T Third { get; set; }
private int counter = 0;
object IEnumerator.Current => (IEnumerator<T>)Current;
public T Current { get; private set; }
public bool MoveNext()
{
if (counter > 2) return false;
counter++;
switch (counter)
{
case 1:
First = Activator.CreateInstance<T>();
Current = First;
break;
case 2:
Second = Activator.CreateInstance<T>();
Current = Second;
break;
case 3:
Third = Activator.CreateInstance<T>();
Current = Third;
break;
}
return true;
}
public void Reset()
{
counter = 0;
First = default;
Second = default;
Third = default;
}
public void Dispose() => Reset();
}
public class MyEnumerable<T> : IEnumerable<T>
{
IEnumerator IEnumerable.GetEnumerator() => GetEnumerator();
public IEnumerator<T> GetEnumerator() => new MyEnumerator<T>();
}
}
So let's recap... IEnumerable<T> is a type that has a method that returns an IEnumerator<T> type. The IEnumerator<T> type has the T Current { get; } property as well as the IEnumerator methods.
Let's break this down one more time in code and call out the pieces manually so that you can see it clearer. This will be only the console part of the app because everything else stays the same.
Console App
class Program
{
static void Main(string[] args)
{
IEnumerable<Person> enumerable = new MyEnumerable<Person>();
IEnumerator<Person> enumerator = enumerable.GetEnumerator();
while (enumerator.MoveNext())
Console.WriteLine(enumerator.Current.Name);
Console.ReadKey();
}
// OUTPUT
// Test 0
// Test 1
// Test 2
}
FYI: One thing to point out is in the answer above there are two versions of Linq. Linq in EF or Linq-to-SQL contain different extension methods than typical linq. The main difference is that query expression in Linq (when referring to a database) will return IQueryable<T> which implements the IQueryable interface, which creates SQL expressions that are ran and iterated against. In other words... something like a .Where(…) clause doesn't query the entire database and then iterate over it. It turns that expression into a SQL expression. That's why things like .Equals() will not work in those specific Lambda expressions.
Does IEnumerable<T> store a function to be called later?
Yes. An IEnumerable is exactly what it says it is. It's something which can be enumerated through at some future point. You can think of it like setting up a pipeline of operations.
It's not until it actually is enumerated (I.E. calling foreach, .ElementAt(), ToList(), etc) that any of those operations are actually invoked. This is called deferred execution.
what exactly causes the code within the Select() to be called from later lines?
When you call SomeEnumerable.Select(SomeOperation), the result is an IEnumerable which is an object representing that "pipeline" which you have set up. The implementation of that IEnumerable does store the function which you passed to it. The actual source for this (for .net core) is here. You can see that SelectEnumerableIterator, SelectListIterator, and SelectArrayIterator all have a Func<TSource, TResult> as a private field. This is where it stores that the function you specified for later use. The array and list iterators just provide some shortcuts if you know you're iterating through a finite collection.

C# yield return performance

How much space is reserved to the underlying collection behind a method using yield return syntax WHEN I PERFORM a ToList() on it? There's a chance it will reallocate and thus decrease performance if compared to the standard approach where i create a list with predefined capacity?
The two scenarios:
public IEnumerable<T> GetList1()
{
foreach( var item in collection )
yield return item.Property;
}
public IEnumerable<T> GetList2()
{
List<T> outputList = new List<T>( collection.Count() );
foreach( var item in collection )
outputList.Add( item.Property );
return outputList;
}
yield return does not create an array that has to be resized, like what List does; instead, it creates an IEnumerable with a state machine.
For instance, let's take this method:
public static IEnumerable<int> Foo()
{
Console.WriteLine("Returning 1");
yield return 1;
Console.WriteLine("Returning 2");
yield return 2;
Console.WriteLine("Returning 3");
yield return 3;
}
Now let's call it and assign that enumerable to a variable:
var elems = Foo();
None of the code in Foo has executed yet. Nothing will be printed on the console. But if we iterate over it, like this:
foreach(var elem in elems)
{
Console.WriteLine( "Got " + elem );
}
On the first iteration of the foreach loop, the Foo method will be executed until the first yield return. Then, on the second iteration, the method will "resume" from where it left off (right after the yield return 1), and execute until the next yield return. Same for all subsequent elements.
At the end of the loop, the console will look like this:
Returning 1
Got 1
Returning 2
Got 2
Returning 3
Got 3
This means you can write methods like this:
public static IEnumerable<int> GetAnswers()
{
while( true )
{
yield return 42;
}
}
You can call the GetAnswers method, and every time you request an element, it'll give you 42; the sequence never ends. You couldn't do this with a List, because lists have to have a finite size.
How much space is reserved to the underlying collection behind a method using yield return syntax?
There's no underlying collection.
There's an object, but it isn't a collection. Just how much space it will take up depends on what it needs to keep track of.
There's a chance it will reallocate
No.
And thus decrease performance if compared to the standard approach where i create a list with predefined capacity?
It will almost certainly take up less memory than creating a list with a predefined capacity.
Let's try a manual example. Say we had the following code:
public static IEnumerable<int> CountToTen()
{
for(var i = 1; i != 11; ++i)
yield return i;
}
To foreach through this will iterate through the numbers 1 to 10 inclusive.
Now let's do this the way we would have to if yield did not exist. We'd do something like:
private class CountToTenEnumerator : IEnumerator<int>
{
private int _current;
public int Current
{
get
{
if(_current == 0)
throw new InvalidOperationException();
return _current;
}
}
object IEnumerator.Current
{
get { return Current; }
}
public bool MoveNext()
{
if(_current == 10)
return false;
_current++;
return true;
}
public void Reset()
{
throw new NotSupportedException();
// We *could* just set _current back, but the object produced by
// yield won't do that, so we'll match that.
}
public void Dispose()
{
}
}
private class CountToTenEnumerable : IEnumerable<int>
{
public IEnumerator<int> GetEnumerator()
{
return new CountToTenEnumerator();
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}
public static IEnumerable<int> CountToTen()
{
return new CountToTenEnumerable();
}
Now, for a variety of reasons this is quite different to the code you're likely to get from the version using yield, but the basic principle is the same. As you can see there are two allocations involved of objects (same number as if we had a collection and then did a foreach on that) and the storage of a single int. In practice we can expect yield to store a few more bytes than that, but not a lot.
Edit: yield actually does a trick where the first GetEnumerator() call on the same thread that obtained the object returns that same object, doing double service for both cases. Since this covers over 99% of use cases yield actually does one allocation rather than two.
Now let's look at:
public IEnumerable<T> GetList1()
{
foreach( var item in collection )
yield return item.Property;
}
While this would result in more memory used than just return collection, it won't result in a lot more; the only thing the enumerator produced really needs to keep track of is the enumerator produced by calling GetEnumerator() on collection and then wrapping that.
This is going to be massively less memory than that of the wasteful second approach you mention, and much faster to get going.
Edit:
You've changed your question to include "syntax WHEN I PERFORM a ToList() on it", which is worth considering.
Now, here we need to add a third possibility: Knowledge of the collection's size.
Here, there is the possibilty that using new List(capacity) will prevent allocations of the list being built. That can indeed be a considerable saving.
If the object that has ToList called on it implements ICollection<T> then ToList will end up first doing a single allocation of an internal array of T and then calling ICollection<T>.CopyTo().
This would mean that your GetList2 would result in a faster ToList() than your GetList1.
However, your GetList2 has already wasted time and memory doing what ToList() will do with the results of GetList1 anyway!
What it should have done here was just return new List<T>(collection); and be done with it.
If though we need to actually do something inside GetList1 or GetList2 (e.g. convert elements, filter elements, track averages, and so on) then GetList1 is going to be faster and lighter on memory. Much lighter if we never call ToList() on it, and slightly ligher if we do call ToList() because again, the faster and lighter ToList() is offset by GetList2 being slower and heavier in the first place by exactly the same amount.

yield return in recursion

i am attempting to create an
IEnumrable<PropertyInfo>
iv'e got a method called Disassemble which iterates recursively throw a given object and all it's child objects of it's properties .
please do not concern your self with the Inner wrapper objects of type INameValueWrapper
The problem below is when i encounter a property which is a Class i wan't to call Disassemble on it as well
and add it to the same iteration of the IEnumrable , the Dissasemble does not occur again when it is called
where i put the comment :
// The problem is here .
public static IEnumerable<T> Dissasemble<T>(this object sourceObj) where T : INameValueWrapper
{
var properties = sourceObj.GetType().GetProperties();
foreach (var prop in properties)
{
var wrapper = (T)prop.WrapPropertyInfo(sourceObj);
yield return wrapper;
if (wrapper is CollectionPropertyInfoWrapper)
{
var colWrapper = wrapper as CollectionPropertyInfoWrapper;
var collection = (IList)colWrapper.Value;
int index = 0;
foreach (var item in collection)
{
yield return (T)item.WrapItem(collection, index + 1);
index++;
}
}
else
{
var propWrapper = wrapper as PropertyInfoWrapper;
if (!propWrapper.IsPrimitive)
{
var childObject = prop.GetValue(sourceObj);
childObject.Dissasemble<T>(); // here is the problem
}
}
}
yield break;
}
1) Why does it not get called and added to the iteration ?
2) What is the work around this issue ? ,
i could call childObject.Dissasemble<T>().ToList()
and then iterate that collection calling yield return on it's items
but that seems like re doing something i already did.
thanks in advance.
You're calling the method, but then ignoring the results. You may want something like:
foreach (var item in childObject.Disassemble<T>())
{
yield return item;
}
I think you're a bit confused about what yield return does - it only yields a value in the sequence returned by the currently-executing method. It doesn't add a value to some global sequence. If you ignore the value returned by the recursive Disassemble call, the code won't even execute as iterator blocks are lazy. (They only execute code when they're asked for another value.)
Also, you don't need the yield break; at the end of your method.
Note that this if the recursion goes deep, this use of iterator blocks can be inefficient. That may well not be a problem for you, but it's something to think about. See posts by Wes Dyer and Eric Lippert about this.
Instead of
childObject.Dissasemble<T>(); // here is the problem
try
foreach (var a in childObject.Dissasemble<T>())
{
yield return a;
}

consuming sequence generated by IEnumerable

I would like to use an IEnumerable to generate a sequence of values -- specifically, a list of Excel-like column headers.
private IEnumerable<string> EnumerateSymbolNames()
{
foreach (var sym in _symbols)
{
yield return sym;
}
foreach (var sym1 in _symbols)
{
foreach (var sym2 in _symbols)
{
yield return sym1 + sym2;
}
}
yield break;
}
private readonly string[] _symbols = new string[] { "A", "B", "C", ...};
This works fine if I fetch the values from a foreach loop. But what I want is to use the iterator block as a state machine and fetch the next available column header in response to a user action. And this -- consuming the generated values -- is where I've run into trouble.
So far I've tried
return EnumerateSymbolNames().Take(1).FirstOrDefault();
return EnumerateSymbolNames().Take(1).SingleOrDefault();
return EnumerateSymbolNames().FirstOrDefault();
var enumerator = EnumerateSymbolNames().GetEnumerator();
enumerator.MoveNext();
return enumerator.Current;
... but none of these have worked. (All repeatedly return "A".)
Based on the responses to this question, I'm wondering what I want is even possible -- although several of the responses to that post suggest techniques similar to my last one.
And no, this is not a homework assignment :)
When you use GetEnumerator, you need to use the same enumerator for each iteration. If you call GetEnumerator a second time, it will start over at the beginning of the collection.
If you want to use Take, you must first Skip the number of records that have already been processed.
This code worked for me...
var states = EnumerateSymbolNames();
var stateMachine = states.GetEnumerator();
do
{
//something
} while (stateMachine.MoveNext());
When print the results within that loop, it successfully produced the following output:
A
B
C
...
AA
AB
AC
...
BA
BB
...
Which is what I think you intended...
As both #cadrell0's answer and #Mr Steak's comment point out, what I needed to do was retain a reference to the enumerator returned by EnumerateSymbolNames().GetEnumerator().
When you're in a foreach loop, this is done implicitly for you: the iterator variable wraps an enumerator, which (I'm assuming) is scoped locally to the loop. So -- and this is the key piece -- when the iterator block does (the equivalent of)
enumerator.MoveNext();
return enumerator.Current;
... you're always using the same enumerator. Whereas what I was doing was creating / obtaining a different (new) enumerator every time. And predictably, it always started out at the first position in the sequence. This was probably obvious to everyone but me; it seems obvious to me as well in hindsight. (I was thinking of the enumerator as sort of a singleton property of the sequence, assuming that I'd be getting the same enumerator back every time.)
The following does what I want:
public class SymbolGenerator
{
private readonly string[] _symbols = { "A", "B", "C", ... };
private readonly IEnumerator<string> _enumerator;
public SymbolGenerator()
{
_enumerator = EnumerateSymbols().GetEnumerator();
}
public string GetNextSymbol()
{
_enumerator.MoveNext();
return _enumerator.Current;
}
private IEnumerable<string> EnumerateSymbols()
{
// (unchanged)
}
}

How to properly check IEnumerable for existing results

What's the best practice to check if a collection has items?
Here's an example of what I have:
var terminalsToSync = TerminalAction.GetAllTerminals();
if(terminalsToSync.Any())
SyncTerminals(terminalsToSync);
else
GatewayLogAction.WriteLogInfo(Messages.NoTerminalsForSync);
The GetAllTerminals() method will execute a stored procedure and, if we return a result, (Any() is true), SyncTerminals() will loop through the elements; thus enumerating it again and executing the stored procedure for the second time.
What's the best way to avoid this?
I'd like a good solution that can be used in other cases too; possibly without converting it to List.
Thanks in advance.
I would probably use a ToArray call, and then check Length; you're going to enumerate all the results anyway so why not do it early? However, since you've said you want to avoid early realisation of the enumerable...
I'm guessing that SyncTerminals has a foreach, in which case you can write it something like this:
bool any = false;
foreach(var terminal in terminalsToSync)
{
if(!any)any = true;
//....
}
if(!any)
GatewayLogAction.WriteLogInfo(Messages.NoTerminalsForSync);
Okay, there's a redundant if after the first loop, but I'm guessing the cost of an extra few CPU cycles isn't going to matter much.
Equally, you could do the iteration the old way and use a do...while loop and GetEnumerator; taking the first iteration out of the loop; that way there are literally no wasted operations:
var enumerator = terminalsToSync.GetEnumerator();
if(enumerator.MoveNext())
{
do
{
//sync enumerator.Current
} while(enumerator.MoveNext())
}
else
GatewayLogAction.WriteLogInfo(Messages.NoTerminalsForSync);
How about this, which still defers execution, but buffers it once executed:
var terminalsToSync = TerminalAction.GetAllTerminals().Lazily();
with:
public static class LazyEnumerable {
public static IEnumerable<T> Lazily<T>(this IEnumerable<T> source) {
if (source is LazyWrapper<T>) return source;
return new LazyWrapper<T>(source);
}
class LazyWrapper<T> : IEnumerable<T> {
private IEnumerable<T> source;
private bool executed;
public LazyWrapper(IEnumerable<T> source) {
if (source == null) throw new ArgumentNullException("source");
this.source = source;
}
IEnumerator IEnumerable.GetEnumerator() { return GetEnumerator(); }
public IEnumerator<T> GetEnumerator() {
if (!executed) {
executed = true;
source = source.ToList();
}
return source.GetEnumerator();
}
}
}
Personally i wouldnt use an any here, foreach will simply not loop through any items if the collection is empty, so i would just do it like that. However i would recommend that you check for null.
If you do want to pre-enumerate the set use .ToArray() eg will only enumerate once:
var terminalsToSync = TerminalAction.GetAllTerminals().ToArray();
if(terminalsToSync.Any())
SyncTerminals(terminalsToSync);
var terminalsToSync = TerminalAction.GetAllTerminals().ToList();
if(terminalsToSync.Any())
SyncTerminals(terminalsToSync);
else
GatewayLogAction.WriteLogInfo(Messages.NoTerminalsForSync);
.Length or .Count is faster since it doesn't need to go through the GetEnumerator()/MoveNext()/Dispose() required by Any()
Here's another way of approaching this problem:
int count = SyncTerminals(terminalsToSync);
if(count == 0) GatewayLogAction.WriteLogInfo(Messages.NoTerminalsForSync);
where you change SyncTerminals to do:
int count = 0;
foreach(var obj in terminalsToSync) {
count++;
// some code
}
return count;
Nice and simple.
All the caching solutions here are caching all items when the first item is being retrieved. It it really lazy if you cache each single item while the items of the list are is iterated.
The difference can be seen in this example:
public class LazyListTest
{
private int _count = 0;
public void Test()
{
var numbers = Enumerable.Range(1, 40);
var numbersQuery = numbers.Select(GetElement).ToLazyList(); // Cache lazy
var total = numbersQuery.Take(3)
.Concat(numbersQuery.Take(10))
.Concat(numbersQuery.Take(3))
.Sum();
Console.WriteLine(_count);
}
private int GetElement(int value)
{
_count++;
// Some slow stuff here...
return value * 100;
}
}
If you run the Test() method, the _count is only 10. Without caching it would be 16 and with .ToList() it would be 40!
An example of the implementation of LazyList can be found here.
If you're seeing two procedure calls for the evaluation of whatever GetAllTerminals() returns, this means that the procedure's result isn't being cached. Without knowing what data-access strategy you're using, this is quite hard to fix in a general way.
The simplest solution, as you've alluded, is to copy the result of the call before you perform any other operations. If you wanted to, you could neatly wrap this behaviour up in an IEnumerable<T> which executes the inner enumerable call just once:
public class CachedEnumerable<T> : IEnumerable<T>
{
public CachedEnumerable<T>(IEnumerable<T> enumerable)
{
result = new Lazy<List<T>>(() => enumerable.ToList());
}
private Lazy<List<T>> result;
public IEnumerator<T> GetEnumerator()
{
return this.result.Value.GetEnumerator();
}
System.Collections.IEnumerable GetEnumerator()
{
return this.GetEnumerator();
}
}
Wrap the result in an instance of this type and it will not evaluate the inner enumerable multiple times.

Categories