How can I cancel an IEnumerable? - c#

In a method returning IEnumerable<>, I'm opening and looping over a resource (e.g. a database row reader). Once the loop finished, the resource is closed again.
However, it may happen that the caller decides not to finish the enumeration. This leaves the resource open.
Example:
IEnumerable<Foo> Bar ()
{
using (var r = OpenResource()) {
while (r.Read ()) {
yield return r;
}
}
}
// OK - this closes the resource again
foreach (var foo in Bar()) {
Console.WriteLine (foo);
}
// Not OK - resource stays open!
Console.WriteLine (Bar().First());
How would I solve this? Can I easily cancel an enumeration, i.e. tell it to skip over the rest of the loop, or dispose it (putting the cleanup code in Dispose)?
I considered returning a Func<Result, bool> so the user can have it return false if he's done with iterating. Similarly, some kind of cancel token could be used, too. But both approaches seem cumbersome to me.

Normally it is the IEnumerator<> that implements the IDisposable, and if you look at the definition of IEnumerator<> you'll see that:
public interface IEnumerator<out T> : IDisposable, IEnumerator
The foreach statement correctly Dispose() the IEnumerator<> that receives from the IEnumerable<>, so that:
IEnumerable<SomeClass> res = SomeQuery();
foreach (SomeClass sc in res)
{
if (something)
break;
}
upon exiting the foreach in any way (the break, an exception, naturally finishing res), the Dispose() of the IEnumerator<> should be called. See https://msdn.microsoft.com/en-us/library/aa664754(v=vs.71).aspx for an example of how the foreach is implemented (a try... finally... with a Dispose() inside the finally)
Note that the C# will produce "correct" code for using used inside a yield function. See for example here: http://goo.gl/Igzmiz
public IEnumerable<Foo> Bar()
{
using (var r = OpenResource())
{
while (r.Read ())
{
yield return new Foo();
}
}
}
is converted to something that
void IDisposable.Dispose()
{
int num = this.<>1__state;
if (num == -3 || num == 1)
{
try
{
}
finally
{
this.<>m__Finally1();
}
}
}
The Dispose() method of IEnumerator<> will call a m__Finally1 method that will (IDisposable)this.<r>5__1.Dispose(); (where 5__1 is the r returned from OpenResource()). The m__Finally is even called if the code simply "exits" the while (r.Read ()):
if (!this.<r>5__1.Read())
{
this.<>m__Finally1();
and/or if there is an exception.
catch
{
this.System.IDisposable.Dispose();

Related

Is dispose called down to the bottom via yield return?

I give silly examples for simplicity.
IEnumerable<T> Silly<T>(this IEnumerable<T> source)
{
foreach(var x in source) yield return x;
}
I know that this will be compiled into a state machine. but its also similar to
IEnumerable<T> Silly<T>(this IEnumerable<T> source)
{
using(var sillier = source.GetEnumerator())
{
while(sillier.MoveNext()) yield return sillier.Current;
}
}
Now consider this usage
list.Silly().Take(2).ToArray();
Here you can see that Silly enumerable may not be fully consumed, but Take(2) it self will be fully consumed.
Question: when dispose is called on Take enumerator will it also call dispose on Silly enumerator and more specifically sillier enumerator?
My guess is, compiler can handle this simple use case because of foreach but what about not so simple use cases?
IEnumerable<T> Silly<T>(this IEnumerable<T> source)
{
using(var sillier = source.GetEnumerator())
{
// move next can be called on different stages.
}
}
Will this ever be a problem? because most enumerators don't use unmanaged resources, but if one does, this can cause memory leaks.
If dispose is not called, How do i make disposable enumerable?
An Idea: there can be a if(disposed) yield break; after every yield return. now dispose method of silly enumerator will just have to set disposed = true and move the enumerator once to dispose all the required stuff.
The C# compiler takes care of a lot for you when it turns your iterator into the real code. For instance, here's the MoveNext which contains the implementation of your second example1:
private bool MoveNext()
{
try
{
switch (this.<>1__state)
{
case 0:
this.<>1__state = -1;
this.<sillier>5__1 = this.source.GetEnumerator();
this.<>1__state = -3;
while (this.<sillier>5__1.MoveNext())
{
this.<>2__current = this.<sillier>5__1.Current;
this.<>1__state = 1;
return true;
Label_005A:
this.<>1__state = -3;
}
this.<>m__Finally1();
this.<sillier>5__1 = null;
return false;
case 1:
goto Label_005A;
}
return false;
}
fault
{
this.System.IDisposable.Dispose();
}
}
So, you'll notice that the finally clause from your using isn't there at all, and it's a state machine2 that relies on being in certain good (>= 0) states in order to make further progress forwards. (It's also illegal C#, but hey ho).
Now lets look at its Dispose:
[DebuggerHidden]
void IDisposable.Dispose()
{
switch (this.<>1__state)
{
case -3:
case 1:
try
{
}
finally
{
this.<>m__Finally1();
}
break;
}
}
So we can see the <>m__Finally1 is called here (as well as due to exiting the while loop in MoveNext.
And <>m__Finally1:
private void <>m__Finally1()
{
this.<>1__state = -1;
if (this.<sillier>5__1 != null)
{
this.<sillier>5__1.Dispose();
}
}
So, we can see that sillier was disposed and we moved into a negative state which means that MoveNext doesn't have to do any special work to handle the "we've already been disposed state".
So,
An Idea: there can be a if(disposed) yield break; after every yield return. now dispose method of silly enumerator will just have to set disposed = true and move the enumerator once to dispose all the required stuff.
Is completely unnecessary. Trust the compiler to transform the code so that it does all of the logical things it should - it just runs it's finally clause once, when it's either exhausted the iterator logic or when it's explicitly disposed.
1All code samples produced by .NET Reflector. But it's too good at decompiling these constructs these days so if you go and look at the Silly method itself:
[IteratorStateMachine(typeof(<Silly>d__1)), Extension]
private static IEnumerable<T> Silly<T>(this IEnumerable<T> source)
{
IEnumerator<T> <sillier>5__1;
using (<sillier>5__1 = source.GetEnumerator())
{
while (<sillier>5__1.MoveNext())
{
yield return <sillier>5__1.Current;
}
}
<sillier>5__1 = null;
}
It's managed to hide most details about that state machine away again. You need to chase the type referenced by the IteratorStateMachine attribute to see all of the gritty bits shown above.
2Please also note that the compiler is under no obligations to produce a state machine to allow iterators to work. It's an implementation detail of the current C# compilers. The C# Specification places no restriction on how the compiler transforms the iterator, just on what the effects should be.

Threads and access to a shared list

I'm encountering (I hope) a deadlocking issue with a WCF service I'm trying to write.
I have the following lock on a function that "locates" a particular item im the list:
CIPRecipe FindRecipe_ByUniqueID(string uniqueID)
{
lock (_locks[LOCK_RECIPES])
{
foreach (var r in _recipes.Keys)
{
if (_recipes[r].UniqueID == uniqueID)
{
return _recipes[r];
}
}
}
return null;
}
However, various functions reiterate through this list and always apply the same LOCK for example ....
lock (_locks[LOCK_RECIPES_NO_ADD_OR_REMOVE])
{
foreach (var r in _recipes)
{
r.Value.UpdateSummary();
summaries.Add((RecipeSummary)r.Value.Summary);
}
}
What I suspect is, an item in _recipes in the above example has suddenly called a function which ultimately calls the first function - "CIPRecipe FindRecipe_ByUniqueID(string uniqueID)" and this is causing a deadlock when it is reached in the iteration.
I need to stop this list changing whilst I'm iterating through it. Can someone advise me the best practice?
Thanks
What you want is to use a ReaderWriterLockSlim, this will let unlimited concurrent readers through but only a single writer through and block all readers while the writer is writing.
This assumes _locks has been chagned from a object[] to a ReaderWriterSlim[]
//On Read
CIPRecipe FindRecipe_ByUniqueID(string uniqueID)
{
var lockObj = _locks[LOCK_RECIPES];
lockObj.EnterReadLock();
try
{
foreach (var r in _recipes.Keys)
{
if (_recipes[r].UniqueID == uniqueID)
{
return _recipes[r];
}
}
}
finally
{
lockObj.ExitReadLock();
}
return null;
}
//On write
var lockObject = _locks[LOCK_RECIPES]; //Note this now uses the same lock object as the other method.
lockObj.EnterWriteLock();
try
{
foreach (var r in _recipes)
{
r.Value.UpdateSummary();
summaries.Add((RecipeSummary)r.Value.Summary);
}
}
finally
{
lockObj.ExitWriteLock();
}
I don't know if it will solve your deadlock issue, if it is caused by you allowing reads during a write it may.
Perhaps a ConcurrentDictionary is called for here?

Adding logic to method using yield

I am trying to use the yield command to update some methods but I am running into an issue that I don't understand. There is some logic in this method (checking for type of null), if that is the case then I write to a log and yield break. Which does exactly what I want, however in my unit test it is saying that the log function was never called. I am ok with not logging in this situation but I want to know why I can't or if I am doing something wrong.
Here is the code:
public IEnumerable<Ixxx> GetTypes(Type type)
{
if (type == null)
{
log.WriteRecord("log error", "LogName", true);
yield break;
}
lock (blockingObject)
{
foreach (Ixxx item in aDictionary.Values)
{
if (item.Type.Name == type.Name)
{
yield return item;
}
}
}
}
The unit test that is failing is claiming log.WriteRecord was never called. Here is that unit test:
[TestMethod]
public void TestMethod()
{
// Arrange
mockLog.Setup(a => a.WriteRecord(It.IsAny<string>(), It.IsAny<string>(), true)).Returns(true);
// Act
sut.GetTypes(null);
// Assert
mockLog.Verify(a => a.WriteRecord(It.IsAny<string>(), It.IsAny<string>(), true), Times.Once());
}
When I was making a local copy (List) this test passed however now that I am using yield it appears I can not make any function calls within this method? Thanks for any help!
The line "sut.GetTypes(null)" just returns an IEnumerable that you are throwing away. Since you never iterate over the enumerable, none of the code in GetTypes ever executes.
Try this instead:
foreach (var x in sut.GetTypes(null)) {}

iterating through disposable objects

Is the following a safe way of iterating through disposable objects? Or will this result in indisposed objects? etc? What if I used dispose statements instead of the using nests?
public static void Main()
{
foreach (ChildObject oChild in webApp)
{
//On Noes! Unexpected Error!
}
}
public static IEnumerable<ChildObject> SafelyGetNextObjInWebApp(WebApplication webApp)
{
foreach (ParentObject oParent in webApp.Parents)
{
using (parent)
{
foreach (ChildObject oChild in oParent.Children)
{
using (oChild)
{
yield return oChild;
}
}
}
}
}
Your method is not safe unless your caller enumerates through all of the objects returned by SafelyGetNextObjInWebApp(). Consider what happens in the following statement:
ChildObject o = SafelyGetNextObjInWebApp(arg).First();
In this case, exection of SafelyGetNextObjInWebApp() will halt at the yield statement, and never continue. Thus the object will not be disposed of by this call.
If you want to use an iterator to return web-service-created objects one at a time, you should make sure the call is exception safe and impose that the caller of the iterator call dispose. To illustrate:
public IEnumerable<ParentObject> GetParents(WebApplication webApp)
{
// assumes webApp.Parents uses deferred execution.
return webApp.Parents;
}
public void ProcessParent(WebApplication webApp)
{
foreach (ParentObject p in GetParents())
{
// Assumes p.Dipsose() calls ChildObject.Dispose() for all p.ChildObjects.
using(p)
{
foreach (ChildObject o in p.ChildObjects)
{
// do something with o
}
}
}
}
}
The "safe" method may not be much safer after all. What if the iteration breaks (or fails) before all parents and children objects are iterated? The remaining objects won't be disposed (at least, not in that specific method).
It seems that the iterations and disposals should be kept separate. You'll have cleaner code and more control over what the program is doing.
And there's more...
The C# iterator pattern will make the "safe" method fail in a subtle way. After you yield the child object, the program will effectively "exit" the using {...} block, thus disposing the child, making it unusable to whoever got it from iterating SafelyGetNextObjInWebApp().
What could be done
Take the using statements out of SafelyGetNextObjInWebApp(). Encapsulate the yielded children objects in a "Unit of Work" class that "knows" when to dispose the child.
In the end of your block using you execute a dispose, so the same result between using or use dispose in the end of your function
Your code may be ok, but likely cause problems in normal use: you are returning objects that will be disposed on next iteration. So if your caller's code look like
foreach(var i in SafelyGetNextObjInWebApp())
{
if (IsInteresting(i))
{
interestingItems.Add(i);
}
}
// here interestingItems contains disposed items you can't use.
Reversing code by providing method that iterates all items and takes Action<T> as argument may highlight the fact that processing of each item must be finished inside action.

Algorithm for implementing C# yield statement

I'd love to figure it out myself but I was wondering roughly what's the algorithm for converting a function with yield statements into a state machine for an enumerator? For example how does C# turn this:
IEnumerator<string> strings(IEnumerable<string> args)
{ IEnumerator<string> enumerator2 = getAnotherEnumerator();
foreach(var arg in arg)
{ enumerator2.MoveNext();
yield return arg+enumerator.Current;
}
}
into this:
bool MoveNext()
{ switch (this.state)
{
case 0:
this.state = -1;
this.enumerator2 = getAnotherEnumerator();
this.argsEnumerator = this.args.GetEnumerator();
this.state = 1;
while (this.argsEnumerator.MoveNext())
{
this.arg = this.argsEnumerator.Current;
this.enumerator2.MoveNext();
this.current = this.arg + this.enumerator2.Current;
this.state = 2;
return true;
state1:
this.state = 1;
}
this.state = -1;
if (this.argsEnumerator != null) this.argsEnumerator.Dispose();
break;
case 2:
goto state1;
}
return false;
}
Of course the result can be completely different depending on the original code.
The particular code sample you are looking at involves a series of transformations.
Please note that this is an approximate description of the algorithm. The actual names used by the compiler and the exact code it generates may be different. The idea is the same, however.
The first transformation is the "foreach" transformation, which transforms this code:
foreach (var x in y)
{
//body
}
into this code:
var enumerator = y.GetEnumerator();
while (enumerator.MoveNext())
{
var x = enumerator.Current;
//body
}
if (y != null)
{
enumerator.Dispose();
}
The second transformation finds all the yield return statements in the function body, assigns a number to each (a state value), and creates a "goto label" right after the yield.
The third transformation lifts all the local variables and function arguments in the method body into an object called a closure.
Given the code in your example, that would look similar to this:
class ClosureEnumerable : IEnumerable<string>
{
private IEnumerable<string> args;
private ClassType originalThis;
public ClosureEnumerator(ClassType origThis, IEnumerable<string> args)
{
this.args = args;
this.origianlThis = origThis;
}
public IEnumerator<string> GetEnumerator()
{
return new Closure(origThis, args);
}
}
class Closure : IEnumerator<string>
{
public Closure(ClassType originalThis, IEnumerable<string> args)
{
state = 0;
this.args = args;
this.originalThis = originalThis;
}
private IEnumerable<string> args;
private IEnumerator<string> enumerator2;
private IEnumerator<string> argEnumerator;
//- Here ClassType is the type of the object that contained the method
// This may be optimized away if the method does not access any
// class members
private ClassType originalThis;
//This holds the state value.
private int state;
//The current value to return
private string currentValue;
public string Current
{
get
{
return currentValue;
}
}
}
The method body is then moved from the original method to a method inside "Closure" called MoveNext, which returns a bool, and implements IEnumerable.MoveNext.
Any access to any locals is routed through "this", and any access to any class members are routed through this.originalThis.
Any "yield return expr" is translated into:
currentValue = expr;
state = //the state number of the yield statement;
return true;
Any yield break statement is translated into:
state = -1;
return false;
There is an "implicit" yield break statement at the end of the function.
A switch statement is then introduced at the beginning of the procedure that looks at the state number and jumps to the associated label.
The original method is then translated into something like this:
IEnumerator<string> strings(IEnumerable<string> args)
{
return new ClosureEnumerable(this,args);
}
The fact that the state of the method is all pushed into an object and that the MoveNext method uses a switch statement / state variable is what allows the iterator to behave as if control is being passed back to the point immediately after the last "yield return" statement the next time "MoveNext" is called.
It is important to point out, however, that the transformation used by the C# compiler is not the best way to do this. It suffers from poor performance when trying to use "yield" with recursive algorithms. There is a good paper that outlines a better way to do this here:
http://research.microsoft.com/en-us/projects/specsharp/iterators.pdf
It's worth a read if you haven't read it yet.
Just spotted this question - I wrote an article on it recently. I'll have to add the other links mentioned here to the article though...
Raymond Chen answers this here.

Categories