How do you deal with sequences of IDisposable using LINQ? - c#

What's the best approach to call Dispose() on the elements of a sequence?
Suppose there's something like:
IEnumerable<string> locations = ...
var streams = locations.Select ( a => new FileStream ( a , FileMode.Open ) );
var notEmptyStreams = streams.Where ( a => a.Length > 0 );
//from this point on only `notEmptyStreams` will be used/visible
var firstBytes = notEmptyStreams.Select ( a => a.ReadByte () );
var average = firstBytes.Average ();
How do you dispose FileStream instances (as soon as they're no longer needed) while maintaining concise code?
To clarify: this is not an actual piece of code, those lines are methods across a set of classes, and FileStream type is also just an example.
Is doing something along the lines of:
public static IEnumerable<TSource> Where<TSource> (
this IEnumerable<TSource> source ,
Func<TSource , bool> predicate
)
where TSource : IDisposable {
foreach ( var item in source ) {
if ( predicate ( item ) ) {
yield return item;
}
else {
item.Dispose ();
}
}
}
might be a good idea?
Alternatively: do you always solve a very specific scenario with regards to IEnumerable<IDisposable> without trying to generalize? Is it so because having it is an untypical situation? Do you design around having it in the first place? If so, how?

I would write a method, say, AsDisposableCollection that returns a wrapped IEnumerable which also implements IDisposable, so that you can use the usual using pattern. This is a bit more work (implementation of the method), but you need that only once and then you can use the method nicely (as often as you need):
using(var streams = locations.Select(a => new FileStream(a, FileMode.Open))
.AsDisposableCollection()) {
// ...
}
The implementation would look roughly like this (it is not complete - just to show the idea):
class DisposableCollection<T> : IDisposable, IEnumerable<T>
where T : IDisposable {
IEnumerable<T> en; // Wrapped enumerable
List<T> garbage; // To keep generated objects
public DisposableCollection(IEnumerable<T> en) {
this.en = en;
this.garbage = new List<T>();
}
// Enumerates over all the elements and stores generated
// elements in a list of garbage (to be disposed)
public IEnumerator<T> GetEnumerator() {
foreach(var o in en) {
garbage.Add(o);
yield return o;
}
}
// Dispose all elements that were generated so far...
public Dispose() {
foreach(var o in garbage) o.Dispose();
}
}

I suggest you turn the streams variable into an Array or a List, because enumerating through it a second time will (if I'm not mistaken) create new copies of the streams.
var streams = locations.Select(a => new FileStream(a, FileMode.Open)).ToList();
// dispose right away of those you won't need
foreach (FileStream stream in streams.Where(a => a.Length == 0))
stream.Dispose();
var notEmptyStreams = streams.Where(a => a.Length > 0);
// the rest of your code here
foreach (FileStream stream in notEmptyStreams)
stream.Dispose();
EDIT For these constraints, maybe LINQ isn't the best tool around. Maybe you could get away with a simple foreach loop?
var streams = locations.Select(a => new FileStream(a, FileMode.Open));
int count = 0;
int sum = 0;
foreach (FileStream stream in streams) using (stream)
{
if (stream.Length == 0) continue;
count++;
sum += stream.ReadByte();
}
int average = sum / count;

A simple solution is the following:
List<Stream> streams = locations
.Select(a => new FileStream(a, FileMode.Open))
.ToList();
try
{
// Use the streams.
}
finally
{
foreach (IDisposable stream in streams)
stream.Dispose();
}
Note that even with this you could in theory still fail to close a stream if one of the FileStream constructors fails after others have already been constructed. To fix that you need to be more careful constructing the inital list:
List<Stream> streams = new List<Stream>();
try
{
foreach (string location in locations)
{
streams.Add(new FileStream(location, FileMode.Open));
}
// Use the streams.
}
finally { /* same as before */ }
It's a lot of code and it's not concise like you wanted but if you want to be sure that all your streams are being closed, even when there are exceptions, then you should do this.
If you want something more LINQ-like you might want to read this article by Marc Gravell:
SelectMany; combining IDisposable and LINQ

Description
I've come up with a general solution to the problem :)
One of the things that were important to me was that everything is correctly disposed, even if I don't iterate the whole enumeration, which is the case when I use methods like FirstOrDefault (which I do quite often).
So I came up with a custom Enumerator which handles all the disposing. All you have to do is call the AsDisposeableEnumerable which does all the magic for you.
GetMy.Disposeables()
.AsDisposeableEnumerable() // <-- all the magic is injected here
.Skip(5)
.where(i => i > 1024)
.Select(i => new {myNumber = i})
.FirstOrDefault()
Please note that this does not work for infinite enumerations.
The Code
My custom IEnumerable
public class DisposeableEnumerable<T> : IEnumerable<T> where T : System.IDisposable
{
private readonly IEnumerable<T> _enumerable;
public DisposeableEnumerable(IEnumerable<T> enumerable)
{
_enumerable = enumerable;
}
public IEnumerator<T> GetEnumerator()
{
return new DisposeableEnumerator<T>(_enumerable.GetEnumerator());
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}
My custom IEnumerator
public class DisposeableEnumerator<T> : IEnumerator<T> where T : System.IDisposable
{
readonly List<T> toBeDisposed = new List<T>();
private readonly IEnumerator<T> _enumerator;
public DisposeableEnumerator(IEnumerator<T> enumerator)
{
_enumerator = enumerator;
}
public void Dispose()
{
// dispose the remaining disposeables
while (_enumerator.MoveNext()) {
T current = _enumerator.Current;
current.Dispose();
}
// dispose the provided disposeables
foreach (T disposeable in toBeDisposed) {
disposeable.Dispose();
}
// dispose the internal enumerator
_enumerator.Dispose();
}
public bool MoveNext()
{
bool result = _enumerator.MoveNext();
if (result) {
toBeDisposed.Add(_enumerator.Current);
}
return result;
}
public void Reset()
{
_enumerator.Reset();
}
public T Current
{
get
{
return _enumerator.Current;
}
}
object IEnumerator.Current
{
get { return Current; }
}
}
A fancy extension method to make things look good
public static class IDisposeableEnumerableExtensions
{
/// <summary>
/// Wraps the given IEnumarable into a DisposeableEnumerable which ensures that all the disposeables are disposed correctly
/// </summary>
/// <typeparam name="T">The IDisposeable type</typeparam>
/// <param name="enumerable">The enumerable to ensure disposing the elements of</param>
/// <returns></returns>
public static DisposeableEnumerable<T> AsDisposeableEnumerable<T>(this IEnumerable<T> enumerable) where T : System.IDisposable
{
return new DisposeableEnumerable<T>(enumerable);
}
}

Using code from https://lostechies.com/keithdahlby/2009/07/23/using-idisposables-with-linq/, you can turn your query into the following:
(
from location in locations
from stream in new FileStream(location, FileMode.Open).Use()
where stream.Length > 0
select stream.ReadByte()).Average()
You will need the following extension method:
public static IEnumerable<T> Use<T>(this T obj) where T : IDisposable
{
try
{
yield return obj;
}
finally
{
if (obj != null)
obj.Dispose();
}
}
This will properly dispose all the streams that you create, whether or not they are empty.

Here is a simple wrapper that allows you to dispose any IEnumerable with using (to retain the collection type rather than casting as IEnumerable, we would need nested generic parameter types which C# does not seem to support):
public static class DisposableEnumerableExtensions {
public static DisposableEnumerable<T> AsDisposable<T>(this IEnumerable<T> enumerable) where T : IDisposable {
return new DisposableEnumerable<T>(enumerable);
}
}
public class DisposableEnumerable<T> : IDisposable where T : IDisposable {
public IEnumerable<T> Enumerable { get; }
public DisposableEnumerable(IEnumerable<T> enumerable) {
this.Enumerable = enumerable;
}
public void Dispose() {
foreach (var o in this.Enumerable) o.Dispose();
}
}
Usage:
using (var processes = System.Diagnostics.Process.GetProcesses().AsDisposable()) {
foreach (var p in processes.Enumerable) {
Console.Write(p.Id);
}
}

Related

C# How to partition parallel foreach loop to iterate my list

I am new in programming world. i am doing my graduation and also learning dotnet.
I want to iterate my list in parallel foreach but i want to use partition there. I have lack of knowledge so my code is not compiling.
Actually this way i did it first which is working.
Parallel.ForEach(MyBroker, broker =>,,
{
mybrow = new WeightageRowNumber();
mybrow.RowNumber = Interlocked.Increment(ref rowNumber);
lock (_lock)
{
Mylist.Add(mybrow);
}
});
now i want to use partition so i change my code this way but now my code not compiling. here is code
Parallel.ForEach(MyBroker, broker,
(j, loop, subtotal) =>
{
mybrow = new WeightageRowNumber();
mybrow.RowNumber = Interlocked.Increment(ref rowNumber);
lock (_lock)
{
Mylist.Add(mybrow);
}
return brokerRowWeightageRowNumber.RowNumber;
},
(finalResult) =>
var rownum= Interlocked.Increment(ref finalResult);
console.writeline(rownum);
);
please see my second set of code and show me how to restructure to use partition for parallel foreach to iterate my list.
please guide me. thanks
The Parallel.ForEach method has 20 overloads - perhaps try a different overload?
Without your dependencies included I can't give a 1-to-1 example on your implementation but here is an in-depth example (reformatted from here) that you can copy into your IDE and set debug breakpoints (if that's useful). Unfortunately building an instantiable overload of OrderablePartitioner appears non-trivial so sorry for all the boilerplate code:
using System;
using System.Collections.Generic;
using System.Threading.Tasks;
using System.Threading;
using System.Collections.Concurrent;
using System.Collections;
using System.Linq;
// Simple partitioner that will extract one (index,item) pair at a time,
// in a thread-safe fashion, from the underlying collection.
class SingleElementOrderablePartitioner<T> : OrderablePartitioner<T>
{
// The collection being wrapped by this Partitioner
IEnumerable<T> m_referenceEnumerable;
// Class used to wrap m_index for the purpose of sharing access to it
// between an InternalEnumerable and multiple InternalEnumerators
private class Shared<U>
{
internal U Value;
public Shared(U item)
{
Value = item;
}
}
// Internal class that serves as a shared enumerable for the
// underlying collection.
private class InternalEnumerable : IEnumerable<KeyValuePair<long, T>>, IDisposable
{
IEnumerator<T> m_reader;
bool m_disposed = false;
Shared<long> m_index = null;
// These two are used to implement Dispose() when static partitioning is being performed
int m_activeEnumerators;
bool m_downcountEnumerators;
// "downcountEnumerators" will be true for static partitioning, false for
// dynamic partitioning.
public InternalEnumerable(IEnumerator<T> reader, bool downcountEnumerators)
{
m_reader = reader;
m_index = new Shared<long>(0);
m_activeEnumerators = 0;
m_downcountEnumerators = downcountEnumerators;
}
public IEnumerator<KeyValuePair<long, T>> GetEnumerator()
{
if (m_disposed)
throw new ObjectDisposedException("InternalEnumerable: Can't call GetEnumerator() after disposing");
// For static partitioning, keep track of the number of active enumerators.
if (m_downcountEnumerators) Interlocked.Increment(ref m_activeEnumerators);
return new InternalEnumerator(m_reader, this, m_index);
}
IEnumerator<KeyValuePair<long, T>> IEnumerable<KeyValuePair<long, T>>.GetEnumerator()
{
return this.GetEnumerator();
}
public void Dispose()
{
if (!m_disposed)
{
// Only dispose the source enumerator if you are doing dynamic partitioning
if (!m_downcountEnumerators)
{
m_reader.Dispose();
}
m_disposed = true;
}
}
// Called from Dispose() method of spawned InternalEnumerator. During
// static partitioning, the source enumerator will be automatically
// disposed once all requested InternalEnumerators have been disposed.
public void DisposeEnumerator()
{
if (m_downcountEnumerators)
{
if (Interlocked.Decrement(ref m_activeEnumerators) == 0)
{
m_reader.Dispose();
}
}
}
IEnumerator IEnumerable.GetEnumerator()
{
throw new NotImplementedException();
}
}
// Internal class that serves as a shared enumerator for
// the underlying collection.
private class InternalEnumerator : IEnumerator<KeyValuePair<long, T>>
{
KeyValuePair<long, T> m_current;
IEnumerator<T> m_source;
InternalEnumerable m_controllingEnumerable;
Shared<long> m_index = null;
bool m_disposed = false;
public InternalEnumerator(IEnumerator<T> source, InternalEnumerable controllingEnumerable, Shared<long> index)
{
m_source = source;
m_current = default(KeyValuePair<long, T>);
m_controllingEnumerable = controllingEnumerable;
m_index = index;
}
object IEnumerator.Current
{
get { return m_current; }
}
KeyValuePair<long, T> IEnumerator<KeyValuePair<long, T>>.Current
{
get { return m_current; }
}
void IEnumerator.Reset()
{
throw new NotSupportedException("Reset() not supported");
}
// This method is the crux of this class. Under lock, it calls
// MoveNext() on the underlying enumerator, grabs Current and index,
// and increments the index.
bool IEnumerator.MoveNext()
{
bool rval = false;
lock (m_source)
{
rval = m_source.MoveNext();
if (rval)
{
m_current = new KeyValuePair<long, T>(m_index.Value, m_source.Current);
m_index.Value = m_index.Value + 1;
}
else m_current = default(KeyValuePair<long, T>);
}
return rval;
}
void IDisposable.Dispose()
{
if (!m_disposed)
{
// Delegate to parent enumerable's DisposeEnumerator() method
m_controllingEnumerable.DisposeEnumerator();
m_disposed = true;
}
}
}
// Constructor just grabs the collection to wrap
public SingleElementOrderablePartitioner(IEnumerable<T> enumerable)
: base(true, true, true)
{
// Verify that the source IEnumerable is not null
if (enumerable == null)
throw new ArgumentNullException("enumerable");
m_referenceEnumerable = enumerable;
}
// Produces a list of "numPartitions" IEnumerators that can each be
// used to traverse the underlying collection in a thread-safe manner.
// This will return a static number of enumerators, as opposed to
// GetOrderableDynamicPartitions(), the result of which can be used to produce
// any number of enumerators.
public override IList<IEnumerator<KeyValuePair<long, T>>> GetOrderablePartitions(int numPartitions)
{
if (numPartitions < 1)
throw new ArgumentOutOfRangeException("NumPartitions");
List<IEnumerator<KeyValuePair<long, T>>> list = new List<IEnumerator<KeyValuePair<long, T>>>(numPartitions);
// Since we are doing static partitioning, create an InternalEnumerable with reference
// counting of spawned InternalEnumerators turned on. Once all of the spawned enumerators
// are disposed, dynamicPartitions will be disposed.
var dynamicPartitions = new InternalEnumerable(m_referenceEnumerable.GetEnumerator(), true);
for (int i = 0; i < numPartitions; i++)
list.Add(dynamicPartitions.GetEnumerator());
return list;
}
// Returns an instance of our internal Enumerable class. GetEnumerator()
// can then be called on that (multiple times) to produce shared enumerators.
public override IEnumerable<KeyValuePair<long, T>> GetOrderableDynamicPartitions()
{
// Since we are doing dynamic partitioning, create an InternalEnumerable with reference
// counting of spawned InternalEnumerators turned off. This returned InternalEnumerable
// will need to be explicitly disposed.
return new InternalEnumerable(m_referenceEnumerable.GetEnumerator(), false);
}
// Must be set to true if GetDynamicPartitions() is supported.
public override bool SupportsDynamicPartitions
{
get { return true; }
}
}
Here are examples of how to structure Parallel.ForEach using the above OrderablePartitioner. See how you can refactor-out your finally-block entirely out of the ForEach impl?
public class Program
{
static void Main(string[] args)
{
//
// First a fairly simple visual test
//
var someCollection = new string[] { "four", "score", "and", "twenty", "years", "ago" };
var someOrderablePartitioner = new SingleElementOrderablePartitioner<string>(someCollection);
Parallel.ForEach(someOrderablePartitioner, (item, state, index) =>
{
Console.WriteLine("ForEach: item = {0}, index = {1}, thread id = {2}", item, index, Thread.CurrentThread.ManagedThreadId);
});
//
// Now a more rigorous test of dynamic partitioning (used by Parallel.ForEach)
//
List<int> src = Enumerable.Range(0, 100000).ToList();
SingleElementOrderablePartitioner<int> myOP = new SingleElementOrderablePartitioner<int>(src);
int counter = 0;
bool mismatch = false;
Parallel.ForEach(myOP, (item, state, index) =>
{
if (item != index) mismatch = true;
Interlocked.Increment(ref counter);
});
if (mismatch) Console.WriteLine("OrderablePartitioner Test: index mismatch detected");
Console.WriteLine("OrderablePartitioner test: counter = {0}, should be 100000", counter);
}
}
Also this link might be useful ("Write a simple parallel.ForEach Loop")

Ensure deferred execution will be executed only once or else

I ran into a weird issue and I'm wondering what I should do about it.
I have this class that return a IEnumerable<MyClass> and it is a deferred execution. Right now, there are two possible consumers. One of them sorts the result.
See the following example :
public class SomeClass
{
public IEnumerable<MyClass> GetMyStuff(Param givenParam)
{
double culmulativeSum = 0;
return myStuff.Where(...)
.OrderBy(...)
.TakeWhile( o =>
{
bool returnValue = culmulativeSum < givenParam.Maximum;
culmulativeSum += o.SomeNumericValue;
return returnValue;
};
}
}
Consumers call the deferred execution only once, but if they were to call it more than that, the result would be wrong as the culmulativeSum wouldn't be reset. I've found the issue by inadvertence with unit testing.
The easiest way for me to fix the issue would be to just add .ToArray() and get rid of the deferred execution at the cost of a little bit of overhead.
I could also add unit test in consumers class to ensure they call it only once, but that wouldn't prevent any new consumer coded in the future from this potential issue.
Another thing that came to my mind was to make subsequent execution throw.
Something like
return myStuff.Where(...)
.OrderBy(...)
.TakeWhile(...)
.ThrowIfExecutedMoreThan(1);
Obviously this doesn't exist.
Would it be a good idea to implement such thing and how would you do it?
Otherwise, if there is a big pink elephant that I don't see, pointing it out will be appreciated. (I feel there is one because this question is about a very basic scenario :| )
EDIT :
Here is a bad consumer usage example :
public class ConsumerClass
{
public void WhatEverMethod()
{
SomeClass some = new SomeClass();
var stuffs = some.GetMyStuff(param);
var nb = stuffs.Count(); //first deferred execution
var firstOne = stuff.First(); //second deferred execution with the culmulativeSum not reset
}
}
You can solve the incorrect result issue by simply turning your method into iterator:
double culmulativeSum = 0;
var query = myStuff.Where(...)
.OrderBy(...)
.TakeWhile(...);
foreach (var item in query) yield return item;
It can be encapsulated in a simple extension method:
public static class Iterators
{
public static IEnumerable<T> Lazy<T>(Func<IEnumerable<T>> source)
{
foreach (var item in source())
yield return item;
}
}
Then all you need to do in such scenarios is to surround the original method body with Iterators.Lazy call, e.g.:
return Iterators.Lazy(() =>
{
double culmulativeSum = 0;
return myStuff.Where(...)
.OrderBy(...)
.TakeWhile(...);
});
You can use the following class:
public class JustOnceOrElseEnumerable<T> : IEnumerable<T>
{
private readonly IEnumerable<T> decorated;
public JustOnceOrElseEnumerable(IEnumerable<T> decorated)
{
this.decorated = decorated;
}
private bool CalledAlready;
public IEnumerator<T> GetEnumerator()
{
if (CalledAlready)
throw new Exception("Enumerated already");
CalledAlready = true;
return decorated.GetEnumerator();
}
IEnumerator IEnumerable.GetEnumerator()
{
if (CalledAlready)
throw new Exception("Enumerated already");
CalledAlready = true;
return decorated.GetEnumerator();
}
}
to decorate an enumerable so that it can only be enumerated once. After that it would throw an exception.
You can use this class like this:
return new JustOnceOrElseEnumerable(
myStuff.Where(...)
...
);
Please note that I do not recommend this approach because it violates the contract of the IEnumerable interface and thus the Liskov Substitution Principle. It is legal for consumers of this contract to assume that they can enumerate the enumerable as many times as they like.
Instead, you can use a cached enumerable that caches the result of enumeration. This ensures that the enumerable is only enumerated once and that all subsequent enumeration attempts would read from the cache. See this answer here for more information.
Ivan's answer is very fitting for the underlying issue in OP's example - but for the general case, I have approached this in the past using an extension method similar to the one below. This ensures that the Enumerable has a single evaluation but is also deferred:
public static IMemoizedEnumerable<T> Memoize<T>(this IEnumerable<T> source)
{
return new MemoizedEnumerable<T>(source);
}
private class MemoizedEnumerable<T> : IMemoizedEnumerable<T>, IDisposable
{
private readonly IEnumerator<T> _sourceEnumerator;
private readonly List<T> _cache = new List<T>();
public MemoizedEnumerable(IEnumerable<T> source)
{
_sourceEnumerator = source.GetEnumerator();
}
public IEnumerator<T> GetEnumerator()
{
return IsMaterialized ? _cache.GetEnumerator() : Enumerate();
}
private IEnumerator<T> Enumerate()
{
foreach (var value in _cache)
{
yield return value;
}
while (_sourceEnumerator.MoveNext())
{
_cache.Add(_sourceEnumerator.Current);
yield return _sourceEnumerator.Current;
}
_sourceEnumerator.Dispose();
IsMaterialized = true;
}
IEnumerator IEnumerable.GetEnumerator() => GetEnumerator();
public List<T> Materialize()
{
if (IsMaterialized)
return _cache;
while (_sourceEnumerator.MoveNext())
{
_cache.Add(_sourceEnumerator.Current);
}
_sourceEnumerator.Dispose();
IsMaterialized = true;
return _cache;
}
public bool IsMaterialized { get; private set; }
void IDisposable.Dispose()
{
if(!IsMaterialized)
_sourceEnumerator.Dispose();
}
}
public interface IMemoizedEnumerable<T> : IEnumerable<T>
{
List<T> Materialize();
bool IsMaterialized { get; }
}
Example Usage:
void Consumer()
{
//var results = GetValuesComplex();
//var results = GetValuesComplex().ToList();
var results = GetValuesComplex().Memoize();
if(results.Any(i => i == 3))
{
Console.WriteLine("\nFirst Iteration");
//return; //Potential for early exit.
}
var last = results.Last(); // Causes multiple enumeration in naive case.
Console.WriteLine("\nSecond Iteration");
}
IEnumerable<int> GetValuesComplex()
{
for (int i = 0; i < 5; i++)
{
//... complex operations ...
Console.Write(i + ", ");
yield return i;
}
}
Naive: ✔ Deferred, ✘ Single enumeration.
ToList: ✘ Deferred, ✔ Single enumeration.
Memoize: ✔ Deferred, ✔ Single enumeration.
.
Edited to use the proper terminology and flesh out the implementation.

Confused about IEnumerator interface

I am trying to understand how to use the IEnumerator interface and what it is used for. I have a class which implements the IEnumerator interface. A string array is passed to the constructor method.
The problem is when I execute the code then the array is not listed properly. It should be doing it in the order "ali", "veli", "hatca" but it’s listed at the console in this order "veli", "hatca" and -1. I am so confused. What am I doing wrong here? Can you please help?
static void Main(string[] args)
{
ogr o = new ogr();
while (o.MoveNext())
{
Console.WriteLine(o.Current.ToString());
}
}
public class ogr: IEnumerator
{
ArrayList array_ = new ArrayList();
string[] names = new string[] {
"ali", "veli", "hatca"
};
public ogr()
{
array_.AddRange(names);
}
public void addOgr(string name)
{
array_.Add(name);
}
int position;
public object Current
{
get
{
if (position >= 0 && position < array_.Count)
{
return array_[position];
}
else
{
return -1;
}
}
}
public bool MoveNext()
{
if (position < array_.Count && position >= 0)
{
position++;
return true;
}
else
{
return false;
}
}
public void Reset()
{
position = 0;
}
}
IEnumerator is quite difficult to grasp at first, but luckily it's an interface you hardly ever use in itself. Instead, you should probably implement IEnumerable<T>.
However, the source of your confusion comes from this line from the IEnumerator documentation:
Initially, the enumerator is positioned before the first element in
the collection. The Reset method also brings the enumerator back to
this position. After an enumerator is created or the Reset method is
called, you must call the MoveNext method to advance the enumerator to
the first element of the collection before reading the value of
Current; otherwise, Current is undefined.
Your implementation has its current position at 0 initially, instead of -1, causing the strange behavior. Your enumerator begins with Current on the first element instead of being before it.
It is pretty rare for people to use that API directly. More commonly, it is simply used via the foreach statement, i.e.
foreach(var value in someEnumerable) { ... }
where someEnumerable implements IEnumerable, IEnumerable<T> or just the duck-typed pattern. Your class ogr certainly isn't an IEnumerator, and shouldn't be made to try to act like one.
If the intend is for ogr to be enumerable, then:
public ogr : IEnumerable {
IEnumerator IEnumerable.GetEnumerator() {
return array_.GetEnumerator();
}
}
I suspect it would be better to be IEnumerable<string>, though, using List<string> as the backing list:
public SomeType : IEnumerable<string> {
private readonly List<string> someField = new List<string>();
public IEnumerator<string> GetEnumerator()
{ return someField.GetEnumerator(); }
IEnumerator IEnumerable.GetEnumerator()
{ return someField.GetEnumerator(); }
}

Is there an IEnumerable implementation that only iterates over it's source (e.g. LINQ) once?

Provided items is the result of a LINQ expression:
var items = from item in ItemsSource.RetrieveItems()
where ...
Suppose generation of each item takes some non-negligeble time.
Two modes of operation are possible:
Using foreach would allow to start working with items in the beginning of the collection much sooner than whose in the end become available. However if we wanted to later process the same collection again, we'll have to copy save it:
var storedItems = new List<Item>();
foreach(var item in items)
{
Process(item);
storedItems.Add(item);
}
// Later
foreach(var item in storedItems)
{
ProcessMore(item);
}
Because if we'd just made foreach(... in items) then ItemsSource.RetrieveItems() would get called again.
We could use .ToList() right upfront, but that would force us wait for the last item to be retrieved before we could start processing the first one.
Question: Is there an IEnumerable implementation that would iterate first time like regular LINQ query result, but would materialize in process so that second foreach would iterate over stored values?
A fun challenge so I have to provide my own solution. So fun in fact that my solution now is in version 3. Version 2 was a simplification I made based on feedback from Servy. I then realized that my solution had huge drawback. If the first enumeration of the cached enumerable didn't complete no caching would be done. Many LINQ extensions like First and Take will only enumerate enough of the enumerable to get the job done and I had to update to version 3 to make this work with caching.
The question is about subsequent enumerations of the enumerable which does not involve concurrent access. Nevertheless I have decided to make my solution thread safe. It adds some complexity and a bit of overhead but should allow the solution to be used in all scenarios.
public static class EnumerableExtensions {
public static IEnumerable<T> Cached<T>(this IEnumerable<T> source) {
if (source == null)
throw new ArgumentNullException("source");
return new CachedEnumerable<T>(source);
}
}
class CachedEnumerable<T> : IEnumerable<T> {
readonly Object gate = new Object();
readonly IEnumerable<T> source;
readonly List<T> cache = new List<T>();
IEnumerator<T> enumerator;
bool isCacheComplete;
public CachedEnumerable(IEnumerable<T> source) {
this.source = source;
}
public IEnumerator<T> GetEnumerator() {
lock (this.gate) {
if (this.isCacheComplete)
return this.cache.GetEnumerator();
if (this.enumerator == null)
this.enumerator = source.GetEnumerator();
}
return GetCacheBuildingEnumerator();
}
public IEnumerator<T> GetCacheBuildingEnumerator() {
var index = 0;
T item;
while (TryGetItem(index, out item)) {
yield return item;
index += 1;
}
}
bool TryGetItem(Int32 index, out T item) {
lock (this.gate) {
if (!IsItemInCache(index)) {
// The iteration may have completed while waiting for the lock.
if (this.isCacheComplete) {
item = default(T);
return false;
}
if (!this.enumerator.MoveNext()) {
item = default(T);
this.isCacheComplete = true;
this.enumerator.Dispose();
return false;
}
this.cache.Add(this.enumerator.Current);
}
item = this.cache[index];
return true;
}
}
bool IsItemInCache(Int32 index) {
return index < this.cache.Count;
}
IEnumerator IEnumerable.GetEnumerator() {
return GetEnumerator();
}
}
The extension is used like this (sequence is an IEnumerable<T>):
var cachedSequence = sequence.Cached();
// Pulling 2 items from the sequence.
foreach (var item in cachedSequence.Take(2))
// ...
// Pulling 2 items from the cache and the rest from the source.
foreach (var item in cachedSequence)
// ...
// Pulling all items from the cache.
foreach (var item in cachedSequence)
// ...
There is slight leak if only part of the enumerable is enumerated (e.g. cachedSequence.Take(2).ToList(). The enumerator that is used by ToList will be disposed but the underlying source enumerator is not disposed. This is because the first 2 items are cached and the source enumerator is kept alive should requests for subsequent items be made. In that case the source enumerator is only cleaned up when eligigble for garbage Collection (which will be the same time as the possibly large cache).
Take a look at the Reactive Extentsions library - there is a MemoizeAll() extension which will cache the items in your IEnumerable once they're accessed, and store them for future accesses.
See this blog post by Bart De Smet for a good read on MemoizeAll and other Rx methods.
Edit: This is actually found in the separate Interactive Extensions package now - available from NuGet or Microsoft Download.
public static IEnumerable<T> SingleEnumeration<T>(this IEnumerable<T> source)
{
return new SingleEnumerator<T>(source);
}
private class SingleEnumerator<T> : IEnumerable<T>
{
private CacheEntry<T> cacheEntry;
public SingleEnumerator(IEnumerable<T> sequence)
{
cacheEntry = new CacheEntry<T>(sequence.GetEnumerator());
}
public IEnumerator<T> GetEnumerator()
{
if (cacheEntry.FullyPopulated)
{
return cacheEntry.CachedValues.GetEnumerator();
}
else
{
return iterateSequence<T>(cacheEntry).GetEnumerator();
}
}
IEnumerator IEnumerable.GetEnumerator()
{
return this.GetEnumerator();
}
}
private static IEnumerable<T> iterateSequence<T>(CacheEntry<T> entry)
{
using (var iterator = entry.CachedValues.GetEnumerator())
{
int i = 0;
while (entry.ensureItemAt(i) && iterator.MoveNext())
{
yield return iterator.Current;
i++;
}
}
}
private class CacheEntry<T>
{
public bool FullyPopulated { get; private set; }
public ConcurrentQueue<T> CachedValues { get; private set; }
private static object key = new object();
private IEnumerator<T> sequence;
public CacheEntry(IEnumerator<T> sequence)
{
this.sequence = sequence;
CachedValues = new ConcurrentQueue<T>();
}
/// <summary>
/// Ensure that the cache has an item a the provided index. If not, take an item from the
/// input sequence and move to the cache.
///
/// The method is thread safe.
/// </summary>
/// <returns>True if the cache already had enough items or
/// an item was moved to the cache,
/// false if there were no more items in the sequence.</returns>
public bool ensureItemAt(int index)
{
//if the cache already has the items we don't need to lock to know we
//can get it
if (index < CachedValues.Count)
return true;
//if we're done there's no race conditions hwere either
if (FullyPopulated)
return false;
lock (key)
{
//re-check the early-exit conditions in case they changed while we were
//waiting on the lock.
//we already have the cached item
if (index < CachedValues.Count)
return true;
//we don't have the cached item and there are no uncached items
if (FullyPopulated)
return false;
//we actually need to get the next item from the sequence.
if (sequence.MoveNext())
{
CachedValues.Enqueue(sequence.Current);
return true;
}
else
{
FullyPopulated = true;
return false;
}
}
}
}
So this has been edited (substantially) to support multithreaded access. Several threads can ask for items, and on an item by item basis, they will be cached. It doesn't need to wait for the entire sequence to be iterated for it to return cached values. Below is a sample program that demonstrates this:
private static IEnumerable<int> interestingIntGenertionMethod(int maxValue)
{
for (int i = 0; i < maxValue; i++)
{
Thread.Sleep(1000);
Console.WriteLine("actually generating value: {0}", i);
yield return i;
}
}
public static void Main(string[] args)
{
IEnumerable<int> sequence = interestingIntGenertionMethod(10)
.SingleEnumeration();
int numThreads = 3;
for (int i = 0; i < numThreads; i++)
{
int taskID = i;
Task.Factory.StartNew(() =>
{
foreach (int value in sequence)
{
Console.WriteLine("Task: {0} Value:{1}",
taskID, value);
}
});
}
Console.WriteLine("Press any key to exit...");
Console.ReadKey(true);
}
You really need to see it run to understand the power here. As soon as a single thread forces the next actual values to be generated all of the remaining threads can immediately print that generated value, but they will all be waiting if there are no uncached values for that thread to print. (Obviously thread/threadpool scheduling may result in one task taking longer to print it's value than needed.)
There have already been posted thread-safe implementations of the Cached/SingleEnumeration operator by Martin Liversage and Servy respectively, and the thread-safe Memoise operator from the System.Interactive package is also available. In case thread-safety is not a requirement, and paying the cost of thread-synchronization is undesirable, there are answers offering unsynchronized ToCachedEnumerable implementations in this question. All these implementations have in common that they are based on custom types. My challenge was to write a similar not-synchronized operator in a single self-contained extension method (no strings attached). Here is my implementation:
public static IEnumerable<T> MemoiseNotSynchronized<T>(this IEnumerable<T> source)
{
// Argument validation omitted
IEnumerator<T> enumerator = null;
List<T> buffer = null;
return Implementation();
IEnumerable<T> Implementation()
{
if (buffer != null && enumerator == null)
{
// The source has been fully enumerated
foreach (var item in buffer) yield return item;
yield break;
}
enumerator ??= source.GetEnumerator();
buffer ??= new();
for (int i = 0; ; i = checked(i + 1))
{
if (i < buffer.Count)
{
yield return buffer[i];
}
else if (enumerator.MoveNext())
{
Debug.Assert(buffer.Count == i);
var current = enumerator.Current;
buffer.Add(current);
yield return current;
}
else
{
enumerator.Dispose(); enumerator = null;
yield break;
}
}
}
}
Usage example:
IEnumerable<Point> points = GetPointsFromDB().MemoiseNotSynchronized();
// Enumerate the 'points' any number of times, on a single thread.
// The data will be fetched from the DB only once.
// The connection with the DB will open when the 'points' is enumerated
// for the first time, partially or fully.
// The connection will stay open until the 'points' is enumerated fully
// for the first time.
Testing the MemoiseNotSynchronized operator on Fiddle.

Look if a method is called inside a method using reflection

I'm working with reflection and currently have a MethodBody. How do I check if a specific method is called inside the MethodBody?
Assembly assembly = Assembly.Load("Module1");
Type type = assembly.GetType("Module1.ModuleInit");
MethodInfo mi = type.GetMethod("Initialize");
MethodBody mb = mi.GetMethodBody();
Use Mono.Cecil. It is a single standalone assembly that will work on Microsoft .NET as well as Mono. (I think I used version 0.6 or thereabouts back when I wrote the code below)
Say you have a number of assemblies
IEnumerable<AssemblyDefinition> assemblies;
Get these using AssemblyFactory (load one?)
The following snippet would enumerate all usages of methods in all types of these assemblies
methodUsages = assemblies
.SelectMany(assembly => assembly.MainModule.Types.Cast<TypeDefinition>())
.SelectMany(type => type.Methods.Cast<MethodDefinition>())
.Where(method => null != method.Body) // allow abstracts and generics
.SelectMany(method => method.Body.Instructions.Cast<Instruction>())
.Select(instr => instr.Operand)
.OfType<MethodReference>();
This will return all references to methods (so including use in reflection, or to construct expressions which may or may not be executed). As such, this is probably not very useful, except to show you what can be done with the Cecil API without too much of an effort :)
Note that this sample assumes a somewhat older version of Cecil (the one in mainstream mono versions). Newer versions are
more succinct (by using strong typed generic collections)
faster
Of course in your case you could have a single method reference as starting point. Say you want to detect when 'mytargetmethod' can actually be called directly inside 'startingpoint':
MethodReference startingpoint; // get it somewhere using Cecil
MethodReference mytargetmethod; // what you are looking for
bool isCalled = startingpoint
.GetOriginalMethod() // jump to original (for generics e.g.)
.Resolve() // get the definition from the IL image
.Body.Instructions.Cast<Instruction>()
.Any(i => i.OpCode == OpCodes.Callvirt && i.Operand == (mytargetmethod));
Call Tree Search
Here is a working snippet that allows you to recursively search to (selected) methods that call each other (indirectly).
using System;
using System.Collections;
using System.Collections.Generic;
using System.Linq;
using Mono.Cecil;
using Mono.Cecil.Cil;
namespace StackOverflow
{
/*
* breadth-first lazy search across a subset of the call tree rooting in startingPoint
*
* methodSelect selects the methods to recurse into
* resultGen generates the result objects to be returned by the enumerator
*
*/
class CallTreeSearch<T> : BaseCodeVisitor, IEnumerable<T> where T : class
{
private readonly Func<MethodReference, bool> _methodSelect;
private readonly Func<Instruction, Stack<MethodReference>, T> _transform;
private readonly IEnumerable<MethodDefinition> _startingPoints;
private readonly IDictionary<MethodDefinition, Stack<MethodReference>> _chain = new Dictionary<MethodDefinition, Stack<MethodReference>>();
private readonly ICollection<MethodDefinition> _seen = new HashSet<MethodDefinition>(new CompareMembers<MethodDefinition>());
private readonly ICollection<T> _results = new HashSet<T>();
private Stack<MethodReference> _currentStack;
private const int InfiniteRecursion = -1;
private readonly int _maxrecursiondepth;
private bool _busy;
public CallTreeSearch(IEnumerable<MethodDefinition> startingPoints,
Func<MethodReference, bool> methodSelect,
Func<Instruction, Stack<MethodReference>, T> resultGen)
: this(startingPoints, methodSelect, resultGen, InfiniteRecursion)
{
}
public CallTreeSearch(IEnumerable<MethodDefinition> startingPoints,
Func<MethodReference, bool> methodSelect,
Func<Instruction, Stack<MethodReference>, T> resultGen,
int maxrecursiondepth)
{
_startingPoints = startingPoints.ToList();
_methodSelect = methodSelect;
_maxrecursiondepth = maxrecursiondepth;
_transform = resultGen;
}
public override void VisitMethodBody(MethodBody body)
{
_seen.Add(body.Method); // avoid infinite recursion
base.VisitMethodBody(body);
}
public override void VisitInstructionCollection(InstructionCollection instructions)
{
foreach (Instruction instr in instructions)
VisitInstruction(instr);
base.VisitInstructionCollection(instructions);
}
public override void VisitInstruction(Instruction instr)
{
T result = _transform(instr, _currentStack);
if (result != null)
_results.Add(result);
var methodRef = instr.Operand as MethodReference; // TODO select calls only?
if (methodRef != null && _methodSelect(methodRef))
{
var resolve = methodRef.Resolve();
if (null != resolve && !(_chain.ContainsKey(resolve) || _seen.Contains(resolve)))
_chain.Add(resolve, new Stack<MethodReference>(_currentStack.Reverse()));
}
base.VisitInstruction(instr);
}
public IEnumerator<T> GetEnumerator()
{
lock (this) // not multithread safe
{
if (_busy)
throw new InvalidOperationException("CallTreeSearch enumerator is not reentrant");
_busy = true;
try
{
int recursionLevel = 0;
ResetToStartingPoints();
while (_chain.Count > 0 &&
((InfiniteRecursion == _maxrecursiondepth) || recursionLevel++ <= _maxrecursiondepth))
{
// swapout the collection because Visitor will modify
var clone = new Dictionary<MethodDefinition, Stack<MethodReference>>(_chain);
_chain.Clear();
foreach (var call in clone.Where(call => HasBody(call.Key)))
{
// Console.Error.Write("\rCallTreeSearch: level #{0}, scanning {1,-20}\r", recursionLevel, call.Key.Name + new string(' ',21));
_currentStack = call.Value;
_currentStack.Push(call.Key);
try
{
_results.Clear();
call.Key.Body.Accept(this); // grows _chain and _results
}
finally
{
_currentStack.Pop();
}
_currentStack = null;
foreach (var result in _results)
yield return result;
}
}
}
finally
{
_busy = false;
}
}
}
private void ResetToStartingPoints()
{
_chain.Clear();
_seen.Clear();
foreach (var startingPoint in _startingPoints)
{
_chain.Add(startingPoint, new Stack<MethodReference>());
_seen.Add(startingPoint);
}
}
private static bool HasBody(MethodDefinition methodDefinition)
{
return !(methodDefinition.IsAbstract || methodDefinition.Body == null);
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}
internal class CompareMembers<T> : IComparer<T>, IEqualityComparer<T>
where T: class, IMemberReference
{
public int Compare(T x, T y)
{ return StringComparer.InvariantCultureIgnoreCase.Compare(KeyFor(x), KeyFor(y)); }
public bool Equals(T x, T y)
{ return KeyFor(x).Equals(KeyFor(y)); }
private static string KeyFor(T mr)
{ return null == mr ? "" : String.Format("{0}::{1}", mr.DeclaringType.FullName, mr.Name); }
public int GetHashCode(T obj)
{ return KeyFor(obj).GetHashCode(); }
}
}
Notes
do some error handling a Resolve() (I have an extension method TryResolve() for the purpose)
optionally select usages of MethodReferences in a call operation (call, calli, callvirt ...) only (see //TODO)
Typical usage:
public static IEnumerable<T> SearchCallTree<T>(this TypeDefinition startingClass,
Func<MethodReference, bool> methodSelect,
Func<Instruction, Stack<MethodReference>, T> resultFunc,
int maxdepth)
where T : class
{
return new CallTreeSearch<T>(startingClass.Methods.Cast<MethodDefinition>(), methodSelect, resultFunc, maxdepth);
}
public static IEnumerable<T> SearchCallTree<T>(this MethodDefinition startingMethod,
Func<MethodReference, bool> methodSelect,
Func<Instruction, Stack<MethodReference>, T> resultFunc,
int maxdepth)
where T : class
{
return new CallTreeSearch<T>(new[] { startingMethod }, methodSelect, resultFunc, maxdepth);
}
// Actual usage:
private static IEnumerable<TypeUsage> SearchMessages(TypeDefinition uiType, bool onlyConstructions)
{
return uiType.SearchCallTree(IsBusinessCall,
(instruction, stack) => DetectRequestUsage(instruction, stack, onlyConstructions));
}
Note the completiion of a function like DetectRequestUsage to suite your needs is completely and entirely up to you (edit: but see here). You can do whatever you want, and don't forget: you'll have the complete statically analyzed call stack at your disposal, so you actually can do pretty neat things with all that information!
Before it generates code, it must check if it already exists
There are a few cases where catching an exception is way cheaper than preventing it from being generated. This is a prime example. You can get the IL for the method body but Reflection is not a disassembler. Nor is a disassembler a real fix, you'd have the disassemble the entire call tree to implement your desired behavior. After all, a method call in the body could itself call a method, etcetera. It is just much simpler to catch the exception that the jitter will throw when it compiles the IL.
One can use the StackTrace class:
System.Diagnostics.StackTrace st = new System.Diagnostics.StackTrace();
System.Diagnostics.StackFrame sf = st.GetFrame(1);
Console.Out.Write(sf.GetMethod().ReflectedType.Name + "." + sf.GetMethod().Name);
The 1 can be adjusted and determines the number of frame you are interested in.

Categories