Extending LINQ to accept nullable enumerables - c#

While working with Linq extensions it's normal to see code like this:
IEnumerable<int> enumerable = GetEnumerable();
int sum = 0;
if (enumerable != null)
{
sum = enumerable.Sum();
}
In order to enhance the code quality, I wrote the following extension method that checks for nullable enumerables and breaks the linq execution.
public static IEnumerable<T> IgnoreIfEmpty<T>(this IEnumerable<T> enumerable)
{
if (enumerable == null) yield break;
foreach (var item in enumerable)
{
yield return item;
}
}
So, I can refactor the code to be like this:
var sum = GetEnumerable().IgnoreIfEmpty().Sum();
My questions now:
What penalties are associated with my extension method at runtime?
Is it's a good practice to extend linq that way?
Update:
My target framework is: 3.5

What penalties are associated with my extension method at runtime?
Your extension method is transformed into a state-machine, so there's the minimal overhead of that, but that shouldn't be noticeable.
Is it's a good practice to extend linq that way?
In your question you state:
While working with Linq extensions it's normal to see code like this (insert enumerable null check here)
And I beg to differ. The common practice says don't return null where an IEnumerable<T> is expected. Most cases should return an empty collection (or IEnumerable), leaving null to the exceptional, because null is not empty. This would make your method entirely redundant. Use Enumerable.Empty<T> where needed.

You're going to have a method call overhead, it will be negligible unless you are running it in a tight loop or a performance criticial scenario. It's but a shadow in comparison to something like a database call or writing to a file system. Note that the method is probably not going to be inlined, since it's an enumerator.
It's all about readability / maintainability. What do I expect to happen when I see GetEnumerable().IgnoreIfEmpty().Sum();? In this case, it makes sense.
Note that with C# 6 we can use the following syntax: GetEnumerable()?.Sum() which returns an int?. You could write GetEnumerable()?.Sum() ?? 0 or GetEnumerable()?.Sum().GetValueOrDefault() to get a non-null integer that will default to zero.
If you are truly concerned with performance, you could also slightly refactor your method so that it's not an enumerator. This may increase the chance of inlining, although I have no idea of the 'arcane' logic of the JIT compiler:
public static IEnumerable<T> IgnoreIfEmpty<T>(this IEnumerable<T> enumerable)
{
if (enumerable == null) return Enumerable.Empty<T>();
return enumerable;
}
More generally about extending Linq, I think it is perfectly fine as long as the code makes sense. MSDN even has an article about it. If you look at the standard Where, Select methods in Linq, and forget about the performance optimizations they have in there, the methods are all mostly one-liner methods.

You can skip the additional extension method and use null coalescing operator - this is what it's for, and a one-time check for nullability should be a lot more efficient than another state machine:
IEnumerable<int> enumerable = GetEnumerable();
int sum = 0;
sum = (enumerable ?? Enumerable.Empty<int>()).Sum();

Most of the times we write a lot of code just because we are enchanted by the beauty of our creation - not because we really need it - and then we call it abstraction, reusability, extensibility, etc..
Is this raw piece less readable or less extensible or less reuseable or slower :
var sum = GetEnumerable().Where(a => a != null).Sum();
The less code you write - the less code you test - keep it simple.
BTW - it is good to write extension methods if you can justify it.

Related

Is there a convenient way to filter a sequence of C# 8.0 nullable references, retaining only non-nulls?

I have code like this:
IEnumerable<string?> items = new [] { "test", null, "this" };
var nonNullItems = items.Where(item => item != null); //inferred as IEnumerable<string?>
var lengths = nonNullItems.Select(item => item.Length); //nullability warning here
Console.WriteLine(lengths.Max());
How can I write this code in a convenient way such that:
There is no nullability warning, because the type nonNullItems is inferred as IEnumerable<string>.
I don't need to add unchecked non-nullability assertions like item! (because I want to benefit from the compilers sanity checking, and not rely on me being an error-free coder)
I don't add runtime checked non-nullability assertions (because that's pointless overhead both in code-size and at runtime, and in case of human error that fails later than ideal).
The solution or coding pattern can apply more generally to other sequences of items of nullable-reference type.
I'm aware of this solution, which leverages the flow-sensitive typing in the C# 8.0 compiler, but it's.... not so pretty, mostly because it's so long and noisy:
var notNullItems = items.SelectMany(item =>
item != null ? new[] { item } : Array.Empty<string>())
);
Is there a better alternative?
I think you'll have to help the compiler in either one way or another. Calling .Where() is never safe of returning not-null. Maybe Microsoft could add some logic to determine basic scenarios like yours, but AFAIK that's not the situation right now.
However, you could write a simple extension method like that:
public static class Extension
{
public static IEnumerable<T> WhereNotNull<T>(this IEnumerable<T?> o) where T:class
{
return o.Where(x => x != null)!;
}
}
Unfortunately you will have to tell the compiler that you know more about the situation than it does.
One reason would be that the Where method has not been annotated in a way that lets the compiler understand the guarantee for non-nullability, nor is it actually possible to annotate it. There might be a case for having additional heuristics added to the compiler to understand some basic cases, like this one, but currently we do not have it.
As such, one option would be to use the null forgiving operator, colloquially known as the "dammit operator". You touch upon this yourself, however, instead of sprinkling exclamation marks all over the code where you use the collection, you can instead tuck on an additional step on producing the collection which, at least to me, makes it more palatable:
var nonNullItems = items.Where(item => item != null).Select(s => s!);
This will flag nonNullItems as IEnumerable<string> instead of IEnumerable<string?>, and thus be handled correctly in the rest of your code.
I don't know if this answer meets the criteria for your 3rd bullet point, but then your .Where() filter does not either, so...
Replace
var nonNullItems = items.Where(item => item != null)
with
var nonNullItems = items.OfType<string>()
This will yield an inferred type of IEnumerable<string> for nonNullItems, and this technique can be applied to any nullable reference type.
FWIW special support is being considered for C# 10: https://github.com/dotnet/csharplang/issues/3951

How to make [example] extension method more generic/functional/efficient?

I needed a double[] split into groups of x elements by stride y returning a List. Pretty basic...a loop and/or some linq and your all set. However, I have not been spending much time on extension methods and this looked like a good candidate for some practice. The naive version returns what I am looking for in my current application....
(A)
public static IList<T[]> Split<T>(this IEnumerable<T> source, int every, int take)
{
/*... throw E if X is insane ...*/
var result = source
.Where ((t, i) => i % every == 0)
.Select((t, i) => source.Skip(i * every).Take(take).ToArray())
.ToList();
return result;
}
...the return type is sort of generic...depending on your definition of generic.
I would think...
(B)
public static IEnumerable<IEnumerable<T>> Split<T>
(this IEnumerable<T> source,int every, int take){/*...*/}
...is a better solution...maybe.
Question(s):
Is (B) preferred ?...Why ?
How would you cast (B) as IList <T[]> ?
Any benefit in refactoring ? possibly
two methods that might be chained or the like.
Is the approach sound ?...or have I
missed something basic.
Comments, opinions and harsh language are always appreciated.
Usage Context: C# .Net 4.0
B is probably the better option. Really the major change is that the consumer of the code has the option to make it a list using ToList() on the end of your method, instead of being forced to deal with a List (an IList, actually, which cannot be iterated).
This has a LOT of advantages in method chaining and general use. It's easy to ToList() an enumerable, but hard to go the other way. So, you can call Select().Split().OrderBy() on a list and use the results in a foreach statement without having to have Linq iterate through the whole thing at once.
Refactoring to yield return single values MIGHT get you a performance bonus, but since you're basically just returning the iterator that the Select gave you (which will yield one item at a time itself) I don't think you'll get much benefit in yielding through it yourself.
I would prefer (B) as it looks more flexible. One way of casting the output of the (B) method to an IList<T[]> is as simple as chaining .Select(x => x.ToArray()).ToList() to it, e.g.,
var foo =
bar.Split(someEvery, someTake).Select(x => x.ToArray()).ToList();
In .Net 4, you can just change the return type to IEnumerable<IEnumerable<T>> and it will work.
Before .Net 4, you would have to cast the internal lists to IEnumerable first, by just calling .Cast<IEnumerable<T>>() on your result before returning.

In C#, why can't an anonymous method contain a yield statement?

I thought it would be nice to do something like this (with the lambda doing a yield return):
public IList<T> Find<T>(Expression<Func<T, bool>> expression) where T : class, new()
{
IList<T> list = GetList<T>();
var fun = expression.Compile();
var items = () => {
foreach (var item in list)
if (fun.Invoke(item))
yield return item; // This is not allowed by C#
}
return items.ToList();
}
However, I found out that I can't use yield in anonymous method. I'm wondering why. The yield docs just say it is not allowed.
Since it wasn't allowed I just created List and added the items to it.
Eric Lippert recently wrote a series of blog posts about why yield is not allowed in some cases.
Part 1
Part 2
Part 3
Part 4
Part 5
Part 6
EDIT2:
Part 7 (this one was posted later and specifically addresses this question)
You will probably find the answer there...
EDIT1: this is explained in the comments of Part 5, in Eric's answer to Abhijeet Patel's comment:
Q :
Eric,
Can you also provide some insight into
why "yields" are not allowed inside an
anonymous method or lambda expression
A :
Good question. I would love to have
anonymous iterator blocks. It would be
totally awesome to be able to build
yourself a little sequence generator
in-place that closed over local
variables. The reason why not is
straightforward: the benefits don't
outweigh the costs. The awesomeness of
making sequence generators in-place is
actually pretty small in the grand
scheme of things and nominal methods
do the job well enough in most
scenarios. So the benefits are not
that compelling.
The costs are large. Iterator
rewriting is the most complicated
transformation in the compiler, and
anonymous method rewriting is the
second most complicated. Anonymous
methods can be inside other anonymous
methods, and anonymous methods can be
inside iterator blocks. Therefore,
what we do is first we rewrite all
anonymous methods so that they become
methods of a closure class. This is
the second-last thing the compiler
does before emitting IL for a method.
Once that step is done, the iterator
rewriter can assume that there are no
anonymous methods in the iterator
block; they've all be rewritten
already. Therefore the iterator
rewriter can just concentrate on
rewriting the iterator, without
worrying that there might be an
unrealized anonymous method in there.
Also, iterator blocks never "nest",
unlike anonymous methods. The iterator
rewriter can assume that all iterator
blocks are "top level".
If anonymous methods are allowed to
contain iterator blocks, then both
those assumptions go out the window.
You can have an iterator block that
contains an anonymous method that
contains an anonymous method that
contains an iterator block that
contains an anonymous method, and...
yuck. Now we have to write a rewriting
pass that can handle nested iterator
blocks and nested anonymous methods at
the same time, merging our two most
complicated algorithms into one far
more complicated algorithm. It would
be really hard to design, implement,
and test. We are smart enough to do
so, I'm sure. We've got a smart team
here. But we don't want to take on
that large burden for a "nice to have
but not necessary" feature. -- Eric
Eric Lippert has written an excellent series of articles on the limitations (and design decisions influencing those choices) on iterator blocks
In particular iterator blocks are implemented by some sophisticated compiler code transformations. These transformations would impact with the transformations which happen inside anonymous functions or lambdas such that in certain circumstances they would both try to 'convert' the code into some other construct which was incompatible with the other.
As a result they are forbidden from interaction.
How iterator blocks work under the hood is dealt with well here.
As a simple example of an incompatibility:
public IList<T> GreaterThan<T>(T t)
{
IList<T> list = GetList<T>();
var items = () => {
foreach (var item in list)
if (fun.Invoke(item))
yield return item; // This is not allowed by C#
}
return items.ToList();
}
The compiler is simultaneously wanting to convert this to something like:
// inner class
private class Magic
{
private T t;
private IList<T> list;
private Magic(List<T> list, T t) { this.list = list; this.t = t;}
public IEnumerable<T> DoIt()
{
var items = () => {
foreach (var item in list)
if (fun.Invoke(item))
yield return item;
}
}
}
public IList<T> GreaterThan<T>(T t)
{
var magic = new Magic(GetList<T>(), t)
var items = magic.DoIt();
return items.ToList();
}
and at the same time the iterator aspect is trying to do it's work to make a little state machine. Certain simple examples might work with a fair amount of sanity checking (first dealing with the (possibly arbitrarily) nested closures) then seeing if the very bottom level resulting classes could be transformed into iterator state machines.
However this would be
Quite a lot of work.
Couldn't possibly work in all cases without at the very least the iterator block aspect being able to prevent the closure aspect from applying certain transformations for efficiency (like promoting local variables to instance variables rather than a fully fledged closure class).
If there was even a slight chance of overlap where it was impossible or sufficiently hard to not be implemented then the number of support issues resulting would likely be high since the subtle breaking change would be lost on many users.
It can be very easily worked around.
In your example like so:
public IList<T> Find<T>(Expression<Func<T, bool>> expression)
where T : class, new()
{
return FindInner(expression).ToList();
}
private IEnumerable<T> FindInner<T>(Expression<Func<T, bool>> expression)
where T : class, new()
{
IList<T> list = GetList<T>();
var fun = expression.Compile();
foreach (var item in list)
if (fun.Invoke(item))
yield return item;
}
Unfortunately I don't know why they didn't allow this, since of course it's entirely possible to do envision how this would work.
However, anonymous methods are already a piece of "compiler magic" in the sense that the method will be extracted either to a method in the existing class, or even to a whole new class, depending on whether it deals with local variables or not.
Additionally, iterator methods using yield is also implemented using compiler magic.
My guess is that one of these two makes the code un-identifiable to the other piece of magic, and that it was decided to not spend time on making this work for the current versions of the C# compiler. Of course, it might not be a concious choice at all, and that it just doesn't work because nobody thought to implement it.
For a 100% accurate question I would suggest you use the Microsoft Connect site and report a question, I'm sure you'll get something usable in return.
I would do this:
IList<T> list = GetList<T>();
var fun = expression.Compile();
return list.Where(item => fun.Invoke(item)).ToList();
Of course you need the System.Core.dll referenced from .NET 3.5 for the Linq method. And include:
using System.Linq;
Cheers,
Sly
Maybe its just a syntax limitation. In Visual Basic .NET, which is very similar to C#, it is perfectly possible while awkward to write
Sub Main()
Console.Write("x: ")
Dim x = CInt(Console.ReadLine())
For Each elem In Iterator Function()
Dim i = x
Do
Yield i
i += 1
x -= 1
Loop Until i = x + 20
End Function() ' here
Console.WriteLine($"{elem} to {x}")
Next
Console.ReadKey()
End Sub
Also note the parentheses ' here; the lambda function Iterator Function...End Function returns an IEnumerable(Of Integer) but is not such an object by itself. It must be called to get that object, and that’s what the () after End Function does.
The converted code by [1] raises errors in C# 7.3 (CS0149):
static void Main()
{
Console.Write("x: ");
var x = System.Convert.ToInt32(Console.ReadLine());
// ERROR: CS0149 - Method name expected
foreach (var elem in () =>
{
var i = x;
do
{
yield return i;
i += 1;
x -= 1;
}
while (i != x + 20);
}())
Console.WriteLine($"{elem} to {x}");
Console.ReadKey();
}
I strongly disagree to the reason given in the other answers that it's difficult for the compiler to handle. The Iterator Function() you see in the VB.NET example is specifically created for lambda iterators.
In VB, there is the Iterator keyword; it has no C# counterpart. IMHO, there is no real reason this is not a feature of C#.
So if you really, really want anonymous iterator functions, currently use Visual Basic or (I haven't checked it) F#, as stated in a comment of Part #7 in #Thomas Levesque's answer (do Ctrl+F for F#).

Would a conditional de-reference operator be a good thing in C#?

In functional languages there is often a Maybe monad which allows you to chain multiple calls on an object and have the entire expression return None/null if any part of the chain evaluates to nothing, rather than the typical NullReferenceException you'd get in C# by chaining calls where one object could be null.
This can be trivially implemented by writing a Maybe<T> with some extension methods to allow similar behaviour in C# using query comprehensions, which can be useful when processing XML with optional elements/attributes e.g.
var val = from foo in doc.Elements("foo").FirstOrDefault().ToMaybe()
from bar in foo.Attribute("bar").ToMaybe()
select bar.Value;
But this syntax is a bit clunky and unintuitive as people are used to dealing with sequences in Linq rather than single elements, and it leaves you with a Maybe<T> rather than a T at the end. Would a conditional de-reference operator (e.g. ..) be sufficiently useful to make it into the language? e.g.
var val = doc.Elements("foo").FirstOrDefault()..Attribute("bar")..Value;
The conditional de-reference would expand to something like:
object val;
var foo = doc.Elements("foo").FirstOrDefault();
if (foo != null)
{
var bar = foo.Attribute("bar");
if (bar != null)
{
val = bar.Value;
}
else
{
val = null;
}
}
I can see that this could potentially lead to terrible abuse like using .. everywhere to avoid a NullReferenceException, but on the other hand when used properly it could be very handy in quite a few situations. Thoughts?
Chaining multiple calls on an object makes me fear violations of the Law of Demeter. Thus, I am skeptical that this feature is a good idea, at least in terms of solving the specific problem you are using as an example.
It's an interesting idea that could be achieved with an extension method. Something like this, for example (note, just for example - I'm sure it can be refined):
public static IEnumerable<T> Maybe<T>(this IEnumerable<T> lhs, Func<IEnumerable<T>, T> rhs)
{
if (lhs != null)
{
return rhs(lhs);
}
return lhs;
}
I suspect a combination of NUllable and extension methods would allow a significant portion of this to b achieved.
This would limit T to value types of course.
(TBH I would rather see tuple support in the langauge and eliminate out parameters, as F# does.)
You code can be simplified, set val to null and you eliminate else branches.
I can see the potential usefulness, but other than a slight performance impact if the elements are often null, why not just surround the code block with a try..catch block for a NullReferenceException instead?

Why is there no ForEach extension method on IEnumerable?

Inspired by another question asking about the missing Zip function:
Why is there no ForEach extension method on the IEnumerable interface? Or anywhere? The only class that gets a ForEach method is List<>. Is there a reason why it's missing, maybe performance?
There is already a foreach statement included in the language that does the job most of the time.
I'd hate to see the following:
list.ForEach( item =>
{
item.DoSomething();
} );
Instead of:
foreach(Item item in list)
{
item.DoSomething();
}
The latter is clearer and easier to read in most situations, although maybe a bit longer to type.
However, I must admit I changed my stance on that issue; a ForEach() extension method would indeed be useful in some situations.
Here are the major differences between the statement and the method:
Type checking: foreach is done at runtime, ForEach() is at compile time (Big Plus!)
The syntax to call a delegate is indeed much simpler: objects.ForEach(DoSomething);
ForEach() could be chained: although evilness/usefulness of such a feature is open to discussion.
Those are all great points made by many people here and I can see why people are missing the function. I wouldn't mind Microsoft adding a standard ForEach method in the next framework iteration.
ForEach method was added before LINQ. If you add ForEach extension, it will never be called for List instances because of extension methods constraints. I think the reason it was not added is to not interference with existing one.
However, if you really miss this little nice function, you can roll out your own version
public static void ForEach<T>(
this IEnumerable<T> source,
Action<T> action)
{
foreach (T element in source)
action(element);
}
You could write this extension method:
// Possibly call this "Do"
IEnumerable<T> Apply<T> (this IEnumerable<T> source, Action<T> action)
{
foreach (var e in source)
{
action(e);
yield return e;
}
}
Pros
Allows chaining:
MySequence
.Apply(...)
.Apply(...)
.Apply(...);
Cons
It won't actually do anything until you do something to force iteration. For that reason, it shouldn't be called .ForEach(). You could write .ToList() at the end, or you could write this extension method, too:
// possibly call this "Realize"
IEnumerable<T> Done<T> (this IEnumerable<T> source)
{
foreach (var e in source)
{
// do nothing
;
}
return source;
}
This may be too significant a departure from the shipping C# libraries; readers who are not familiar with your extension methods won't know what to make of your code.
The discussion here gives the answer:
Actually, the specific discussion I witnessed did in fact hinge over functional purity. In an expression, there are frequently assumptions made about not having side-effects. Having ForEach is specifically inviting side-effects rather than just putting up with them. -- Keith Farmer (Partner)
Basically the decision was made to keep the extension methods functionally "pure". A ForEach would encourage side-effects when using the Enumerable extension methods, which was not the intent.
While I agree that it's better to use the built-in foreach construct in most cases, I find the use of this variation on the ForEach<> extension to be a little nicer than having to manage the index in a regular foreach myself:
public static int ForEach<T>(this IEnumerable<T> list, Action<int, T> action)
{
if (action == null) throw new ArgumentNullException("action");
var index = 0;
foreach (var elem in list)
action(index++, elem);
return index;
}
Example
var people = new[] { "Moe", "Curly", "Larry" };
people.ForEach((i, p) => Console.WriteLine("Person #{0} is {1}", i, p));
Would give you:
Person #0 is Moe
Person #1 is Curly
Person #2 is Larry
One workaround is to write .ToList().ForEach(x => ...).
pros
Easy to understand - reader only needs to know what ships with C#, not any additional extension methods.
Syntactic noise is very mild (only adds a little extranious code).
Doesn't usually cost extra memory, since a native .ForEach() would have to realize the whole collection, anyway.
cons
Order of operations isn't ideal. I'd rather realize one element, then act on it, then repeat. This code realizes all elements first, then acts on them each in sequence.
If realizing the list throws an exception, you never get to act on a single element.
If the enumeration is infinite (like the natural numbers), you're out of luck.
I've always wondered that myself, that is why that I always carry this with me:
public static void ForEach<T>(this IEnumerable<T> col, Action<T> action)
{
if (action == null)
{
throw new ArgumentNullException("action");
}
foreach (var item in col)
{
action(item);
}
}
Nice little extension method.
So there has been a lot of comments about the fact that a ForEach extension method isn't appropriate because it doesn't return a value like the LINQ extension methods. While this is a factual statement, it isn't entirely true.
The LINQ extension methods do all return a value so they can be chained together:
collection.Where(i => i.Name = "hello").Select(i => i.FullName);
However, just because LINQ is implemented using extension methods does not mean that extension methods must be used in the same way and return a value. Writing an extension method to expose common functionality that does not return a value is a perfectly valid use.
The specific arguement about ForEach is that, based on the constraints on extension methods (namely that an extension method will never override an inherited method with the same signature), there may be a situation where the custom extension method is available on all classes that impelement IEnumerable<T> except List<T>. This can cause confusion when the methods start to behave differently depending on whether or not the extension method or the inherit method is being called.
You could use the (chainable, but lazily evaluated) Select, first doing your operation, and then returning identity (or something else if you prefer)
IEnumerable<string> people = new List<string>(){"alica", "bob", "john", "pete"};
people.Select(p => { Console.WriteLine(p); return p; });
You will need to make sure it is still evaluated, either with Count() (the cheapest operation to enumerate afaik) or another operation you needed anyway.
I would love to see it brought in to the standard library though:
static IEnumerable<T> WithLazySideEffect(this IEnumerable<T> src, Action<T> action) {
return src.Select(i => { action(i); return i; } );
}
The above code then becomes people.WithLazySideEffect(p => Console.WriteLine(p)) which is effectively equivalent to foreach, but lazy and chainable.
Note that the MoreLINQ NuGet provides the ForEach extension method you're looking for (as well as a Pipe method which executes the delegate and yields its result). See:
https://www.nuget.org/packages/morelinq
https://code.google.com/p/morelinq/wiki/OperatorsOverview
#Coincoin
The real power of the foreach extension method involves reusability of the Action<> without adding unnecessary methods to your code. Say that you have 10 lists and you want to perform the same logic on them, and a corresponding function doesn't fit into your class and is not reused. Instead of having ten for loops, or a generic function that is obviously a helper that doesn't belong, you can keep all of your logic in one place (the Action<>. So, dozens of lines get replaced with
Action<blah,blah> f = { foo };
List1.ForEach(p => f(p))
List2.ForEach(p => f(p))
etc...
The logic is in one place and you haven't polluted your class.
Most of the LINQ extension methods return results. ForEach does not fit into this pattern as it returns nothing.
If you have F# (which will be in the next version of .NET), you can use
Seq.iter doSomething myIEnumerable
Partially it's because the language designers disagree with it from a philosophical perspective.
Not having (and testing...) a feature is less work than having a feature.
It's not really shorter (there's some passing function cases where it is, but that wouldn't be the primary use).
It's purpose is to have side effects, which isn't what linq is about.
Why have another way to do the same thing as a feature we've already got? (foreach keyword)
https://blogs.msdn.microsoft.com/ericlippert/2009/05/18/foreach-vs-foreach/
You can use select when you want to return something.
If you don't, you can use ToList first, because you probably don't want to modify anything in the collection.
I wrote a blog post about it:
http://blogs.msdn.com/kirillosenkov/archive/2009/01/31/foreach.aspx
You can vote here if you'd like to see this method in .NET 4.0:
http://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?FeedbackID=279093
In 3.5, all the extension methods added to IEnumerable are there for LINQ support (notice that they are defined in the System.Linq.Enumerable class). In this post, I explain why foreach doesn't belong in LINQ:
Existing LINQ extension method similar to Parallel.For?
Is it me or is the List<T>.Foreach pretty much been made obsolete by Linq.
Originally there was
foreach(X x in Y)
where Y simply had to be IEnumerable (Pre 2.0), and implement a GetEnumerator().
If you look at the MSIL generated you can see that it is exactly the same as
IEnumerator<int> enumerator = list.GetEnumerator();
while (enumerator.MoveNext())
{
int i = enumerator.Current;
Console.WriteLine(i);
}
(See http://alski.net/post/0a-for-foreach-forFirst-forLast0a-0a-.aspx for the MSIL)
Then in DotNet2.0 Generics came along and the List. Foreach has always felt to me to be an implementation of the Vistor pattern, (see Design Patterns by Gamma, Helm, Johnson, Vlissides).
Now of course in 3.5 we can instead use a Lambda to the same effect, for an example try
http://dotnet-developments.blogs.techtarget.com/2008/09/02/iterators-lambda-and-linq-oh-my/
I would like to expand on Aku's answer.
If you want to call a method for the sole purpose of it's side-effect without iterating the whole enumerable first you can use this:
private static IEnumerable<T> ForEach<T>(IEnumerable<T> xs, Action<T> f) {
foreach (var x in xs) {
f(x); yield return x;
}
}
My version an extension method which would allow you to use ForEach on IEnumerable of T
public static class EnumerableExtension
{
public static void ForEach<T>(this IEnumerable<T> source, Action<T> action)
{
source.All(x =>
{
action.Invoke(x);
return true;
});
}
}
No one has yet pointed out that ForEach<T> results in compile time type checking where the foreach keyword is runtime checked.
Having done some refactoring where both methods were used in the code, I favor .ForEach, as I had to hunt down test failures / runtime failures to find the foreach problems.

Categories