Why Standard Extension Method on IEnumerables - c#

When i use a standard Extension Method on a List such as
Where(...)
the result is always IEnumerable, and when
you decide to do a list operation such as Foreach()
we need to Cast(not pretty) or use a ToList() extension method that
(maybe) uses a new List that consumes more memory (is that right?):
List<string> myList=new List<string>(){//some data};
(Edit: this cast on't Work)
myList.Where(p=>p.Length>5).Tolist().Foreach(...);
or
(myList.Where(p=>p.Length>5) as List<string>).Foreach(...);
Which is better code or is there a third way?
Edit:
Foreach is a sample, Replace that with BinarySerach
myList.Where(p=>p.Length>5).Tolist().Binarysearch(...)

The as is definitely not a good approach, and I'd be surprised if it works.
In terms of what is "best", I would propose foreach instead of ForEach:
foreach(var item in myList.Where(p=>p.Length>5)) {
... // do something with item
}
If you desperately want to use list methods, perhaps:
myList.FindAll(p=>p.Length>5).ForEach(...);
or indeed
var result = myList.FindAll(p=>p.Length>5).BinarySearch(...);
but note that this does (unlike the first) require an additional copy of the data, which could be a pain if there are 100,000 items in myList with length above 5.
The reason that LINQ returns IEnumerable<T> is that this (LINQ-to-Objects) is designed to be composable and streaming, which is not possible if you go to a list. For example, a combination of a few where / select etc should not strictly need to create lots of intermediate lists (and indeed, LINQ doesn't).
This is even more important when you consider that not all sequences are bounded; there are infinite sequences, for example:
static IEnumerable<int> GetForever() {
while(true) yield return 42;
}
var thisWorks = GetForever().Take(10).ToList();
as until the ToList it is composing iterators, not generating an intermediate list. There are a few buffered operations, though, like OrderBy, which need to read all the data first. Most LINQ operations are streaming.

One of the design goals for LINQ is to allow composable queries on any supported data type, which is achieved by having return-types specified using generic interfaces rather than concrete classes (such as IEnumerable<T> as you noted). This allows the nuts and bolts to be implemented as needed, either as a concrete class (e.g. WhereEnumerableIterator<T> or hoisted into a SQL query) or using the convenient yield keyword.
Additionally, another design philosophy of LINQ is one of deferred execution. Basically, until you actually use the query, no real work has been done. This allows potentially expensive (or infinite as Mark notes) operations to be completed only exactly as needed.
If List<T>.Where returned another List<T> it would potentially limit composition and would certainly hinder deferred execution (not to mention generate excess memory).
So, looking back at your example, the best way to use the result of the Where operator depends on what you want to do with it!
// This assumes myList has 20,000 entries
// if .Where returned a new list we'd potentially double our memory!
var largeStrings = myList.Where(ss => ss.Length > 100);
foreach (var item in largeStrings)
{
someContainer.Add(item);
}
// or if we supported an IEnumerable<T>
someContainer.AddRange(myList.Where(ss => ss.Length > 100));

If you want to make a simple foreach over a list, you can do like this:
foreach (var item in myList.Where([Where clause]))
{
// Do something with each item.
}

You can't cast (as) IEnumerable<string> to List<string>. IEnumerable evaluates items when you access those. Invoking ToList<string>() will enumerate all items in the collection and returns a new List, which is a bit of memory inefficiency and as well as unnecessary. If you are willing to use ForEach extension method to any collection its better to write a new ForEach extension method that will work on any collection.
public static void ForEach<T>(this IEnumerable<T> enumerableList, Action<T> action)
{
foreach(T item in enumerableList)
{
action(item);
}
}

Related

Does Linq's IEnumerable.Select return a reference to the original IEnumerable?

I was trying to clone an List in my code, because I needed to output that List to some other code, but the original reference was going to be cleared later on. So I had the idea of using the Select extension method to create a new reference to an IEnumerable of the same elements, for example:
List<int> ogList = new List<int> {1, 2, 3};
IEnumerable<int> enumerable = ogList.Select(s => s);
Now after doing ogList.Clear(), I was surprised to see that my new enumerable was also empty.
So I started fiddling around in LINQPad, and saw that even if my Select returned different objects entirely, the behaviour was the same.
List<int> ogList = new List<int> {1, 2, 3};
IEnumerable<int> enumerable = ogList.Select(s => 5); // Doesn't return the original int
enumerable.Count().Dump(); // Count is 3
ogList.Clear();
enumerable.Count().Dump(); // Count is 0!
Note that in LINQPad, the Dump()s are equivalent to Console.WriteLine().
Now probably my need to clone the list in the first place was due to bad design, and even if I didn't want to rethink the design I could easily clone it properly. But this got me thinking about what the Select extension method actually does.
According to the documentation for Select:
This method is implemented by using deferred execution. The immediate return value is an object that stores all the information that is required to perform the action. The query represented by this method is not executed until the object is enumerated either by calling its GetEnumerator method directly or by using foreach in Visual C# or For Each in Visual Basic.
So then I tried adding this code before clearing:
foreach (int i in enumerable)
{
i.Dump();
}
The result was still the same.
Finally, I tried one last thing to figure out if the reference in my new enumerable was the same as the old one. Instead of clearing the original List, I did:
ogList.Add(4);
Then I printed out the contents of my enumerable (the "cloned" one), expecting to see '4' appended to the end of it. Instead, I got:
5
5
5
5 // Huh?
Now I have no choice but to admit that I have no idea how the Select extension method works behind the scenes. What's going on?
List/List<T> are for all intents and purposes fancy resizable arrays. They own and hold the data for value types such as your ints or references to the data for reference types in memory and they always know how many items they have.
IEnumerable/IEnumerable<T> are different beasts. They provide a different service/contract. An IEnumerable is fictional, it does not exist. It can create data out of thin air, with no physical backing. Their only promise is that they have a public method called GetEnumerator() that returns an IEnumerator/IEnumerator<T>. The promise that an IEnumerator makes is simple:
some item could be available or not at a time when you decide you need it. This is achieved through a simple method that the IEnumerator interface has: bool MoveNext() - which returns false when the enumeration is completed or true if there was in fact a new item that needed to be returned. You can read the data through a property that the IEnumerator interface has, conveniently called Current.
To get back to your observations/question: as far as the IEnumerable in your example is concerned, it does not even think about the data unless your code tells it to fetch some data.
When you are writing:
List<int> ogList = new List<int> {1, 2, 3};
IEnumerable<int> enumerable = ogList.Select(s => s);
You are saying: Listen here IEnumerable, I might come to you asking for some items at some point in the future. I'll tell you when I will need them, for now sit still and do nothing. With Select(s => s) you are conceptually defining an identity projection of int to int.
A very rough simplified, non-real-life implementation of the select you've written is:
IEnumerable<T> Select(this IEnumerable<int> source, Func<int,T> transformer) something like
{
foreach (var i in source) //create an enumerator for source and starts enumeration
{
yield return transformer(i); //yield here == return an item and wait for orders
}
}
(this explains why you got a 5 when expecting a for, your transform was s => 5)
For value types, such as the ints in your case: If you want to clone the list, clone the whole list or part of it for future enumeration by using the result of an enumeration materialized through a List. This way you create a list that is a clone of the original list, entirely detached from its original list:
IEnumerable<int> cloneOfEnumerable = ogList.Select(s => s).ToList();
Later edit: Of course ogList.Select(s => s) is equivalent to ogList. I'm leaving the projection here, as it was in the question.
What you are creating here is: a list from the result of an enumerable, further consumed through the IEnumerable<int> interface. Considering what I've said above about the nature of IList vs IEnumerable, I would prefer to write/read:
IList<int> cloneOfEnumerable = ogList.ToList();
CAUTION: Be careful with reference types. IList/List make no promise of keeping the objects "safe", they can mutate to null for all IList cares. Keyword if you ever need it: deep cloning.
CAUTION: Beware of infinite or non-rewindable IEnumerables
Provided answers explain why you are not obtaining a cloned list (due to deferred execution of some LINQ extension methods).
However, keep in mind that list.Select(e => e).ToList() will get a real clone only when dealing with value types such as int.
If you have a list of reference types you will receive a cloned list of references to existent objects. In this case you should consider one of the solutions provided here for deep-cloning or my favorite from here (which might be limited by object inner structure).
You have to be aware that an object that implements IEnumerable does not have to be a collection itself. It is an object that makes it possible to get an object that implements IEnumerator. Once you have the enumerator you can ask for the first element and for the next element until there are no more next elements.
Every LINQ function that returns an IEnumerable is not the sequence itself, it only enables you to ask for the enumerator. If you want a sequence, you'll have to use ToList.
There are several other LINQ functions that do not return an IEnumerable, but for instance a Dictionary, or only one element (FirstOrDefault(), Max(), Single(), Any(). These functions will get the enumerator from the IEnumerable and start enumerating until they have the result. Any will only have to check if you can start enumerating. Max will enumerate over all elements and remember the largest one. etc.
You'll have to be aware: as long as your LINQ statement is an IEnumerable of something, your source sequence is not accessed yet. If you change your source sequence before you start enumerating, the enumeration is over your changed source sequence.
If you don't want this, you'll have to do the enumeration before you change your source. Usually this will be ToList, but this can be any of the non-deferred function: Max(), Any(), FirstOrDefault(), etc.
List<TSource> sourceItems = ...
var myEnumerable = sourceItems
.Where(sourceItem => ...)
.GroupBy(sourceItem => ...)
.Select(group => ...);
// note: myEnumerable is an IEnumerable, it is not a sequence yet.
var list1 = sourceItems.ToList(); // Enumerate over the sequence
var first = sourceItems.FirstOrDefault(); // Enumerate and stop after the first
// now change the source, and to the same things again
sourceItems.Clear();
var list1 = sourceItems.ToList(); // returns empty list
var first = sourceItems.FirstOrDefault(); // return null: there is no first element
So every LINQ function that does not return IEnumerable, will start enumerating over sourceItems as the sequence is at the moment that you start enumerating. The IEnumerable is not the sequence itself.
This is an enumerable.
var enumerable = ogList.Select(s => s);
If you iterate through this enumerable, LINQ will in turn iterate over the original resultset. Each and every time. If you do anything to the original enumerable, the results will also be reflected in your LINQ calls.
If you need to freeze the data, store it in a list instead:
var enumerable = ogList.Select(s => s).ToList();
Now you've made a copy. Iterating over this list will not touch the original enumerable.

Is there any method like ForEach for IList? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
LINQ equivalent of foreach for IEnumerable<T>
List<T> has a method called ForEach which executes the passed action on each element of it.
var names = new List<String>{ "Bruce", "Alfred", "Tim", "Richard" };
names.ForEach(p => { Console.WriteLine(p); });
But what if names is not a List<T> but an IList<T>? IList<T> doesn't have a method like ForEach.
Is there some alternative?
Use a foreach loop:
foreach (var p in names) {
Console.WriteLine(p);
}
There is no reason to use delegates and extension methods all over the place if that doesn't actually improve readability; a foreach loop is not any less explicitly telling readers what's being done than a ForEach method.
If your IList<T> is an array (T[]), then you have Array.ForEach method on them similar to ForEach on List<T>. You can create an extension method for your custom IList<T> or IEnumerable<T> or whatever you prefer.
public static void ForEach<T>(this IList<T> list, Action<T> action)
{
foreach (T t in list)
action(t);
}
You just have to be wary of the fact that the objects in the original collection will be modified, but I guess the naming does imply that.
------------------------------------------------------------------------------------------------------------------------------------
I prefer to call:
people.Where(p => p.Tenure > 5)
.Select(p => p.Nationality)
.ForEach(n => AssignCitizenShip(n);
than
foreach (var n in people.Where(p => p.Tenure > 5).Select(p => p.Nationality))
{
AssignCitizenShip(n);
}
If so you can create the extension method on IEnumerable. Mind you the terminating call ForEach executes the Linq query. If you do not want it, you can defer it too by using yield statement and returning an IEnumerable<T> back:
public static IEnumerable<T> ForEach<T>(this IEnumerable<T> list, Action<T> action)
{
foreach (T t in list)
{
action(t);
yield return t;
}
}
That solves the side-effect issue, but I personally like a method named ForEach to finally execute the call.
-----------------------------------------------------------------------------------------------------------------------------------
To address the opposing views on preferences, here is a better link from Eric Lippert than this. To quote him:
"The first reason is that doing so violates the functional programming
principles that all the other sequence operators are based upon.
Clearly the sole purpose of a call to this method is to cause side
effects. The purpose of an expression is to compute a value, not to
cause a side effect. The purpose of a statement is to cause a side
effect. The call site of this thing would look an awful lot like an
expression (though, admittedly, since the method is void-returning,
the expression could only be used in a “statement expression”
context.) It does not sit well with me to make the one and only
sequence operator that is only useful for its side effects.
The second reason is that doing so adds zero new representational
power to the language".
Eric's not saying it's a bad thing to do - just the philosophical reasons behind the decision to not include the construct in Linq by default. If you believe a function on an IEnumerable shouldn't act on the contents, then don't do it. Personally I dont mind it since I'm well aware what it does. I treat it as any other method that causes side-effect on a collection class. I can enter into the function and debug it too if I want. Here is another one from Linq itself.
people.Where(p => p.Tenure > 5)
.Select(p => p.Nationality)
.AsParallel()
.ForAll(n => AssignCitizenShip(n);
As I would say, there is nothing bad about these. Its just personal preference. I wouldn't use this for nested foreachs or if it involves more than one line of code to execute inside the foreach loop since thats plain unreadable. But for simple example I posted, I like it. Looks clean and concise.
Edit: See a performance link btw: Why is List<T>.ForEach faster than standard foreach?
You could make an extension method and use most of the implementation of void List<T>.ForEach(Action<T> action). You can download the source code at the Shared Source Initiative site.
Basically you will end to something like this:
public static void ForEach<T>(this IList<T> list, Action<T> action)
{
if (list == null) throw new ArgumentNullException("null");
if (action == null) throw new ArgumentNullException("action");
for (int i = 0; i < list.Count; i++)
{
action(list[i]);
}
}
It is slightly better than the other implementations that use the foreach statement since it takes advantage of the fact that IList includes an indexer.
Although I aggree with the answer of O. R. Mapper, sometimes in big projects with many developers it is hard to convicne everybody that a foreach statement is clearer. Even worse, if your API is based on interfaces (IList) instead of concrete types (List) then developers that are used to the List<T>.ForEach method might start calling ToList on your IList references! I know because it happened in my previous project. I was using the collection interfaces everywhere in our public APIs following the Framework Design Guidelines. It took me a while to notice that many developers where not used to this and call to ToList started apprearing with an alarming rate. Finally I added this extension method to a common assembly that everybody was using and made sure that all unecessary call to ToList were removed from the codebase.
Add this code to static class and call it extensions:
public static void ForEach<T>(this IList<T> list, Action<T> action) {
foreach(var item in list) {
action.Invoke(item);
}
}

c# Find an item in 2 / multiple lists

I have the presumably common problem of having elements that I wish to place in 2 (or more) lists. However sometimes I want to find an element that could be in one of the lists. Now there is more than one way of doing this eg using linq or appending, but all seem to involve the unnecessary creation of an extra list containing all the elements of the separate lists and hence waste processing time.
So I was considering creating my own generic FindinLists class which would take 2 lists as its constructor parameters would provide a Find() and an Exists() methods. The Find and Exists methods would only need to search the second or subsequent lists if the item was not found in the first list. The FindInLists class could be instantiated in the getter of a ( no setter)property. A second constructor for the FindInLists class could take an array of lists as its parameter.
Is this useful or is there already a way to search multiple lists without incurring the wasteful overhead of the creation of a super list?
You could use the LINQ Concat function.
var query = list1.Concat(list2).Where(x => x.Category=="my category");
Linq already has this functionality by virtue of the FirstOrDefault method. It uses deferred execution so will stream from any input and will short circuit the return when a matching element is found.
var matched = list1.Concat(list2).FirstOrDefault(e => element.Equals(e));
Update
BaseType matched = list1.Concat(list2).Concat(list3).FirstOrDefault(e => element.Equals(e));
I believe IEnumerable<T>.Concat() is what you need. It doesn't create an extra list, it only iterates through the given pair of collections when queried
Concat() uses deferred execution, so at the time it's called it only creates an iterator which stores the reference to both concatenated IEnumerables. At the time the resulting collection is enumerated, it iterates through first and then through the second.
Here's the decompiled code for the iterator - no rocket science going on there:
private static IEnumerable<TSource> ConcatIterator<TSource>(IEnumerable<TSource> first, IEnumerable<TSource> second)
{
foreach (TSource iteratorVariable0 in first)
{
yield return iteratorVariable0;
}
foreach (TSource iteratorVariable1 in second)
{
yield return iteratorVariable1;
}
}
When looking to the docs for Concat(), I've stumbled across another alternative I didn't know - SelectMany. Given a collection of collections it allows you to work with the children of all parent collections at once like this:
IEnumerable<string> concatenated = new[] { firstColl, secondColl }
.SelectMany(item => item);
you can do something like this:
var list1 = new List<int>{1,2,3,4,5,6,7};
var list2 = new List<int>{0,-3,-4,2};
int elementToPush = 4;//value to find among available lists
var exist = list1.Exists(i=>i==elementToPush) || list2.Exists(j=>j==elementToPush);
If at least one collection required element exists, result is false, otherwise it's true.
One row and no external storage creation.
Hope this helps.
You could probably just create a List of lists and then use linq on that list. It is still creating a new List but it is a list of references rather than duplicating the contents of all the lists.
List<string> a = new List<string>{"apple", "aardvark"};
List<string> b = new List<string>{"banana", "bananananana", "bat"};
List<string> c = new List<string>{"cat", "canary"};
List<string> d = new List<string>{"dog", "decision"};
List<List<string>> super = new List<List<string>> {a,b,c,d};
super.Any(x=>x.Contains("apple"));
the Any call should return after the first list returns true so as requested will not process later lists if it finds it in an earlier list.
Edit: Having written this I prefer the answers using Concat but I leave this here as an alternative if you want something that might be more aesthetically pleasing. ;-)

How to make [example] extension method more generic/functional/efficient?

I needed a double[] split into groups of x elements by stride y returning a List. Pretty basic...a loop and/or some linq and your all set. However, I have not been spending much time on extension methods and this looked like a good candidate for some practice. The naive version returns what I am looking for in my current application....
(A)
public static IList<T[]> Split<T>(this IEnumerable<T> source, int every, int take)
{
/*... throw E if X is insane ...*/
var result = source
.Where ((t, i) => i % every == 0)
.Select((t, i) => source.Skip(i * every).Take(take).ToArray())
.ToList();
return result;
}
...the return type is sort of generic...depending on your definition of generic.
I would think...
(B)
public static IEnumerable<IEnumerable<T>> Split<T>
(this IEnumerable<T> source,int every, int take){/*...*/}
...is a better solution...maybe.
Question(s):
Is (B) preferred ?...Why ?
How would you cast (B) as IList <T[]> ?
Any benefit in refactoring ? possibly
two methods that might be chained or the like.
Is the approach sound ?...or have I
missed something basic.
Comments, opinions and harsh language are always appreciated.
Usage Context: C# .Net 4.0
B is probably the better option. Really the major change is that the consumer of the code has the option to make it a list using ToList() on the end of your method, instead of being forced to deal with a List (an IList, actually, which cannot be iterated).
This has a LOT of advantages in method chaining and general use. It's easy to ToList() an enumerable, but hard to go the other way. So, you can call Select().Split().OrderBy() on a list and use the results in a foreach statement without having to have Linq iterate through the whole thing at once.
Refactoring to yield return single values MIGHT get you a performance bonus, but since you're basically just returning the iterator that the Select gave you (which will yield one item at a time itself) I don't think you'll get much benefit in yielding through it yourself.
I would prefer (B) as it looks more flexible. One way of casting the output of the (B) method to an IList<T[]> is as simple as chaining .Select(x => x.ToArray()).ToList() to it, e.g.,
var foo =
bar.Split(someEvery, someTake).Select(x => x.ToArray()).ToList();
In .Net 4, you can just change the return type to IEnumerable<IEnumerable<T>> and it will work.
Before .Net 4, you would have to cast the internal lists to IEnumerable first, by just calling .Cast<IEnumerable<T>>() on your result before returning.

Why is there no ForEach extension method on IEnumerable?

Inspired by another question asking about the missing Zip function:
Why is there no ForEach extension method on the IEnumerable interface? Or anywhere? The only class that gets a ForEach method is List<>. Is there a reason why it's missing, maybe performance?
There is already a foreach statement included in the language that does the job most of the time.
I'd hate to see the following:
list.ForEach( item =>
{
item.DoSomething();
} );
Instead of:
foreach(Item item in list)
{
item.DoSomething();
}
The latter is clearer and easier to read in most situations, although maybe a bit longer to type.
However, I must admit I changed my stance on that issue; a ForEach() extension method would indeed be useful in some situations.
Here are the major differences between the statement and the method:
Type checking: foreach is done at runtime, ForEach() is at compile time (Big Plus!)
The syntax to call a delegate is indeed much simpler: objects.ForEach(DoSomething);
ForEach() could be chained: although evilness/usefulness of such a feature is open to discussion.
Those are all great points made by many people here and I can see why people are missing the function. I wouldn't mind Microsoft adding a standard ForEach method in the next framework iteration.
ForEach method was added before LINQ. If you add ForEach extension, it will never be called for List instances because of extension methods constraints. I think the reason it was not added is to not interference with existing one.
However, if you really miss this little nice function, you can roll out your own version
public static void ForEach<T>(
this IEnumerable<T> source,
Action<T> action)
{
foreach (T element in source)
action(element);
}
You could write this extension method:
// Possibly call this "Do"
IEnumerable<T> Apply<T> (this IEnumerable<T> source, Action<T> action)
{
foreach (var e in source)
{
action(e);
yield return e;
}
}
Pros
Allows chaining:
MySequence
.Apply(...)
.Apply(...)
.Apply(...);
Cons
It won't actually do anything until you do something to force iteration. For that reason, it shouldn't be called .ForEach(). You could write .ToList() at the end, or you could write this extension method, too:
// possibly call this "Realize"
IEnumerable<T> Done<T> (this IEnumerable<T> source)
{
foreach (var e in source)
{
// do nothing
;
}
return source;
}
This may be too significant a departure from the shipping C# libraries; readers who are not familiar with your extension methods won't know what to make of your code.
The discussion here gives the answer:
Actually, the specific discussion I witnessed did in fact hinge over functional purity. In an expression, there are frequently assumptions made about not having side-effects. Having ForEach is specifically inviting side-effects rather than just putting up with them. -- Keith Farmer (Partner)
Basically the decision was made to keep the extension methods functionally "pure". A ForEach would encourage side-effects when using the Enumerable extension methods, which was not the intent.
While I agree that it's better to use the built-in foreach construct in most cases, I find the use of this variation on the ForEach<> extension to be a little nicer than having to manage the index in a regular foreach myself:
public static int ForEach<T>(this IEnumerable<T> list, Action<int, T> action)
{
if (action == null) throw new ArgumentNullException("action");
var index = 0;
foreach (var elem in list)
action(index++, elem);
return index;
}
Example
var people = new[] { "Moe", "Curly", "Larry" };
people.ForEach((i, p) => Console.WriteLine("Person #{0} is {1}", i, p));
Would give you:
Person #0 is Moe
Person #1 is Curly
Person #2 is Larry
One workaround is to write .ToList().ForEach(x => ...).
pros
Easy to understand - reader only needs to know what ships with C#, not any additional extension methods.
Syntactic noise is very mild (only adds a little extranious code).
Doesn't usually cost extra memory, since a native .ForEach() would have to realize the whole collection, anyway.
cons
Order of operations isn't ideal. I'd rather realize one element, then act on it, then repeat. This code realizes all elements first, then acts on them each in sequence.
If realizing the list throws an exception, you never get to act on a single element.
If the enumeration is infinite (like the natural numbers), you're out of luck.
I've always wondered that myself, that is why that I always carry this with me:
public static void ForEach<T>(this IEnumerable<T> col, Action<T> action)
{
if (action == null)
{
throw new ArgumentNullException("action");
}
foreach (var item in col)
{
action(item);
}
}
Nice little extension method.
So there has been a lot of comments about the fact that a ForEach extension method isn't appropriate because it doesn't return a value like the LINQ extension methods. While this is a factual statement, it isn't entirely true.
The LINQ extension methods do all return a value so they can be chained together:
collection.Where(i => i.Name = "hello").Select(i => i.FullName);
However, just because LINQ is implemented using extension methods does not mean that extension methods must be used in the same way and return a value. Writing an extension method to expose common functionality that does not return a value is a perfectly valid use.
The specific arguement about ForEach is that, based on the constraints on extension methods (namely that an extension method will never override an inherited method with the same signature), there may be a situation where the custom extension method is available on all classes that impelement IEnumerable<T> except List<T>. This can cause confusion when the methods start to behave differently depending on whether or not the extension method or the inherit method is being called.
You could use the (chainable, but lazily evaluated) Select, first doing your operation, and then returning identity (or something else if you prefer)
IEnumerable<string> people = new List<string>(){"alica", "bob", "john", "pete"};
people.Select(p => { Console.WriteLine(p); return p; });
You will need to make sure it is still evaluated, either with Count() (the cheapest operation to enumerate afaik) or another operation you needed anyway.
I would love to see it brought in to the standard library though:
static IEnumerable<T> WithLazySideEffect(this IEnumerable<T> src, Action<T> action) {
return src.Select(i => { action(i); return i; } );
}
The above code then becomes people.WithLazySideEffect(p => Console.WriteLine(p)) which is effectively equivalent to foreach, but lazy and chainable.
Note that the MoreLINQ NuGet provides the ForEach extension method you're looking for (as well as a Pipe method which executes the delegate and yields its result). See:
https://www.nuget.org/packages/morelinq
https://code.google.com/p/morelinq/wiki/OperatorsOverview
#Coincoin
The real power of the foreach extension method involves reusability of the Action<> without adding unnecessary methods to your code. Say that you have 10 lists and you want to perform the same logic on them, and a corresponding function doesn't fit into your class and is not reused. Instead of having ten for loops, or a generic function that is obviously a helper that doesn't belong, you can keep all of your logic in one place (the Action<>. So, dozens of lines get replaced with
Action<blah,blah> f = { foo };
List1.ForEach(p => f(p))
List2.ForEach(p => f(p))
etc...
The logic is in one place and you haven't polluted your class.
Most of the LINQ extension methods return results. ForEach does not fit into this pattern as it returns nothing.
If you have F# (which will be in the next version of .NET), you can use
Seq.iter doSomething myIEnumerable
Partially it's because the language designers disagree with it from a philosophical perspective.
Not having (and testing...) a feature is less work than having a feature.
It's not really shorter (there's some passing function cases where it is, but that wouldn't be the primary use).
It's purpose is to have side effects, which isn't what linq is about.
Why have another way to do the same thing as a feature we've already got? (foreach keyword)
https://blogs.msdn.microsoft.com/ericlippert/2009/05/18/foreach-vs-foreach/
You can use select when you want to return something.
If you don't, you can use ToList first, because you probably don't want to modify anything in the collection.
I wrote a blog post about it:
http://blogs.msdn.com/kirillosenkov/archive/2009/01/31/foreach.aspx
You can vote here if you'd like to see this method in .NET 4.0:
http://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?FeedbackID=279093
In 3.5, all the extension methods added to IEnumerable are there for LINQ support (notice that they are defined in the System.Linq.Enumerable class). In this post, I explain why foreach doesn't belong in LINQ:
Existing LINQ extension method similar to Parallel.For?
Is it me or is the List<T>.Foreach pretty much been made obsolete by Linq.
Originally there was
foreach(X x in Y)
where Y simply had to be IEnumerable (Pre 2.0), and implement a GetEnumerator().
If you look at the MSIL generated you can see that it is exactly the same as
IEnumerator<int> enumerator = list.GetEnumerator();
while (enumerator.MoveNext())
{
int i = enumerator.Current;
Console.WriteLine(i);
}
(See http://alski.net/post/0a-for-foreach-forFirst-forLast0a-0a-.aspx for the MSIL)
Then in DotNet2.0 Generics came along and the List. Foreach has always felt to me to be an implementation of the Vistor pattern, (see Design Patterns by Gamma, Helm, Johnson, Vlissides).
Now of course in 3.5 we can instead use a Lambda to the same effect, for an example try
http://dotnet-developments.blogs.techtarget.com/2008/09/02/iterators-lambda-and-linq-oh-my/
I would like to expand on Aku's answer.
If you want to call a method for the sole purpose of it's side-effect without iterating the whole enumerable first you can use this:
private static IEnumerable<T> ForEach<T>(IEnumerable<T> xs, Action<T> f) {
foreach (var x in xs) {
f(x); yield return x;
}
}
My version an extension method which would allow you to use ForEach on IEnumerable of T
public static class EnumerableExtension
{
public static void ForEach<T>(this IEnumerable<T> source, Action<T> action)
{
source.All(x =>
{
action.Invoke(x);
return true;
});
}
}
No one has yet pointed out that ForEach<T> results in compile time type checking where the foreach keyword is runtime checked.
Having done some refactoring where both methods were used in the code, I favor .ForEach, as I had to hunt down test failures / runtime failures to find the foreach problems.

Categories