I have two methods with these signatures:
void Method<T>(T data)
{
}
void Method<T>(IEnumerable<T> data)
{
}
Its an overload of the same method to take either a single object or a list of them. If I try to pass a List<'T> to it, it resolves to the first method, when obviosly i want the second. I have to use list.AsEnumerable() to get it to resolve to the second. Is there any way to make it resolve to the second regardless of whether the list is type T[], IList<'T>, List<'T>, Collection<'T>, IEnumerable<'T>, etc.
The best solution: do not go there in the first place. This is a bad design. You'll note that none of the framework classes do this. A list has two methods: Add, and AddRange. The first adds a single item, the second adds a sequence of items.
It's a bad design because you are writing a device for automatically producing bugs. Consider again the example of a mutable list:
List<object> myqueries = new List<object>();
myqueries.Add(from c in customers select c.Name);
myqueries.Add(from o in orders where o.Amount > 10000.00m select o);
You'd expect that to add two queries to the list of queries; if Add were overloaded to take a sequence then this would add the query results to the list, not the queries. You need to be able to distinguish between a query and its results; they are logically completely different.
The best thing to do is to make the methods have two different names.
If you're hell bent on this bad design, then if it hurts when you do that, don't do that. Overload resolution is designed to find the best possible match. Don't try to hammer on overload resolution so that it does something worse. Again, that is confusing and bug prone. Solve the problem using some other mechanism. For example:
static void Frob(IEnumerable ts) // not generic!
{
foreach(object t in ts) Frob<object>(t);
}
static void Frob<T>(T t)
{
if (t is IEnumerable)
Frob((IEnumerable) t);
else
// otherwise, frob a single T.
}
Now no matter what the user gives you - an array of T, an array of lists of T, whatever, you end up only frobbing single Ts.
But again, this is almost certainly a bad idea. Don't do it. Two methods that have different semantics should have different names.
Depends if you are doing this:
var list = new List<Something>();
Method<List<Something>>(list); // Will resolve to the first overload.
Or:
var list = new List<Something>();
Method<Something>(list); // Will resolve to the second overload.
The reason this occurs, is the compiler will select the most specific method it can, so where your generic Method<T>(T data) is used, it is compiled as Method<List<Something>>(List<Something> data), which is more specific than IEnumerable<Something>
The overload resolution will attempt to find the best matching overload.
In the case of the IEnumerable<T> overload, you will indeed need to explicitly convert or use the IEnumerable<T>, as then that will indeed be the best match.
Otherwise, the simple generic overload will be considered a better match.
For a lot more detail, read the "overload resolution" blog entries of Eric Lippert.
It's somewhat unusual to have overloads such that a method can take either a single thing, or a collection of things, isn't it? Why not just have
void MethodSingle<T>(T data)
{
Method(new T[] { data });
}
void Method<T>(IEnumerable<T> data)
{
}
Clearer to both the compiler and the reader, I'd suggest.
Q:
Is there any way to make it resolve to the second regardless of whether the list is type T[], IList<'T>, List<'T>, Collection<'T>, IEnumerable<'T>, etc.
The T[] is not an list but an array, so probably no.
I not sure that this will be helpful but you can try to create a method with restrictions
public void Method<U,T> (U date) where U : IList<T> { /* ... */ }
Why don't expand the method this way:
void Method<T>(T data)
{
var enumerable = data as IEnumerable<T>;
if(enumerable != null)
{
Method(enumerable);
return;
}
...
}
Related
I have two methods with the following signatures:
public void DoSomething<T>(T value) { ... }
public void DoSomething<T>(IEnumerable<T> value) { ... }
I'm trying to call the second one like this:
DoSomething(new[] { "Pizza", "Chicken", "Cheese" });
It still jumps into the first one.
How can I enforce that it will enter the second method instead? For example by using a where clause for the generic parameter?
Edit:
To be more precise: Why does it not work, even if I am more specific and change the overloads to something like this, which will result in an compile error when trying to call the method like shown above:
public void DoSomething<T>(T value) where T : IComparable { ... }
public void DoSomething<T>(IEnumerable<T> value) where T : IComparable { ... }
You can explicitly cast it to IEnumerable<T> reference at first place after initialization and then pass in to the method as parameter like following, so that it resolves to more specific overload which would be IEnumerable<T> not T:
DoSomething(new[] { "Pizza", "Chicken", "Cheese" } as IEnumerable<string>);
or this can also be done:
IEnumerable<string> foods = new[] { "Pizza", "Chicken", "Cheese" };
DoSomething(foods);
The problem is when you do not cast it to IEnumerable<T>, it's type at compile time is String[], so it would resolve to the overload which take T as input parameter.
So at calling time, your code actually is compile like:
DoSomething<String[]>(foods);
So it calls the first overload instead of the second one.
Another solution to this is to specify the generic type parameter yourself explicitly instead of compiler to resolve itself like:
DoSomething<String>(foods);
Refer to the following demo fiddle i just created to explain it:
Demo Fiddle
Edit:
As some people are suggesting the better would be to change the method name to be obvious enough that one can understand for what kind of types it can process, though the work around that i mention would work.
Hope it Helps!
When there are multiple applicable overloads of a method there are several different factors that are used to determine which one is "better". One of those factors is how "close" each of the parameters are to the types in the method signature. string[] is an IEnumerable<string>, so it's valid in that parameter slot, but string[] is closer to string[] (an exact match is as "close" as it gets) than it is to IEnumerable<string>, so the first overload is "closer" than the second.
So if you change the compile time type of your second parameter to exactly IEnumerable, then both overloads will be exactly the same on the "closeness" scale. Since that's the same, it goes on to another "betterness" tiebreaker, which is that non-generic methods are "better" than generic methods, so you end up calling the overload that you want.
As far as how to change the compile time type of that expression to IEnumerable<string>, the most convenient way would be AsEnumerable:
DoSomething(new[] { "Pizza", "Chicken", "Cheese" }.AsEnumerable());
As explained in #Servy's excellent answer, the compiler will choose the best matching type for overloaded methods.
By changing the generic function to use object instead, you cause anything that is an IEnumerable<> to match that function best.
public void DoSomething(object value) { ... }
public void DoSomething<T>(IEnumerable<T> value) { ... }
I think it may be theoretically possible that downcasting value to object could cause a performance issue versus the generic method, but since DoSomething doesn't return the value or take another parameter of matching type T, the use of object is restricted to the body of DoSomething and will allow all uses that are legal for T.
Because T accepts all the types and string[] is not an IEnumerable directly. It is derived from it.
DoSomething(new[] { "Pizza", "Chicken", "Cheese" }.AsEnumerable());
This must be a duplicate but i haven't found it. I've found this question which is related since it answers why it's recommended to use a method group instead of a lambda.
But how do i use an existing method group instead of a lambda if the method is not in the current class and the method is not static?
Say i have a list of ints which i want to convert to strings, i can use List.ConvertAll, but i need to pass a Converter<int, string> to it:
List<int> ints = new List<int> { 1 };
List<string> strings = ints.ConvertAll<string>(i => i.ToString());
This works, but it creates an unnecessary anonymous method with the lambda. So if Int32.ToString would be static and would take an int i could write:
List<string> strings = ints.ConvertAll<string>(Int32.ToString);
But that doesn't compile - of course. So how can i use a method group anyway?
If i'd create an instance method like this
string FooInt(int foo)
{
return foo.ToString();
}
i could use strings = ints.ConvertAll<string>(FooInt);, but that is not what i want. I don't want to create a new method just to be able to use an existing.
There is an static method in the framework, that can be used to convert any integrated data type into a string, namely Convert.ToString:
List<int> ints = new List<int> { 1 };
List<string> strings = ints.ConvertAll<string>(Convert.ToString);
Since the signature of Convert.ToString is also known, you can even eliminate the explicit target type parameter:
var strings = ints.ConvertAll(Convert.ToString);
This works. However, I'd also prefer the lambda-expression, even if ReSharper tells you something different. ReSharper sometimes optimizes too much imho. It prevents developers from thinking about their code, especially in the aspect of readability.
Update
Based on Tim's comment, I will try to explain the difference between lambda and static method group calls in this particular case. Therefor, I first took a look into the mscorlib disassembly to figure out, how int-to-string conversion exactly works. The Int32.ToString method calls an external method within the Number-class of the System namespace:
[__DynamicallyInvokable, TargetedPatchingOptOut("Performance critical to inline across NGen image boundaries"), SecuritySafeCritical]
public string ToString(IFormatProvider provider)
{
return Number.FormatInt32(this, null, NumberFormatInfo.GetInstance(provider));
}
The static Convert.ToString member does nothing else than calling ToString on the parameter:
[__DynamicallyInvokable]
public static string ToString(int value)
{
return value.ToString(CultureInfo.CurrentCulture);
}
Technically there would be no difference, if you'd write your own static member or extension, like you did in your question. So what's the difference between those two lines?
ints.ConvertAll<string>(i => i.ToString());
ints.ConvertAll(Convert.ToString);
Also - technically - there is no difference. The first example create's an anonymous method, that returns a string and accepts an integer. Using the integer's instance, it calls it's member ToString. The second one does the same, with the exception that the method is not anonymous, but an integrated member of the framework.
The only difference is that the second line is shorter and saves the compiler a few operations.
But why can't you call the non-static ToString directly?
Let's take a look into the ConvertAll-method of List:
public List<TOutput> ConvertAll<TOutput>(Converter<T, TOutput> converter)
{
if (converter == null)
{
ThrowHelper.ThrowArgumentNullException(ExceptionArgument.converter);
}
List<TOutput> list = new List<TOutput>(this._size);
for (int i = 0; i < this._size; i++)
{
list._items[i] = converter(this._items[i]);
}
list._size = this._size;
return list;
}
The list iteraterates over each item, calls the converter with the item as an argument and copys the result into a new list which it returns in the end.
So the only relation here is your converter that get's called explicitly. If you could pass Int32.ToString to the method, the compiler would have to decide to call this._items[i].ToString() within the loop. In this specific case it would work, but that's "too much intelligence" for the compiler. The type system does not support such code conversions. Instead the converter is an object, describing a method that can be called from the scope of the callee. Either this is an existing static method, like Convert.ToString, or an anonymous expression, like your lambda.
What causes the differences in your benchmark results?
That's hard to guess. I can imagine two factors:
Evaluating lambdas may result in runtime-overhead.
Framework calls may be optimized.
The last point especially means, that the JITer is able to inline the call which results in a better performance. However, those are just assumptions of mine. If anyone could clarify this, I'd appreciate it! :)
You hit the nail on the head yourself:
This works, but it creates an unnecessary anonymous method with the
lambda.
You can't do what you're asking for because there is no appropriate method group that you can use so the anonymous method is necessary. It works in that other case because the implicit range variable is passed to the delegate created by the method group. In your case, you need the method to be called on the range variable. It's a completely different scenario.
I have an overload method - the first implementation always returns a single object, the second implementation always returns an enumeration.
I'd like to make the methods generic and overloaded, and restrict the compiler from attempting to bind to the non-enumeration method when the generic type is enumerable...
class Cache
{
T GetOrAdd<T> (string cachekey, Func<T> fnGetItem)
where T : {is not IEnumerable}
{
}
T[] GetOrAdd<T> (string cachekey, Func<IEnumerable<T>> fnGetItem)
{
}
}
To be used with...
{
// The compile should choose the 1st overload
var customer = Cache.GetOrAdd("FirstCustomer", () => context.Customers.First());
// The compile should choose the 2nd overload
var customers = Cache.GetOrAdd("AllCustomers", () => context.Customers.ToArray());
}
Is this just plain bad code-smell that I'm infringing on here, or is it possible to disambiguate the above methods so that the compiler will always get the calling code right?
Up votes for anyone who can produce any answer other than "rename one of the methods".
Rename one of the methods. You'll notice that List<T> has an Add and and AddRange method; follow that pattern. Doing something to an item and doing something to a sequence of items are logically different tasks, so make the methods have different names.
This is a difficult use case to support because of how the C# compiler performs overload resolution and how it decides which method to bind to.
The first issue is that constraints are not part of the signature of a method and won't be considered for overload resolution.
The second problem you've got to overcome is that the compiler chooses the best match from the available signatures - which, when dealing with generics, generally means that SomeMethod<T>(T) will be considered a better match than SomeMethod<T>( IEnumerable<T> ) ... particularly when you've got parameters like T[] or List<T>.
But more fundamentally, you have to consider whether operating on a single value vs. a collection of values is really the same operation. If they are logically different, then you probably want to use different names just for clarity. Perhaps there are some use cases where you could argue that the semantic differences between single objects and collections of objects are not meaningful ... but in that case, why implement two different methods at all? It's unclear that method overloading is the best way to express the differences. Let's look at an example that lends to the confusion:
Cache.GetOrAdd("abc", () => context.Customers.Frobble() );
First, note that in the example above we are choosing to ignore the return parameter. Second, notice that we call some method Frobble() on the Customers collection. Now can you tell me which overload of GetOrAdd() will be called? Clearly without knowing the type that Frobble() returns it's not possible. Personally I believe that code whose semantics can't be readily inferred from the syntax should be avoided when possible. If we choose better names, this issue is alleviated:
Cache.Add( "abc", () => context.Customers.Frobble() );
Cache.AddRange( "xyz", () => context.Customers.Frobble() );
Ultimately, there are only three options to disambiguate the methods in your example:
Change the name of one of the methods.
Cast to IEnumerable<T> wherever you call the second overload.
Change the signature of one of the methods in a way that the compiler can differentiate.
Option 1 is self-evident, so I'll say no more about it.
Options 2 is also easy to understand:
var customers = Cache.GetOrAdd("All",
() => (IEnumerable<Customer>)context.Customers.ToArray());
Option 3 is more complicated. Let's look at ways we can be achieve it.
On approach is by changing the signature of the Func<> delegate, for instance:
T GetOrAdd<T> (string cachekey, Func<object,T> fnGetItem)
T[] GetOrAdd<T> (string cachekey, Func<IEnumerable<T>> fnGetItem)
// now we can do:
var customer = Cache.GetOrAdd("First", _ => context.Customers.First());
var customers = Cache.GetOrAdd("All", () => context.Customers.ToArray());
Personally, I find this option terribly ugly, unintuitive, and confusing. Introducing an unused parameter is terrible ... but, sadly it will work.
An alternative way of changing the signature (which is somewhat less terrible) is to make the return value an out parameter:
void GetOrAdd<T> (string cachekey, Func<object,T> fnGetItem, out T);
void GetOrAdd<T> (string cachekey, Func<IEnumerable<T>> fnGetItem, out T[])
// now we can write:
Customer customer;
Cache.GetOrAdd("First", _ => context.Customers.First(), out customer);
Customer[] customers;
var customers = Cache.GetOrAdd("All",
() => context.Customers.ToArray(), out customers);
But is this really better? It prevents us from using these methods as parameters of other method calls. It also makes the code less clear and less understandable, IMO.
A final alternative I'll present is to add another generic parameter to the methods which identifies the type of the return value:
T GetOrAdd<T> (string cachekey, Func<T> fnGetItem);
R[] GetOrAdd<T,R> (string cachekey, Func<IEnumerable<T>> fnGetItem);
// now we can do:
var customer = Cache.GetOrAdd("First", _ => context.Customers.First());
var customers = Cache.GetOrAdd<Customer,Customer>("All", () => context.Customers.ToArray());
So can use hints to help the compiler to choose an overload for us ... sure. But look at all of the extra work we have to do as the developer to get there (not to mention the introduced ugliness and opportunity for mistakes). Is it really worth the effort? Particularly when an easy and reliable technique (naming the methods differently) already exists to help us?
Use only one method and have it detect the IEnumerable<T> case dynamically rather than attempting the impossible via generic constraints. It would be "code smell" to have to deal with two different cache methods depending on if the object to store/retrieve is something enumerable or not. Also, just because it implements IEnumerable<T> does not mean it is necessarily a collection.
constraints don't support exclusion, which may seem frustrating at first, but is consistent and makes sense (consider, for example, that interfaces don't dictate what implementations can't do).
That being said, you could play around with the constraints of your IEnumerable overload...maybe change your method to have two generic typings <X, T> with a constraint like "where X : IEnumerable<T>" ?
ETA the following code sample:
void T[] GetOrAdd<X,T> (string cachekey, Func<X> fnGetItem)
where X : IEnumerable<T>
{
}
I'm learning a bit about function programming, and I'm wondering:
1) If my ForEach extension method is pure? The way I'm calling it seems violate the "don't mess with the object getting passed in", right?
public static void ForEach<T>(this IEnumerable<T> source, Action<T> action)
{
foreach ( var item in source )
action(item);
}
static void Main(string[] args)
{
List<Cat> cats = new List<Cat>()
{
new Cat{ Purring=true,Name="Marcus",Age=10},
new Cat{ Purring=false, Name="Fuzzbucket",Age=25 },
new Cat{ Purring=false, Name="Beanhead",Age=9 },
new Cat{Purring=true,Name="Doofus",Age=3}
};
cats.Where(x=>x.Purring==true).ForEach(x =>
{
Console.WriteLine("{0} is a purring cat... purr!", x.Name);
});
// *************************************************
// Does this code make the extension method impure?
// *************************************************
cats.Where(x => x.Purring == false).ForEach(x =>
{
x.Purring = true; // purr,baby
});
// all the cats now purr
cats.Where(x=>x.Purring==true).ForEach(x =>
{
Console.WriteLine("{0} is a purring cat... purr!", x.Name);
});
}
public class Cat {
public bool Purring;
public string Name;
public int Age;
}
2) If it is impure, is it bad code? I personally think it makes cleaner looking code than the old foreach ( var item in items) { blah; }, but I worry that since it might be impure, it could make a mess.
3) Would it be bad code if it returned IEnumerable<T> instead of void? I'd say as long as it is impure, yes it would be very bad code as it would encourage chaining something that would modify the chain. For example, is this bad code?
// possibly bad extension
public static IEnumerable<T> ForEach<T>(this IEnumerable<T> source, Action<T> action)
{
foreach ( var item in source )
action(item);
return source;
}
Impurity doesn't necesarily mean bad code. Many people find it easy and useful to use side effects to solve a problem. The key is first knowing how to do it in a pure way, so you'll know when impurity is appropriate :).
.NET doesn't have the concept of purity in the type system, so a "pure" method that takes in arbitrary delegates can always be impure, depending on how it's called. For instance, "Where", aka "filter", would usually be considered a pure function, since it doesn't modify its arguments or modify global state.
But, there's nothing stopping you from putting such code inside the argument to Where. For example:
things.Where(x => { Console.WriteLine("um?");
return true; })
.Count();
So that's definately an impure usage of Where. Enumerables can do whatever they want as they iterate.
Is your code bad? No. Using a foreach loop is just as "impure" -- you're still modifying the source objects. I write code like that all the time. Chain together some selects, filters, etc., then execute a ForEach on it to invoke some work. You're right, it's cleaner and easier.
Example: ObservableCollection. It has no AddRange method for some reason. So, if I want to add a bunch of things to it, what do I do?
foreach(var x in things.Where(y => y.Foo > 0)) { collection.Add(x)); }
or
things.Where(x => x.Foo > 0).ForEach(collection.Add);
I prefer the second one. At a minimum, I don't see how it can be construed as being worse than the first way.
When is it bad code? When it does side effecting code in a place that's not expected. This is the case for my first example using Where. And even then, there are times when the scope is very limited and the usage is clear.
Chaining ForEach
I've written code that does things like that. To avoid confusion, I would give it another name. The main confusion is "is this immediately evaluated or lazy?". ForEach implies that it'll go execute a loop right away. But something returning an IEnumerable implies that the items will be processed as needed. So I'd suggest giving it another name ("Process", "ModifySeq", "OnEach"... something like that), and making it lazy:
public static IEnumerable<T> OnEach(this IEnumerable<T> src, Action<T> f) {
foreach(var x in src) {
f(x);
yield return x;
}
}
It is not pure, as it can call impure methods. I think by typical definitions, purity is a transitive closure - a function is pure only if all the functions it calls (directly or indirectly) are also pure, or if the effects of those functions are encapsulated (e.g. they only mutate a non-escaping local variable).
Yes, it's not pure, but that's kind of a moot point as it's not even a function.
As the method doesn't return anything, the only option for it to do anything at all is to either affect the objects that you are sending in, or affecting something unrelated (like writing to the console window).
Edit:
To answer your third question; yes, that is bad code, as it seems to be doing something that it doesn't. The method returns a collection so it seems to be pure, but as it just returns the collection that was sent in, it's actually not any more pure than the first version. To make any sense the method should take a Func<T,T> delegate to use as conversion, and return a collection of the converted items:
public static IEnumerable<T> ForEach<T>(this IEnumerable<T> source, Func<T,T> converter) {
foreach (T item in source) {
yield return converter(item);
}
}
It's of course still up to the converter function if the extension call is pure. If it doesn't make a copy of the input item but just changes it and returns it, the call is still not pure.
Indeed, because your lambda expression contains an assignment, the function is now by definition impure. Whether the assignment is related to one of the arguments or another object defined outside the current function is irrelevant... A function must have no side-effects whatsoever in order to be called pure. See Wikipedia for a more precise (though quite straightforward) definition, which details the two conditions a function must satisfy to be deemed pure (having no side-effects is one of them). I believe lambda expressions are typically meant to be used as pure functions (at least I would imagine they were originally studied as such from the mathematical perspective), though clearly C# isn't stringent about this, where purely functional languages are. So it's probably not bad pratice, though it's definitely worth being away that such a function is impure.
Inspired by another question asking about the missing Zip function:
Why is there no ForEach extension method on the IEnumerable interface? Or anywhere? The only class that gets a ForEach method is List<>. Is there a reason why it's missing, maybe performance?
There is already a foreach statement included in the language that does the job most of the time.
I'd hate to see the following:
list.ForEach( item =>
{
item.DoSomething();
} );
Instead of:
foreach(Item item in list)
{
item.DoSomething();
}
The latter is clearer and easier to read in most situations, although maybe a bit longer to type.
However, I must admit I changed my stance on that issue; a ForEach() extension method would indeed be useful in some situations.
Here are the major differences between the statement and the method:
Type checking: foreach is done at runtime, ForEach() is at compile time (Big Plus!)
The syntax to call a delegate is indeed much simpler: objects.ForEach(DoSomething);
ForEach() could be chained: although evilness/usefulness of such a feature is open to discussion.
Those are all great points made by many people here and I can see why people are missing the function. I wouldn't mind Microsoft adding a standard ForEach method in the next framework iteration.
ForEach method was added before LINQ. If you add ForEach extension, it will never be called for List instances because of extension methods constraints. I think the reason it was not added is to not interference with existing one.
However, if you really miss this little nice function, you can roll out your own version
public static void ForEach<T>(
this IEnumerable<T> source,
Action<T> action)
{
foreach (T element in source)
action(element);
}
You could write this extension method:
// Possibly call this "Do"
IEnumerable<T> Apply<T> (this IEnumerable<T> source, Action<T> action)
{
foreach (var e in source)
{
action(e);
yield return e;
}
}
Pros
Allows chaining:
MySequence
.Apply(...)
.Apply(...)
.Apply(...);
Cons
It won't actually do anything until you do something to force iteration. For that reason, it shouldn't be called .ForEach(). You could write .ToList() at the end, or you could write this extension method, too:
// possibly call this "Realize"
IEnumerable<T> Done<T> (this IEnumerable<T> source)
{
foreach (var e in source)
{
// do nothing
;
}
return source;
}
This may be too significant a departure from the shipping C# libraries; readers who are not familiar with your extension methods won't know what to make of your code.
The discussion here gives the answer:
Actually, the specific discussion I witnessed did in fact hinge over functional purity. In an expression, there are frequently assumptions made about not having side-effects. Having ForEach is specifically inviting side-effects rather than just putting up with them. -- Keith Farmer (Partner)
Basically the decision was made to keep the extension methods functionally "pure". A ForEach would encourage side-effects when using the Enumerable extension methods, which was not the intent.
While I agree that it's better to use the built-in foreach construct in most cases, I find the use of this variation on the ForEach<> extension to be a little nicer than having to manage the index in a regular foreach myself:
public static int ForEach<T>(this IEnumerable<T> list, Action<int, T> action)
{
if (action == null) throw new ArgumentNullException("action");
var index = 0;
foreach (var elem in list)
action(index++, elem);
return index;
}
Example
var people = new[] { "Moe", "Curly", "Larry" };
people.ForEach((i, p) => Console.WriteLine("Person #{0} is {1}", i, p));
Would give you:
Person #0 is Moe
Person #1 is Curly
Person #2 is Larry
One workaround is to write .ToList().ForEach(x => ...).
pros
Easy to understand - reader only needs to know what ships with C#, not any additional extension methods.
Syntactic noise is very mild (only adds a little extranious code).
Doesn't usually cost extra memory, since a native .ForEach() would have to realize the whole collection, anyway.
cons
Order of operations isn't ideal. I'd rather realize one element, then act on it, then repeat. This code realizes all elements first, then acts on them each in sequence.
If realizing the list throws an exception, you never get to act on a single element.
If the enumeration is infinite (like the natural numbers), you're out of luck.
I've always wondered that myself, that is why that I always carry this with me:
public static void ForEach<T>(this IEnumerable<T> col, Action<T> action)
{
if (action == null)
{
throw new ArgumentNullException("action");
}
foreach (var item in col)
{
action(item);
}
}
Nice little extension method.
So there has been a lot of comments about the fact that a ForEach extension method isn't appropriate because it doesn't return a value like the LINQ extension methods. While this is a factual statement, it isn't entirely true.
The LINQ extension methods do all return a value so they can be chained together:
collection.Where(i => i.Name = "hello").Select(i => i.FullName);
However, just because LINQ is implemented using extension methods does not mean that extension methods must be used in the same way and return a value. Writing an extension method to expose common functionality that does not return a value is a perfectly valid use.
The specific arguement about ForEach is that, based on the constraints on extension methods (namely that an extension method will never override an inherited method with the same signature), there may be a situation where the custom extension method is available on all classes that impelement IEnumerable<T> except List<T>. This can cause confusion when the methods start to behave differently depending on whether or not the extension method or the inherit method is being called.
You could use the (chainable, but lazily evaluated) Select, first doing your operation, and then returning identity (or something else if you prefer)
IEnumerable<string> people = new List<string>(){"alica", "bob", "john", "pete"};
people.Select(p => { Console.WriteLine(p); return p; });
You will need to make sure it is still evaluated, either with Count() (the cheapest operation to enumerate afaik) or another operation you needed anyway.
I would love to see it brought in to the standard library though:
static IEnumerable<T> WithLazySideEffect(this IEnumerable<T> src, Action<T> action) {
return src.Select(i => { action(i); return i; } );
}
The above code then becomes people.WithLazySideEffect(p => Console.WriteLine(p)) which is effectively equivalent to foreach, but lazy and chainable.
Note that the MoreLINQ NuGet provides the ForEach extension method you're looking for (as well as a Pipe method which executes the delegate and yields its result). See:
https://www.nuget.org/packages/morelinq
https://code.google.com/p/morelinq/wiki/OperatorsOverview
#Coincoin
The real power of the foreach extension method involves reusability of the Action<> without adding unnecessary methods to your code. Say that you have 10 lists and you want to perform the same logic on them, and a corresponding function doesn't fit into your class and is not reused. Instead of having ten for loops, or a generic function that is obviously a helper that doesn't belong, you can keep all of your logic in one place (the Action<>. So, dozens of lines get replaced with
Action<blah,blah> f = { foo };
List1.ForEach(p => f(p))
List2.ForEach(p => f(p))
etc...
The logic is in one place and you haven't polluted your class.
Most of the LINQ extension methods return results. ForEach does not fit into this pattern as it returns nothing.
If you have F# (which will be in the next version of .NET), you can use
Seq.iter doSomething myIEnumerable
Partially it's because the language designers disagree with it from a philosophical perspective.
Not having (and testing...) a feature is less work than having a feature.
It's not really shorter (there's some passing function cases where it is, but that wouldn't be the primary use).
It's purpose is to have side effects, which isn't what linq is about.
Why have another way to do the same thing as a feature we've already got? (foreach keyword)
https://blogs.msdn.microsoft.com/ericlippert/2009/05/18/foreach-vs-foreach/
You can use select when you want to return something.
If you don't, you can use ToList first, because you probably don't want to modify anything in the collection.
I wrote a blog post about it:
http://blogs.msdn.com/kirillosenkov/archive/2009/01/31/foreach.aspx
You can vote here if you'd like to see this method in .NET 4.0:
http://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?FeedbackID=279093
In 3.5, all the extension methods added to IEnumerable are there for LINQ support (notice that they are defined in the System.Linq.Enumerable class). In this post, I explain why foreach doesn't belong in LINQ:
Existing LINQ extension method similar to Parallel.For?
Is it me or is the List<T>.Foreach pretty much been made obsolete by Linq.
Originally there was
foreach(X x in Y)
where Y simply had to be IEnumerable (Pre 2.0), and implement a GetEnumerator().
If you look at the MSIL generated you can see that it is exactly the same as
IEnumerator<int> enumerator = list.GetEnumerator();
while (enumerator.MoveNext())
{
int i = enumerator.Current;
Console.WriteLine(i);
}
(See http://alski.net/post/0a-for-foreach-forFirst-forLast0a-0a-.aspx for the MSIL)
Then in DotNet2.0 Generics came along and the List. Foreach has always felt to me to be an implementation of the Vistor pattern, (see Design Patterns by Gamma, Helm, Johnson, Vlissides).
Now of course in 3.5 we can instead use a Lambda to the same effect, for an example try
http://dotnet-developments.blogs.techtarget.com/2008/09/02/iterators-lambda-and-linq-oh-my/
I would like to expand on Aku's answer.
If you want to call a method for the sole purpose of it's side-effect without iterating the whole enumerable first you can use this:
private static IEnumerable<T> ForEach<T>(IEnumerable<T> xs, Action<T> f) {
foreach (var x in xs) {
f(x); yield return x;
}
}
My version an extension method which would allow you to use ForEach on IEnumerable of T
public static class EnumerableExtension
{
public static void ForEach<T>(this IEnumerable<T> source, Action<T> action)
{
source.All(x =>
{
action.Invoke(x);
return true;
});
}
}
No one has yet pointed out that ForEach<T> results in compile time type checking where the foreach keyword is runtime checked.
Having done some refactoring where both methods were used in the code, I favor .ForEach, as I had to hunt down test failures / runtime failures to find the foreach problems.