C# Paradigms: Side effects on Lists - c#

I am trying to evolve my understanding of side effects and how they should be controlled and applied.
In the following List of flights, I want to set a property of each flight satisfying a conditions:
IEnumerable<FlightResults> fResults = getResultsFromProvider();
//Set all non-stop flights description
fResults.Where(flight => flight.NonStop)
.Select(flight => flight.Description = "Fly Direct!");
In this expression, I have a side effect on my list. From my limited knowledge I know for ex. "LINQ is used for queries only" and "There are only a few operations to lists and assigning or setting values is not one of them" and "lists should be immutable".
What is wrong with my LINQ statement above and how should it be changed?
Where can I get more information on the fundamental paradigms on the scenario I have described above?

You have two ways of achieving it the LINQ way:
explicit foreach loop
foreach(Flight f in fResults.Where(flight => flight.NonStop))
f.Description = "Fly Direct!";
with a ForEach operator, made for the side effects:
fResults.Where(flight => flight.NonStop)
.ForEach(flight => flight.Description = "Fly Direct!");
The first way is quite heavy for such a simple task, the second way should only be used with very short bodies.
Now, you might ask yourself why there isn't a ForEach operator in the LINQ stack. It's quite simple - LINQ is supposed to be a functional way of expressing query operations, which especially means that none of the operators are supposed to have side effects. The design team decided against adding a ForEach operator to the stack because the only usage is its side effect.
A usual implementation of the ForEach operator would be like this:
public static class EnumerableExtension
{
public static void ForEach<T> (this IEnumerable<T> source, Action<T> action)
{
if(source == null)
throw new ArgumentNullException("source");
foreach(T obj in source)
action(obj);
}
}

One problem with that approach is that it won't work at all. The query is lazy, which means that it won't execute the code in the Select until you actually read something from the query, and you never do that.
You could come around that by adding .ToList() at the end of the query, but the code is still using side effects and throwing away the actual result. You should use the result to do the update instead:
//Set all non-stop flights description
foreach (var flight in fResults.Where(flight => flight.NonStop)) {
flight.Description = "Fly Direct!";
}

Your LINQ code does not "directly" violate the guidelines you mention, because you are not modifying the list itself; you are just modifying some property on the contents of the list.
However, the main objection that drives these guidelines remains: you should not be modifying data with LINQ (also, you are abusing Select to perform your side effects).
Not modifying any data can be justified pretty easily. Consider this snippet:
fResults.Where(flight => flight.NonStop)
Do you see where this is modifying the flight properties? Neither will many maintenance programmers, since they will stop reading after the Where -- the code that follows is obviously free of side effects since this is a query, right?
[Nitpick: Certainly, seeing a query whose return value is not retained is a dead giveaway that the query does have side effects or that the code should have been removed; in any case, that "something is wrong". But it's so much easier to say that when there are only 2 lines of code to look at instead of pages upon pages.]
As a correct solution, I would recommend this:
foreach (var x in fResults.Where(flight => flight.NonStop))
{
x.Description = "Fly Direct!";
}
Pretty easy to both write and read.

There is nothing wrong with it perse, except that you need to iterate it somehow, like calling Count() on it.
From a 'style' perspective it is not good. One would not expect an iterator to mutate a list value/property.
IMO the following would be better:
foreach (var x in fResults.Where(flight => flight.NonStop))
{
x.Description = "Fly Direct!";
}
The intent is much clearer to the reader or maintainer of the code.

You should break that up into two blocks of code, one for the retrieval and one for setting the value:
var nonStopFlights = fResults.Where(f => f.NonStop);
foreach(var flight in nonStopFlights)
flight.Description = "Fly Direct!";
Or, if you really hate the look of foreach you could try:
var nonStopFlights = fResults.Where(f => f.NonStop).ToList();
// ForEach is a method on List that is acceptable to make modifications inside.
nonStopFlights.ForEach(f => f.Description = "Fly Direct!");

I like using foreach when I'm actually changing something. Something like
foreach (var flight in fResults.Where(f => f.NonStop))
{
flight.Description = "Fly Direct!";
}
and so does Eric Lippert in his article about why LINQ does not have a ForEach helper method.
But we can go a bit deeper here. I am philosophically opposed to providing such a method, for two reasons.
The first reason is that doing so violates the functional programming principles that all the other sequence operators are based upon. Clearly the sole purpose of a call to this method is to cause side effects.

Related

Difference between using where/lambda and for each

I'm new to C# and came across this function preformed on a dictionary.
_objDictionary.Keys.Where(a => (a is fooObject)).ToList().ForEach(a => ((fooObject)a).LaunchMissles());
My understanding is that this essentially puts every key that is a fooObject into a list, then performs the LaunchMissles function of each. How is that different than using a for each loop like this?
foreach(var entry in _objDictionary.Keys)
{
if (entry is fooObject)
{
entry.LaunchMissles();
}
}
EDIT: The resounding opinion appears to be that there is no functional difference.
This is good example of abusing LINQ - statement did not become more readable or better in any other way, but some people just like to put LINQ everywhere. Though in this case you might take the best from both worlds by doing:
foreach(var entry in _objDictionary.Keys.OfType<FooObject>())
{
entry.LaunchMissles();
}
Note that in your foreach example you are missing a cast to FooObject to invoke LaunchMissles.
In general, Linq is no Voodomagic and does the same stuff under the hood that you would need to write if you werent using it. Linq just makes things easier to write but it wont beat regular code performance wise (if it really is equivalent)
In your case, your "oldschool" approach is perfectly fine and in my opinion the favorable
foreach(var entry in _objDictionary.Keys)
{
fooObject foo = entry as fooObject;
if (foo != null)
{
foo .LaunchMissles();
}
}
Regarding the Linq-Approach:
Materializing the Sequence to a List just to call a method on it, that does the same as the code above, is just wasting ressources and making it less readable.
In your example it doesnt make a diffrence but if the source wasnt a Collection (like Dictionary.Keys is) but an IEnumerable that really works the lazy way, then there can be a huge impact.
Lazy evalutation is designed to yield items when needed, calling ToList inbetween would first gather all items before actually executing the ForEach.
While the plain foreach-approach would get one item, then process it, then get the next and so on.
If you really want to use a "Linq-Foreach" than dont use the List-Implementation but roll your own extensionmethod (like mentioned in the comments below your quesiton)
public static class EnumerableExtensionMethods
{
public static void ForEach<T>(this IEnumerable<T> sequence, Action<T> action)
{
foreach(T item in sequence)
action(item);
}
}
Then still rolling with a regular foreach should be prefered, unless you put the foreach-body into a different method
sequence.ForEach(_methodThatDoesThejob);
That is the only "for me acceptable" way of using this.

Replacing a foreach with LINQ expression

So I have a code like this:
foreach (var optionValues in productOption.ProductOptionValues)
{
if (optionValues.ProductOptionValueID > 0)
{
unitOfWork.ProductContext.Entry(optionValues).State = EntityState.Modified;
}
else
{
unitOfWork.ProductContext.Entry(optionValues).State = EntityState.Added;
}
}
The code review for this was that I should look at using LINQ to do this.
Can someone please point me to a resource that can explain using LINQ to change the object properties?
You don't. Simple as that.
The code review for this was that I should look at using LINQ to do this and avoid the foreach.
Tell the code reviewer he is wrong. LinQ is for Querying data. You are updating data. Stay with your foreach loop, it's fine.
LINQ is for querying. You are modifying values, so foreach is perfectly fine here.
The best you could do is this:
var query =
from optionValues in productOption.ProductOptionValues
select new
{
entry = unitOfWork.ProductContext.Entry(optionValues),
value = optionValues.ProductOptionValueID > 0
? EntityState.Modified
: EntityState.Added
};
foreach (var x in query)
{
x.entry.State = x.value;
}
But I don't think that this really gives you much in terms of readability.
The only reasonable use of LINQ here (and it depends on the type of ProductOptionValues) is to filter the results using Where, which essentially replaces your if statement, but it's not better than your current code:
foreach (var option in productOption.ProductOptionValues.Where(x => x.ProductOptionValueID > 0)
unitOfWork.ProductContext.Entry(optionValues).State = EntityState.Modified;
foreach (var option in productOption.ProductOptionValues.Where(x => x.ProductOptionValueID <= 0)
unitOfWork.ProductContext.Entry(optionValues).State = EntityState.Added;
You should not use the LINQ ForEach extension. Let me explain why:
The LINQ foreach violates the functional programming principles that all the other sequence operators are based upon.
Clearly the sole purpose of a call to this method is to cause side effects. The purpose of an expression is to compute a value, not to cause a side effect.
The purpose of a statement is to cause a side effect. The call site of this thing would look an awful lot like an expression
The second reason is that using it adds zero representational value to your code. Doing this lets you rewrite this perfectly clear code:
foreach(Foo foo in foos){ statement involving foo; }
into this code:
foos.ForEach((Foo foo)=>{ statement involving foo; });
which uses almost exactly the same characters in slightly different order. And yet the second version is harder to understand, harder to debug, and introduces closure semantics, thereby potentially changing object lifetimes in subtle ways.
The above is in parts a summary of a blog Post from Eric Lippert. Read the full post here.
What is more the Extension has been removed by the BCL Team in Windows 8:
List.ForEach has been removed in Metro style apps. While the method seems simple it has a number of potential problems when the list gets mutated by the method passed to ForEach.
Instead it is recommended that you simply use a foreach loop.
Assuming productOption.ProductOptionValues is an IList<>() (if it isn't you might need to do a .ToList() before the .ForEach), it would be something like this:
productOption.ProductOptionValues.ForEach(x =>
unitOfWork.ProductContext.Entry(x).State = (
(x.ProductOptionValueID > 0) ? EntityState.Modified : EntityState.Added)
)
... but I don't think that's really an improvement. Quite the opposite, in fact.
Really, don't do this.
Tricky, just for the humor, it is possible in several ways, such as:
var sum = productOption.ProductOptionValues.Select(
optionValues => unitOfWork.ProductContext.Entry(optionValues).State = (optionValues.ProductOptionValueID > 0 ? EntityState.Modified : EntityState.Added).Sum();

Linq ForEach vs All Performance review

For most of the time I am using All(and returns true) instead of ForEach. Is it a good practice to use ALL instead of ForEach all the time(in case of IEnumerable), I understand All could be run on IEnumerable whereas foreach runs only on list
var wells = GlobalDataModel.WellList.Where(u => u.RefProjectName == project.OldProjectName);
if (wells.Any())
{
wells.All(u =>
{
u.RefProjectName = project.ProjectName;
return true;
});
}
var wellsList = GlobalDataModel.WellList.Where(u => u.RefProjectName == project.OldProjectName).ToList();
wellsList.ForEach(u => u.RefProjectName = project.ProjectName);
Nope, You're abusing the All method. Take a look at documentation
Determines whether all elements of a sequence satisfy a condition.
It should be used to determine all elements are true/false based on some condition, It is not meant to used to produce side effects.
List.ForEach is meant to be used for side effects. You may use it if you already have List<T> upfront. Calling ToList and creating new List just for the sake of List.ForEach is not worth. It adds another O(n) operation.
In short don't use All for side effects, List.ForEach is barely acceptable when you have list already. Recommended way is use loop of your choice, nothing can be better than that.
Ericlippert has something to say about ForEach, note that it is removed in ModernUI apps, may be removed in desktop version of .net too.
If you're checking whether all elements satisfy some condition, use All.
But, if you need to perform some operation on each element, don't use All or any other LINQ predicate (Where, Select, Any, etc.), as they are meant to be used in a purely functional way. You can iterate over the elements using a foreach .. in loop or if you prefer with the List<T>.ForEach method. However as you mention it is part of List<T> and makes your code slightly harder to change (e.g. from list to another enumerable).
See here for a discussion about the "right way" to use LINQ.
For example, you can write your code like this:
foreach (var u in GlobalDataModel.WellList
.Where(u => u.RefProjectName == project.OldProjectName))
{
u.RefProjectName = project.ProjectName;
}
It's more obvious that side effects are being done. Also, this will only iterate once over the sequence, skipping the elements that don't satisfy the condition.

Change foreach to lambda

I need a help with simpify this statement. How to change foreach to lambda
var r = mp.Call(c => c.GetDataset()); // returns IEnumerable of dataset
foreach (DatasetUserAppsUsage item in r)
{
datasetUserAppsUsage.Merge(item.AppsUsageSummary);
}
lambdas and loops are orthogonal. It is inappropriate to try to change them to brute-force one into the other. That code is fine. Leave it.
You can get .ForEach implementations, but it isn't going to make the code better (in fact, it will be harder to follow, i.e. worse), and it won't be more efficient (in fact, it will be marginally slower, i.e. worse).
You can do the following
r.ToList().ForEach(item => datasetUserAppsUsage.Merge(item.AppsUsageSummary);
Personally, I don't think I would merge this into a single lambda. You could do:
mp.Call(c => c.GetDataset()).ToList().ForEach(item => datasetUserAppsUsage.Merge(item.AppsUsageSummary));
However, I would avoid it, as it's purposefully causing side effects, which really violates the expectations of LINQ, and is not very clear in its intent.
I agree that lambdas purpose it different, but sometimes I use this trick:
mp.Call(c => c.GetDataset())
.All(a => { datasetUserAppsUsage.Merge(a.AppsUsageSummary); return true; });
The trick is to use All() and return true to avoid break.
And do not change the underlying collection when inside enumerator of course :)

Should I use a simple foreach or Linq when collecting data out of a collection

For a simple case, where class foo has a member i, and I have a collection of foos, say IEnumerable<Foo> foos, and I want to end up with a collection of foo's member i, say List<TypeOfi> result.
Question: is it preferable to use a foreach (Option 1 below) or some form of Linq (Option 2 below) or some other method. Or, perhaps, it it not even worth concerning myself with (just choose my personal preference).
Option 1:
foreach (Foo foo in foos)
result.Add(foo.i);
Option 2:
result.AddRange(foos.Select(foo => foo.i));
To me, Option 2 looks cleaner, but I'm wondering if Linq is too heavy handed for something that can achieved with such a simple foreach loop.
Looking for all opinions and suggestions.
I prefer the second option over the first. However, unless there is a reason to pre-create the List<T> and use AddRange, I would avoid it. Personally, I would use:
List<TypeOfi> results = foos.Select(f => f.i).ToList();
In addition, I would not necessarily even use ToList() unless you actually need a true List<T>, or need to force the execution to be immediate instead of deferred. If you just need the collection of "i" values to iterate, I would simply use:
var results = foos.Select(f => f.i);
I definitely prefer the second. It is far more declarative and easier to understand (to me, at least).
LINQ is here to make our lives more declarative so I would hardly consider it heavy handed even in cases as seemingly "trivial" as this.
As Reed said, though, you could improve the quality by using:
var result = foos.Select(f => f.i).ToList();
As long as there is no data already in the result collection.
LINQ isn't heavy handed in any way, both the foreach and the linq code do about the same, the foreach in the second case is just hidden away.
It really is just a matter of preference, at least concerning linq to objects. If your source collection is a linq to entities query or something different, it is a complete different case - the second case would put the query into the database which is much more effective. In this simple case, the difference probably won't be that much, but if you throw in a Where operator or others into it and make the query non-trivial, the linq query will most likely have better/faster performance.
I think you could also just do
foos.Select(foo => foo.i).ToList<TypeOfi>();

Categories