Calling Select inside of List<> extended method in C# - c#

Just wondering why a Select call won't execute if it's called inside of an extended method?
Or is it maybe that I'm thinking Select does one thing, while it's purpose is for something different?
Code Example:
var someList = new List<SomeObject>();
int triggerOn = 5;
/* list gets populated*/
someList.MutateList(triggerOn, "Add something", true);
MutateList method declaration:
public static class ListExtension
{
public static IEnumerable<SomeObject> MutateList(this IEnumerable<SomeObject> objects, int triggerOn, string attachment, bool shouldSkip = false)
{
return objects.Select(obj =>
{
if (obj.ID == triggerOn)
{
if (shouldSkip) shouldSkip = false;
else obj.Name += $" {attachment}";
}
return obj;
});
}
}
The solution without Select works. I'm just doing a foreach instead.
I know that the Select method has a summary saying: "Projects each element of a sequence into a new form." But if that were true, then wouldn't my code example be showing errors?
Solution that I used (Inside of the MutateList method):
foreach(SomeObject obj in objects)
{
if (obj.ID == triggerOn)
{
if (shouldSkip) shouldSkip = false;
else obj.Name += $" {attachment}";
}
});
return objects;

Select uses deferred execution, meaning that it does not actually execute until you try to iterate over the results, with a ForEach, or using Linq methods that require the actual results like ToList or Sum.
Also, it returns an iterator, it does not run on the items in-place, but you're not capturing the return value in your calling code.
For those reasons - I would recommend not using Select to mutate the object in the list. You're just wrapping a ForEach call in a less clean way. I would just use ForEach within the method.

Related

Variable assignment to first item in IEnumerable does not work

I am attempting to assign the value true to a field in my collection of objects. I am using the First() method to retrieve the first object, and assign to it. In this example, I am assigning the value true to the Show variable. However, immediately after the assignment, it appears that Show variable is still false:
public class CallerItem
{
public int IndexId;
public string PhoneNumber;
public bool ToInd;
public bool Show;
}
public void myFunc() {
var callers = dbCallerRecs.Select(x => new CallerItem() { IndexId = x.IndexId, PhoneNumber = x.PhoneNumber, ToInd = x.ToInd });
var toCallers = callers.Where(x => x.ToInd);
if (toCallers.Any())
{
toCallers.First().Show = true;
Console.Log(toCallers.First().Show); //THIS LOGS 'false'. HOWEVER, IT SHOULD LOG 'true'
}
}
Is there something I am missing? Perhaps my understanding of the references returned from the Where clause is not right?
if (toCallers.Any())
{
toCallers.First().Show = true;
Console.Log(toCallers.First().Show); //THIS LOGS 'false'. HOWEVER, IT SHOULD LOG 'true'
}
Every time you call .First() you are getting the first item. For some enumerables (e.g. IQueryable) it will return a different object every time.
The below code will call the method only once and thus avoid the issue. Note also that I have used FirstOrDefault rather than Any then First - since the former will result in fewer DB queries (i.e. be faster).
var caller = toCallers.FirstOrDefault().
if (caller != null)
{
caller.Show = true;
Console.Log(caller.Show);
}
var callers = dbCallerRecs.Select(x => new CallerItem() { IndexId = x.IndexId, PhoneNumber = x.PhoneNumber, ToInd = x.ToInd });
var toCallers = callers.Where(x => x.ToInd);
defines a query which is evaluated when some elements in the resulting IEnumerable<CallerItem> (or IQueryable<CallerItem> which implements IEnumerable<CallerItem>) is iterated. This happens three times in your code - when calling Any and both times you call First (assuming .Any() returns true).
The reason you see this behaviour is the two calls to First cause the query to be re-evaluated and a new object to be created for each call, so you're modifying a different object the one you end up logging.
One solution would be to eagerly evaluate the query:
var toCallers = callers.Where(x => x.ToInd).ToList();

Combine two expressions using LINQ?

var EXPEarners =
from victor in ins.BattleParticipants
where victor.GetComponent<TotalEXP>() != null
select victor;
foreach (GameObject victor in EXPEarners)
{
victor.GetComponent<TotalEXP>().value += EXPGain;
}
I'm new to LINQ and I would like some help. Is there a way to combine these two blocks of code so I don't have to call GetComponent() twice? (I'm using Unity.) Perhaps introduce a temporary variable and use a foreach loop instead? But the whole purpose of using LINQ was to avoid the foreach.
Also, is there a way to inject methods in between the LINQ statements, like call a void method before I select the final result, in case I want to do something "in between?"
There are a number of ways you could do this, but one small alteration to your query would get you to a single call:
First, get rid of the null check and simply return a map of victor and component:
var EXPEarners =
from victor in ins.BattleParticipants
select new {
victor,
component = victor.GetComponent<TotalEXP>()
};
Then, loop over each pair, adding the experience points if the component isn't null:
foreach (var participant in EXPEarners)
{
// can do something with participant.victor here
if (participant.component != null)
participant.component.value += EXPGain;
}
You could of course shorten this code up quite a bit, but if you do need to do something in between, you have the opportunity.
You could try this alternative:
// Dosomething for every item in the list
ins.BattleParticipants.All(gameObject => Reward(gameObject, EXPGain));
Then you write a method to perform "Reward", which can be as complex as you like
static bool Reward(GameObject gameObject, int EXPGain)
{
TotalEXP exp = gameObject.GetComponent<TotalEXP>();
if (exp != null)
{
exp.value += EXPGain;
return true;
}
return false;
}
And if you want, you can chain these, so for example you can also call a "Bonus" for all those you rewarded (where Reward returned true)
// Reward all EXPGain in the list then give them a Bonus
ins.BattleParticipants.Where(gameObject => Reward(gameObject, EXPGain)).All(gameObject => Bonus(gameObject, BONGain));
Then you write a method to perform "Bonus"
static bool Bonus(GameObject gameObject, int BONGain)
{
SomeOther soc = gameObject.GetComponent<SomeOther>();
if (soc != null)
{
soc.value += BONGain;
return true;
}
return false;
}
If you only want to increment TotalEXP value and you don't use a retrived GameObject somewhere else you can use let and retrive the collection of TotalEXP:
var TotalEXPs =
from victor in ins.BattleParticipants
let component = victor.GetComponent<TotalEXP>()
where component != null
select component;
foreach (TotalEXP expin TotalEXPs)
{
exp.value += EXPGain;
}
Otherwise, you can see #Cᴏʀʏ answer where you can retrive GameObject and it TotalEXP
Try searching for the "let" statement on LINQ. Maybe it can help you.

ToList method in Linq

If I am not wrong, the ToList() method iterate on each element of provided collection and add them to new instance of List and return this instance.Suppose an example
//using linq
list = Students.Where(s => s.Name == "ABC").ToList();
//traditional way
foreach (var student in Students)
{
if (student.Name == "ABC")
list.Add(student);
}
I think the traditional way is faster, as it loops only once, where as of above of Linq iterates twice once for Where method and then for ToList() method.
The project I am working on now has extensive use of Lists all over and I see there is alot of such kind of use of ToList() and other Methods that can be made better like above if I take list variable as IEnumerable and remove .ToList() and use it further as IEnumerable.
Do these things make any impact on performance?
Do these things make any impact on performance?
That depends on your code. Most of the time, using LINQ does cause a small performance hit. In some cases, this hit can be significant for you, but you should avoid LINQ only when you know that it is too slow for you (i.e. if profiling your code showed that LINQ is reason why your code is slow).
But you're right that using ToList() too often can cause significant performance problems. You should call ToList() only when you have to. Be aware that there are also cases where adding ToList() can improve performance a lot (e.g. when the collection is loaded from database every time it's iterated).
Regarding the number of iterations: it depends on what exactly do you mean by “iterates twice”. If you count the number of times MoveNext() is called on some collection, then yes, using Where() this way leads to iterating twice. The sequence of operations goes like this (to simplify, I'm going to assume that all items match the condition):
Where() is called, no iteration for now, Where() returns a special enumerable.
ToList() is called, calling MoveNext() on the enumerable returned from Where().
Where() now calls MoveNext() on the original collection and gets the value.
Where() calls your predicate, which returns true.
MoveNext() called from ToList() returns, ToList() gets the value and adds it to the list.
…
What this means is that if all n items in the original collection match the condition, MoveNext() will be called 2n times, n times from Where() and n times from ToList().
var list = Students.Where(s=>s.Name == "ABC");
This will only create a query and not loop the elements until the query is used. By calling ToList() will first then execute the query and thus only loop your elements once.
List<Student> studentList = new List<Student>();
var list = Students.Where(s=>s.Name == "ABC");
foreach(Student s in list)
{
studentList.add(s);
}
this example will also only iterate once. Because its only used once. Keep in mind that list will iterate all students everytime its called.. Not only just those whose names are ABC. Since its a query.
And for the later discussion Ive made a testexample. Perhaps its not the very best implementation of IEnumable but it does what its supposed to do.
First we have our list
public class TestList<T> : IEnumerable<T>
{
private TestEnumerator<T> _Enumerator;
public TestList()
{
_Enumerator = new TestEnumerator<T>();
}
public IEnumerator<T> GetEnumerator()
{
return _Enumerator;
}
System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
{
throw new NotImplementedException();
}
internal void Add(T p)
{
_Enumerator.Add(p);
}
}
And since we want to count how many times MoveNext is called we have to implement our custom enumerator aswel. Observe in MoveNext we have a counter that is static in our program.
public class TestEnumerator : IEnumerator
{
public Item FirstItem = null;
public Item CurrentItem = null;
public TestEnumerator()
{
}
public T Current
{
get { return CurrentItem.Value; }
}
public void Dispose()
{
}
object System.Collections.IEnumerator.Current
{
get { throw new NotImplementedException(); }
}
public bool MoveNext()
{
Program.Counter++;
if (CurrentItem == null)
{
CurrentItem = FirstItem;
return true;
}
if (CurrentItem != null && CurrentItem.NextItem != null)
{
CurrentItem = CurrentItem.NextItem;
return true;
}
return false;
}
public void Reset()
{
CurrentItem = null;
}
internal void Add(T p)
{
if (FirstItem == null)
{
FirstItem = new Item<T>(p);
return;
}
Item<T> lastItem = FirstItem;
while (lastItem.NextItem != null)
{
lastItem = lastItem.NextItem;
}
lastItem.NextItem = new Item<T>(p);
}
}
And then we have a custom item that just wraps our value
public class Item<T>
{
public Item(T item)
{
Value = item;
}
public T Value;
public Item<T> NextItem;
}
To use the actual code we create a "list" with 3 entries.
public static int Counter = 0;
static void Main(string[] args)
{
TestList<int> list = new TestList<int>();
list.Add(1);
list.Add(2);
list.Add(3);
var v = list.Where(c => c == 2).ToList(); //will use movenext 4 times
var v = list.Where(c => true).ToList(); //will also use movenext 4 times
List<int> tmpList = new List<int>(); //And the loop in OP question
foreach(var i in list)
{
tmpList.Add(i);
} //Also 4 times.
}
And conclusion? How does it hit performance?
The MoveNext is called n+1 times in this case. Regardless of how many items we have.
And also the WhereClause does not matter, he will still run MoveNext 4 times. Because we always run our query on our initial list.
The only performance hit we will take is the actual LINQ framework and its calls. The actual loops made will be the same.
And before anyone asks why its N+1 times and not N times. Its because he returns false the last time when he is out of elements. Making it the number of elements + end of list.
To answer this completely, it depends on the implementation. If you are talking about LINQ to SQL/EF, there will be only one iteration in this case when .ToList is called, which internally calls .GetEnumerator. The query expression is then parsed into TSQL and passed to the database. The resulting rows are then iterated over (once) and added to the list.
In the case of LINQ to Objects, there is only one pass through the data as well. The use of yield return in the where clause sets up a state machine internally which keeps track of where the process is in the iteration. Where does NOT do a full iteration creating a temporary list and then passing those results to the rest of the query. It just determines if an item meets a criteria and only passes on those that match.
First of all, Why are you even asking me? Measure for yourself and see.
That said, Where, Select, OrderBy and the other LINQ IEnumerable extension methods, in general, are implemented as lazy as possible (the yield keyword is used often). That means that they do not work on the data unless they have to. From your example:
var list = Students.Where(s => s.Name == "ABC");
won't execute anything. This will return momentarily even if Students is a list of 10 million objects. The predicate won't be called at all until the result is actually requested somewhere, and that is practically what ToList() does: It says "Yes, the results - all of them - are required immediately".
There is however, some initial overhead in calling of the LINQ methods, so the traditional way will, in general, be faster, but composability and the ease-of-use of the LINQ methods, IMHO, more than compensate for that.
If you like to take a look at how these methods are implemented, they are available for reference from Microsoft Reference Sources.

Looking for non-type-specific method of handling Generic Collections in c#

My situation is this. I need to run some validation and massage type code on multiple different types of objects, but for cleanliness (and code reuse), I'd like to make all the calls to this validation look basically the same regardless of object. I am attempting to solve this through overloading, which works fine until I get to Generic Collection objects.
The following example should clarify what I'm talking about here:
private string DoStuff(string tmp) { ... }
private ObjectA DoStuff(ObjectA tmp) { ... }
private ObjectB DoStuff(ObjectB tmp) { ... }
...
private Collection<ObjectA> DoStuff(Collection<ObjectA> tmp) {
foreach (ObjectA obj in tmp) if (DoStuff(obj) == null) tmp.Remove(obj);
if (tmp.Count == 0) return null;
return tmp;
}
private Collection<Object> DoStuff(Collection<ObjectB> tmp) {
foreach (ObjectB obj in tmp) if (DoStuff(obj) == null) tmp.Remove(obj);
if (tmp.Count == 0) return null;
return tmp;
}
...
This seems like a real waste, as I have to duplicate the exact same code for every different Collection<T> type. I would like to make a single instance of DoStuff that handles any Collection<T>, rather than make a separate one for each.
I have tried using ICollection, but this has two problems: first, ICollection does not expose the .Remove method, and I can't write the foreach loop because I don't know the type of the objects in the list. Using something more generic, like object, does not work because I don't have a method DoStuff that accepts an object - I need it to call the appropriate one for the actual object. Writing a DoStuff method which takes an object and does some kind of huge list of if statements to pick the right method and cast appropriately kind of defeats the whole idea of getting rid of redundant code - I might as well just copy and paste all those Collection<T> methods.
I have tried using a generic DoStuff<T> method, but this has the same problem in the foreach loop. Because I don't know the object type at design time, the compiler won't let me call DoStuff(obj).
Technically, the compiler should be able to tell which call needs to be made at compile time, since these are all private methods, and the specific types of the objects being passed in the calls are all known at the point the method is being called. That knowledge just doesn't seem to bubble up to the later methods being called by this method.
I really don't want to use reflection here, as that makes the code even more complicated than just copying and pasting all the Collection<T> methods, and it creates a performance slowdown. Any ideas?
---EDIT 1---
I realized that my generic method references were not displaying correctly, because I had not used the html codes for the angle brackets. This should be fixed now.
---EDIT 2---
Based on a response below, I have altered my Collection<T> method to look like this:
private Collection<T> DoStuff<T>(Collection<T> tmp) {
for (int i = tmp.Count - 1; i >= 0; i--) if (DoStuff(tmp[i]) == null) tmp.RemoveAt(i);
if (tmp.Count == 0) return null;
return tmp;
}
This still does not work, however, as the compiler cannot figure out which overloaded method to call when I call DoStuff(tmp[i]).
You need to pass the method you want to call into the generic method as a parameter. That way the overload resolution happens at a point where the compiler knows what types to expect.
Alternatively, you need to make the per-item DoStuff method generic (or object) to support any possible item in the collection.
(I also separated the RemoveItem call from the first loop, so that it isn't trying to remove an item from the same list being iterated.)
private Collection<T> DoStuff<T>(Collection<T> tmp, Func<T, T> stuffDoer)
{
var removeList = tmp
.Select(v => stuffDoer(v))
.Where(v => v == null)
.ToList();
foreach (var removeItem in removeList) tmp.Remove(removeItem);
if (tmp.Count == 0) return null;
return tmp;
}
private class ObjectA { }
private class ObjectB { }
private string DoStuff(string tmp) { return tmp; }
private ObjectA DoStuff(ObjectA tmp) { return tmp; }
private ObjectB DoStuff(ObjectB tmp) { return tmp; }
Call using this code:
var x = new Collection<ObjectA>
{
new ObjectA(),
new ObjectA(),
null
};
var result = DoStuff(x, DoStuff);
Something like this?:
private Collection DoStuff<T>(Collection tmp)
{
// This will probably assert as you are modifying a collection while looping in it.
foreach (T obj in tmp) if (DoStuff(obj) == null) tmp.Remove(obj);
if (tmp.Count == 0) return null;
return tmp;
}
Where T is the type of the object in the collection.
Please note that you have a line that will most likely assert. SO:
private Collection DoStuff<T>(Collection tmp)
{
// foreach doesn't work if you are modifying the collection.
// Looping backward with an index, so we never encounter an invalid index.
for (int i = tmp.Count - 1; i >= 0; i--) if (DoStuff(tmp[i]) == null) tmp.Remove(tmp[i]);
if (tmp.Count == 0) return null;
return tmp;
}
But at this point... Why make it generic, since you are not using T anymore?
private Collection DoStuff(Collection tmp)
{
// DoStuff can be generic, but you shouldn't need to explicitly pass it a type...
for (int i = tmp.Count - 1; i >= 0; i--) if (DoStuff(tmp[i]) == null) tmp.Remove(tmp[i]);
if (tmp.Count == 0) return null;
return tmp;
}

How does LINQ defer execution when in a using statement

Imagine I have the following:
private IEnumerable MyFunc(parameter a)
{
using(MyDataContext dc = new MyDataContext)
{
return dc.tablename.Select(row => row.parameter == a);
}
}
private void UsingFunc()
{
var result = MyFunc(new a());
foreach(var row in result)
{
//Do something
}
}
According to the documentation the linq execution will defer till I actual enumerate the result, which occurs in the line at the foreach. However the using statement should force the object to be collected reliably at the end of the call to MyFunct().
What actually happens, when will the disposer run and/or the result run?
Only thing I can think of is the deferred execution is computed at compile time, so the actual call is moved by the compiler to the first line of the foreach, causing the using to perform correctly, but not run until the foreach line?
Is there a guru out there who can help?
EDIT: NOTE: This code does work, I just don't understand how.
I did some reading and I realised in my code that I had called the ToList() extension method which of course enumerates the result. The ticked answer's behaviour is perfectly correct for the actual question answered.
Sorry for any confusion.
I would expect that to simply not work; the Select is deferred, so no data has been consumed at this point. However, since you have disposed the data-context (before leaving MyFunc), it will never be able to get data. A better option is to pass the data-context into the method, so that the consumer can choose the lifetime. Also, I would recommend returning IQueryable<T> so that the consumer can "compose" the result (i.e. add OrderBy / Skip / Take / Where etc, and have it impact the final query):
// this could also be an instance method on the data-context
internal static IQueryable<SomeType> MyFunc(
this MyDataContext dc, parameter a)
{
return dc.tablename.Where(row => row.parameter == a);
}
private void UsingFunc()
{
using(MyDataContext dc = new MyDataContext()) {
var result = dc.MyFunc(new a());
foreach(var row in result)
{
//Do something
}
}
}
Update: if you (comments) don't want to defer execution (i.e. you don't want the caller dealing with the data-context), then you need to evaluate the results. You can do this by calling .ToList() or .ToArray() on the result to buffer the values.
private IEnumerable<SomeType> MyFunc(parameter a)
{
using(MyDataContext dc = new MyDataContext)
{
// or ToList() etc
return dc.tablename.Where(row => row.parameter == a).ToArray();
}
}
If you want to keep it deferred in this case, then you need to use an "iterator block":
private IEnumerable<SomeType> MyFunc(parameter a)
{
using(MyDataContext dc = new MyDataContext)
{
foreach(SomeType row in dc
.tablename.Where(row => row.parameter == a))
{
yield return row;
}
}
}
This is now deferred without passing the data-context around.
I just posted another deferred-execution solution to this problem here, including this sample code:
IQueryable<MyType> MyFunc(string myValue)
{
return from dc in new MyDataContext().Use()
from row in dc.MyTable
where row.MyField == myValue
select row;
}
void UsingFunc()
{
var result = MyFunc("MyValue").OrderBy(row => row.SortOrder);
foreach(var row in result)
{
//Do something
}
}
The Use() extension method essentially acts like a deferred using block.

Categories