Going Through A Foreach When It Can Get Modified? [duplicate] - c#

This question already has answers here:
What is the best way to modify a list in a 'foreach' loop?
(11 answers)
Closed 9 years ago.
I want to do a foreach loop while taking out members of that foreach loop, but that's throwing errors. My only idea is to create another list inside of this loop to find which Slices to remove, and loop through the new list to remove items from Pizza.
foreach(var Slice in Pizza)
{
if(Slice.Flavor == "Sausage")
{
Me.Eat(Slice); //This removes an item from the list: "Pizza"
}
}

You can do this, by far the simplest way I have found (like to think I invented it, sure that's not true though ;))
foreach (var Slice in Pizza.ToArray())
{
if (Slice.Flavor == "Sausage") // each to their own.. would have gone for BBQ
{
Me.Eat(Slice);
}
}
..because it's iterating over a fixed copy of the loop. It will iterate all items, even if they are removed.
Handy isn't it!
(By the way guys, this is a handy way of iterating through a copy of a collection, with thread safety, and removing the time an object is locked: Lock, get the ToArray() copy, release the lock, then iterate)
Hope that helps!

If you have to iterate through a list and need to remove items, iterate backwards using a for loop:
// taken from Preet Sangha's answer and modified
for(int i = Pizza.Count-1; i >= 0, i--)
{
var Slice = Pizza[i];
if(Slice.Flavor == "Sausage")
{
Me.Eat(Slice); //This removes an item from the list: "Pizza"
}
}
The reason to iterate backwards is so that when you remove Elements you don't run into an IndexOutOfRangeException that's caused by accessing Pizza[5] on a Pizza that has only 5 Elements because we removed the sixth one.
The reason to use a for loop is because the iterator variable i has no relation to the Pizza, so you can modify the Pizza without the enumerator "breaking"

use a for loop not a foreach
for(int i = 0; i < in Pizza.Count(), ++i)
{
var Slice = Pizza[i];
if(Slice.Flavor == "Sausage")
{
Me.Eat(Slice); //This removes an item from the list: "Pizza"
}
}

Proably the clearest way to approach this would be to build a list of slices to eat, then to process it, avoiding altering the original enumeration in-loop. I've never been a fan of using indexed loops for this, as it can be error-prone.
List<Slice> slicesToEat=new List<Slice>();
foreach(var Slice in Pizza)
{
if(Slice.Flavor == "Sausage")
{
slicesToEat.Add(Slice);
}
}
foreach(var slice in slicesToEat)
{
Me.Eat(slice);
}

Perhaps change your Me.Eat() signature to take an IEnumerable<Slice>
Me.Eat(Pizza.Where(s=>s.Flavor=="Sausage").ToList());
This lets you perform the task in 1 line of code.
Then your Eat() could be like:
public void Eat(IEnumerable<Slice> remove)
{
foreach (Slice r in remove)
{
Pizza.Remove(r);
}
}

The VB6-styled "Collection" object allows for modification during enumeration, and seems to work sensibly when such modifications occur. Too bad it has other limitations (key type is limited to case-insensitive strings) and doesn't support generics, since none of the other collection types allow modification.
Frankly, I'm not clear why Microsoft's iEnumerable contract requires that an exception be thrown if a collection is modified. I would understand a requirement that an exception be thrown if a changes to a collection would make it impossible for an enumeration to continue without wackiness (skipping or duplicating values that did not change during enumeration, crashing, etc.) but see no reason not to allow a collection that could enumerate sensibly to do so.

Where can you order pizza where slices have separate toppings? Anyway...
Using Linq:
// Was "Me.Eat()" supposed to be "this.Eat()"?
Pizza
.Where(slice => slice.Flavor == "Sausage")
.Foreach(sausageSlice => { Me.Eat(sausageSlice); });
The first two lines create a new list with only sausage slices. The third will take that new subset and pass each slice to Me.Eat(). The { and ;} may be superfluous. This is not the most efficient method because it first makes a copy (as do many other approaches that have been given), but it is certainly clean and readable.
BTW, This is only for posterity as the best answer was already given - iterate backward by index.

What kind of collection is Pizza? If it's a List<T> then you can call the RemoveAll method:
Pizza.RemoveAll(slice => string.Equals(slice.Flavor, "Sausage"));

Related

Finding 2-Tuple Combinations of IEnumerable<T> collection, C#

I would like to implement a method, that takes a collection of an unknown Type as a parameter and returns a Collection of 2-tuples which contains all possible distinct combinations from these elements (with no repetition). My Code:
public static IEnumerable<Tuple<T, T>> Get2Combinations<T>(this
IEnumerable<T> col)
{
/*foreach (var item1 in col)
{
col.GetEnumerator().MoveNext();
foreach (var item2 in col)
{
yield return new Tuple<T, T>(item1, item2);
}
}*/
for (int i = 0; i < col.Count(); i++)
{
for (int j = i + 1; j < col.Count(); j++)
{
yield return new Tuple<T, T>(col.ElementAt(i),
col.ElementAt(j));
}
}
}
What I'm doing is i take the first element and take a pair with every other. Then using this inner for loop i loop through all the remaining ones. The problem i see is the method col.ElementAt(i). If we look into source code, we see that if 'col' is of type IList, then this gets directly the value at given index, but taking any other collection, this would be veery very slow and would take a lot of time.
I attempted to deal with this using foreach loops (the commented section), which are efficient when using IEnumerable, but that part just doesn't work, because the enumerator is common for both inner and outer loop and therefore this produces set of all 2-tuples, where some of them are repeated.
Would anyone give me some suggestions, how to improve this code?
The problem is that Enumerable is designed to describe a class where you can iterate through it (like a stream). Its not intended to support efficiently random access (like an array).
Where you use Count() you are forcing the Enumerable to iterate itself to its end, so in the case of a Stream this will wait until the entire stream is read. Of course a Stream might not support efficient direct access, or even buffer its content in memory (remember - it just promises to support enumeration) - so subsequently calling ElementAt() could force it to re-read from the beginning to the position indicated.
Best way to solve this is to swap from IEnumerable to IList. This means it does support random access; clearly it could still be poorly performing, but thats not the responsibility of your function.

Is there a way to handle any type of collection, instead of solely relying on Array, List, etc?

This example is for a method called "WriteLines", which takes an array of strings and adds them to an asynchronous file writer. It works, but I am curious if there is an interesting way to support -any- collection of strings, rather than relying on the programmer to convert to an array.
I came up with something like:
public void AddLines(IEnumerable<string> lines)
{
// grab the queue
lock (_queue)
{
// loop through the collection and enqueue each line
for (int i = 0, count = lines.Count(); i < count; i++)
{
_queue.Enqueue(lines.ElementAt(i));
}
}
// notify the thread it has work to do.
_hasNewItems.Set();
}
This appears to work but I have no idea of any performance implications it has, or any logic implications either (What happens to the order? I assume this will allow even unordered collections to work, e.g. HashSet).
Is there a more accepted way to achieve this?
You've been passed an IEnumerable<string> - that means you can iterate over it. Heck, there's even a language feature specifically for it - foreach:
foreach (string line in lines)
{
_queue.Enqueue(line);
}
Unlike your existing approach, this will only iterate over the sequence once. Your current code will behave differently based on the underlying implementation - in some cases Count() and ElementAt are optimized, but in some cases they aren't. You can see this really easily if you use an iterator block and log:
public IEnumerable<string> GetItems()
{
Console.WriteLine("yielding a");
yield return "a";
Console.WriteLine("yielding b");
yield return "b";
Console.WriteLine("yielding c");
yield return "c";
}
Try calling AddLines(GetItems()) with your current implementation, and look at the console...
Adding this answer as well since you are using threads, use a ConcurrentQueue instead, like so:
// the provider method
// _queue = new BlockingCollection<string>()
public void AddLines(IEnumerable<string> lines)
{
foreach (var line in lines)
{
_queue.Add(line);
}
}
No locks required, and allows for multiple consumers and providers since we flag for each element added.
The consumer basically only has to do var workitem = _queue.Take();

Does for loop count elements added into itself during the loop?

My question is, that when I loop through a list with for loop, and add elements to it during this, does it count the elements added while looping?
Simple code example:
for (int i = 0; i < listOfIds.Count(); i++) // Does loop counts the items added below?
{
foreach (var post in this.Collection)
{
if (post.ResponsePostID == listOfIds.ElementAt(i))
{
listOfIds.Add(post.PostId); // I add new item to list in here
}
}
}
I hope my explanation is good enough for you to understand what my question is.
Yes, it usually does. But changing a collection at the same time you're iterating over it can lead to weird behavior and hard-to-find bugs. It isn't recommended at all.
If you want this loop run only for preAdded item count then do this
int nLstCount = listOfIds.Count();
for (int i = 0; i < nLstCount ; i++)
{
foreach (var post in this.Collection)
{
if (post.ResponsePostID == listOfIds.ElementAt(i))
{
listOfIds.Add(post.PostId);
}
}
}
Yes it surely will.The inner foreach loop will execute and add the elements the outer collection and thus will increament the count of the elements.
listOfIds.Count=2 //iteration 1
listOfIds.Add(//element)
when it come to the for loop again
listOfIds.Count=3 //iteration 2
As a slightly abridged explanation of the for loop. You're essentially defining the following:
for (initializer; condition; iterator)
body
Your initializer will will establish your initial conditions, and will only happen once (effectively outside the loop).
Your condition will be evaluated every time to determine whether your loop should run again, or simply exit.
Your iterator defines an action that will occur after each iteration in your loop.
So in your case, your loop will reevaluate listOfIds.Count(); each time, to decide if it should execute; that may or may not be your desired behaviour.
As Dennis points out, you can let yourself get into a bit of a mess (youy loop may run infinitely) if you aren't careful.
A much more detailed/better written explanation can be found on msdn: http://msdn.microsoft.com/en-us/library/ch45axte.aspx

Why do we need iterators in c#?

Can somebody provide a real life example regarding use of iterators. I tried searching google but was not satisfied with the answers.
You've probably heard of arrays and containers - objects that store a list of other objects.
But in order for an object to represent a list, it doesn't actually have to "store" the list. All it has to do is provide you with methods or properties that allow you to obtain the items of the list.
In the .NET framework, the interface IEnumerable is all an object has to support to be considered a "list" in that sense.
To simplify it a little (leaving out some historical baggage):
public interface IEnumerable<T>
{
IEnumerator<T> GetEnumerator();
}
So you can get an enumerator from it. That interface (again, simplifying slightly to remove distracting noise):
public interface IEnumerator<T>
{
bool MoveNext();
T Current { get; }
}
So to loop through a list, you'd do this:
var e = list.GetEnumerator();
while (e.MoveNext())
{
var item = e.Current;
// blah
}
This pattern is captured neatly by the foreach keyword:
foreach (var item in list)
// blah
But what about creating a new kind of list? Yes, we can just use List<T> and fill it up with items. But what if we want to discover the items "on the fly" as they are requested? There is an advantage to this, which is that the client can abandon the iteration after the first three items, and they don't have to "pay the cost" of generating the whole list.
To implement this kind of lazy list by hand would be troublesome. We would have to write two classes, one to represent the list by implementing IEnumerable<T>, and the other to represent an active enumeration operation by implementing IEnumerator<T>.
Iterator methods do all the hard work for us. We just write:
IEnumerable<int> GetNumbers(int stop)
{
for (int n = 0; n < stop; n++)
yield return n;
}
And the compiler converts this into two classes for us. Calling the method is equivalent to constructing an object of the class that represents the list.
Iterators are an abstraction that decouples the concept of position in a collection from the collection itself. The iterator is a separate object storing the necessary state to locate an item in the collection and move to the next item in the collection. I have seen collections that kept that state inside the collection (i.e. a current position), but it is often better to move that state to an external object. Among other things it enables you to have multiple iterators iterating the same collection.
Simple example : a function that generates a sequence of integers :
static IEnumerable<int> GetSequence(int fromValue, int toValue)
{
if (toValue >= fromValue)
{
for (int i = fromValue; i <= toValue; i++)
{
yield return i;
}
}
else
{
for (int i = fromValue; i >= toValue; i--)
{
yield return i;
}
}
}
To do it without an iterator, you would need to create an array then enumerate it...
Iterate through the students in a class
The Iterator design pattern provides
us with a common method of enumerating
a list of items or array, while hiding
the details of the list's
implementation. This provides a
cleaner use of the array object and
hides unneccessary information from
the client, ultimately leading to
better code-reuse, enhanced
maintainability, and fewer bugs. The
iterator pattern can enumerate the
list of items regardless of their
actual storage type.
Iterate through a set of homework questions.
But seriously, Iterators can provide a unified way to traverse the items in a collection regardless of the underlying data structure.
Read the first two paragraphs here for a little more info.
A couple of things they're great for:
a) For 'perceived performance' while maintaining code tidiness - the iteration of something separated from other processing logic.
b) When the number of items you're going to iterate through is not known.
Although both can be done through other means, with iterators the code can be made nicer and tidier as someone calling the iterator don't need to worry about how it finds the stuff to iterate through...
Real life example: enumerating directories and files, and finding the first [n] that fulfill some criteria, e.g. a file containing a certain string or sequence etc...
Beside everything else, to iterate through lazy-type sequences - IEnumerators. Each next element of such sequence may be evaluated/initialized upon iteration step which makes it possible to iterate through infinite sequences using finite amount of resources...
The canonical and simplest example is that it makes infinite sequences possible without the complexity of having to write the class to do that yourself:
// generate every prime number
public IEnumerator<int> GetPrimeEnumerator()
{
yield return 2;
var primes = new List<int>();
primesSoFar.Add(2);
Func<int, bool> IsPrime = n => primes.TakeWhile(
p => p <= (int)Math.Sqrt(n)).FirstOrDefault(p => n % p == 0) == 0;
for (int i = 3; true; i += 2)
{
if (IsPrime(i))
{
yield return i;
primes.Add(i);
}
}
}
Obviously this would not be truly infinite unless you used a BigInt instead of int but it gives you the idea.
Writing this code (or similar) for each generated sequence would be tedious and error prone. the iterators do that for you. If the above example seems too complex for you consider:
// generate every power of a number from start^0 to start^n
public IEnumerator<int> GetPowersEnumerator(int start)
{
yield return 1; // anything ^0 is 1
var x = start;
while(true)
{
yield return x;
x *= start;
}
}
They come at a cost though. Their lazy behaviour means you cannot spot common errors (null parameters and the like) until the generator is first consumed rather than created without writing wrapping functions to check first. The current implementation is also incredibly bad(1) if used recursively.
Wiriting enumerations over complex structures like trees and object graphs is much easier to write as the state maintenance is largely done for you, you must simply write code to visit each item and not worry about getting back to it.
I don't use this word lightly - a O(n) iteration can become O(N^2)
An iterator is an easy way of implementing the IEnumerator interface. Instead of making a class that has the methods and properties required for the interface, you just make a method that returns the values one by one and the compiler creates a class with the methods and properties needed to implement the interface.
If you for example have a large list of numbers, and you want to return a collection where each number is multiplied by two, you can make an iterator that returns the numbers instead of creating a copy of the list in memory:
public IEnumerable<int> GetDouble() {
foreach (int n in originalList) yield return n * 2;
}
In C# 3 you can do something quite similar using extension methods and lambda expressions:
originalList.Select(n => n * 2)
Or using LINQ:
from n in originalList select n * 2
IEnumerator<Question> myIterator = listOfStackOverFlowQuestions.GetEnumerator();
while (myIterator.MoveNext())
{
Question q;
q = myIterator.Current;
if (q.Pertinent == true)
PublishQuestion(q);
else
SendMessage(q.Author.EmailAddress, "Your question has been rejected");
}
foreach (Question q in listOfStackOverFlowQuestions)
{
if (q.Pertinent == true)
PublishQuestion(q);
else
SendMessage(q.Author.EmailAddress, "Your question has been rejected");
}

C# (.Net 2.0) Micro-Optimization Part 2: Finding Contiguous Groups within a grid

I have a very simple function which takes in a matching bitfield, a grid, and a square. It used to use a delegate but I did a lot of recoding and ended up with a bitfield & operation to avoid the delegate while still being able to perform matching within reason. Basically, the challenge is to find all contiguous elements within a grid which match the match bitfield, starting from a specific "leader" square.
Square is somewhat small (but not tiny) class. Any tips on how to push this to be even faster? Note that the grid itself is pretty small (500 elements in this test).
Edit: It's worth noting that this function is called over 200,000 times per second. In truth, in the long run my goal will be to call it less often, but that's really tough, considering that my end goal is to make the grouping system be handled with scripts rather than being hardcoded. That said, this function is always going to be called more than any other function.
Edit: To clarify, the function does not check if leader matches the bitfield, by design. The intention is that the leader is not required to match the bitfield (though in some cases it will).
Things tried unsuccessfully:
Initializing the dictionary and stack with a capacity.
Casting the int to an enum to avoid a cast.
Moving the dictionary and stack outside the function and clearing them each time they are needed. This makes things slower!
Things tried successfully:
Writing a hashcode function instead of using the default: Hashcodes are precomputed and are equal to x + y * parent.Width. Thanks for the reminder, Jim Mischel.
mquander's Technique: See GetGroupMquander below.
Further Optimization: Once I switched to HashSets, I got rid of the Contains test and replaced it with an Add test. Both Contains and Add are forced to seek a key, so just checking if an add succeeds is more efficient than adding if a Contains fails check fails. That is, if (RetVal.Add(s)) curStack.Push(s);
public static List<Square> GetGroup(int match, Model grid, Square leader)
{
Stack<Square> curStack = new Stack<Square>();
Dictionary<Square, bool> Retval = new Dictionary<Square, bool>();
curStack.Push(leader);
while (curStack.Count != 0)
{
Square curItem = curStack.Pop();
if (Retval.ContainsKey(curItem)) continue;
Retval.Add(curItem, true);
foreach (Square s in curItem.Neighbors)
{
if (0 != ((int)(s.RoomType) & match))
{
curStack.Push(s);
}
}
}
return new List<Square>(Retval.Keys);
}
=====
public static List<Square> GetGroupMquander(int match, Model grid, Square leader)
{
Stack<Square> curStack = new Stack<Square>();
Dictionary<Square, bool> Retval = new Dictionary<Square, bool>();
Retval.Add(leader, true);
curStack.Push(leader);
while (curStack.Count != 0)
{
Square curItem = curStack.Pop();
foreach (Square s in curItem.Neighbors)
{
if (0 != ((int)(s.RoomType) & match))
{
if (!Retval.ContainsKey(s))
{
curStack.Push(s);
Retval.Add(curItem, true);
}
}
}
}
return new List<Square>(Retval.Keys);
}
The code you posted assumes that the leader square matches the bitfield. Is that by design?
I assume your Square class has implemented a GetHashCode method that's quick and provides a good distribution.
You did say micro-optimization . . .
If you have a good idea how many items you're expecting, you'll save a little bit of time by pre-allocating the dictionary. That is, if you know you won't have more than 100 items that match, you can write:
Dictionary<Square, bool> Retval = new Dictionary<Square, bool>(100);
That will avoid having to grow the dictionary and re-hash everything. You can also do the same thing with your stack: pre-allocate it to some reasonable maximum size to avoid resizing later.
Since you say that the grid is pretty small it seems reasonable to just allocate the stack and the dictionary to the grid size, if that's easy to determine. You're only talking grid_size references each, so memory isn't a concern unless your grid becomes very large.
Adding a check to see if an item is in the dictionary before you do the push might speed it up a little. It depends on the relative speed of a dictionary lookup as opposed to the overhead of having a duplicate item in the stack. Might be worth it to give this a try, although I'd be surprised if it made a big difference.
if (0 != ((int)(s.RoomType) & match))
{
if (!Retval.ContainsKey(curItem))
curStack.Push(s);
}
I'm really stretching on this last one. You have that cast in your inner loop. I know that the C# compiler sometimes generates a surprising amount of code for a seemingly simple cast, and I don't know if that gets optimized away by the JIT compiler. You could remove that cast from your inner loop by creating a local variable of the enum type and assigning it the value of match:
RoomEnumType matchType = (RoomEnumType)match;
Then your inner loop comparison becomes:
if (0 != (s.RoomType & matchType))
No cast, which might shave some cycles.
Edit: Micro-optimization aside, you'll probably get better performance by modifying your algorithm slightly to avoid processing any item more than once. As it stands, items that do match can end up in the stack multiple times, and items that don't match can be processed multiple times. Since you're already using a dictionary to keep track of items that do match, you can keep track of the non-matching items by giving them a value of false. Then at the end you simply create a List of those items that have a true value.
public static List<Square> GetGroup(int match, Model grid, Square leader)
{
Stack<Square> curStack = new Stack<Square>();
Dictionary<Square, bool> Retval = new Dictionary<Square, bool>();
curStack.Push(leader);
Retval.Add(leader, true);
int numMatch = 1;
while (curStack.Count != 0)
{
Square curItem = curStack.Pop();
foreach (Square s in curItem.Neighbors)
{
if (Retval.ContainsKey(curItem))
continue;
if (0 != ((int)(s.RoomType) & match))
{
curStack.Push(s);
Retval.Add(s, true);
++numMatch;
}
else
{
Retval.Add(s, false);
}
}
}
// LINQ makes this easier, but since you're using .NET 2.0...
List<Square> matches = new List<Square>(numMatch);
foreach (KeyValuePair<Square, bool> kvp in Retval)
{
if (kvp.Value == true)
{
matches.Add(kvp.Key);
}
}
return matches;
}
Here are a couple of suggestions -
If you're using .NET 3.5, you could change RetVal to a HashSet<Square> instead of a Dictionary<Square,bool>, since you're never using the values (only the keys) in the Dictionary. This would be a small improvement.
Also, if you changed the return to IEnumerable, you could just return the HashSet's enumerator directly. Depending on the usage of the results, it could potentially be faster in certain areas (and you can always use ToList() on the results if you really need a list).
However, there is a BIG optimization that could be added here -
Right now, you're always adding in every neighbor, even if that neighbor has already been processed. For example, when leader is processed, it adds in leader+1y, then when leader+1y is processed, it puts BACK in leader (even though you've already handled that Square), and next time leader is popped off the stack, you continue. This is a lot of extra processing.
Try adding:
foreach (Square s in curItem.Neighbors)
{
if ((0 != ((int)(s.RoomType) & match)) && (!Retval.ContainsKey(s)))
{
curStack.Push(s);
}
}
This way, if you've already processed the square of your neighbor, it doesn't get re-added to the stack, just to be skipped when it's popped later.

Categories