Identifying last loop when using for each [duplicate] - c#

This question already has answers here:
Foreach loop, determine which is the last iteration of the loop
(24 answers)
Closed 10 days ago.
I want to do something different with the last loop iteration when performing 'foreach' on an object. I'm using Ruby but the same goes for C#, Java etc.
list = ['A','B','C']
list.each{|i|
puts "Looping: "+i # if not last loop iteration
puts "Last one: "+i # if last loop iteration
}
The output desired is equivalent to:
Looping: 'A'
Looping: 'B'
Last one: 'C'
The obvious workaround is to migrate the code to a for loop using 'for i in 1..list.length', but the for each solution feels more graceful. What is the most graceful way to code a special case during a loop? Can it be done with foreach?

I see a lot of complex, hardly readable code here... why not keep it simple:
var count = list.Length;
foreach(var item in list)
if (--count > 0)
Console.WriteLine("Looping: " + item);
else
Console.Writeline("Lastone: " + item);
It's only one extra statement!
Another common situation is that you want to do something extra or less with the last item, like putting a separator between the items:
var count = list.Length;
foreach(var item in list)
{
Console.Write(item);
if (--count > 0)
Console.Write(",");
}

The foreach construct (in Java definitely, probably also in other languages) is intended to represent the most general kind if iteration, which includes iteration over collections that have no meaningful iteration order. For example, a hash-based set does not have an ordering, and therefore there is no "last element". The last iteration may yield a different element each time you iterate.
Basically: no, the foreach construct is not meant to be used that way.

How about obtaining a reference to the last item first and then use it for comparison inside the foreach loop? I am not say that you should do this as I myself would use the index based loop as mentioned by KlauseMeier. And sorry I don't know Ruby so the following sample is in C#! Hope u dont mind :-)
string lastItem = list[list.Count - 1];
foreach (string item in list) {
if (item != lastItem)
Console.WriteLine("Looping: " + item);
else Console.Writeline("Lastone: " + item);
}
I revised the following code to compare by reference not value (can only use reference types not value types). the following code should support multiple objects containing same string (but not same string object) since MattChurcy's example did not specify that the strings must be distinct and I used LINQ Last method instead of calculating the index.
string lastItem = list.Last();
foreach (string item in list) {
if (!object.ReferenceEquals(item, lastItem))
Console.WriteLine("Looping: " + item);
else Console.WriteLine("Lastone: " + item);
}
Limitations of the above code. (1) It can only work for strings or reference types not value types. (2) Same object can only appear once in the list. You can have different objects containing the same content. Literal strings cannot be used repeatedly since C# does not create a unique object for strings that have the same content.
And i no stupid. I know an index based loop is the one to use. I already said so when i first posted the initial answer. I provided the best answer I can in the context of the question. I am too tired to keep explaining this so can you all just vote to delete my answer. I'll be so happy if this one goes away. thanks

Is this elegant enough? It assumes a non-empty list.
list[0,list.length-1].each{|i|
puts "Looping:"+i # if not last loop iteration
}
puts "Last one:" + list[list.length-1]

In Ruby I'd use each_with_index in this situation
list = ['A','B','C']
last = list.length-1
list.each_with_index{|i,index|
if index == last
puts "Last one: "+i
else
puts "Looping: "+i # if not last loop iteration
end
}

You can define an eachwithlast method in your class to do the same as each on all elements but the last, but something else for the last:
class MyColl
def eachwithlast
for i in 0...(size-1)
yield(self[i], false)
end
yield(self[size-1], true)
end
end
Then you could call it like this (foo being an instance of MyColl or a subclass thereof):
foo.eachwithlast do |value, last|
if last
puts "Last one: "+value
else
puts "Looping: "+value
end
end
Edit: Following molf's suggestion:
class MyColl
def eachwithlast (defaultAction, lastAction)
for i in 0...(size-1)
defaultAction.call(self[i])
end
lastAction.call(self[size-1])
end
end
foo.eachwithlast(
lambda { |x| puts "looping "+x },
lambda { |x| puts "last "+x } )

C# 3.0 or newer
Firstly, I would write an extension method:
public static void ForEachEx<T>(this IEnumerable<T> s, Action<T, bool> act)
{
IEnumerator<T> curr = s.GetEnumerator();
if (curr.MoveNext())
{
bool last;
while (true)
{
T item = curr.Current;
last = !curr.MoveNext();
act(item, last);
if (last)
break;
}
}
}
Then using the new foreach is very simple:
int[] lData = new int[] { 1, 2, 3, 5, -1};
void Run()
{
lData.ForEachEx((el, last) =>
{
if (last)
Console.Write("last one: ");
Console.WriteLine(el);
});
}

You should use foreach only if you handle each one same. Use index based interation instead. Else you must add a different structure around the items, which you can use to differentiate the normal from last one in the foreach call (look at good Papers about the map reduced from google for the background: http://labs.google.com/papers/mapreduce.html, map == foreach, reduced == e.g. sum or filter).
Map has no knowledge about the structure (esp. which position a item is), it only transforms one item by item (no knowledge from one item can be used to transform an other!), but reduce can use a memory to for example count the position and handle the last item.
A common trick is to reverse the list and handle the first (which has now a known index = 0), and later apply reverse again. (Which is elegant but not fast ;) )

Foreach is elegant in that it has no concern for the number of items in a list and treats each element equally, I think your only solution will be using a for loop that either stops at itemcount-1 and then you present your last item outside of the loop or a conditional within the loop that handles that specific condition, i.e. if (i==itemcount) { ... } else { ... }

You could do something like that (C#) :
string previous = null;
foreach(string item in list)
{
if (previous != null)
Console.WriteLine("Looping : {0}", previous);
previous = item;
}
if (previous != null)
Console.WriteLine("Last one : {0}", previous);

Ruby also has each_index method:
list = ['A','B','C']
list.each_index{|i|
if i < list.size - 1
puts "Looping:"+list[i]
else
puts "Last one:"+list[i]
}
EDIT:
Or using each (corrected TomatoGG and Kirschstein solution):
list = ['A', 'B', 'C', 'A']
list.each { |i|
if (i.object_id != list.last.object_id)
puts "Looping:#{i}"
else
puts "Last one:#{i}"
end
}
Looping:A
Looping:B
Looping:C
Last one:A
Or
list = ['A', 'B', 'C', 'A']
list.each {|i|
i.object_id != list.last.object_id ? puts "Looping:#{i}" : puts "Last one:#{i}"
}

What you are trying to do seems just a little too advanced for the foreach-loop. However, you can use Iterators explicitly. For example, in Java, I would write this:
Collection<String> ss = Arrays.asList("A","B","C");
Iterator<String> it = ss.iterator();
while (it.hasNext()) {
String s = it.next();
if(it.hasNext())
System.out.println("Looping: " + s);
else
System.out.println("Last one: " + s);
}

If you're using a collection that exposes a Count property - an assumption made by many of the other answers, so I'll make it too - then you can do something like this using C# and LINQ:
foreach (var item in list.Select((x, i) => new { Val = x, Pos = i }))
{
Console.Write(item.Pos == (list.Count - 1) ? "Last one: " : "Looping: ");
Console.WriteLine(item.Val);
}
If we additionally assume that the items in the collection can be accessed directly by index - the currently accepted answer assumes this - then a plain for loop will be more elegant/readable than a foreach:
for (int i = 0; i < list.Count; i++)
{
Console.Write(i == (list.Count - 1) ? "Last one: " : "Looping: ");
Console.WriteLine(list[i]);
}
If the collection doesn't expose a Count property and can't be accessed by index then there isn't really any elegant way to do this, at least not in C#. A bug-fixed variation of Thomas Levesque's answer is probably as close as you'll get.
Here's the bug-fixed version of Thomas's answer:
string previous = null;
bool isFirst = true;
foreach (var item in list)
{
if (!isFirst)
{
Console.WriteLine("Looping: " + previous);
}
previous = item;
isFirst = false;
}
if (!isFirst)
{
Console.WriteLine("Last one: " + previous);
}
And here's how I would do it in C# if the collection doesn't expose a Count property and the items aren't directly accessible by index. (Notice that there's no foreach and the code isn't particularly succinct, but it will give decent performance over pretty much any enumerable collection.)
// i'm assuming the non-generic IEnumerable in this code
// wrap the enumerator in a "using" block if dealing with IEnumerable<T>
var e = list.GetEnumerator();
if (e.MoveNext())
{
var item = e.Current;
while (e.MoveNext())
{
Console.WriteLine("Looping: " + item);
item = e.Current;
}
Console.WriteLine("Last one: " + item);
}

At least in C# that's not possible without a regular for loop.
The enumerator of the collection decides whether a next elements exists (MoveNext method), the loop doesn't know about this.

I think I prefer kgiannakakis's solution, however you could always do something like this;
list = ['A','B','C']
list.each { |i|
if (i != list.last)
puts "Looping:#{i}"
else
puts "Last one:#{i}"
end
}

I notice a number of suggestions assume that you can find the last item in the list before beginning the loop, and then compare every item to this item. If you can do this efficiently, then the underlying data structure is likely a simple array. If that's the case, why bother with the foreach at all? Just write:
for (int x=0;x<list.size()-1;++x)
{
System.out.println("Looping: "+list.get(x));
}
System.out.println("Last one: "+list.get(list.size()-1));
If you cannot retrieve an item from an arbitrary position efficiently -- like it the underlying structure is a linked list -- then getting the last item probably involved a sequential search of the entire list. Depending on the size of the list, that may be a performance issue. If this is a frequently-executed function, you might want to consider using an array or ArrayList or comparable structure so you can do it this way.
Sounds to me like you're asking, "What's the best way to put a screw in using a hammer?", when of course the better question to ask is, "What's the correct tool to use to put in a screw?"

Would it be a viable solution for your case to just take the first/last elements out of your array before doing the "general" each run?
Like this:
list = ['A','B','C','D']
first = list.shift
last = list.pop
puts "First one: #{first}"
list.each{|i|
puts "Looping: "+i
}
puts "Last one: #{last}"

This problem can be solved in an elegant way using pattern matching in a functional programming language such as F#:
let rec printList (ls:string list) =
match ls with
| [last] -> "Last " + last
| head::rest -> "Looping " + head + "\n" + printList (rest)
| [] -> ""

I don't know how for-each loops works in other languages but java.
In java for-each uses the Iterable interface that is used by the for-each to get an Iterator and loop with it. Iterator has a method hasNext that you could use if you could see the iterator within the loop.
You can actually do the trick by enclosing an already obtained Iterator in an Iterable object so the for loop got what it needs and you can get a hasNext method inside the loop.
List<X> list = ...
final Iterator<X> it = list.iterator();
Iterable<X> itw = new Iterable<X>(){
public Iterator<X> iterator () {
return it;
}
}
for (X x: itw) {
doSomething(x);
if (!it.hasNext()) {
doSomethingElse(x);
}
}
You can create a class that wraps all this iterable and iterator stuff so the code looks like this:
IterableIterator<X> itt = new IterableIterator<X>(list);
for (X x: itit) {
doSomething(x);
if (!itit.hasNext()) {
doSomethingElse(x);
}
}

Similar to kgiannakakis's answer:
list.first(list.size - 1).each { |i| puts "Looping: " + i }
puts "Last one: " + list.last

How about this one? just learnt a little Ruby. hehehe
list.collect {|x|(x!=list.last ? "Looping:"+x:"Lastone:"+x) }.each{|i|puts i}

Remove the last one from the list and retain its avlue.
Spec spec = specs.Find(s=>s.Value == 'C');
if (spec != null)
{
specs.Remove(spec);
}
foreach(Spec spec in specs)
{
}

Another pattern that works, without having to rewrite the foreach loop:
var delayed = null;
foreach (var X in collection)
{
if (delayed != null)
{
puts("Looping");
// Use delayed
}
delayed = X;
}
puts("Last one");
// Use delayed
This way the compiler keeps the loop straight, iterators (including those without counts) work as expected, and the last one is separated out from the others.
I also use this pattern when I want something to happen in between iterations, but not after the last one. In that case, X is used normally, delayed refers to something else, and the usage of delayed is only at the loop beginning and nothing needs to be done after the loop ends.

Use join whenever possible.
Most often the delimiter is the same between all elements, just join them together with the corresponding function in your language.
Ruby example,
puts array.join(", ")
This should cover 99% of all cases, and if not split the array into head and tail.
Ruby example,
*head, tail = array
head.each { |each| put "looping: " << each }
puts "last element: " << tail

<hello> what are we thinking here?
public static void main(String[] args) {
// TODO Auto-generated method stub
String str = readIndex();
String comp[] = str.split("}");
StringBuffer sb = new StringBuffer();
for (String s : comp) {
sb.append(s);
sb.append("}\n");
}
System.out.println (sb.toString());
}
As a modeling notation, the influence of the OMT notation dominates (e. g., using rectangles for classes and objects). Though the Booch "cloud" notation was dropped, the Booch capability to specify lower-level design detail was embraced. The use case notation from Objectory and the component notation from Booch were integrated with the rest of the notation, but the semantic integration was relatively weak in UML 1.1, and was not really fixed until the UML 2.0 major revision.

Related

How to combine items in List<string> to make new items efficiently

I have a case where I have the name of an object, and a bunch of file names. I need to match the correct file name with the object. The file name can contain numbers and words, separated by either hyphen(-) or underscore(_). I have no control of either file name or object name. For example:
10-11-12_001_002_003_13001_13002_this_is_an_example.svg
The object name in this case is just a string, representing an number
10001
I need to return true or false if the file name is a match for the object name. The different segments of the file name can match on their own, or any combination of two segments. In the example above, it should be true for the following cases (not every true case, just examples):
10001
10002
10003
11001
11002
11003
12001
12002
12003
13001
13002
And, we should return false for this case (among others):
13003
What I've come up with so far is this:
public bool IsMatch(string filename, string objectname)
{
var namesegments = GetNameSegments(filename);
var match = namesegments.Contains(objectname);
return match;
}
public static List<string> GetNameSegments(string filename)
{
var segments = filename.Split('_', '-').ToList();
var newSegments = new List<string>();
foreach (var segment in segments)
{
foreach (var segment2 in segments)
{
if (segment == segment2)
continue;
var newToken = segment + segment2;
newSegments.Add(newToken);
}
}
return segments.Concat(newSegments).ToList();
}
One or two segments combined can make a match, and that is enought. Three or more segments combined should not be considered.
This does work so far, but is there a better way to do it, perhaps without nesting foreach loops?
First: don't change debugged, working, sufficiently efficient code for no reason. Your solution looks good.
However, we can make some improvements to your solution.
public static List<string> GetNameSegments(string filename)
Making the output a list puts restrictions on the implementation that are not required by the caller. It should be IEnumerable<String>. Particularly since the caller in this case only cares about the first match.
var segments = filename.Split('_', '-').ToList();
Why ToList? A list is array-backed. You've already got an array in hand. Just use the array.
Since there is no longer a need to build up a list, we can transform your two-loop solution into an iterator block:
public static IEnumerable<string> GetNameSegments(string filename)
{
var segments = filename.Split('_', '-');
foreach (var segment in segments)
yield return segment;
foreach (var s1 in segments)
foreach (var s2 in segments)
if (s1 != s2)
yield return s1 + s2;
}
Much nicer. Alternatively we could notice that this has the structure of a query and simply return the query:
public static IEnumerable<string> GetNameSegments(string filename)
{
var q1= filename.Split('_', '-');
var q2 = from s1 in q1
from s2 in q1
where s1 != s2
select s1 + s2;
return q1.Concat(q2);
}
Again, much nicer in this form.
Now let's talk about efficiency. As is often the case, we can achieve greater efficiency at a cost of increased complication. This code looks like it should be plenty fast enough. Your example has nine segments. Let's suppose that nine or ten is typical. Our solutions thus far consider the ten or so singletons first, and then the hundred or so combinations. That's nothing; this code is probably fine. But what if we had thousands of segments and were considering millions of possibilities?
In that case we should restructure the algorithm. One possibility would be this general solution:
public bool IsMatch(HashSet<string> segments, string name)
{
if (segments.Contains(name))
return true;
var q = from s1 in segments
where name.StartsWith(s1)
let s2 = name.Substring(s1.Length)
where s1 != s2
where segments.Contains(s2)
select 1; // Dummy. All we care about is if there is one.
return q.Any();
}
Your original solution is quadratic in the number of segments. This one is linear; we rely on the constant order contains operation. (This assumes of course that string operations are constant time because strings are short. If that's not true then we have a whole other kettle of fish to fry.)
How else could we extract wins in the asymptotic case?
If we happened to have the property that the collection was not a hash set but rather a sorted list then we could do even better; we could binary search the list to find the start and end of the range of possible prefix matches, and then pour the list into a hashset to do the suffix matches. That's still linear, but could have a smaller constant factor.
If we happened to know that the target string was small compared to the number of segments, we could attack the problem from the other end. Generate all possible combinations of partitions of the target string and check if both halves are in the segment set. The problem with this solution is that it is quadratic in memory usage in the size of the string. So what we'd want to do there is construct a special hash on character sequences and use that to populate the hash table, rather than the standard string hash. I'm sure you can see how the solution would go from there; I shan't spell out the details.
Efficiency is very much dependent on the business problem that you're attempting to solve. Without knowing the full context/usage it's difficult to define the most efficient solution. What works for one situation won't always work for others.
I would always advocate to write working code and then solve any performance issues later down the line (or throw more tin at the problem as it's usually cheaper!) If you're having specific performance issues then please do tell us more...
I'm going to go out on a limb here and say (hope) that you're only going to be matching the filename against the object name once per execution. If that's the case I reckon this approach will be just about the fastest. In a circumstance where you're matching a single filename against multiple object names then the obvious choice is to build up an index of sorts and match against that as you were already doing, although I'd consider different types of collection depending on your expected execution/usage.
public static bool IsMatch(string filename, string objectName)
{
var segments = filename.Split('-', '_');
for (int i = 0; i < segments.Length; i++)
{
if (string.Equals(segments[i], objectName)) return true;
for (int ii = 0; ii < segments.Length; ii++)
{
if (ii == i) continue;
if (string.Equals($"{segments[i]}{segments[ii]}", objectName)) return true;
}
}
return false;
}
If you are willing to use the MoreLINQ NuGet package then this may be worth considering:
public static HashSet<string> GetNameSegments(string filename)
{
var segments = filename.Split(new char[] {'_', '-'}, StringSplitOptions.RemoveEmptyEntries).ToList();
var matches = segments
.Cartesian(segments, (x, y) => x == y ? null : x + y)
.Where(z => z != null)
.Concat(segments);
return new HashSet<string>(matches);
}
StringSplitOptions.RemoveEmptyEntries handles adjacent separators (e.g. --). Cartesian is roughly equivalent to your existing nested for loops. The Where is to remove null entries (i.e. if x == y). Concat is the same as your existing Concat. The use of HashSet allows for your Contains calls (in IsMatch) to be faster.

Skip first and last in IEnumerable, deferring execution

I have this huge json file neatly formated starting with the characters "[\r\n" and ending with "]". I have this piece of code:
foreach (var line in File.ReadLines(#"d:\wikipedia\wikipedia.json").Skip(1))
{
if (line[0] == ']') break;
// Do stuff
}
I'm wondering, what would be best performance-wise, what machine code would be the most optimal in regards to how many clock cycles and memory is consumed if I were to compare the above code to one where I have replaced "break" with "continue", or would both of those pieces of code compile to the same MSIL and machine code? If you know the answer, please explain exactly how you reached your conclusion? I'd really like to know.
EDIT: Before you close this as nonsensical, consider that this code is equivalent to the above code and consider that the c# compiler optimizes when the code path is flat and does not fork in a lot of ways, would all of the following examples generate the same amount of work for the CPU?
IEnumerable<char> text = new[] {'[', 'a', 'b', 'c', ']'};
foreach (var c in text.Skip(1))
{
if (c == ']') break;
// Do stuff
}
foreach (var c in text.Skip(1))
{
if (c == ']') continue;
// Do stuff
}
foreach (var c in text.Skip(1))
{
if (c != ']')
{
// Do stuff
}
}
foreach (var c in text.Skip(1))
{
if (c != ']')
{
// Do stuff
}
}
foreach (var c in text.Skip(1))
{
if (c != ']')
{
// Do stuff
}
else
{
break;
}
}
EDIT2: Here's another way of putting it: what's the prettiest way to skip the first and last item in an IEnumerable while still deferring the executing until //Do stuff?
Q: Different MSIL for break or continue in loop?
Yes, that's because it works like this:
foreach (var item in foo)
{
// more code...
if (...) { continue; } // jump to #1
if (...) { break; } // jump to #2
// more code...
// #1 -- just before the '}'
}
// #2 -- after the exit of the loop.
Q: What will give you the most performance?
Branches are branches for the compiler. If you have a goto, a continue or a break, it will eventually be compiled as a branch (opcode br), which will be analyzes as such. In other words: it doesn't make a difference.
What does make a difference is having predictable patterns of both data and code flow in the code. Branching breaks code flow, so if you want performance, you should avoid irregular branches.
In other words, prefer:
for (int i=0; i<10 && someCondition; ++i)
to:
for (int i=0; i<10; ++i)
{
// some code
if (someCondition) { ... }
// some code
}
As always with performance, the best thing to do is to run benchmarks. There's no surrogate.
Q: What will give you the most performance? (#2)
You're doing a lot with IEnumerable's. If you want raw performance and have the option, it's best to use an array or a string. There's no better alternative in terms of raw performance for sequential access of elements.
If an array isn't an option (for example because it doesn't match the access pattern), it's best to use a data structure that best suits the access pattern. Learn about the characteristics of hash tables (Dictionary), red black trees (SortedDictionary) and how List works. Knowledge about how stuff really works is the thing you need. If unsure, test, test and test again.
Q: What will give you the most performance? (#3)
I'd also try JSON libraries if your intent is to parse that. These people probably already invented the wheel for you - if not, it'll give you a baseline "to beat".
Q: [...] what's the prettiest way to skip the first and last item [...]
If the underlying data structure is a string, List or array, I'd simply do this:
for (int i=1; i<str.Length-1; ++i)
{ ... }
To be frank, other data structures don't really make sense here IMO. That said, people somethings like to put Linq code everywhere, so...
Using an enumerator
You can easily make a method that returns all but the first and last element. In my book, enumerators always are accessed in code through things like foreach to ensure that IDisposable is called correctly.
public static IEnumerable<T> GetAllButFirstAndLast<T>(IEnumerable<T> myEnum)
{
T jtem = default(T);
bool first = true;
foreach (T item in myEnum.Skip(1))
{
if (first) { first = false; } else { yield return jtem; }
jtem = item;
}
}
Note that this has little to do with "getting the best performance out of your code". One look at the IL tells you all you need to know.

Using LINQ in a string array to improve efficient C#

I have a equation string and when I split it with a my pattern I get the folowing string array.
string[] equationList = {"code1","+","code2","-","code3"};
Then from this I create a list which only contains the codes.
List<string> codeList = {"code1","code2","code3"};
Then existing code loop through the codeList and retrieve the value of each code and replaces the value in the equationList with the below code.
foreach (var code in codeList ){
var codeVal = GetCodeValue(code);
for (var i = 0; i < equationList.Length; i++){
if (!equationList[i].Equals(code,StringComparison.InvariantCultureIgnoreCase)) continue;
equationList[i] = codeVal;
break;
}
}
I am trying to improve the efficiency and I believe I can get rid of the for loop within the foreach by using linq.
My question is would it be any better if I do in terms of speeding up the process?
If yes then can you please help with the linq statement?
Before jumping to LINQ... which doesn't solve any problems you've described, let's look at the logic you have here.
We split a string with a 'pattern'. How?
We then create a new list of codes. How?
We then loop through those codes and decode them. How?
But since we forgot to keep track of where those code came from, we now loop through the equationList (which is an array, not a List<T>) to substitute the results.
Seems a little convoluted to me.
Maybe a simpler solution would be:
Take in a string, and return IEnumerable<string> of words (similar to what you do now).
Take in a IEnumerable<string> of words, and return a IEnumerable<?> of values.
That is to say with this second step iterate over the strings, and simply return the value you want to return - rather than trying to extract certain values out, parsing them, and then inserting them back into a collection.
//Ideally we return something more specific eg, IEnumerable<Tokens>
public IEnumerable<string> ParseEquation(IEnumerable<string> words)
{
foreach (var word in words)
{
if (IsOperator(word)) yield return ToOperator(word);
else if (IsCode(word)) yield return ToCode(word);
else ...;
}
}
This is quite similar to the LINQ Select Statement... if one insisted I would suggest writing something like so:
var tokens = equationList.Select(ToToken);
...
public Token ToToken(string word)
{
if (IsOperator(word)) return ToOperator(word);
else if (IsCode(word)) return ToCode(word);
else ...;
}
If GetCodeValue(code) doesn't already, I suggest it probably could use some sort of caching/dictionary in its implementation - though the specifics dictate this.
The benefits of this approach is that it is flexible (we can easily add more processing steps), simple to follow (we put in these values and get these as a result, no mutating state) and easy to write. It also breaks the problem down into nice little chunks that solve their own task, which will help immensely when trying to refactor, or find niggly bugs/performance issues.
If your array is always alternating codex then operator this LINQ should do what you want:
string[] equationList = { "code1", "+", "code2", "-", "code3" };
var processedList = equationList.Select((s,j) => (j % 2 == 1) ? s :GetCodeValue(s)).ToArray();
You will need to check if it is faster
I think the fastest solution will be this:
var codeCache = new Dictionary<string, string>();
for (var i = equationList.Length - 1; i >= 0; --i)
{
var item = equationList[i];
if (! < item is valid >) // you know this because you created the codeList
continue;
string codeVal;
if (!codeCache.TryGetValue(item, out codeVal))
{
codeVal = GetCodeValue(item);
codeCache.Add(item, codeVal);
}
equationList[i] = codeVal;
}
You don't need a codeList. If every code is unique you can remove the codeCace.

Why do we need iterators in c#?

Can somebody provide a real life example regarding use of iterators. I tried searching google but was not satisfied with the answers.
You've probably heard of arrays and containers - objects that store a list of other objects.
But in order for an object to represent a list, it doesn't actually have to "store" the list. All it has to do is provide you with methods or properties that allow you to obtain the items of the list.
In the .NET framework, the interface IEnumerable is all an object has to support to be considered a "list" in that sense.
To simplify it a little (leaving out some historical baggage):
public interface IEnumerable<T>
{
IEnumerator<T> GetEnumerator();
}
So you can get an enumerator from it. That interface (again, simplifying slightly to remove distracting noise):
public interface IEnumerator<T>
{
bool MoveNext();
T Current { get; }
}
So to loop through a list, you'd do this:
var e = list.GetEnumerator();
while (e.MoveNext())
{
var item = e.Current;
// blah
}
This pattern is captured neatly by the foreach keyword:
foreach (var item in list)
// blah
But what about creating a new kind of list? Yes, we can just use List<T> and fill it up with items. But what if we want to discover the items "on the fly" as they are requested? There is an advantage to this, which is that the client can abandon the iteration after the first three items, and they don't have to "pay the cost" of generating the whole list.
To implement this kind of lazy list by hand would be troublesome. We would have to write two classes, one to represent the list by implementing IEnumerable<T>, and the other to represent an active enumeration operation by implementing IEnumerator<T>.
Iterator methods do all the hard work for us. We just write:
IEnumerable<int> GetNumbers(int stop)
{
for (int n = 0; n < stop; n++)
yield return n;
}
And the compiler converts this into two classes for us. Calling the method is equivalent to constructing an object of the class that represents the list.
Iterators are an abstraction that decouples the concept of position in a collection from the collection itself. The iterator is a separate object storing the necessary state to locate an item in the collection and move to the next item in the collection. I have seen collections that kept that state inside the collection (i.e. a current position), but it is often better to move that state to an external object. Among other things it enables you to have multiple iterators iterating the same collection.
Simple example : a function that generates a sequence of integers :
static IEnumerable<int> GetSequence(int fromValue, int toValue)
{
if (toValue >= fromValue)
{
for (int i = fromValue; i <= toValue; i++)
{
yield return i;
}
}
else
{
for (int i = fromValue; i >= toValue; i--)
{
yield return i;
}
}
}
To do it without an iterator, you would need to create an array then enumerate it...
Iterate through the students in a class
The Iterator design pattern provides
us with a common method of enumerating
a list of items or array, while hiding
the details of the list's
implementation. This provides a
cleaner use of the array object and
hides unneccessary information from
the client, ultimately leading to
better code-reuse, enhanced
maintainability, and fewer bugs. The
iterator pattern can enumerate the
list of items regardless of their
actual storage type.
Iterate through a set of homework questions.
But seriously, Iterators can provide a unified way to traverse the items in a collection regardless of the underlying data structure.
Read the first two paragraphs here for a little more info.
A couple of things they're great for:
a) For 'perceived performance' while maintaining code tidiness - the iteration of something separated from other processing logic.
b) When the number of items you're going to iterate through is not known.
Although both can be done through other means, with iterators the code can be made nicer and tidier as someone calling the iterator don't need to worry about how it finds the stuff to iterate through...
Real life example: enumerating directories and files, and finding the first [n] that fulfill some criteria, e.g. a file containing a certain string or sequence etc...
Beside everything else, to iterate through lazy-type sequences - IEnumerators. Each next element of such sequence may be evaluated/initialized upon iteration step which makes it possible to iterate through infinite sequences using finite amount of resources...
The canonical and simplest example is that it makes infinite sequences possible without the complexity of having to write the class to do that yourself:
// generate every prime number
public IEnumerator<int> GetPrimeEnumerator()
{
yield return 2;
var primes = new List<int>();
primesSoFar.Add(2);
Func<int, bool> IsPrime = n => primes.TakeWhile(
p => p <= (int)Math.Sqrt(n)).FirstOrDefault(p => n % p == 0) == 0;
for (int i = 3; true; i += 2)
{
if (IsPrime(i))
{
yield return i;
primes.Add(i);
}
}
}
Obviously this would not be truly infinite unless you used a BigInt instead of int but it gives you the idea.
Writing this code (or similar) for each generated sequence would be tedious and error prone. the iterators do that for you. If the above example seems too complex for you consider:
// generate every power of a number from start^0 to start^n
public IEnumerator<int> GetPowersEnumerator(int start)
{
yield return 1; // anything ^0 is 1
var x = start;
while(true)
{
yield return x;
x *= start;
}
}
They come at a cost though. Their lazy behaviour means you cannot spot common errors (null parameters and the like) until the generator is first consumed rather than created without writing wrapping functions to check first. The current implementation is also incredibly bad(1) if used recursively.
Wiriting enumerations over complex structures like trees and object graphs is much easier to write as the state maintenance is largely done for you, you must simply write code to visit each item and not worry about getting back to it.
I don't use this word lightly - a O(n) iteration can become O(N^2)
An iterator is an easy way of implementing the IEnumerator interface. Instead of making a class that has the methods and properties required for the interface, you just make a method that returns the values one by one and the compiler creates a class with the methods and properties needed to implement the interface.
If you for example have a large list of numbers, and you want to return a collection where each number is multiplied by two, you can make an iterator that returns the numbers instead of creating a copy of the list in memory:
public IEnumerable<int> GetDouble() {
foreach (int n in originalList) yield return n * 2;
}
In C# 3 you can do something quite similar using extension methods and lambda expressions:
originalList.Select(n => n * 2)
Or using LINQ:
from n in originalList select n * 2
IEnumerator<Question> myIterator = listOfStackOverFlowQuestions.GetEnumerator();
while (myIterator.MoveNext())
{
Question q;
q = myIterator.Current;
if (q.Pertinent == true)
PublishQuestion(q);
else
SendMessage(q.Author.EmailAddress, "Your question has been rejected");
}
foreach (Question q in listOfStackOverFlowQuestions)
{
if (q.Pertinent == true)
PublishQuestion(q);
else
SendMessage(q.Author.EmailAddress, "Your question has been rejected");
}

C# (.Net 2.0) Micro-Optimization Part 2: Finding Contiguous Groups within a grid

I have a very simple function which takes in a matching bitfield, a grid, and a square. It used to use a delegate but I did a lot of recoding and ended up with a bitfield & operation to avoid the delegate while still being able to perform matching within reason. Basically, the challenge is to find all contiguous elements within a grid which match the match bitfield, starting from a specific "leader" square.
Square is somewhat small (but not tiny) class. Any tips on how to push this to be even faster? Note that the grid itself is pretty small (500 elements in this test).
Edit: It's worth noting that this function is called over 200,000 times per second. In truth, in the long run my goal will be to call it less often, but that's really tough, considering that my end goal is to make the grouping system be handled with scripts rather than being hardcoded. That said, this function is always going to be called more than any other function.
Edit: To clarify, the function does not check if leader matches the bitfield, by design. The intention is that the leader is not required to match the bitfield (though in some cases it will).
Things tried unsuccessfully:
Initializing the dictionary and stack with a capacity.
Casting the int to an enum to avoid a cast.
Moving the dictionary and stack outside the function and clearing them each time they are needed. This makes things slower!
Things tried successfully:
Writing a hashcode function instead of using the default: Hashcodes are precomputed and are equal to x + y * parent.Width. Thanks for the reminder, Jim Mischel.
mquander's Technique: See GetGroupMquander below.
Further Optimization: Once I switched to HashSets, I got rid of the Contains test and replaced it with an Add test. Both Contains and Add are forced to seek a key, so just checking if an add succeeds is more efficient than adding if a Contains fails check fails. That is, if (RetVal.Add(s)) curStack.Push(s);
public static List<Square> GetGroup(int match, Model grid, Square leader)
{
Stack<Square> curStack = new Stack<Square>();
Dictionary<Square, bool> Retval = new Dictionary<Square, bool>();
curStack.Push(leader);
while (curStack.Count != 0)
{
Square curItem = curStack.Pop();
if (Retval.ContainsKey(curItem)) continue;
Retval.Add(curItem, true);
foreach (Square s in curItem.Neighbors)
{
if (0 != ((int)(s.RoomType) & match))
{
curStack.Push(s);
}
}
}
return new List<Square>(Retval.Keys);
}
=====
public static List<Square> GetGroupMquander(int match, Model grid, Square leader)
{
Stack<Square> curStack = new Stack<Square>();
Dictionary<Square, bool> Retval = new Dictionary<Square, bool>();
Retval.Add(leader, true);
curStack.Push(leader);
while (curStack.Count != 0)
{
Square curItem = curStack.Pop();
foreach (Square s in curItem.Neighbors)
{
if (0 != ((int)(s.RoomType) & match))
{
if (!Retval.ContainsKey(s))
{
curStack.Push(s);
Retval.Add(curItem, true);
}
}
}
}
return new List<Square>(Retval.Keys);
}
The code you posted assumes that the leader square matches the bitfield. Is that by design?
I assume your Square class has implemented a GetHashCode method that's quick and provides a good distribution.
You did say micro-optimization . . .
If you have a good idea how many items you're expecting, you'll save a little bit of time by pre-allocating the dictionary. That is, if you know you won't have more than 100 items that match, you can write:
Dictionary<Square, bool> Retval = new Dictionary<Square, bool>(100);
That will avoid having to grow the dictionary and re-hash everything. You can also do the same thing with your stack: pre-allocate it to some reasonable maximum size to avoid resizing later.
Since you say that the grid is pretty small it seems reasonable to just allocate the stack and the dictionary to the grid size, if that's easy to determine. You're only talking grid_size references each, so memory isn't a concern unless your grid becomes very large.
Adding a check to see if an item is in the dictionary before you do the push might speed it up a little. It depends on the relative speed of a dictionary lookup as opposed to the overhead of having a duplicate item in the stack. Might be worth it to give this a try, although I'd be surprised if it made a big difference.
if (0 != ((int)(s.RoomType) & match))
{
if (!Retval.ContainsKey(curItem))
curStack.Push(s);
}
I'm really stretching on this last one. You have that cast in your inner loop. I know that the C# compiler sometimes generates a surprising amount of code for a seemingly simple cast, and I don't know if that gets optimized away by the JIT compiler. You could remove that cast from your inner loop by creating a local variable of the enum type and assigning it the value of match:
RoomEnumType matchType = (RoomEnumType)match;
Then your inner loop comparison becomes:
if (0 != (s.RoomType & matchType))
No cast, which might shave some cycles.
Edit: Micro-optimization aside, you'll probably get better performance by modifying your algorithm slightly to avoid processing any item more than once. As it stands, items that do match can end up in the stack multiple times, and items that don't match can be processed multiple times. Since you're already using a dictionary to keep track of items that do match, you can keep track of the non-matching items by giving them a value of false. Then at the end you simply create a List of those items that have a true value.
public static List<Square> GetGroup(int match, Model grid, Square leader)
{
Stack<Square> curStack = new Stack<Square>();
Dictionary<Square, bool> Retval = new Dictionary<Square, bool>();
curStack.Push(leader);
Retval.Add(leader, true);
int numMatch = 1;
while (curStack.Count != 0)
{
Square curItem = curStack.Pop();
foreach (Square s in curItem.Neighbors)
{
if (Retval.ContainsKey(curItem))
continue;
if (0 != ((int)(s.RoomType) & match))
{
curStack.Push(s);
Retval.Add(s, true);
++numMatch;
}
else
{
Retval.Add(s, false);
}
}
}
// LINQ makes this easier, but since you're using .NET 2.0...
List<Square> matches = new List<Square>(numMatch);
foreach (KeyValuePair<Square, bool> kvp in Retval)
{
if (kvp.Value == true)
{
matches.Add(kvp.Key);
}
}
return matches;
}
Here are a couple of suggestions -
If you're using .NET 3.5, you could change RetVal to a HashSet<Square> instead of a Dictionary<Square,bool>, since you're never using the values (only the keys) in the Dictionary. This would be a small improvement.
Also, if you changed the return to IEnumerable, you could just return the HashSet's enumerator directly. Depending on the usage of the results, it could potentially be faster in certain areas (and you can always use ToList() on the results if you really need a list).
However, there is a BIG optimization that could be added here -
Right now, you're always adding in every neighbor, even if that neighbor has already been processed. For example, when leader is processed, it adds in leader+1y, then when leader+1y is processed, it puts BACK in leader (even though you've already handled that Square), and next time leader is popped off the stack, you continue. This is a lot of extra processing.
Try adding:
foreach (Square s in curItem.Neighbors)
{
if ((0 != ((int)(s.RoomType) & match)) && (!Retval.ContainsKey(s)))
{
curStack.Push(s);
}
}
This way, if you've already processed the square of your neighbor, it doesn't get re-added to the stack, just to be skipped when it's popped later.

Categories