C# Linq different values printed than presented in debug mode - c#

var counter=0;
var array = new int[] {0, 1, 2, 3,4};
var test = array.Select(a => counter++);
foreach (var item in test)
{
Console.WriteLine(item);
}
Console.ReadLine();
When I run the code above the console prints 0,1,2,3,4.
However, when I expand test array in the debug mode I can see numbers from 10 to 14. Why??
Also, can you help me why the console does not print 1,2,3,4,5 as it should return incremented counter.

The reason the output keeps changing is that test isn't actually evaluated until you enumerate through it. So opening up the debug view causes it to evaluate the enumeration. Then every time you enumerate it, it will run again, each time the counter variable increasing. So you can get some funny results by running the for loop multiple times or printing out test.First() multiple times.
You can prevent this by forcing the enumerable to materialise into a list:
var test = array.Select(a => counter++).ToList();
// ^^^^^^^^^
As for why it starts at zero, that's because ++ in this context is a post-increment operator meaning it returns the value and then increments. If you want it to start at 1, prefix the variable instead:
var test = array.Select(a => ++counter).ToList();

That's normal. When you just use the Select method, you get a lazy list, it means it will be evaluated when you access it. Here, you access it twice, when you execute the foreach and when you look in the debugger, each time, your select Func is getting executed, thus incrementing your counter.
If you replace by
var test = array.Select(a => counter++).ToList();
It won't be lazy anymore, and will be executed once when you call ToList(). Yet, staying lazy might be interesting especially if you want to add some conditions later, for example appending some Where conditions, you wouldn't like your query to be executed before you're finished building it.
Your counter starts at zero because counter++ will first give you the value, then only increment it. If you want to start at one you can either initialize counter to 1 or replace counter++ by ++counter, it will be first incremented then returned.

Related

Need to understand how recursion is working in finding word permutation code sample

I have a working code sample for finding possible word permutations for mis-typed words. For example, someone types the word "gi" so the suggested words would be "go" or "hi". The find get_nearby_letters() and isWord() functions are given.
I need to understand how is sub_words is getting values. And since the function is called recursively how is the char[] nearby_letters = get_nearby_letters(letters[index-1]); program statement being reached?
I seem to be having trouble understanding how recursive functions work.
public List<string> nearby_words(string word)
{
List<string> possible_words;
char[] letters = word.ToCharArray();
possible_words = get_nearby_permutations(letters, 0);
possible_words = possible_words.Where(x => isWord(x)).ToList();
return possible_words;
}
public List<string> get_nearby_permutations(char[] letters, int index)
{
List<string> permutations = new List<string>();
if (index >= letters.Count())
{
permutations = new List<string> { "" };
return permutations;
}
List<string> sub_words = get_nearby_permutations(letters, ++index);
char[] nearby_letters = get_nearby_letters(letters[index-1]);
foreach (var sub_word in sub_words)
{
foreach (var letter in nearby_letters)
{
permutations.Add(letter+sub_word);
}
}
return permutations;
}
The function rightget_nearby_permutations() is recursive because it calls itself inside of the fuction. Now you are wondering how the part after the recursive call can even be reached.
Have a look at the parameter index, which is counted up each time. At the start rightget_nearby_permutations() will be called index = 0. Inside of the function you have a recursive function call with ++index, which means the index will be counted up by one.
This goes on until the condition index >= letters.Count() is reached. This time there will be no recursive call and a List with one empty string will be returned. In the previously calling function this List gets stored in the parameter sub_words.
And now everything goes backwards and the lines after the recursive call will be reached and permutations populated.
ProTip: Use debugging and breakpoints to check what your code is doing.
Edit: Example of recursive call for letters.Count()==2:
Function 1
index = 0
Recursive call of Function 2
index = 1
Recursive call of Function 3
index = 2
index >= letters.Count() == true
return
continue with f2
return permutations
continue with f1
return permutations
how is sub_words is getting values
The local variable sub_words receives the return value of the method call to the get_nearby_permutations() method.
how is the char[] nearby_letters = get_nearby_letters(letters[index-1]); program statement being reached?
That program statement is executed after the previous call to get_nearby_permutations() returns. It works just like any program statement that follows a method call.
Stack Overflow isn't really the best place to seek help understanding recursion. It's a broad topic and typically requires some hand-holding with the student to walk them through the specifics. You should read articles such the Wikipedia article Recursion (computer science) and the In plain English, what is recursion? Q&A on programmers.stackexchange.com.
It its core, recursion is two things:
A method (function) that calls itself, and
One or more termination cases, i.e. a reason for the method to not call itself
In your example, the method calls itself to obtain the results of the operation on the input after the current index. IMHO, it should have been written like this:
List<string> sub_words = get_nearby_permutations(letters, index + 1);
char[] nearby_letters = get_nearby_letters(letters[index]);
That would make more clear that it's not really that you want a new value for index in the current call frame of the method, but that the next call should use the incremented value. Incrementing the value and then subtracting it when the variable is used later in the current call frame is just confusing and inefficient.
So, you have the first part, clearly. The second part, a reason to not call itself, happens because each time it calls itself, the index value increases by one. Eventually, the index value is large enough that there are no more characters to process, and the list containing the empty string is returned instead of the method calling itself.
That is fundamentally how your recursive method works. Of course, there is a bit more to it than that. After all, the method does real work as well. But that's just regular algorithmic stuff. I.e. given the results of the recursive call, now the method will create different combinations of the current letter with the various strings returned by the recursive call.
Since the first time the method returns, it's simply returned a list with the empty string, all of the "combinations" are just the letters near the current letter. But then those letters are returned as sub_words values to the previous call to the method, at which point it then combines those values with the nearby letters to the previous letter.
In this way, the method works its way back, creating different permutations of possible words by trying all the different combinations of letters with each of the previously-determined, shorter letter combinations.
With all that in mind, your next step should be simply to step into the method using the debugger. You will find that with each call into the method, the index value increases by one, until eventually the method returns from the termination clause (i.e. the list containing the empty string), and then from there, each time you return a list, you proceed to generate a longer list based on the current letter and the previous list.
The debugger can be very informative in understanding this code. I recommend it be one of the very next things you try. :)

Count is zero after assigning to List

I have the following code in my program:
List<_Transaction> transactionListing = collectionRun.AttachedTransactions;
When I debug, the AttachedTransactions hsa a count of 3 (its also a list of List<_Transaction>). But the assignment does not work because the transactionListing has a count of zero.
I'm perplexed.
EDIT:
On the right hand side the count of AttachedTransactions is 3. But on the left hand side the count of transactionListing remains zero after assignment
Try this:
List<_Transaction> transactionListing = collectionRun.AttachedTransactions.ToList();
Or:
List<_Transaction> transactionListing = new List<_Transaction>(collectionRun.AttachedTransactions);
This will make a copy of the AttachedTransactions list at the time of assignment.
At the moment you're getting a reference to collectionRun.AttachedTransactions and I'm guessing that something else is altering it so it appears that the assignment isn't working.

Does list.count physically iterate through the list to count it, or does it keep a pointer

I am stepping through a large list of object to do some stuff regarding said objects in the list.
During my iteration, I will remove some objects from the list depending on certain criteria.
Once all is done, I need to update the UI regarding the number of objects in my list. (List of T).
QUESTION:
When I call list.count, does .net actually iterate through the list to
count it, or does it store the count as a property/variable?
If .net physically re-iterates through the list, I may just as well keep a counter on my own iteration through the list, and save the overhead?
Thanks
It simply keeps an internal int to track the number of items. So no iteration.
The documentation says retrieving Count is an O(1) operation:
http://msdn.microsoft.com/en-us/library/27b47ht3%28v=vs.110%29.aspx
You can see for yourself:
http://referencesource.microsoft.com/#mscorlib/system/collections/generic/list.cs
List is implemented as an array list, and it keeps track of its own size, so invoking the .Count property doesn't require any iteration.
If you call the LINQ .Count() extension method, this will check whether the underlying IEnumerable<> implements ICollection (which a List<> does), and use the .Count property on that interface if possible. So this won't cause any iteration to occur either.
Incidentally, there are other problems you're going to encounter if you attempt to remove items from your list while iterating through it. It's not really clear how iteration should behave when you are removing elements out from under the iterator, so List<>s will avoid this issue entirely by throwing an exception if the list has been modified since its enumerator was created.
You can use a decompiler, such as the freely-available ILSpy, to answer these questions. If you're referring to the List<T> type, then the Count getter simply involves reading a field:
public int Count
{
get { return this._size; }
}
As stated here under the remarks tab
http://msdn.microsoft.com/en-us/library/27b47ht3(v=vs.110).aspx
Retrieving the value of this property is an O(1) operation.
Which means no iteration is occurring.
You tagged your question with both vb.net and c#, so in reply to "If .net physically re-iterates through the list, I may just as well keep a counter on my own iteration through the list, and save the overhead?"
If your iteration is with a For i = first To last then VB.NET will evaluate first and last when it enters the loop:
Dim first As Integer = 1
Dim last As Integer = 3
For i = first To last
Console.Write(i.ToString() & " ")
last = -99
Next
outputs: 1 2 3
If you do the equivalent in C#, first and last are evaluated on every iteration:
int first = 1;
int last = 1;
for (int i = first; i <= last; i++)
{
Console.Write(i.ToString() + " ");
last = -99;
}
outputs: 1
If your .Count() function/property is expensive to evaluate and/or you don't want it to be re-evaluated on each iteration (for some other reason), then in C# you could assign it to a temporary variable.

Parallel.ForEach - Access To Modified Closure Applies?

I've read a number of other questions about Access to Modified closure so I understand the basic principle. Still, I couldn't tell - does Parallel.ForEach have the same issues?
Take the following snippet where I recompute the usage stats for users for the last week as an example:
var startTime = DateTime.Now;
var endTime = DateTime.Now.AddHours(6);
for (var i = 0; i < 7; i++)
{
// this next line gives me "Access To Modified Closure"
Parallel.ForEach(allUsers, user => UpdateUsageStats(user, startTime, endTime));
// move back a day and continue the process
startTime = startTime.AddDays(-1);
endTime = endTime.AddDays(-1);
}
From what I know of this code the foreach should run my UpdateUsageStats routine right away and start/end time variables won't be updated till the next time around the loop. Is that correct or should I use local variables to make sure there aren't issues?
You are accessing a modified closure, so it does apply. But, you are not changing its value while you are using it, so assuming you are not changing the values inside UpdateUsageStats you don't have a problem here.
Parallel.Foreach waits for the execution to end, and only then are you changing the values in startTime and endTime.
"Access to modified closure" only leads to problems if the capture scope leaves the loop in which the capture takes place and is used elsewhere. For example,
var list = new List<Action>();
for (var i = 0; i < 7; i++)
{
list.Add(() => Console.WriteLine(i));
}
list.ForEach(a => a()); // prints "7" 7 times, because `i` was captured inside the loop
In your case the lamda doing the capture doesn't leave the loop (the Parallel.ForEach call is executed completely within the loop, each time around).
You still get the warning because the compiler doesn't know whether or not Parallel.ForEach is causing the the lambda to be stored for later invocation. Since we know more than the compiler we can safely ignore the warning.

Using AsSequential in order to preserve order

I am looking at this code
var numbers = Enumerable.Range(0, 20);
var parallelResult = numbers.AsParallel().AsOrdered()
.Where(i => i % 2 == 0).AsSequential();
foreach (int i in parallelResult.Take(5))
Console.WriteLine(i);
The AsSequential() is supposed to make the resulting array sorted. Actually it is sorted after its execution, but if I remove the call to AsSequential(), it is still sorted (since AsOrdered()) is called.
What is the difference between the two?
AsSequential is just meant to stop any further parallel execution - hence the name. I'm not sure where you got the idea that it's "supposed to make the resulting array sorted". The documentation is pretty clear:
Converts a ParallelQuery into an IEnumerable to force sequential evaluation of the query.
As you say, AsOrdered ensures ordering (for that particular sequence).
I know that this was asked over a year old but here are my two cents.
In the example exposed, i think it uses AsSequential so that the next query operator (in this case the Take operator) it is execute sequentially.
However the Take operator prevent a query from being parallelized, unless the source elements are in their original indexing position, so that is why even when you remove the AsSequential operator, the result is still sorted.

Categories