Double-use of C# iterator works unexpectedly - c#

This is my first go at coding in C# - I have a background in C/Python/Javascript/Haskell.
Why does the program below work? I would expect this to work in Haskell, as lists are immutable, but I am struggling with how I can use the same iterator nums twice, without error.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace HelloWorld
{
class Program
{
static void Main(string[] args)
{
var nums = new List<int?>() { 0, 0, 2, 3, 3, 3, 4 };
var lastNums = new List<int?>() { null } .Concat(nums);
var changesAndNulls = lastNums.Zip(nums,
(last, curr) => (last == null || last != curr) ? curr : null
);
var changes = from changeOrNull in changesAndNulls where changeOrNull != null select changeOrNull;
foreach (var change in changes) {
Console.WriteLine("change: " + change);
}
}
}
}

In your code nums is not IEnumerator<T> (iterator), it's IEnumarable<T> and IEnumarable <T> has GetEnumerator() method which can be invoked as many times as required:
IEnumerable<int?> nums = new List<int?>() { 0, 0, 2, 3, 3, 3, 4 };
// Linq gets enumerator to do Concat
using (var iterator1 = nums.GetEnumerator()) {
while (iterator1.MoveNext()) {
...
}
}
...
// Linq gets (fresh!) enumerator to do Zip
using (var iterator2 = nums.GetEnumerator()) {
while (iterator2.MoveNext()) {
...
}
}
So IEnumerable<T> is a factory producing IEnumerator<T> instances (and it is a IEnumerator<T> that can't be re-used)

A List<int?> implements the interface IEnumerable<int?>. That means it has a method called GetEnumerator().
This method returns a new Enumerator<int?> object used to iterate over all the items.
You are calling GetEnumerator() (in the background) when you use it inside a foreach loop or when you call one of the many extension methods like Concat() (which call GetEnumerator() themselves).
I suggest you take a tutorial on C#. It is a very different language than Haskell and it has some very unique features.
Just for starters: http://en.wikipedia.org/wiki/C_Sharp_syntax

Related

Permutation algorithm Optimization

I have this permutation code working perfectly but it does not generate the code fast enough, I need help with optimizing the code to run faster, please it is important that the result remains the same, I have seen other algorithms but they don't into consideration the output length and same character reputation which are all valid output. if I can have this converted into a for loop with 28 characters of alphanumeric, that would be awesome. below is the current code I am looking to optimize.
namespace CSharpPermutations
{
public interface IPermutable<T>
{
ISet<T> GetRange();
}
public class Digits : IPermutable<int>
{
public ISet<int> GetRange()
{
ISet<int> set = new HashSet<int>();
for (int i = 0; i < 10; ++i)
set.Add(i);
return set;
}
}
public class AlphaNumeric : IPermutable<char>
{
public ISet<char> GetRange()
{
ISet<char> set = new HashSet<char>();
set.Add('0');
set.Add('1');
set.Add('2');
set.Add('3');
set.Add('4');
set.Add('5');
set.Add('6');
set.Add('7');
set.Add('8');
set.Add('9');
set.Add('a');
set.Add('b');
return set;
}
}
public class PermutationGenerator<T,P> : IEnumerable<string>
where P : IPermutable<T>, new()
{
public PermutationGenerator(int number)
{
this.number = number;
this.range = new P().GetRange();
}
public IEnumerator<string> GetEnumerator()
{
foreach (var item in Permutations(0,0))
{
yield return item.ToString();
}
}
IEnumerator IEnumerable.GetEnumerator()
{
foreach (var item in Permutations(0,0))
{
yield return item;
}
}
private IEnumerable<StringBuilder> Permutations(int n, int k)
{
if (n == number)
yield return new StringBuilder();
foreach (var element in range.Skip(k))
{
foreach (var result in Permutations(n + 1, k + 1))
{
yield return new StringBuilder().Append(element).Append(result);
}
}
}
private int number;
private ISet<T> range;
}
class MainClass
{
public static void Main(string[] args)
{
foreach (var element in new PermutationGenerator<char, AlphaNumeric>(2))
{
Console.WriteLine(element);
}
}
}
}
Thanks for your effort in advance.
What you're outputting there is the cartesian product of two sets; the first set is the characters "0123456789ab" and the second set is the characters "123456789ab".
Eric Lippert wrote a well-known article demonstrating how to use Linq to solve this.
We can apply this to your problem like so:
using System;
using System.Collections.Generic;
using System.Linq;
namespace Demo;
static class Program
{
static void Main(string[] args)
{
char[][] source = new char[2][];
source[0] = "0123456789ab".ToCharArray();
source[1] = "0123456789ab".ToCharArray();
foreach (var perm in Combine(source))
{
Console.WriteLine(string.Concat(perm));
}
}
public static IEnumerable<IEnumerable<T>> Combine<T>(IEnumerable<IEnumerable<T>> sequences)
{
IEnumerable<IEnumerable<T>> emptyProduct = new[] { Enumerable.Empty<T>() };
return sequences.Aggregate(
emptyProduct,
(accumulator, sequence) =>
from accseq in accumulator
from item in sequence
select accseq.Concat(new[] { item }));
}
}
You can extend this to 28 characters by modifying the source data:
source[0] = "0123456789abcdefghijklmnopqr".ToCharArray();
source[1] = "0123456789abcdefghijklmnopqr".ToCharArray();
If you want to know how this works, read Eric Lipper's excellent article, which I linked above.
Consider
foreach (var result in Permutations(n + 1, k + 1))
{
yield return new StringBuilder().Append(element).Append(result);
}
Permutations is a recursive function that implements an iterator. So each time the .MoveNext() method is will advance one step of the loop, that will call MoveNext() in turn etc, resulting in N calls to MoveNext(), new StringBuilder, Append() etc. This is quite inefficient.
A can also not see that stringBuilder gives any advantage here. It is a benefit if you concatenate many strings, but as far as I can see you only add two strings together.
The first thing you should do is add code to measure the performance, or even better, use a profiler. That way you can tell if any changes actually improves the situation or not.
The second change I would try would be to try rewrite the recursion to an iterative implementation. This probably means that you need to keep track of an explicit stack of the numbers to process. Or if this is to difficult, stop using iterator blocks and let the recursive method take a list that it adds results to.

Linq Query fail to execute method

While learning Linq, I wrote the code bellow, the problem is that "PrintResults()" method is never executed. I don't understand why!?
Is what I m trying to do possible?
Thank you.
using System;
using System.Collections.Generic;
using System.Linq;
namespace Linq
{
class Program
{
static void Main(string[] args)
{
int[] scores = new int[] { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
//IEnumerable<int> query =
// from score in scores
// where score % 2 == 0
// select score;
// Console.WriteLine(score);
IEnumerable<int> queryResults = scores.Where(x => x % 2 == 0).ToList().Take(2);
PrintResults(queryResults);
}
static IEnumerable<int> PrintResults(IEnumerable<int> input)
{
foreach (var score in input)
{
Console.WriteLine(score);
yield return score;
}
}
}
}
When a method contains a yield return statement, it becomes an "iterator block". It will be evaluated lazily. This means that the code will not execute until some client enumerates over the IEnumerable<int> that is returned.
To see the results, invoke it like this:
var results = PrintResults(queryResults);
foreach (var result in results)
{
// do something
}
Another way to "collapse" the iterator is just to call .ToList() on the return value. That will cause it to be enumerated just like a foreach loop does:
var results = PrintResults(queryResults).ToList();
Jon Skeet describes iterator blocks in more detail here.

MongoDB C# Driver - Fastest way to perform an "IN" query on _id

I'm trying to get values from a collection, based on items whose IDs are in a certain collection of IDs.
My current code to build the filter is:
IEnumerable<string> IDList;
using (var enumerator = IDList.GetEnumerator())
{
if (enumerator.MoveNext() == false) return null; // empty collection
// take the first key
var key = enumerator.Current;
filter = Builders<MyClass>.Filter.Eq(p => p.Key, key);
// take all the other keys
while (enumerator.MoveNext())
{
var innerKey = enumerator.Current;
filter = filter | Builders<MyClass>.Filter.Eq(p => p.Key, innerKey);
}
}
and then my code to get the items is:
List<MyClass> values = new List<MyClass>();
using (var cursor = await MyCollection.FindAsync(filter))
{
while (await cursor.MoveNextAsync())
{
values.AddRange(cursor.Current);
}
}
This code's performance seems pretty subpar, and I'm sure there has to be a faster way since MongoDB should have very good performance... Not to mention I'm querying an indexed field, which should make the query very fast. What can I do to speed this up, both in an async way and a sync way? From some Googling I've seen that there are many ways to query a collection, and I'm not sure which way would be the best for my particular case.
Running this query in RoboMongo takes 0.02 seconds, while running it in C# MongoDb.Driver takes a full second, sometimes even longer and I'm not sure why.
Thanks in advance.
How about a simple "$in" query?
using MongoDB.Bson;
using MongoDB.Driver;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Threading.Tasks;
namespace ConsoleApp1
{
public class MyClass
{
public ObjectId Id;
public string Key;
}
public class Program
{
static void Main(string[] args)
{
IEnumerable<string> ids = new [] { "a", "b", "c" };
var collection = new MongoClient().GetDatabase("test").GetCollection<MyClass>("test");
foreach (var id in ids)
{
collection.InsertOne(new MyClass { Key = id });
}
// here comes the "$in" query
var filter = Builders<MyClass>.Filter.In(myClass => myClass.Key, ids);
// sync
List<MyClass> values = collection.Find(filter).ToList();
// async
var queryTask = collection.FindAsync(filter);
values = GetValues(queryTask).Result;
Console.ReadLine();
}
private static async Task<List<MyClass>> GetValues(System.Threading.Tasks.Task<IAsyncCursor<MyClass>> queryTask)
{
var cursor = await queryTask;
return await cursor.ToListAsync<MyClass>();
}
}
}

C# Equivalent of Python's itertools.chain

What (if any) is the C# equivalent of Python's itertools.chain method?
Python Example:
l1 = [1, 2]
l2 = [3, 4]
for v in itertools.chain(l1, l2):
print(v)
Results:
1
2
3
4
Note that I'm not interested in making a new list that combines my first two and then processing that. I want the memory/time savings that itertools.chain provides by not instantiating this combined list.
You could do the same on C# by using the Concat extension method from LINQ:
l1.Concat(l2)
LINQ uses a deferred execution model, so this won't create a new list.
Enumerable.Concat (MSDN)
var l1 = new List<int>() { 1, 2 };
var l2 = new List<int>() { 3, 4 };
foreach(var item in Enumerable.Concat(l1, l2))
{
Console.WriteLine(item.ToString())
}
I've started building my own c# equivalent of itertools. Here is what I came up with for chain. Nice and succinct and keeps some of the Python charm. Here are two implementations of Chain. It seems the compiler can't distinguish between the call signatures, so use the one that best suites you.
public static IEnumerable<T> Chain<T>(IEnumerable<IEnumerable<T>> collection)
{
foreach (var innerCollection in collection)
{
foreach (var item in innerCollection)
{
yield return item;
}
}
}
public static IEnumerable<T> Chain<T>(params IEnumerable<T>[] collection)
{
foreach (var innerCollection in collection)
{
foreach (var item in innerCollection)
{
yield return item;
}
}
}

How to iterate through two collections of the same length using a single foreach

I know this question has been asked many times before but I tried out the answers and they don't seem to work.
I have two lists of the same length but not the same type, and I want to iterate through both of them at the same time as list1[i] is connected to list2[i].
Eg:
Assuming that i have list1 (as List<string>) and list2 (as List<int>)
I want to do something like
foreach( var listitem1, listitem2 in list1, list2)
{
// do stuff
}
Is this possible?
This is possible using .NET 4 LINQ Zip() operator or using open source MoreLINQ library which provides Zip() operator as well so you can use it in more earlier .NET versions
Example from MSDN:
int[] numbers = { 1, 2, 3, 4 };
string[] words = { "one", "two", "three" };
// The following example concatenates corresponding elements of the
// two input sequences.
var numbersAndWords = numbers.Zip(words, (first, second) => first + " " + second);
foreach (var item in numbersAndWords)
{
Console.WriteLine(item);
}
// OUTPUT:
// 1 one
// 2 two
// 3 three
Useful links:
Soure code of the MoreLINQ Zip() implementation: MoreLINQ Zip.cs
Edit - Iterating whilst positioning at the same index in both collections
If the requirement is to move through both collections in a 'synchronized' fashion, i.e. to use the 1st element of the first collection with the 1st element of the second collection, then 2nd with 2nd, and so on, without needing to perform any side effecting code, then see #sll's answer and use .Zip() to project out pairs of elements at the same index, until one of the collections runs out of elements.
More Generally
Instead of the foreach, you can access the IEnumerator from the IEnumerable of both collections using the GetEnumerator() method and then call MoveNext() on the collection when you need to move on to the next element in that collection. This technique is common when processing two or more ordered streams, without needing to materialize the streams.
var stream1Enumerator = stream1.GetEnumerator();
var stream2Enumerator = stream2.GetEnumerator();
var currentGroupId = -1; // Initial value
// i.e. Until stream1Enumerator runs out of
while (stream1Enumerator.MoveNext())
{
// Now you can iterate the collections independently
if (stream1Enumerator.Current.Id != currentGroupId)
{
stream2Enumerator.MoveNext();
currentGroupId = stream2Enumerator.Current.Id;
}
// Do something with stream1Enumerator.Current and stream2Enumerator.Current
}
As others have pointed out, if the collections are materialized and support indexing, such as an ICollection interface, you can also use the subscript [] operator, although this feels rather clumsy nowadays:
var smallestUpperBound = Math.Min(collection1.Count, collection2.Count);
for (var index = 0; index < smallestUpperBound; index++)
{
// Do something with collection1[index] and collection2[index]
}
Finally, there is also an overload of Linq's .Select() which provides the index ordinal of the element returned, which could also be useful.
e.g. the below will pair up all elements of collection1 alternatively with the first two elements of collection2:
var alternatePairs = collection1.Select(
(item1, index1) => new
{
Item1 = item1,
Item2 = collection2[index1 % 2]
});
Short answer is no you can't.
Longer answer is that is because foreach is syntactic sugar - it gets an iterator from the collection and calls Next on it. This is not possible with two collections at the same time.
If you just want to have a single loop, you can use a for loop and use the same index value for both collections.
for(int i = 0; i < collectionsLength; i++)
{
list1[i];
list2[i];
}
An alternative is to merge both collections into one using the LINQ Zip operator (new to .NET 4.0) and iterate over the result.
foreach(var tup in list1.Zip(list2, (i1, i2) => Tuple.Create(i1, i2)))
{
var listItem1 = tup.Item1;
var listItem2 = tup.Item2;
/* The "do stuff" from your question goes here */
}
It can though be such that much of your "do stuff" can go in the lambda that here creates a tuple, which would be even better.
If the collections are such that they can be iterated, then a for() loop is probably simpler still though.
Update: Now with the built-in support for ValueTuple in C#7.0 we can use:
foreach ((var listitem1, var listitem2) in list1.Zip(list2, (i1, i2) => (i1, i2)))
{
/* The "do stuff" from your question goes here */
}
You can wrap the two IEnumerable<> in helper class:
var nums = new []{1, 2, 3};
var strings = new []{"a", "b", "c"};
ForEach(nums, strings).Do((n, s) =>
{
Console.WriteLine(n + " " + s);
});
//-----------------------------
public static TwoForEach<A, B> ForEach<A, B>(IEnumerable<A> a, IEnumerable<B> b)
{
return new TwoForEach<A, B>(a, b);
}
public class TwoForEach<A, B>
{
private IEnumerator<A> a;
private IEnumerator<B> b;
public TwoForEach(IEnumerable<A> a, IEnumerable<B> b)
{
this.a = a.GetEnumerator();
this.b = b.GetEnumerator();
}
public void Do(Action<A, B> action)
{
while (a.MoveNext() && b.MoveNext())
{
action.Invoke(a.Current, b.Current);
}
}
}
Instead of a foreach, why not use a for()? for example...
int length = list1.length;
for(int i = 0; i < length; i++)
{
// do stuff with list1[i] and list2[i] here.
}

Categories