Linq Query fail to execute method - c#

While learning Linq, I wrote the code bellow, the problem is that "PrintResults()" method is never executed. I don't understand why!?
Is what I m trying to do possible?
Thank you.
using System;
using System.Collections.Generic;
using System.Linq;
namespace Linq
{
class Program
{
static void Main(string[] args)
{
int[] scores = new int[] { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
//IEnumerable<int> query =
// from score in scores
// where score % 2 == 0
// select score;
// Console.WriteLine(score);
IEnumerable<int> queryResults = scores.Where(x => x % 2 == 0).ToList().Take(2);
PrintResults(queryResults);
}
static IEnumerable<int> PrintResults(IEnumerable<int> input)
{
foreach (var score in input)
{
Console.WriteLine(score);
yield return score;
}
}
}
}

When a method contains a yield return statement, it becomes an "iterator block". It will be evaluated lazily. This means that the code will not execute until some client enumerates over the IEnumerable<int> that is returned.
To see the results, invoke it like this:
var results = PrintResults(queryResults);
foreach (var result in results)
{
// do something
}
Another way to "collapse" the iterator is just to call .ToList() on the return value. That will cause it to be enumerated just like a foreach loop does:
var results = PrintResults(queryResults).ToList();
Jon Skeet describes iterator blocks in more detail here.

Related

Permutation algorithm Optimization

I have this permutation code working perfectly but it does not generate the code fast enough, I need help with optimizing the code to run faster, please it is important that the result remains the same, I have seen other algorithms but they don't into consideration the output length and same character reputation which are all valid output. if I can have this converted into a for loop with 28 characters of alphanumeric, that would be awesome. below is the current code I am looking to optimize.
namespace CSharpPermutations
{
public interface IPermutable<T>
{
ISet<T> GetRange();
}
public class Digits : IPermutable<int>
{
public ISet<int> GetRange()
{
ISet<int> set = new HashSet<int>();
for (int i = 0; i < 10; ++i)
set.Add(i);
return set;
}
}
public class AlphaNumeric : IPermutable<char>
{
public ISet<char> GetRange()
{
ISet<char> set = new HashSet<char>();
set.Add('0');
set.Add('1');
set.Add('2');
set.Add('3');
set.Add('4');
set.Add('5');
set.Add('6');
set.Add('7');
set.Add('8');
set.Add('9');
set.Add('a');
set.Add('b');
return set;
}
}
public class PermutationGenerator<T,P> : IEnumerable<string>
where P : IPermutable<T>, new()
{
public PermutationGenerator(int number)
{
this.number = number;
this.range = new P().GetRange();
}
public IEnumerator<string> GetEnumerator()
{
foreach (var item in Permutations(0,0))
{
yield return item.ToString();
}
}
IEnumerator IEnumerable.GetEnumerator()
{
foreach (var item in Permutations(0,0))
{
yield return item;
}
}
private IEnumerable<StringBuilder> Permutations(int n, int k)
{
if (n == number)
yield return new StringBuilder();
foreach (var element in range.Skip(k))
{
foreach (var result in Permutations(n + 1, k + 1))
{
yield return new StringBuilder().Append(element).Append(result);
}
}
}
private int number;
private ISet<T> range;
}
class MainClass
{
public static void Main(string[] args)
{
foreach (var element in new PermutationGenerator<char, AlphaNumeric>(2))
{
Console.WriteLine(element);
}
}
}
}
Thanks for your effort in advance.
What you're outputting there is the cartesian product of two sets; the first set is the characters "0123456789ab" and the second set is the characters "123456789ab".
Eric Lippert wrote a well-known article demonstrating how to use Linq to solve this.
We can apply this to your problem like so:
using System;
using System.Collections.Generic;
using System.Linq;
namespace Demo;
static class Program
{
static void Main(string[] args)
{
char[][] source = new char[2][];
source[0] = "0123456789ab".ToCharArray();
source[1] = "0123456789ab".ToCharArray();
foreach (var perm in Combine(source))
{
Console.WriteLine(string.Concat(perm));
}
}
public static IEnumerable<IEnumerable<T>> Combine<T>(IEnumerable<IEnumerable<T>> sequences)
{
IEnumerable<IEnumerable<T>> emptyProduct = new[] { Enumerable.Empty<T>() };
return sequences.Aggregate(
emptyProduct,
(accumulator, sequence) =>
from accseq in accumulator
from item in sequence
select accseq.Concat(new[] { item }));
}
}
You can extend this to 28 characters by modifying the source data:
source[0] = "0123456789abcdefghijklmnopqr".ToCharArray();
source[1] = "0123456789abcdefghijklmnopqr".ToCharArray();
If you want to know how this works, read Eric Lipper's excellent article, which I linked above.
Consider
foreach (var result in Permutations(n + 1, k + 1))
{
yield return new StringBuilder().Append(element).Append(result);
}
Permutations is a recursive function that implements an iterator. So each time the .MoveNext() method is will advance one step of the loop, that will call MoveNext() in turn etc, resulting in N calls to MoveNext(), new StringBuilder, Append() etc. This is quite inefficient.
A can also not see that stringBuilder gives any advantage here. It is a benefit if you concatenate many strings, but as far as I can see you only add two strings together.
The first thing you should do is add code to measure the performance, or even better, use a profiler. That way you can tell if any changes actually improves the situation or not.
The second change I would try would be to try rewrite the recursion to an iterative implementation. This probably means that you need to keep track of an explicit stack of the numbers to process. Or if this is to difficult, stop using iterator blocks and let the recursive method take a list that it adds results to.

MongoDB C# Driver - Fastest way to perform an "IN" query on _id

I'm trying to get values from a collection, based on items whose IDs are in a certain collection of IDs.
My current code to build the filter is:
IEnumerable<string> IDList;
using (var enumerator = IDList.GetEnumerator())
{
if (enumerator.MoveNext() == false) return null; // empty collection
// take the first key
var key = enumerator.Current;
filter = Builders<MyClass>.Filter.Eq(p => p.Key, key);
// take all the other keys
while (enumerator.MoveNext())
{
var innerKey = enumerator.Current;
filter = filter | Builders<MyClass>.Filter.Eq(p => p.Key, innerKey);
}
}
and then my code to get the items is:
List<MyClass> values = new List<MyClass>();
using (var cursor = await MyCollection.FindAsync(filter))
{
while (await cursor.MoveNextAsync())
{
values.AddRange(cursor.Current);
}
}
This code's performance seems pretty subpar, and I'm sure there has to be a faster way since MongoDB should have very good performance... Not to mention I'm querying an indexed field, which should make the query very fast. What can I do to speed this up, both in an async way and a sync way? From some Googling I've seen that there are many ways to query a collection, and I'm not sure which way would be the best for my particular case.
Running this query in RoboMongo takes 0.02 seconds, while running it in C# MongoDb.Driver takes a full second, sometimes even longer and I'm not sure why.
Thanks in advance.
How about a simple "$in" query?
using MongoDB.Bson;
using MongoDB.Driver;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Threading.Tasks;
namespace ConsoleApp1
{
public class MyClass
{
public ObjectId Id;
public string Key;
}
public class Program
{
static void Main(string[] args)
{
IEnumerable<string> ids = new [] { "a", "b", "c" };
var collection = new MongoClient().GetDatabase("test").GetCollection<MyClass>("test");
foreach (var id in ids)
{
collection.InsertOne(new MyClass { Key = id });
}
// here comes the "$in" query
var filter = Builders<MyClass>.Filter.In(myClass => myClass.Key, ids);
// sync
List<MyClass> values = collection.Find(filter).ToList();
// async
var queryTask = collection.FindAsync(filter);
values = GetValues(queryTask).Result;
Console.ReadLine();
}
private static async Task<List<MyClass>> GetValues(System.Threading.Tasks.Task<IAsyncCursor<MyClass>> queryTask)
{
var cursor = await queryTask;
return await cursor.ToListAsync<MyClass>();
}
}
}

Double-use of C# iterator works unexpectedly

This is my first go at coding in C# - I have a background in C/Python/Javascript/Haskell.
Why does the program below work? I would expect this to work in Haskell, as lists are immutable, but I am struggling with how I can use the same iterator nums twice, without error.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace HelloWorld
{
class Program
{
static void Main(string[] args)
{
var nums = new List<int?>() { 0, 0, 2, 3, 3, 3, 4 };
var lastNums = new List<int?>() { null } .Concat(nums);
var changesAndNulls = lastNums.Zip(nums,
(last, curr) => (last == null || last != curr) ? curr : null
);
var changes = from changeOrNull in changesAndNulls where changeOrNull != null select changeOrNull;
foreach (var change in changes) {
Console.WriteLine("change: " + change);
}
}
}
}
In your code nums is not IEnumerator<T> (iterator), it's IEnumarable<T> and IEnumarable <T> has GetEnumerator() method which can be invoked as many times as required:
IEnumerable<int?> nums = new List<int?>() { 0, 0, 2, 3, 3, 3, 4 };
// Linq gets enumerator to do Concat
using (var iterator1 = nums.GetEnumerator()) {
while (iterator1.MoveNext()) {
...
}
}
...
// Linq gets (fresh!) enumerator to do Zip
using (var iterator2 = nums.GetEnumerator()) {
while (iterator2.MoveNext()) {
...
}
}
So IEnumerable<T> is a factory producing IEnumerator<T> instances (and it is a IEnumerator<T> that can't be re-used)
A List<int?> implements the interface IEnumerable<int?>. That means it has a method called GetEnumerator().
This method returns a new Enumerator<int?> object used to iterate over all the items.
You are calling GetEnumerator() (in the background) when you use it inside a foreach loop or when you call one of the many extension methods like Concat() (which call GetEnumerator() themselves).
I suggest you take a tutorial on C#. It is a very different language than Haskell and it has some very unique features.
Just for starters: http://en.wikipedia.org/wiki/C_Sharp_syntax

Test if all values in a list are unique

I have a small list of bytes and I want to test that they're all different values.
For instance, I have this:
List<byte> theList = new List<byte> { 1,4,3,6,1 };
What's the best way to check if all values are distinct or not?
bool isUnique = theList.Distinct().Count() == theList.Count();
Here's another approach which is more efficient than Enumerable.Distinct + Enumerable.Count (all the more if the sequence is not a collection type). It uses a HashSet<T> which eliminates duplicates, is very efficient in lookups and has a count-property:
var distinctBytes = new HashSet<byte>(theList);
bool allDifferent = distinctBytes.Count == theList.Count;
or another - more subtle and efficient - approach:
var diffChecker = new HashSet<byte>();
bool allDifferent = theList.All(diffChecker.Add);
HashSet<T>.Add returns false if the element could not be added since it was already in the HashSet. Enumerable.All stops on the first "false".
Okay, here is the most efficient method I can think of using standard .Net
using System;
using System.Collections.Generic;
public static class Extension
{
public static bool HasDuplicate<T>(
this IEnumerable<T> source,
out T firstDuplicate)
{
if (source == null)
{
throw new ArgumentNullException(nameof(source));
}
var checkBuffer = new HashSet<T>();
foreach (var t in source)
{
if (checkBuffer.Add(t))
{
continue;
}
firstDuplicate = t;
return true;
}
firstDuplicate = default(T);
return false;
}
}
essentially, what is the point of enumerating the whole sequence twice if all you want to do is find the first duplicate.
I could optimise this more by special casing an empty and single element sequences but that would depreciate from readability/maintainability with minimal gain.
The similar logic to Distinct using GroupBy:
var isUnique = theList.GroupBy(i => i).Count() == theList.Count;
I check if an IEnumerable (aray, list, etc ) is unique like this :
var isUnique = someObjectsEnum.GroupBy(o => o.SomeProperty).Max(g => g.Count()) == 1;
One can also do: Use Hashset
var uniqueIds = new HashSet<long>(originalList.Select(item => item.Id));
if (uniqueIds.Count != originalList.Count)
{
}
There are many solutions.
And no doubt more beautiful ones with the usage of LINQ as "juergen d" and "Tim Schmelter" mentioned.
But, if you bare "Complexity" and speed, the best solution will be to implement it by yourself.
One of the solution will be, to create an array of N size (for byte it's 256).
And loop the array, and on every iteration will test the matching number index if the value is 1 if it does, that means i already increment the array index and therefore the array isn't distinct otherwise i will increment the array cell and continue checking.
And another solution, if you want to find duplicated values.
var values = new [] { 9, 7, 2, 6, 7, 3, 8, 2 };
var sorted = values.ToList();
sorted.Sort();
for (var index = 1; index < sorted.Count; index++)
{
var previous = sorted[index - 1];
var current = sorted[index];
if (current == previous)
Console.WriteLine(string.Format("duplicated value: {0}", current));
}
Output:
duplicated value: 2
duplicated value: 7
http://rextester.com/SIDG48202

C# Equivalent of Python's itertools.chain

What (if any) is the C# equivalent of Python's itertools.chain method?
Python Example:
l1 = [1, 2]
l2 = [3, 4]
for v in itertools.chain(l1, l2):
print(v)
Results:
1
2
3
4
Note that I'm not interested in making a new list that combines my first two and then processing that. I want the memory/time savings that itertools.chain provides by not instantiating this combined list.
You could do the same on C# by using the Concat extension method from LINQ:
l1.Concat(l2)
LINQ uses a deferred execution model, so this won't create a new list.
Enumerable.Concat (MSDN)
var l1 = new List<int>() { 1, 2 };
var l2 = new List<int>() { 3, 4 };
foreach(var item in Enumerable.Concat(l1, l2))
{
Console.WriteLine(item.ToString())
}
I've started building my own c# equivalent of itertools. Here is what I came up with for chain. Nice and succinct and keeps some of the Python charm. Here are two implementations of Chain. It seems the compiler can't distinguish between the call signatures, so use the one that best suites you.
public static IEnumerable<T> Chain<T>(IEnumerable<IEnumerable<T>> collection)
{
foreach (var innerCollection in collection)
{
foreach (var item in innerCollection)
{
yield return item;
}
}
}
public static IEnumerable<T> Chain<T>(params IEnumerable<T>[] collection)
{
foreach (var innerCollection in collection)
{
foreach (var item in innerCollection)
{
yield return item;
}
}
}

Categories