I need help understanding LINQ. Where is my understanding going wrong?

I need help understanding LINQ. Where is my understanding going wrong? - c#

I'll give you my best attempt at understanding it and you let me know where I'm going wrong.
For simplicity, let's assume that we live in a word that only has
numbers 1, 2, 3, 4, 5
operators %, > with their usual precdence
I want to dissamble what happens when I do
List<int> All = new List<int> { 1, 2, 3, 4, 5 };
IEnumerable<int> Filtered = from i in All
where i % 2 == 1
orderby i descending
select i;
foreach ( var i in Filtererd )
{
Console.WriteLine(i);
}
What I understand first of all is that the query itself does not create a Ienumerable<int>; it creates and Expression Tree associated with the query. The elements returned by the query are yielded in an invisible function created by the compiler like
public static IEnumerable<int> MyInvisibleFunction ( List<int> Source )
{
foreach ( int i in Source.Reverse() )
{
if ( i % 2 == 1 )
{
yield return i;
}
}
}
(Of course that's kind of a weird example because Source.Reverse() is itself a query, but anyways ...)
Now I'm confused where expression tress come into play here. When I think of expression trees, I think of trees like
(3 % 1 > 0)
/ \
/ \
(3 % 1) > 0
/ \
3 % 1
in the small world I've created. But where does a tree like that come in to play in my LINQ query
from i in All
where i % 2 == 1
orderby i descending
select i
??? That's what I don't understand. I'm looking at the Expression class and I see how it could create the example tree I showed, but I don't see where it would come into play in my query.

I'll give you my best attempt at understanding it and you let me know where I'm going wrong.
OK.
What I understand first of all is that the query itself does not create a Ienumerable<int>;
This statement is completely wrong.
it creates and Expression Tree associated with the query.
This statement is also completely wrong.
The elements returned by the query are yielded in an invisible function created by the compiler
This statement is also completely wrong.
where does a tree like that come in to play in my LINQ query
It doesn't. Your query uses no expression trees.
I'm looking at the Expression class and I see how it could create the example tree I showed, but I don't see where it would come into play
It doesn't.
want to dissamble what happens when I do
List<int> All = new List<int> { 1, 2, 3, 4, 5 };
IEnumerable<int> Filtered = from i in All
where i % 2 == 1
orderby i descending
select i;
foreach ( var i in Filtererd )
Console.WriteLine(i);
Let's break it down. First the compiler turns that into
List<int> All = new List<int> { 1, 2, 3, 4, 5 };
IEnumerable<int> Filtered = All.Where(i => i % 2 == 1).OrderBy(i => i);
foreach ( var i in Filtererd )
Console.WriteLine(i);
Next the compiler does overload resolution and evaluates extension methods
List<int> All = new List<int> { 1, 2, 3, 4, 5 };
IEnumerable<int> Filtered =
Enumerable.OrderBy<int>(
Enumerable.Where<int>(All, i => i % 2 == 1)),
i => i));
foreach ( var i in Filtererd )
Console.WriteLine(i);
Next lambdas are desugared:
static bool A1(int i) { return i % 2 == 1; )
static int A2(int i) { return i }
...
List<int> All = new List<int> { 1, 2, 3, 4, 5 };
IEnumerable<int> Filtered =
Enumerable.OrderBy<int>(
Enumerable.Where<int>(All, new Func<int, bool>(A1))),
new Func<int, int>(A2)));
foreach (var i in Filtererd )
Console.WriteLine(i);
That is actually not how the lambdas are desugared exactly; they are also cached, but let's ignore that detail.
I assume that you do not want the foreach desugared. See the C# specification for details.
If you want to know what Where and OrderBy do, read the source code.

Expression trees don't come into play in your query, because your source is a regular in-memory list. – Theodoros Chatzigiannakis
This is true.
There is no invisible iterator function being generated. Your query translates to:
List<int> All = new List<int> { 1, 2, 3, 4, 5 };
IEnumerable<int> Filtered =
All
.Where(i => i % 2 == 0)
.OrderByDescending(i => i);
There is no need for custom iterators. The language just calls the existing library functions.
This is the same for IQueryable except that the lambda arguments are not passed as delegates but as expression trees.
You can see this in action by commenting in the AsQueryable() call here.

Related

How to create new list from list of list where elements are in new list are in alternative order? [duplicate]

This question already has answers here:
Interleaving multiple (more than 2) irregular lists using LINQ
(5 answers)
Closed 5 years ago.
Suppose I have list of list. I want to create new list from given list of list such that elements are in order of example given below.
Inputs:-
List<List<int>> l = new List<List<int>>();
List<int> a = new List<int>();
a.Add(1);
a.Add(2);
a.Add(3);
a.Add(4);
List<int> b = new List<int>();
b.Add(11);
b.Add(12);
b.Add(13);
b.Add(14);
b.Add(15);
b.Add(16);
b.Add(17);
b.Add(18);
l.Add(a);
l.Add(b);
Output(list):-
1
11
2
12
3
13
4
14
15
16
And output list must not contain more than 10 elements.
I am currently doing this using foreach inside while but I want to know how can I do this using LINQ.
int loopCounter = 0,index=0;
List<int> o=new List<int>();
while(o.Count<10)
{
foreach(List<int> x in l)
{
if(o.Count<10)
o.Add(x[index]);
}
index++;
}
Thanks.

Use the SelectMany and Select overloads that receive the item's index. That will be used to apply the desired ordering. The use of the SelectMany is to flatten the nested collections level. Last, apply Take to retrieve only the desired number of items:
var result = l.SelectMany((nested, index) =>
nested.Select((item, nestedIndex) => (index, nestedIndex, item)))
.OrderBy(i => i.nestedIndex)
.ThenBy(i => i.index)
.Select(i => i.item)
.Take(10);
Or in query syntax:
var result = (from c in l.Select((nestedCollection, index) => (nestedCollection, index))
from i in c.nestedCollection.Select((item, index) => (item, index))
orderby i.index, c.index
select i.item).Take(10);
If using a C# 6.0 and prior project an anonymous type instead:
var result = l.SelectMany((nested, index) =>
nested.Select((item, nestedIndex) => new {index, nestedIndex, item}))
.OrderBy(i => i.nestedIndex)
.ThenBy(i => i.index)
.Select(i => i.item)
.Take(10);
To explain why Zip alone is not enough: zip is equivalent to performing a join operation on the second collection to the first, where the
attribute to join by is the index. Therefore Only items that exist in the first collection, if they have a match in the second, will appear in the result.
The next option is to think about left join which will return all items of the first collection with a match (if exists) in the second. In the case described OP is looking for the functionality of a full outer join - get all items of both collection and match when possible.

I know you asked for LINQ, but I do often feel that LINQ is a hammer and as soon as a developer finds it, every problem is a nail. I wouldn't have done this one with LINQ, for a readability/maintainability point of view because I think something like this is simpler and easier to understand/more self documenting:
List<int> r = new List<int>(10);
for(int i = 0; i < 10; i++){
if(i < a.Count)
r.Add(a[i]);
if(i < b.Count)
r.Add(b[i]);
}
You don't need to stop the loop early if a and b collectively only have eg 8 items, but you could by extending the test of the for loop
I also think this case may be more performant than LINQ because it's doing a lot less
If your mandate to use LINQ is academic (this is a homework that must use LINQ) then go ahead, but if it's a normal everyday system that some other poor sucker will have to maintain one day, I implore you to consider whether this is a good application for LINQ

This will handle 2 or more internal List<List<int>>'s - it returns an IEnumerable<int> via yield so you have to call .ToList() on it to make it a list. Linq.Any is used for the break criteria.
Will throw on any list being null. Add checks to your liking.
static IEnumerable<int> FlattenZip (List<List<int>> ienum, int maxLength = int.MaxValue)
{
int done = 0;
int index = 0;
int yielded = 0;
while (yielded <= maxLength && ienum.Any (list => index < list.Count))
foreach (var l in ienum)
{
done++;
if (index < l.Count)
{
// this list is big enough, we will take one out
yielded++;
yield return l[index];
}
if (yielded > maxLength)
break; // we are done
if (done % (ienum.Count) == 0)
index += 1; // checked all lists, advancing index
}
}
public static void Main ()
{
// other testcases to consider:
// in total too few elememts
// one list empty (but not null)
// too many lists (11 for 10 elements)
var l1 = new List<int> { 1, 2, 3, 4 };
var l2 = new List<int> { 11, 12, 13, 14, 15, 16 };
var l3 = new List<int> { 21, 22, 23, 24, 25, 26 };
var l = new List<List<int>> { l1, l2, l3 };
var zipped = FlattenZip (l, 10);
Console.WriteLine (string.Join (", ", zipped));
Console.ReadLine ();
}

Apply lambda expression for specific indices in list c#

I am new to lambda expression and I have problem to convert parts of code which involves indices of list inside the loop to equivalent lambda expression.
Example 1: Working with different indices inside one list
List<double> newList = new List<double>();
for (int i = 0; i < list.Count - 1; ++i)
{
newList.Add(list[i] / list[i + 1]);
}
Example 2: Working with indices from two lists
double result = 0;
for (int i = 0; i < list1.Count; ++i)
{
result += f(list1[i]) * g(list2[i]);
}
How to write equivalent lambda expressions?

A lambda expression is something that looks like {params} => {body}, where the characteristic symbol is the => "maps to." What you are asking for are typically referred to as LINQ query expressions, which come in two styles:
The functional style of query is typically a sequence of chained calls to LINQ extension methods such as Select, Where, Take, or ToList. This is the style I have used in the examples below and is also the much-more prevalent style (in my experience).
The "language integrated" style (*) uses built-in C# keywords that the compiler will turn into the functional style for you. For example:
var query = from employee in employeeList
where employee.ManagerId == 17
select employee.Name;
| compiler
v rewrite
var query = employeeList
.Where(employee => employee.ManagerId == 17)
.Select(employee => employee.Name);
Example 1:
var newList = Enumerable.Range(0, list.Count - 1)
.Select(i => list[i] / list[i + 1])
.ToList();
Example 2:
var result = Enumerable.Zip(list1.Select(f), list2.Select(g), (a, b) => a * b).Sum();
(*) I'm not actually sure this is the official name for it. Please correct me with the proper name if you know it.

For your first example you can try using Zip and Skip:
var result = list.Zip(list.Skip(1), (x, y) => x / y);
How does it work
If your initial list is {1, 2, 3}, then Skip(1) would result in {2, 3}; then Zip takes {1, 2, 3} and {2, 3} as inputs, and tells that, for each pair of numbers, it shall divide them. Repeat until there are no more elements in at least one list, and return the result as IEnumerable.
For the second example,
var result = list1.Zip(list2, (x, y) => f(x) * f(y)).Sum();

How do I use LINQ to find 5 elements in a row that match one predicate, but where the sixth element doesn't?

I'm trying to learn LINQ and it seems that finding a series of 'n' elements that match a predicate should be possible but I can't seem to figure out how to approach the problem.
My solution actually needs a second, different predicate to test the 'end' of the sequence but finding the first element that doesn't past a test, after a sequence of at least 5 elements that do pass the test would also be interesting.
Here is my naive non-LINQ approach....
int numPassed = 0;
for (int i = 0; i < array.Count - 1; i++ )
{
if (FirstTest(array[i]))
{
numPassed++;
}
else
{
numPassed = 0;
}
if ((numPassed > 5) && SecondTest(array[i + 1]))
{
foundindex = i;
break;
}
}

A performant LINQ solution is possible but frankly quite ugly. The idea is to isolate subsequences that match the description (a series of N items matching a predicate that ends when an item is found that matches a second predicate) and then select the first of these that has a minimum length.
Let's say that the parameters are:
var data = new[] { 0, 1, 1, 1, 0, 0, 2, 2, 2, 2, 2 };
Func<int, bool> acceptPredicate = i => i != 0;
// The reverse of acceptPredicate, but could be otherwise
Func<int, bool> rejectPredicate = i => i == 0;
Isolating subsequences is possible with GroupBy and a bunch of ugly stateful code (here's the inherent awkwardness -- you have to keep non-trivial state). The idea is to group by an artificial and arbitrary "group number", choosing a different number whenever we move from a subsequence that might be acceptable to one that definitely is not acceptable and when the reverse happens as well:
var acceptMode = false;
var groupCount = 0;
var groups = data.GroupBy(i => {
if (acceptMode && rejectPredicate(i)) {
acceptMode = false;
++groupCount;
}
else if (!acceptMode && acceptPredicate(i)) {
acceptMode = true;
++groupCount;
}
return groupCount;
});
The last step (finding the first group of acceptable length) is easy, but there is one last pitfall: making sure that you don't select one of the groups that do not satisfy the stated condition:
var result = groups.Where(g => !rejectPredicate(g.First()))
.FirstOrDefault(g => g.Count() >= 5);
All of the above is achieved with a single pass over the source sequence.
Note that this code will accept a sequence of items that also ends the source sequence (i.e. it does not terminate because we found an item that satisfies rejectPredicate but because we ran out of data). If you don't want this a slight modification will be required.
See it in action.

Not elegant, but this will work:
var indexList = array
.Select((x, i) => new
{ Item = x, Index = i })
.Where(item =>
item.Index + 5 < array.Length &&
FirstTest(array[item.Index]) &&
FirstTest(array[item.Index+1]) &&
FirstTest(array[item.Index+2]) &&
FirstTest(array[item.Index+3]) &&
FirstTest(array[item.Index+4]) &&
SecondTest(array[item.Index+5]))
.Select(item => item.Index);

Instead of trying to combine existing extension methods, it is much more cleaner to use an Enumerator.
Example:
IEnumerable<T> MatchThis<T>(IEnumerable<T> source,
Func<T, bool> first_predicate,
Int32 times_match,
Func<T, bool> second_predicate)
{
var found = new List<T>();
using (var en = source.GetEnumerator())
{
while(en.MoveNext() && found.Count < times_match)
if (first_predicate((T)en.Current))
found.Add((T)en.Current);
else
found.Clear();
if (found.Count < times_match && !en.MoveNext() || !second_predicate((T)en.Current))
return Enumerable.Empty<T>();
found.Add((T)en.Current);
return found;
}
}
Usage:
var valid_seq = new Int32[] {800, 3423, 423423, 1, 2, 3, 4, 5, 200, 433, 32};
var result = MatchThis(valid_seq, e => e<100, 5, e => e>100);
Result:

var result = array.GetSixth(FirstTest).FirstOrDefault(SecondTest);
internal static class MyExtensions
{
internal static IEnumerable<T> GetSixth<T>(this IEnumerable<T> source, Func<T, bool> predicate)
{
var counter=0;
foreach (var item in source)
{
if (counter==5) yield return item;
counter = predicate(item) ? counter + 1 : 0;
}
}
}

It looks like you want continuous 6 elements, the first 5 of which match predicate1, and the last (the 6th) matches predicate2. Your non-linq version works fine, using linq in this case is a little reluctant. And trying to resolve the problem in one linq query makes the issue harder, here is a (maybe) cleaner linq solution:
int continuous = 5;
var temp = array.Select(n => FirstTest(n) ? 1 : 0);
var result = array.Where((n, index) =>
index >= continuous
&& SecondTest(n)
&& temp.Skip(index - continuous).Take(continuous).Sum() == continuous)
.FirstOrDefault();
Things will be easier if you have Morelinq.Batch method.

Like others have mentioned, LINQ is not the ideal solution to this kind of pattern matching need. But still, it is possible, and it doesn't have to be ugly:
Func<int, bool> isBody = n => n == 8;
Func<int, bool> isEnd = n => n == 2;
var requiredBodyLength = 5;
// Index: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
int[] sequence = { 6, 8, 8, 9, 2, 1, 8, 8, 8, 8, 8, 8, 8, 2, 5 };
// ^-----------^ ^
// Body End
// First we stick an index on each element, since that's the desired output.
var indexedSequence = sequence.Select((n, i) => new { Index = i, Number = n }).ToArray();
// Scroll to the right to see comments
var patternMatchIndexes = indexedSequence
.Select(x => indexedSequence.Skip(x.Index).TakeWhile(x2 => isBody(x2.Number))) // Skip all the elements we already processed and try to match the body
.Where(body => body.Count() == requiredBodyLength) // Filter out any body sequences of incorrect length
.Select(body => new { BodyIndex = body.First().Index, EndIndex = body.Last().Index + 1 }) // Prepare the index of the first body element and the index of the end element
.Where(x => x.EndIndex < sequence.Length && isEnd(sequence[x.EndIndex])) // Make sure there is at least one element after the body and that it's an end element
.Select(x => x.BodyIndex) // There may be more than one matching pattern, get all their indices
.ToArray();
//patternMatchIndexes.Dump(); // Uncomment in LINQPad to see results
Note that this implementation is not performant at all, it is only meant as a teaching aid to show how something can be done in LINQ despite the unsuitability of solving it that way.

Find the number of differences between two lists

I want to compare two lists with the same number of elements, and find the number of differences between them. Right now, I have this code (which works):
public static int CountDifferences<T> (this IList<T> list1, IList<T> list2)
{
if (list1.Count != list2.Count)
throw new ArgumentException ("Lists must have the same number of elements", "list2");
int count = 0;
for (int i = 0; i < list1.Count; i++) {
if (!EqualityComparer<T>.Default.Equals (list1[i], list2[i]))
count++;
}
return count;
}
This feels messy to me, and it seems like there must be a more elegant way to achieve it. Is there a way, perhaps, to combine the two lists into a single list of tuples, then simple examine each element of the new list to see if both elements are equal?

Since order in the list does count this would be my approach:
public static int CountDifferences<T>(this IList<T> list1, IList<T> list2)
{
if (list1.Count != list2.Count)
throw new ArgumentException("Lists must have the same number of elements", "list2");
int count = list1.Zip(list2, (a, b) => a.Equals(b) ? 0 : 1).Sum();
return count;
}
Simply merging the lists using Enumerable.Zip() then summing up the differences, still O(n) but this just enumerates the lists once.
Also this approach would work on any two IEnumerable of the same type since we do not use the list indexer (besides obviously in your count comparison in the guard check).

I think your approach is fine, but you could use LINQ to simplify your function:
public static int CountDifferences<T>(this IList<T> list1, IList<T> list2)
{
if(list1.Count != list2.Count)
throw new ArgumentException("Lists must have same # elements", "list2");
return list1.Where((t, i) => !Equals(t, list2[i])).Count();
}
The way you have it written in the question, I don't think Intersect does what you're looking for. For example, say you have:
var list1 = new List<int> { 1, 2, 3, 4, 6, 8 };
var list2 = new List<int> { 1, 2, 4, 5, 6, 8 };
If you run list1.CountDifferences(list2), I'm assuming that you want to get back 2 since elements 2 and 3 are different. Intersect in this case will return 5 since the lists have 5 elements in common. So, if you're looking for 5 then Intersect is the way to go. If you're looking to return 2 then you could use the LINQ statement above.

Try something like this:
var result = list1.Intersect(list2);
var differences = list1.Count - result.Count();
If order counts:
var result = a.Where((x,i) => x !=b[i]);
var differences = result.Count();

You want the Intersect extension method of Enumerable.
public static int CountDifferences<T> (this IList<T> list1, IList<T> list2)
{
if (list1.Count != list2.Count)
throw new ArgumentException ("Lists must have the same number of elements", "list2");
return list1.Count - list1.Intersect(list2).Count();
}

You can use the extension method Zip of List.
List<int> lst1 = new List<int> { 1, 2, 3, 4, 5 };
List<int> lst2 = new List<int> { 6, 2, 9, 4, 5 };
int cntDiff = lst1.Zip(lst2, (a, b) => a != b).Count(a => a);
// Output is 2

LINQ: Determine if two sequences contains exactly the same elements

I need to determine whether or not two sets contains exactly the same elements. The ordering does not matter.
For instance, these two arrays should be considered equal:
IEnumerable<int> data = new []{3, 5, 6, 9};
IEnumerable<int> otherData = new []{6, 5, 9, 3}
One set cannot contain any elements, that are not in the other.
Can this be done using the built-in query operators? And what would be the most efficient way to implement it, considering that the number of elements could range from a few to hundreds?

If you want to treat the arrays as "sets" and ignore order and duplicate items, you can use HashSet<T>.SetEquals method:
var isEqual = new HashSet<int>(first).SetEquals(second);
Otherwise, your best bet is probably sorting both sequences in the same way and using SequenceEqual to compare them.

I suggest sorting both, and doing an element-by-element comparison.
data.OrderBy(x => x).SequenceEqual(otherData.OrderBy(x => x))
I'm not sure how fast the implementation of OrderBy is, but if it's a O(n log n) sort like you'd expect the total algorithm is O(n log n) as well.
For some cases of data, you can improve on this by using a custom implementation of OrderBy that for example uses a counting sort, for O(n+k), with k the size of the range wherein the values lie.

If you might have duplicates (or if you want a solution which performs better for longer lists), I'd try something like this:
static bool IsSame<T>(IEnumerable<T> set1, IEnumerable<T> set2)
{
if (set1 == null && set2 == null)
return true;
if (set1 == null || set2 == null)
return false;
List<T> list1 = set1.ToList();
List<T> list2 = set2.ToList();
if (list1.Count != list2.Count)
return false;
list1.Sort();
list2.Sort();
return list1.SequenceEqual(list2);
}
UPDATE: oops, you guys are right-- the Except() solution below needs to look both ways before crossing the street. And it has lousy perf for longer lists. Ignore the suggestion below! :-)
Here's one easy way to do it. Note that this assumes the lists have no duplicates.
bool same = data.Except (otherData).Count() == 0;

Here is another way to do it:
IEnumerable<int> data = new[] { 3, 5, 6, 9 };
IEnumerable<int> otherData = new[] { 6, 5, 9, 3 };
data = data.OrderBy(d => d);
otherData = otherData.OrderBy(d => d);
data.Zip(otherData, (x, y) => Tuple.Create(x, y)).All(d => d.Item1 == d.Item2);

First, check the length. If they are different, the sets are different.
you can do data.Intersect(otherData);, and check the length is identical.
OR, simplt sort the sets, and iterate through them.

First check if both data collections have the same number of elements and the check if all the elements in one collection are presented in the other
IEnumerable<int> data = new[] { 3, 5, 6, 9 };
IEnumerable<int> otherData = new[] { 6, 5, 9, 3 };
bool equals = data.Count() == otherData.Count() && data.All(x => otherData.Contains(x));

This should help:
IEnumerable<int> data = new []{ 3,5,6,9 };
IEnumerable<int> otherData = new[] {6, 5, 9, 3};
if(data.All(x => otherData.Contains(x)))
{
//Code Goes Here
}

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

I need help understanding LINQ. Where is my understanding going wrong? - c#

Related

How to create new list from list of list where elements are in new list are in alternative order? [duplicate]

Apply lambda expression for specific indices in list c#

How do I use LINQ to find 5 elements in a row that match one predicate, but where the sixth element doesn't?

Find the number of differences between two lists

LINQ: Determine if two sequences contains exactly the same elements

Categories

Resources