LINQ: Determine if two sequences contains exactly the same elements - c#

I need to determine whether or not two sets contains exactly the same elements. The ordering does not matter.
For instance, these two arrays should be considered equal:
IEnumerable<int> data = new []{3, 5, 6, 9};
IEnumerable<int> otherData = new []{6, 5, 9, 3}
One set cannot contain any elements, that are not in the other.
Can this be done using the built-in query operators? And what would be the most efficient way to implement it, considering that the number of elements could range from a few to hundreds?

If you want to treat the arrays as "sets" and ignore order and duplicate items, you can use HashSet<T>.SetEquals method:
var isEqual = new HashSet<int>(first).SetEquals(second);
Otherwise, your best bet is probably sorting both sequences in the same way and using SequenceEqual to compare them.

I suggest sorting both, and doing an element-by-element comparison.
data.OrderBy(x => x).SequenceEqual(otherData.OrderBy(x => x))
I'm not sure how fast the implementation of OrderBy is, but if it's a O(n log n) sort like you'd expect the total algorithm is O(n log n) as well.
For some cases of data, you can improve on this by using a custom implementation of OrderBy that for example uses a counting sort, for O(n+k), with k the size of the range wherein the values lie.

If you might have duplicates (or if you want a solution which performs better for longer lists), I'd try something like this:
static bool IsSame<T>(IEnumerable<T> set1, IEnumerable<T> set2)
{
if (set1 == null && set2 == null)
return true;
if (set1 == null || set2 == null)
return false;
List<T> list1 = set1.ToList();
List<T> list2 = set2.ToList();
if (list1.Count != list2.Count)
return false;
list1.Sort();
list2.Sort();
return list1.SequenceEqual(list2);
}
UPDATE: oops, you guys are right-- the Except() solution below needs to look both ways before crossing the street. And it has lousy perf for longer lists. Ignore the suggestion below! :-)
Here's one easy way to do it. Note that this assumes the lists have no duplicates.
bool same = data.Except (otherData).Count() == 0;

Here is another way to do it:
IEnumerable<int> data = new[] { 3, 5, 6, 9 };
IEnumerable<int> otherData = new[] { 6, 5, 9, 3 };
data = data.OrderBy(d => d);
otherData = otherData.OrderBy(d => d);
data.Zip(otherData, (x, y) => Tuple.Create(x, y)).All(d => d.Item1 == d.Item2);

First, check the length. If they are different, the sets are different.
you can do data.Intersect(otherData);, and check the length is identical.
OR, simplt sort the sets, and iterate through them.

First check if both data collections have the same number of elements and the check if all the elements in one collection are presented in the other
IEnumerable<int> data = new[] { 3, 5, 6, 9 };
IEnumerable<int> otherData = new[] { 6, 5, 9, 3 };
bool equals = data.Count() == otherData.Count() && data.All(x => otherData.Contains(x));

This should help:
IEnumerable<int> data = new []{ 3,5,6,9 };
IEnumerable<int> otherData = new[] {6, 5, 9, 3};
if(data.All(x => otherData.Contains(x)))
{
//Code Goes Here
}

Related

How to create new list from list of list where elements are in new list are in alternative order? [duplicate]

This question already has answers here:
Interleaving multiple (more than 2) irregular lists using LINQ
(5 answers)
Closed 5 years ago.
Suppose I have list of list. I want to create new list from given list of list such that elements are in order of example given below.
Inputs:-
List<List<int>> l = new List<List<int>>();
List<int> a = new List<int>();
a.Add(1);
a.Add(2);
a.Add(3);
a.Add(4);
List<int> b = new List<int>();
b.Add(11);
b.Add(12);
b.Add(13);
b.Add(14);
b.Add(15);
b.Add(16);
b.Add(17);
b.Add(18);
l.Add(a);
l.Add(b);
Output(list):-
1
11
2
12
3
13
4
14
15
16
And output list must not contain more than 10 elements.
I am currently doing this using foreach inside while but I want to know how can I do this using LINQ.
int loopCounter = 0,index=0;
List<int> o=new List<int>();
while(o.Count<10)
{
foreach(List<int> x in l)
{
if(o.Count<10)
o.Add(x[index]);
}
index++;
}
Thanks.
Use the SelectMany and Select overloads that receive the item's index. That will be used to apply the desired ordering. The use of the SelectMany is to flatten the nested collections level. Last, apply Take to retrieve only the desired number of items:
var result = l.SelectMany((nested, index) =>
nested.Select((item, nestedIndex) => (index, nestedIndex, item)))
.OrderBy(i => i.nestedIndex)
.ThenBy(i => i.index)
.Select(i => i.item)
.Take(10);
Or in query syntax:
var result = (from c in l.Select((nestedCollection, index) => (nestedCollection, index))
from i in c.nestedCollection.Select((item, index) => (item, index))
orderby i.index, c.index
select i.item).Take(10);
If using a C# 6.0 and prior project an anonymous type instead:
var result = l.SelectMany((nested, index) =>
nested.Select((item, nestedIndex) => new {index, nestedIndex, item}))
.OrderBy(i => i.nestedIndex)
.ThenBy(i => i.index)
.Select(i => i.item)
.Take(10);
To explain why Zip alone is not enough: zip is equivalent to performing a join operation on the second collection to the first, where the
attribute to join by is the index. Therefore Only items that exist in the first collection, if they have a match in the second, will appear in the result.
The next option is to think about left join which will return all items of the first collection with a match (if exists) in the second. In the case described OP is looking for the functionality of a full outer join - get all items of both collection and match when possible.
I know you asked for LINQ, but I do often feel that LINQ is a hammer and as soon as a developer finds it, every problem is a nail. I wouldn't have done this one with LINQ, for a readability/maintainability point of view because I think something like this is simpler and easier to understand/more self documenting:
List<int> r = new List<int>(10);
for(int i = 0; i < 10; i++){
if(i < a.Count)
r.Add(a[i]);
if(i < b.Count)
r.Add(b[i]);
}
You don't need to stop the loop early if a and b collectively only have eg 8 items, but you could by extending the test of the for loop
I also think this case may be more performant than LINQ because it's doing a lot less
If your mandate to use LINQ is academic (this is a homework that must use LINQ) then go ahead, but if it's a normal everyday system that some other poor sucker will have to maintain one day, I implore you to consider whether this is a good application for LINQ
This will handle 2 or more internal List<List<int>>'s - it returns an IEnumerable<int> via yield so you have to call .ToList() on it to make it a list. Linq.Any is used for the break criteria.
Will throw on any list being null. Add checks to your liking.
static IEnumerable<int> FlattenZip (List<List<int>> ienum, int maxLength = int.MaxValue)
{
int done = 0;
int index = 0;
int yielded = 0;
while (yielded <= maxLength && ienum.Any (list => index < list.Count))
foreach (var l in ienum)
{
done++;
if (index < l.Count)
{
// this list is big enough, we will take one out
yielded++;
yield return l[index];
}
if (yielded > maxLength)
break; // we are done
if (done % (ienum.Count) == 0)
index += 1; // checked all lists, advancing index
}
}
public static void Main ()
{
// other testcases to consider:
// in total too few elememts
// one list empty (but not null)
// too many lists (11 for 10 elements)
var l1 = new List<int> { 1, 2, 3, 4 };
var l2 = new List<int> { 11, 12, 13, 14, 15, 16 };
var l3 = new List<int> { 21, 22, 23, 24, 25, 26 };
var l = new List<List<int>> { l1, l2, l3 };
var zipped = FlattenZip (l, 10);
Console.WriteLine (string.Join (", ", zipped));
Console.ReadLine ();
}

Linq GroupBy converted to a conditional?

If I have the following List
List<int> Values = new List<int>() { 1, 5, 6, 2, 5 };
and I want to check for duplicates, so I use GroupBy:
Values.GroupBy(x => x).Where(g => g.Count() > 1).Select(g => g.Key)
How do I make this work in a While conditional/convert to boolean so that, if and only if there are any duplicates present in Values, evaluate to true, otherwise, evaluate false using Linq? I suspect I need to use .Any somehow, but I'm not sure where to fit it in.
Simply check before adding randomly generated number in the list
do
{
var randomNumber = //Generate number here
if(!Values.Contains(randomNumber))
{
Values.Add(randomNumber);
break;
}
}while(true);
The root problem here, is that you are using a Select call. The Select call you are using will return an IEnumerable of longs, not a boolean.
The only way code like that would compile as part of a boolean (be it as a bool variable or as a check done inside an If, While, etc), is if you remove both the Select and the Where calls, and replace it with an Any:
List<int> Values = new List<int>() { 1, 5, 6, 2, 5 };
Values.GroupBy(x => x).Any(g => g.Count() > 1);
If you then need to find out what the duplicates actually are, then the best way to do this would be to store the results of the Group call in a variable, and then use that variable to call Any in your boolean check, and grab the items from the variable when you want to report on them:
List<int> Values = new List<int>() { 1, 5, 6, 2, 5 };
var duplicateValues = Values.GroupBy(x => x).Where(g => g.Count() > 1);
bool anyDuplicates = duplicateValues.Any();
var duplicateKeys = duplicateValues.Select(x => x.Key).ToList();
You can use combination of foreach loop and return value of HashSet<int>.Add method.
With this approach you don't need to loop all values if duplication have been found early. Where GroupBy should always loop all values.
public static bool HaveDuplicates(this IEnumerable<int> values)
{
var set = new HashSet<int>();
foreach (var value in values)
{
if (set.Add(value) == false)
{
return true;
}
}
return false;
}
Use it as extension method
var values = new List<int>() { 1, 5, 6, 1, 2, 5, 6, 7, 8, 9 };
if (values.HaveDuplicates())
{
// have duplicates
}
I'm trying to use it in a do while loop. Inside the do, I generate a
random number from 1 to 9, and add it to the Values list. if the
number that gets generated is one already in the list, then run the
loop over again. If not, escape the loop
Use HashSet<T> for saving generated number. HashSet<T> will check for duplication with O(1) operations, where
var values = new HashSet<int>();
do
{
var generatedValue = GenerateNumber(); // your generation logic
if(values.Add(generatedValue) == false)
{
break;
}
}
while(yourCondition);
You can loop HashSet as other collections or convert it to List if you need
var numbers = values.ToList();

I need help understanding LINQ. Where is my understanding going wrong?

I'll give you my best attempt at understanding it and you let me know where I'm going wrong.
For simplicity, let's assume that we live in a word that only has
numbers 1, 2, 3, 4, 5
operators %, > with their usual precdence
I want to dissamble what happens when I do
List<int> All = new List<int> { 1, 2, 3, 4, 5 };
IEnumerable<int> Filtered = from i in All
where i % 2 == 1
orderby i descending
select i;
foreach ( var i in Filtererd )
{
Console.WriteLine(i);
}
What I understand first of all is that the query itself does not create a Ienumerable<int>; it creates and Expression Tree associated with the query. The elements returned by the query are yielded in an invisible function created by the compiler like
public static IEnumerable<int> MyInvisibleFunction ( List<int> Source )
{
foreach ( int i in Source.Reverse() )
{
if ( i % 2 == 1 )
{
yield return i;
}
}
}
(Of course that's kind of a weird example because Source.Reverse() is itself a query, but anyways ...)
Now I'm confused where expression tress come into play here. When I think of expression trees, I think of trees like
(3 % 1 > 0)
/ \
/ \
(3 % 1) > 0
/ \
3 % 1
in the small world I've created. But where does a tree like that come in to play in my LINQ query
from i in All
where i % 2 == 1
orderby i descending
select i
??? That's what I don't understand. I'm looking at the Expression class and I see how it could create the example tree I showed, but I don't see where it would come into play in my query.
I'll give you my best attempt at understanding it and you let me know where I'm going wrong.
OK.
What I understand first of all is that the query itself does not create a Ienumerable<int>;
This statement is completely wrong.
it creates and Expression Tree associated with the query.
This statement is also completely wrong.
The elements returned by the query are yielded in an invisible function created by the compiler
This statement is also completely wrong.
where does a tree like that come in to play in my LINQ query
It doesn't. Your query uses no expression trees.
I'm looking at the Expression class and I see how it could create the example tree I showed, but I don't see where it would come into play
It doesn't.
want to dissamble what happens when I do
List<int> All = new List<int> { 1, 2, 3, 4, 5 };
IEnumerable<int> Filtered = from i in All
where i % 2 == 1
orderby i descending
select i;
foreach ( var i in Filtererd )
Console.WriteLine(i);
Let's break it down. First the compiler turns that into
List<int> All = new List<int> { 1, 2, 3, 4, 5 };
IEnumerable<int> Filtered = All.Where(i => i % 2 == 1).OrderBy(i => i);
foreach ( var i in Filtererd )
Console.WriteLine(i);
Next the compiler does overload resolution and evaluates extension methods
List<int> All = new List<int> { 1, 2, 3, 4, 5 };
IEnumerable<int> Filtered =
Enumerable.OrderBy<int>(
Enumerable.Where<int>(All, i => i % 2 == 1)),
i => i));
foreach ( var i in Filtererd )
Console.WriteLine(i);
Next lambdas are desugared:
static bool A1(int i) { return i % 2 == 1; )
static int A2(int i) { return i }
...
List<int> All = new List<int> { 1, 2, 3, 4, 5 };
IEnumerable<int> Filtered =
Enumerable.OrderBy<int>(
Enumerable.Where<int>(All, new Func<int, bool>(A1))),
new Func<int, int>(A2)));
foreach (var i in Filtererd )
Console.WriteLine(i);
That is actually not how the lambdas are desugared exactly; they are also cached, but let's ignore that detail.
I assume that you do not want the foreach desugared. See the C# specification for details.
If you want to know what Where and OrderBy do, read the source code.
Expression trees don't come into play in your query, because your source is a regular in-memory list. – Theodoros Chatzigiannakis
This is true.
There is no invisible iterator function being generated. Your query translates to:
List<int> All = new List<int> { 1, 2, 3, 4, 5 };
IEnumerable<int> Filtered =
All
.Where(i => i % 2 == 0)
.OrderByDescending(i => i);
There is no need for custom iterators. The language just calls the existing library functions.
This is the same for IQueryable except that the lambda arguments are not passed as delegates but as expression trees.
You can see this in action by commenting in the AsQueryable() call here.

Find the number of differences between two lists

I want to compare two lists with the same number of elements, and find the number of differences between them. Right now, I have this code (which works):
public static int CountDifferences<T> (this IList<T> list1, IList<T> list2)
{
if (list1.Count != list2.Count)
throw new ArgumentException ("Lists must have the same number of elements", "list2");
int count = 0;
for (int i = 0; i < list1.Count; i++) {
if (!EqualityComparer<T>.Default.Equals (list1[i], list2[i]))
count++;
}
return count;
}
This feels messy to me, and it seems like there must be a more elegant way to achieve it. Is there a way, perhaps, to combine the two lists into a single list of tuples, then simple examine each element of the new list to see if both elements are equal?
Since order in the list does count this would be my approach:
public static int CountDifferences<T>(this IList<T> list1, IList<T> list2)
{
if (list1.Count != list2.Count)
throw new ArgumentException("Lists must have the same number of elements", "list2");
int count = list1.Zip(list2, (a, b) => a.Equals(b) ? 0 : 1).Sum();
return count;
}
Simply merging the lists using Enumerable.Zip() then summing up the differences, still O(n) but this just enumerates the lists once.
Also this approach would work on any two IEnumerable of the same type since we do not use the list indexer (besides obviously in your count comparison in the guard check).
I think your approach is fine, but you could use LINQ to simplify your function:
public static int CountDifferences<T>(this IList<T> list1, IList<T> list2)
{
if(list1.Count != list2.Count)
throw new ArgumentException("Lists must have same # elements", "list2");
return list1.Where((t, i) => !Equals(t, list2[i])).Count();
}
The way you have it written in the question, I don't think Intersect does what you're looking for. For example, say you have:
var list1 = new List<int> { 1, 2, 3, 4, 6, 8 };
var list2 = new List<int> { 1, 2, 4, 5, 6, 8 };
If you run list1.CountDifferences(list2), I'm assuming that you want to get back 2 since elements 2 and 3 are different. Intersect in this case will return 5 since the lists have 5 elements in common. So, if you're looking for 5 then Intersect is the way to go. If you're looking to return 2 then you could use the LINQ statement above.
Try something like this:
var result = list1.Intersect(list2);
var differences = list1.Count - result.Count();
If order counts:
var result = a.Where((x,i) => x !=b[i]);
var differences = result.Count();
You want the Intersect extension method of Enumerable.
public static int CountDifferences<T> (this IList<T> list1, IList<T> list2)
{
if (list1.Count != list2.Count)
throw new ArgumentException ("Lists must have the same number of elements", "list2");
return list1.Count - list1.Intersect(list2).Count();
}
You can use the extension method Zip of List.
List<int> lst1 = new List<int> { 1, 2, 3, 4, 5 };
List<int> lst2 = new List<int> { 6, 2, 9, 4, 5 };
int cntDiff = lst1.Zip(lst2, (a, b) => a != b).Count(a => a);
// Output is 2

Intersection of 6 List<int> objects

As I mentioned in the title I've got 6 List objects in my hand. I want to find the intersection of them except the ones who has no item.
intersectionResultSet =
list1.
Intersect(list2).
Intersect(list3).
Intersect(list4).
Intersect(list5).
Intersect(list6).ToList();
When one of them has no item, normally I get empty set as a result. So I want to exclude the ones that has no item from intersection operation. What's the best way to do that?
Thanks in advance,
You could use something like this:
// Your handful of lists
IEnumerable<IEnumerable<int>> lists = new[]
{
new List<int> { 1, 2, 3 },
new List<int>(),
null,
new List<int> { 2, 3, 4 }
};
List<int> intersection = lists
.Where(c => c != null && c.Any())
.Aggregate(Enumerable.Intersect)
.ToList();
foreach (int value in intersection)
{
Console.WriteLine(value);
}
This has been tested and produces the following output:
2
3
With thanks to #Matajon for pointing out a cleaner (and more performant) use of Enumerable.Intersect in the Aggregate function.
Simply, using LINQ too.
var lists = new List<IEnumerable<int>>() { list1, list2, list3, list4, list5, list6 };
var result = lists
.Where(x => x.Any())
.Aggregate(Enumerable.Intersect)
.ToList();
You could use LINQ to get all the list that are longer then 0 , and then send them to the function you've described.
Another option :
Override/Extend "Intersect" to a function that does Intersect on a list only if it's not empty , and call it instead of Intersect.

Categories