Intersection of 6 List<int> objects - c#

As I mentioned in the title I've got 6 List objects in my hand. I want to find the intersection of them except the ones who has no item.
intersectionResultSet =
list1.
Intersect(list2).
Intersect(list3).
Intersect(list4).
Intersect(list5).
Intersect(list6).ToList();
When one of them has no item, normally I get empty set as a result. So I want to exclude the ones that has no item from intersection operation. What's the best way to do that?
Thanks in advance,

You could use something like this:
// Your handful of lists
IEnumerable<IEnumerable<int>> lists = new[]
{
new List<int> { 1, 2, 3 },
new List<int>(),
null,
new List<int> { 2, 3, 4 }
};
List<int> intersection = lists
.Where(c => c != null && c.Any())
.Aggregate(Enumerable.Intersect)
.ToList();
foreach (int value in intersection)
{
Console.WriteLine(value);
}
This has been tested and produces the following output:
2
3
With thanks to #Matajon for pointing out a cleaner (and more performant) use of Enumerable.Intersect in the Aggregate function.

Simply, using LINQ too.
var lists = new List<IEnumerable<int>>() { list1, list2, list3, list4, list5, list6 };
var result = lists
.Where(x => x.Any())
.Aggregate(Enumerable.Intersect)
.ToList();

You could use LINQ to get all the list that are longer then 0 , and then send them to the function you've described.
Another option :
Override/Extend "Intersect" to a function that does Intersect on a list only if it's not empty , and call it instead of Intersect.

Related

Filter a list of address objects by a list of string postcodes [duplicate]

I have a list of parameters like this:
public class parameter
{
public string name {get; set;}
public string paramtype {get; set;}
public string source {get; set;}
}
IEnumerable<Parameter> parameters;
And a array of strings i want to check it against.
string[] myStrings = new string[] { "one", "two"};
I want to iterate over the parameter list and check if the source property is equal to any of the myStrings array. I can do this with nested foreach's but i would like to learn how to do it in a nicer way as i have been playing around with linq and like the extension methods on enumerable like where etc so nested foreachs just feel wrong. Is there a more elegant preferred linq/lambda/delegete way to do this.
Thanks
You could use a nested Any() for this check which is available on any Enumerable:
bool hasMatch = myStrings.Any(x => parameters.Any(y => y.source == x));
Faster performing on larger collections would be to project parameters to source and then use Intersect which internally uses a HashSet<T> so instead of O(n^2) for the first approach (the equivalent of two nested loops) you can do the check in O(n) :
bool hasMatch = parameters.Select(x => x.source)
.Intersect(myStrings)
.Any();
Also as a side comment you should capitalize your class names and property names to conform with the C# style guidelines.
Here is a sample to find if there are match elements in another list
List<int> nums1 = new List<int> { 2, 4, 6, 8, 10 };
List<int> nums2 = new List<int> { 1, 3, 6, 9, 12};
if (nums1.Any(x => nums2.Any(y => y == x)))
{
Console.WriteLine("There are equal elements");
}
else
{
Console.WriteLine("No Match Found!");
}
If both the list are too big and when we use lamda expression then it will take a long time to fetch . Better to use linq in this case to fetch parameters list:
var items = (from x in parameters
join y in myStrings on x.Source equals y
select x)
.ToList();
list1.Select(l1 => l1.Id).Intersect(list2.Select(l2 => l2.Id)).ToList();
var list1 = await _service1.GetAll();
var list2 = await _service2.GetAll();
// Create a list of Ids from list1
var list1_Ids = list1.Select(l => l.Id).ToList();
// filter list2 according to list1 Ids
var list2 = list2.Where(l => list1_Ids.Contains(l.Id)).ToList();

Find the number of differences between two lists

I want to compare two lists with the same number of elements, and find the number of differences between them. Right now, I have this code (which works):
public static int CountDifferences<T> (this IList<T> list1, IList<T> list2)
{
if (list1.Count != list2.Count)
throw new ArgumentException ("Lists must have the same number of elements", "list2");
int count = 0;
for (int i = 0; i < list1.Count; i++) {
if (!EqualityComparer<T>.Default.Equals (list1[i], list2[i]))
count++;
}
return count;
}
This feels messy to me, and it seems like there must be a more elegant way to achieve it. Is there a way, perhaps, to combine the two lists into a single list of tuples, then simple examine each element of the new list to see if both elements are equal?
Since order in the list does count this would be my approach:
public static int CountDifferences<T>(this IList<T> list1, IList<T> list2)
{
if (list1.Count != list2.Count)
throw new ArgumentException("Lists must have the same number of elements", "list2");
int count = list1.Zip(list2, (a, b) => a.Equals(b) ? 0 : 1).Sum();
return count;
}
Simply merging the lists using Enumerable.Zip() then summing up the differences, still O(n) but this just enumerates the lists once.
Also this approach would work on any two IEnumerable of the same type since we do not use the list indexer (besides obviously in your count comparison in the guard check).
I think your approach is fine, but you could use LINQ to simplify your function:
public static int CountDifferences<T>(this IList<T> list1, IList<T> list2)
{
if(list1.Count != list2.Count)
throw new ArgumentException("Lists must have same # elements", "list2");
return list1.Where((t, i) => !Equals(t, list2[i])).Count();
}
The way you have it written in the question, I don't think Intersect does what you're looking for. For example, say you have:
var list1 = new List<int> { 1, 2, 3, 4, 6, 8 };
var list2 = new List<int> { 1, 2, 4, 5, 6, 8 };
If you run list1.CountDifferences(list2), I'm assuming that you want to get back 2 since elements 2 and 3 are different. Intersect in this case will return 5 since the lists have 5 elements in common. So, if you're looking for 5 then Intersect is the way to go. If you're looking to return 2 then you could use the LINQ statement above.
Try something like this:
var result = list1.Intersect(list2);
var differences = list1.Count - result.Count();
If order counts:
var result = a.Where((x,i) => x !=b[i]);
var differences = result.Count();
You want the Intersect extension method of Enumerable.
public static int CountDifferences<T> (this IList<T> list1, IList<T> list2)
{
if (list1.Count != list2.Count)
throw new ArgumentException ("Lists must have the same number of elements", "list2");
return list1.Count - list1.Intersect(list2).Count();
}
You can use the extension method Zip of List.
List<int> lst1 = new List<int> { 1, 2, 3, 4, 5 };
List<int> lst2 = new List<int> { 6, 2, 9, 4, 5 };
int cntDiff = lst1.Zip(lst2, (a, b) => a != b).Count(a => a);
// Output is 2

LINQ: Determine if two sequences contains exactly the same elements

I need to determine whether or not two sets contains exactly the same elements. The ordering does not matter.
For instance, these two arrays should be considered equal:
IEnumerable<int> data = new []{3, 5, 6, 9};
IEnumerable<int> otherData = new []{6, 5, 9, 3}
One set cannot contain any elements, that are not in the other.
Can this be done using the built-in query operators? And what would be the most efficient way to implement it, considering that the number of elements could range from a few to hundreds?
If you want to treat the arrays as "sets" and ignore order and duplicate items, you can use HashSet<T>.SetEquals method:
var isEqual = new HashSet<int>(first).SetEquals(second);
Otherwise, your best bet is probably sorting both sequences in the same way and using SequenceEqual to compare them.
I suggest sorting both, and doing an element-by-element comparison.
data.OrderBy(x => x).SequenceEqual(otherData.OrderBy(x => x))
I'm not sure how fast the implementation of OrderBy is, but if it's a O(n log n) sort like you'd expect the total algorithm is O(n log n) as well.
For some cases of data, you can improve on this by using a custom implementation of OrderBy that for example uses a counting sort, for O(n+k), with k the size of the range wherein the values lie.
If you might have duplicates (or if you want a solution which performs better for longer lists), I'd try something like this:
static bool IsSame<T>(IEnumerable<T> set1, IEnumerable<T> set2)
{
if (set1 == null && set2 == null)
return true;
if (set1 == null || set2 == null)
return false;
List<T> list1 = set1.ToList();
List<T> list2 = set2.ToList();
if (list1.Count != list2.Count)
return false;
list1.Sort();
list2.Sort();
return list1.SequenceEqual(list2);
}
UPDATE: oops, you guys are right-- the Except() solution below needs to look both ways before crossing the street. And it has lousy perf for longer lists. Ignore the suggestion below! :-)
Here's one easy way to do it. Note that this assumes the lists have no duplicates.
bool same = data.Except (otherData).Count() == 0;
Here is another way to do it:
IEnumerable<int> data = new[] { 3, 5, 6, 9 };
IEnumerable<int> otherData = new[] { 6, 5, 9, 3 };
data = data.OrderBy(d => d);
otherData = otherData.OrderBy(d => d);
data.Zip(otherData, (x, y) => Tuple.Create(x, y)).All(d => d.Item1 == d.Item2);
First, check the length. If they are different, the sets are different.
you can do data.Intersect(otherData);, and check the length is identical.
OR, simplt sort the sets, and iterate through them.
First check if both data collections have the same number of elements and the check if all the elements in one collection are presented in the other
IEnumerable<int> data = new[] { 3, 5, 6, 9 };
IEnumerable<int> otherData = new[] { 6, 5, 9, 3 };
bool equals = data.Count() == otherData.Count() && data.All(x => otherData.Contains(x));
This should help:
IEnumerable<int> data = new []{ 3,5,6,9 };
IEnumerable<int> otherData = new[] {6, 5, 9, 3};
if(data.All(x => otherData.Contains(x)))
{
//Code Goes Here
}

Overlay/Join two collections with Linq

I have the following scenario:
List 1 has 20 items of type TItem, List 2 has 5 items of the same type. List 1 already contains the items from List 2 but in a different state. I want to overwrite the 5 items in List 1 with the items from List 2.
I thought a join might work, but I want to overwrite the items in List 1, not join them together and have duplicates.
There is a unique key that can be used to find which items to overwrite in List 1 the key is of type int
You could use the built in Linq .Except() but it wants an IEqualityComparer so use a fluid version of .Except() instead.
Assuming an object with an integer key as you indicated:
public class Item
{
public int Key { get; set; }
public int Value { get; set; }
public override string ToString()
{
return String.Format("{{{0}:{1}}}", Key, Value);
}
}
The original list of objects can be merged with the changed one as follows:
IEnumerable<Item> original = new[] { 1, 2, 3, 4, 5 }.Select(x => new Item
{
Key = x,
Value = x
});
IEnumerable<Item> changed = new[] { 2, 3, 5 }.Select(x => new Item
{
Key = x,
Value = x * x
});
IEnumerable<Item> result = original.Except(changed, x => x.Key).Concat(changed);
result.ForEach(Console.WriteLine);
output:
{1:1}
{4:4}
{2:4}
{3:9}
{5:25}
LINQ isn't used to perform actual modifications to the underlying data sources; it's strictly a query language. You could, of course, do an outer join on List2 from List1 and select List2's entity if it's not null and List1's entity if it is, but that is going to give you an IEnumerable<> of the results; it won't actually modify the collection. You could do a ToList() on the result and assign it to List1, but that would change the reference; I don't know if that would affect the rest of your application.
Taking your question literally, in that you want to REPLACE the items in List1 with those from List2 if they exist, then you'll have to do that manually in a for loop over List1, checking for the existence of a corresponding entry in List2 and replacing the List1 entry by index with that from List2.
As Adam says, LINQ is about querying. However, you can create a new collection in the right way using Enumerable.Union. You'd need to create an appropriate IEqualityComparer though - it would be nice to have UnionBy. (Another one for MoreLINQ perhaps?)
Basically:
var list3 = list2.Union(list1, keyComparer);
Where keyComparer would be an implementation to compare the two keys. MiscUtil contains a ProjectionEqualityComparer which would make this slightly easier.
Alternatively, you could use DistinctBy from MoreLINQ after concatenation:
var list3 = list2.Concat(list1).DistinctBy(item => item.Key);
Here's a solution with GroupJoin.
List<string> source = new List<string>() { "1", "22", "333" };
List<string> modifications = new List<string>() { "4", "555"};
//alternate implementation
//List<string> result = source.GroupJoin(
// modifications,
// s => s.Length,
// m => m.Length,
// (s, g) => g.Any() ? g.First() : s
//).ToList();
List<string> result =
(
from s in source
join m in modifications
on s.Length equals m.Length into g
select g.Any() ? g.First() : s
).ToList();
foreach (string s in result)
Console.WriteLine(s);
Hmm, how about a re-usable extension method while I'm at it:
public static IEnumerable<T> UnionBy<T, U>
(
this IEnumerable<T> source,
IEnumerable<T> otherSource,
Func<T, U> selector
)
{
return source.GroupJoin(
otherSource,
selector,
selector,
(s, g) => g.Any() ? g.First() : s
);
}
Which is called by:
List<string> result = source
.UnionBy(modifications, s => s.Length)
.ToList();

Using lambda expressions to get a subset where array elements are equal

I have an interesting problem, and I can't seem to figure out the lambda expression to make this work.
I have the following code:
List<string[]> list = GetSomeData(); // Returns large number of string[]'s
List<string[]> list2 = GetSomeData2(); // similar data, but smaller subset
List<string[]> newList = list.FindAll(predicate(string[] line){
return (???);
});
I want to return only those records in list in which element 0 of each string[] is equal to one of the element 0's in list2.
list contains data like this:
"000", "Data", "more data", "etc..."
list2 contains data like this:
"000", "different data", "even more different data"
Fundamentally, i could write this code like this:
List<string[]> newList = new List<string[]>();
foreach(var e in list)
{
foreach(var e2 in list2)
{
if (e[0] == e2[0])
newList.Add(e);
}
}
return newList;
But, i'm trying to use generics and lambda's more, so i'm looking for a nice clean solution. This one is frustrating me though.. maybe a Find inside of a Find?
EDIT:
Marc's answer below lead me to experiment with a varation that looks like this:
var z = list.Where(x => list2.Select(y => y[0]).Contains(x[0])).ToList();
I'm not sure how efficent this is, but it works and is sufficiently succinct. Anyone else have any suggestions?
You could join? I'd use two steps myself, though:
var keys = new HashSet<string>(list2.Select(x => x[0]));
var data = list.Where(x => keys.Contains(x[0]));
If you only have .NET 2.0, then either install LINQBridge and use the above (or similar with a Dictionary<> if LINQBridge doesn't include HashSet<>), or perhaps use nested Find:
var data = list.FindAll(arr => list2.Find(arr2 => arr2[0] == arr[0]) != null);
note though that the Find approach is O(n*m), where-as the HashSet<> approach is O(n+m)...
You could use the Intersect extension method in System.Linq, but you would need to provide an IEqualityComparer to do the work.
static void Main(string[] args)
{
List<string[]> data1 = new List<string[]>();
List<string[]> data2 = new List<string[]>();
var result = data1.Intersect(data2, new Comparer());
}
class Comparer : IEqualityComparer<string[]>
{
#region IEqualityComparer<string[]> Members
bool IEqualityComparer<string[]>.Equals(string[] x, string[] y)
{
return x[0] == y[0];
}
int IEqualityComparer<string[]>.GetHashCode(string[] obj)
{
return obj.GetHashCode();
}
#endregion
}
Intersect may work for you.
Intersect finds all the items that are in both lists.
Ok re-read the question. Intersect doesn't take the order into account.
I have written a slightly more complex linq expression that will return a list of items that are in the same position (index) with the same value.
List<String> list1 = new List<String>() {"000","33", "22", "11", "111"};
List<String> list2 = new List<String>() {"000", "22", "33", "11"};
List<String> subList = list1.Select ((value, index) => new { Value = value, Index = index})
.Where(w => list2.Skip(w.Index).FirstOrDefault() == w.Value )
.Select (s => s.Value).ToList();
Result: {"000", "11"}
Explanation of the query:
Select a set of values and position of that value.
Filter that set where the item in the same position in the second list has the same value.
Select just the value (not the index as well).
Note I used:
list2.Skip(w.Index).FirstOrDefault()
//instead of
list2[w.Index]
So that it will handle lists of different lengths.
If you know the lists will be the same length or list1 will always be shorter then list2[w.Index] would probably a bit faster.

Categories