Find all groups of pairs with intersections C# - c#

Given a list of pairs such as
List<int> pair1 = new List<int>() { 1, 3};
List<int> pair2 = new List<int>() { 1, 2 };
List<int> pair3 = new List<int>() { 5, 3 };
List<int> pair4 = new List<int>() { 7, 8 };
List<int> pair5 = new List<int>() { 8, 11 };
List<int> pair6 = new List<int>() { 6, 9 };
List<int> pair7 = new List<int>() { 2, 13 };
List<int> pair8 = new List<int>() { 13, 16 };
How can I find all of the unions where the pairs intersect?
Output should be something like the following:
1,2,3,5,13,16
7,8,11
6,9
// create lists of pairs to sort through
static void links2XML(SQLiteConnection m_dbConnection)
{
List<int> pair1 = new List<int>() { 1, 3};
List<int> pair2 = new List<int>() { 1, 2 };
List<int> pair3 = new List<int>() { 5, 3 };
List<int> pair4 = new List<int>() { 7, 8 };
List<int> pair5 = new List<int>() { 8, 11 };
List<int> pair6 = new List<int>() { 6, 9 };
List<int> pair7 = new List<int>() { 2, 13 };
List<int> pair8 = new List<int>() { 13, 16 };
var pairs = new List<List<int>>();
pairs.Add(pair1);
pairs.Add(pair2);
pairs.Add(pair3);
pairs.Add(pair4);
pairs.Add(pair5);
pairs.Add(pair6);
pairs.Add(pair7);
pairs.Add(pair8);
var output = new List<int>();
foreach (var pair in pairs)
{
foreach (int i in followLinks(pair, pairs))
{
Console.Write(i + ",");
}
Console.WriteLine();
}
}
// loop through pairs to find intersections and recursively call function to
//build full list of all such ints
static List<int> followLinks(List<int> listA, List<List<int>> listB)
{
var links = listA;
var listC = listB.ToList();
bool added = false;
foreach (var l in listB)
{
var result = listA.Intersect(l);
if (result.Count<int>() > 0)
{
links = links.Union<int>(l).ToList();
listC.Remove(l); // remove pair for recursion after adding
added = true;
}
}
if (added)
{
followLinks(links, listC); //recursively call function with updated
//list of pairs and truncated list of lists of pairs
return links;
}
else return links;
}
Code should query the lists of pairs and output groups. I've tried this a few different ways, and this seems to have gotten the closest. I'm sure it requires a recursive loop, but figuring out the structure of it is just not making sense to me at the moment.
To clarify for some of the questions the number pairs are random that I chose for this question. The actual data set will be far larger and pulled from a database. That part is irrelevant to my question, though, and it's already solved anyway. It's really just this sorting that is giving me trouble.
To further clarify, the output will find a list of all of the integers from each pair that had an intersection... given pairs 1,2 and 1,3 the output would be 1,2,3. Given pairs 1,2 and 3,5, the output would be 1,2 for one list and 3,5 for the other. Hopefully that makes it clearer what I'm trying to find.

I used this function to return the full hash set of all links then loop through all of the initial pairs to feed this function. I filter the results to remove duplicates, and it solves the issue. Suggestion was from user Sven.
// Follow links to build hashset of all linked tags
static HashSet<int> followLinks(HashSet<int> testHs, List<HashSet<int>> pairs)
{
while (true)
{
var tester = new HashSet<int>(testHs);
bool keepGoing = false;
foreach (var p in pairs)
{
if (testHs.Overlaps(p))
{
testHs.UnionWith(p);
keepGoing = true;
}
}
for (int i = pairs.Count - 1; i == 0; i-- )
{
if (testHs.Overlaps(pairs[i]))
{
testHs.UnionWith(pairs[i]);
keepGoing = true;
}
}
if (!keepGoing) break;
if (testHs.SetEquals(tester)) break;
}
return testHs;
}

Related

How to add List<T> items dynamically to IEnumerable<T>

Code
public static void Main()
{
List<int> list1 = new List<int> {1, 2, 3, 4, 5, 6 };
List<int> list2 = new List<int> {1, 2, 3 };
List<int> list3 = new List<int> {1, 2 };
var lists = new IEnumerable<int>[] { list1, list2, list3 };
var commons = GetCommonItems(lists);
Console.WriteLine("Common integers:");
foreach (var c in commons)
Console.WriteLine(c);
}
static IEnumerable<T> GetCommonItems<T>(IEnumerable<T>[] lists)
{
HashSet<T> hs = new HashSet<T>(lists.First());
for (int i = 1; i < lists.Length; i++)
hs.IntersectWith(lists[i]);
return hs;
}
As for the sample, I showed "list1" "list2" "list3", but I may have more than 50 lists that are generating each list using for each loop. How can I add programmatically each "list" to IEnumerable lists for comparing data of each list?
I tried many ways like conversion to list, Add, Append, Concat but nothing worked.
Is there any other best way to compare the N number of lists?
The output of Code: 1 2
You can create a list of lists and add lists to that list dynamically. Something like this:
var lists = new List<List<int>>();
lists.Add(new List<int> {1, 2, 3, 4, 5, 6 });
lists.Add(new List<int> {1, 2, 3 });
lists.Add(new List<int> {1, 2 });
foreach (var list in listSources)
lists.Add(list);
var commons = GetCommonItems(lists);
To find intersections you can use this solution for example: Intersection of multiple lists with IEnumerable.Intersect() (actually looks like that's what you are using already).
Also make sure to change the signature of the GetCommonItems method:
static IEnumerable<T> GetCommonItems<T>(List<List<T>> lists)
What you could do is allow the GetCommonItems method to accept a variable amount of parameters using the params keyword. This way, you avoid needing to create a new collection of lists.
It goes without saying, however, that if the amount of lists in your source is variable as well, this could be trickier to use.
I've also amended the GetCommonItems method to work like the code from https://stackoverflow.com/a/1676684/9945524
public static void Main()
{
List<int> list1 = new List<int> { 1, 2, 3, 4, 5, 6 };
List<int> list2 = new List<int> { 1, 2, 3 };
List<int> list3 = new List<int> { 1, 2 };
var commons = GetCommonItems(list1, list2, list3); // pass the lists here
Console.WriteLine("Common integers:");
foreach (var c in commons)
Console.WriteLine(c);
}
static IEnumerable<T> GetCommonItems<T>(params List<T>[] lists)
{
return lists.Skip(1).Aggregate(
new HashSet<T>(lists.First()),
(hs, lst) =>
{
hs.IntersectWith(lst);
return hs;
}
);
}
Alternate solution using your existing Main method.
EDIT: changed the type of lists to IEnumerable<IEnumerable<T>> as per comment in this answer.
public static void Main()
{
List<int> list1 = new List<int> { 1, 2, 3, 4, 5, 6 };
List<int> list2 = new List<int> { 1, 2, 3 };
List<int> list3 = new List<int> { 1, 2 };
var lists = new List<List<int>> { list1, list2, list3 };
var commons = GetCommonItems(lists);
Console.WriteLine("Common integers:");
foreach (var c in commons)
Console.WriteLine(c);
}
static IEnumerable<T> GetCommonItems<T>(IEnumerable<IEnumerable<T>> enumerables)
{
return enumerables.Skip(1).Aggregate(
new HashSet<T>(enumerables.First()),
(hs, lst) =>
{
hs.IntersectWith(lst);
return hs;
}
);
}
IEnumerable is immutable so you always should return an implementation of IEnumerable depending on your needs.
If I understand correctly you want to get common items of N lists. I would use LINQ for this.
My proposition:
1. make one list that contains all of the items. =>
var allElements = new List<int>();
var lists = new List<List<int>>();
foreach (list in lists)
allElements.AddRange(list);
Take items that are repetitive
allElements.GroupBy(x => x).Where(x => x.Count() > 1).Select(x => x).ToList();

Combinations without repetitions with must included in the combos

I have 2 list of ints and I need a list of all possible combinations without repetitions of 5 numbers. But it also needs to include all the ints from another list.
Example:
var takeFrom = new List<int> { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
var mustInclude = new List<int> { 1, 3, 5 };
I have been using KwCombinatorics but it takes ages to finish. And almost 80% of the result is useless because it doesn't contain the ints from the mustInclude list.
Example of output:
var result = new List<int>
{
{ 1, 3, 5, 9, 10 },
{ 1, 3, 5, 8, 7 },
{ 1, 3, 5, 6, 9 },
}
It doesn't have to be in this order, as long as it doesn't contain repetitions.
Borrowing GetAllCombos from this Question, and using the idea from #juharr, I believe the following code gives you the results you are looking for.
List<int> takeFrom = new List<int> { 2, 4, 6, 7, 8, 9, 10 };
List<int> mustInclude = new List<int> { 1, 3, 5 };
protected void Page_Load(object sender, EventArgs e)
{
List<List<int>> FinalList = new List<List<int>>();
FinalList = GetAllCombos(takeFrom);
FinalList = AddListToEachList(FinalList, mustInclude);
gvCombos.DataSource = FinalList;
gvCombos.DataBind();
}
// Recursive
private static List<List<T>> GetAllCombos<T>(List<T> list)
{
List<List<T>> result = new List<List<T>>();
// head
result.Add(new List<T>());
result.Last().Add(list[0]);
if (list.Count == 1)
return result;
// tail
List<List<T>> tailCombos = GetAllCombos(list.Skip(1).ToList());
tailCombos.ForEach(combo =>
{
result.Add(new List<T>(combo));
combo.Add(list[0]);
result.Add(new List<T>(combo));
});
return result;
}
private static List<List<int>> AddListToEachList(List<List<int>> listOfLists, List<int> mustInclude)
{
List<List<int>> newListOfLists = new List<List<int>>();
//Go through each List
foreach (List<int> l in listOfLists)
{
List<int> newList = l.ToList();
//Add each item that should be in all lists
foreach(int i in mustInclude)
newList.Add(i);
newListOfLists.Add(newList);
}
return newListOfLists;
}
protected void gvCombos_RowDataBound(object sender, GridViewRowEventArgs e)
{
if (e.Row.RowType == DataControlRowType.DataRow)
{
List<int> drv = (List<int>)e.Row.DataItem;
Label lblCombo = (Label)e.Row.FindControl("lblCombo");
foreach (int i in drv)
lblCombo.Text += string.Format($"{i} ");
}
}
GetAllCombos gives you all the combinations without the numbers required by all Lists, and then the second AddListToEachList method will add the required numbers to each List.
As already suggested in the comments, you can remove the three required numbers from the list and generate the combinations of two instead of five.
Something like this:
takeFrom = takeFrom.Except(mustInclude).ToList();
listOfPairs = KwCombinatorics(takeFrom, 2);
result = listOfPairs.Select(pair => mustInclude.Concat(pair).ToList()).ToList();

splitting list of numbers on each sequence of numbers

I have been working on this problem for a couple days now and I can not seem to get the exact result I am looking for. I have simplified my question to just number based so that I can be as clear as possible on what exactly I am doing and what I want in return.
I start off with a big List<List<double>> where each sub List in the larger List contains 3 numbers. For example, the List looks something like this:
[0] 1,2,3
[1] 1,2,3
[2] 1,2,3
[3] 4,5,6
[4] 4,5,6
[5] 4,5,6
[6] 7,8,9
[7] 7,8,9
[8] 7,8,9
where each item in the list is a different sequence. What I am trying to accomplish is to separate the List into a group of smaller lists where each item in the list are all similar. So for the given example:
list1:
[0] 1,2,3
[1] 1,2,3
[2] 1,2,3
list 2:
[0] 4,5,6
[1] 4,5,6
[2] 4,5,6
list 3:
[0] 7,8,9
[1] 7,8,9
[2] 7,8,9
So, for solve my problem I have created a function to recursively search through the List and pull out the sequences that are similar and add them to separate lists. Not only is my function not working, but my code is very long and complicated and I feel like there should be a similar solution than what I am trying to do. Any suggestions or advice to get me going in the right direction will be appreciated.
I think this will do it for you. It works with your 'out of order' requirement -- that is, {1,2,3} equals {3,2,1} equals {2,3,1}.
static void Main(string[] args)
{
List<List<double>> list = new List<List<double>>()
{
new List<double>() { 1,2,3 },
new List<double>() { 4,5,6 },
new List<double>() { 7,8,9 },
new List<double>() { 2,3,1 },
new List<double>() { 5,6,4 },
new List<double>() { 8,9,7 },
new List<double>() { 3,1,2 },
new List<double>() { 6,4,5 },
new List<double>() { 9,7,8 },
};
// Pick a method, they both work
//var q2 = DictionaryMethod(list);
var q2 = LinqAggregateMethod(list);
foreach (var item in q2)
{
Console.WriteLine("List:");
foreach (var item2 in item)
Console.WriteLine($"\t{item2[0]}, {item2[1]}, {item2[2]}");
}
}
static bool ListsAreEqual(List<double> x, List<double> y)
{
foreach (var d in x.Distinct())
{
if (x.Count(i => i == d) != y.Count(i => i == d))
return false;
}
return true;
}
static IEnumerable<IEnumerable<List<double>>> LinqAggregateMethod(List<List<double>> list)
{
var q = list.Aggregate(new List<List<double>>() /* accumulator(ret) initial value */, (ret, dlist) =>
{
// ret = accumulator
// dlist = one of the List<double> from list
// If accumulator doesn't already contain dlist (or it's equal), add it
if (!ret.Any(dlistRet => ListsAreEqual(dlist, dlistRet)))
ret.Add(dlist);
return ret;
});
// At this point, q contains one 'version' of each list.
// foreach item in q, select all the items in list where the lists are equal
var q2 = q.Select(dlist => list.Where(item => ListsAreEqual(dlist, item)));
return q2;
}
static IEnumerable<IEnumerable<List<double>>> DictionaryMethod(List<List<double>> list)
{
var list2 = new Dictionary<List<double>, List<List<double>>>();
// Loop over each List<double> in list
foreach (var item in list)
{
// Does the dictionary have a key that is equal to this item?
var key = list2.Keys.FirstOrDefault(k => ListsAreEqual(k, item));
if (key == null)
{
// No key found, add it
list2[item] = new List<List<double>>();
}
else
{
// Key was found, add item to its value
list2[key].Add(item);
}
}
var q2 = new List<List<List<double>>>();
foreach (var key in list2.Keys)
{
var a = new List<List<double>>();
a.Add(key); // Add the key
a.AddRange(list2[key]); // Add the other lists
q2.Add(a);
}
return q2;
}
Here is another approach to this problem. I would divide it into 2 Steps.
// Sample input:
List<List<double>> lists = new List<List<double>>();
lists.Add(new List<double> { 1, 1, 3 });
lists.Add(new List<double> { 1, 3, 1 });
lists.Add(new List<double> { 3, 1, 1 });
lists.Add(new List<double> { 4, 5, 6 });
lists.Add(new List<double> { 4, 5, 6 });
lists.Add(new List<double> { 6, 5, 4 });
lists.Add(new List<double> { 7, 8, 9 });
lists.Add(new List<double> { 8, 7, 9 });
lists.Add(new List<double> { 9, 8, 7 });
1) Get all unique lists from your collection. You can order them temporarily with OrderBy. This will allow a comparison using SequenceEqual:
List<List<double>> uniqueOrdered = new List<List<double>>();
foreach (var element in lists.Select(x => x.OrderBy(y => y).ToList()))
{
if (!uniqueOrdered.Any(x=> x.SequenceEqual(element)))
{
uniqueOrdered.Add(element);
}
}
2) Now you have a set of representatives for each of your groups. Run through each representatives and get all lists that match the elements in your representative. Again here you can order them temporarily for the sake of comparison with SequenceEqual:
List<List<List<double>>> result = new List<List<List<double>>>();
foreach (var element in uniqueOrdered)
{
result.Add(lists.FindAll(x=> x.OrderBy(t=>t).SequenceEqual(element)));
}
The lists in the resulting groups will maintain their original order!
I tried my best to shorten the needed code to accomplish what I assume you want to accomplish. By the way I put the resulting lists in a list you'll see:
this following sample is just here to declare your list and put random values:
List<List<int>> ContainerList = new List<List<int>>()
{
new List<int>()
{
0, 1, 2
},
new List<int>()
{
3, 4, 6
},
new List<int>()
{
0, 1, 2
},
new List<int>()
{
7, 8, 9
},
};
Now begins the payload:
List<List<List<int>>> result = new List<List<List<int>>>();
foreach (var cont in ContainerList)
result.Add(ContainerList.FindAll(x => x.SequenceEqual(cont)));
// the following erase duplicates
result = result.Distinct().ToList();
So Now you can get your sublists as:
[0][0] 012
[0][1] 012
[1][0] 346
....
EXPLANATION:
ContainerList.FindAll(x => x.SequenceEqual(cont))
The Following snippet use a predicate: the x here is a value in your list.
As it is a list of list, your x gonna be a list
SequenceEqual means that the Findall function will search equalities by VALUE and not by REFERENCE.
Next we erase duplicates because the Findall on the first element of the ContainerList will return a list containing all his duplicates correspounding to the given parameter (which is x).
But as the parameter (x) increments in the list. You gonna do as many FindAll as there are Values of the same subset. So in the example above, you gonna have 2 lists of 2 012;
I hope it is understandable. My english is terrible.
This is the code that I would use for the example that you gave.
static void Main(string[] args)
{
List<List<double>> lists = new List<List<double>>();
lists.Add(new List<double> { 1, 2, 3 });
lists.Add(new List<double> { 1, 2, 3 });
lists.Add(new List<double> { 1, 2, 3 });
lists.Add(new List<double> { 4, 5, 6 });
lists.Add(new List<double> { 4, 5, 6 });
lists.Add(new List<double> { 4, 5, 6 });
lists.Add(new List<double> { 7, 8, 9 });
lists.Add(new List<double> { 7, 8, 9 });
lists.Add(new List<double> { 7, 8, 9 });
lists.Add(new List<double> { 7, 8, 9 });
List<List<List<double>>> sortedLists = new List<List<List<double>>>();
for (int i = 0; i < lists.Count; i++)
{
bool found = false;
if (!(sortedLists.Count == 0))
{
for (int j = 0; j < sortedLists.Count; j++)
{
if (lists[i][0] == sortedLists[j][0][0] && lists[i][1] == sortedLists[j][0][1] && lists[i][2] == sortedLists[j][0][2])
{
found = true;
sortedLists[j].Add(lists[i]);
break;
}
}
}
if (!found)
{
sortedLists.Add(new List<List<double>> { lists[i] });
}
}
}
The only thing is that the inner if statement is designed for this example specifically.
if (lists[i][0] == sortedLists[j][0][0] && lists[i][1] == sortedLists[j][0][1] && lists[i][2] == sortedLists[j][0][2])
This would have to be changed if you used anything out side of 3 double lists.

Split Array By Values In Sequence [duplicate]

This question already has answers here:
LINQ to find series of consecutive numbers
(6 answers)
Closed 5 years ago.
Is there an easy (linq?) way to split an int array into new arrays based off unbroken numerical sequences? For example given this pseudo code:
[Fact]
public void ArraySpike()
{
var source = new[] {1, 2, 3, 7, 8, 9, 12, 13, 24};
var results = SplitArray(source);
Assert.True(results[0] == new[] {1, 2, 3});
Assert.True(results[1] == new[] {7, 8, 9});
Assert.True(results[2] == new[] {12, 13});
Assert.True(results[3] == new[] {24});
}
public int[][] SplitArray(int[] source)
{
return source.???
}
This can work with the linq extension Aggregate. My seeding is not very elegant but that is easy enough to change. The results variable will contain the array of arrays and they are actually of type List<T> because that way they can be easily grown in the function where an array [] is always of fixed size.
This also assumes the source is already ordered and unique, if that is not the case add .OrderBy(x => x).Distinct()
var source = new[] { 1, 2, 3, 7, 8, 9, 12, 13, 24 };
var results = new List<List<int>>{new List<int>()};
var temp = source.Aggregate(results[0], (b, c) =>
{
if (b.Count > 0 && b.Last() != c - 1)
{
b = new List<int>();
results.Add(b);
}
b.Add(c);
return b;
});
I dug up this extension method from my personal collection:
public static IEnumerable<IEnumerable<T>> GroupConnected<T>(this IEnumerable<T> list, Func<T,T,bool> connectionCondition)
{
if (list == null)
{
yield break;
}
using (var enumerator = list.GetEnumerator())
{
T prev = default(T);
var temp = new List<T>();
while (enumerator.MoveNext())
{
T curr = enumerator.Current;
{
if(!prev.Equals(default(T)) && !connectionCondition(prev, curr))
{
yield return temp;
temp = new List<T>();
}
temp.Add(curr);
}
prev = curr;
}
yield return temp;
}
}
It solves the problem in a more general sense: split up a sequence in subsequences of elements that are "connected" somehow. It traverses the sequence and collects each element in a temporary list until the next item isn't "connected". It then returns the temporary list and begins a new one.
Your array elements are connected when they have a difference of 1:
var results = source.GroupConnected((a,b) => b - a == 1);

The union of the intersects of the 2 set combinations of a sequence of sequences

How can I find the set of items that occur in 2 or more sequences in a sequence of sequences?
In other words, I want the distinct values that occur in at least 2 of the passed in sequences.
Note:
This is not the intersect of all sequences but rather, the union of the intersect of all pairs of sequences.
Note 2:
The does not include the pair, or 2 combination, of a sequence with itself. That would be silly.
I have made an attempt myself,
public static IEnumerable<T> UnionOfIntersects<T>(
this IEnumerable<IEnumerable<T>> source)
{
var pairs =
from s1 in source
from s2 in source
select new { s1 , s2 };
var intersects = pairs
.Where(p => p.s1 != p.s2)
.Select(p => p.s1.Intersect(p.s2));
return intersects.SelectMany(i => i).Distinct();
}
but I'm concerned that this might be sub-optimal, I think it includes intersects of pair A, B and pair B, A which seems inefficient. I also think there might be a more efficient way to compound the sets as they are iterated.
I include some example input and output below:
{ { 1, 1, 2, 3, 4, 5, 7 }, { 5, 6, 7 }, { 2, 6, 7, 9 } , { 4 } }
returns
{ 2, 4, 5, 6, 7 }
and
{ { 1, 2, 3} } or { {} } or { }
returns
{ }
I'm looking for the best combination of readability and potential performance.
EDIT
I've performed some initial testing of the current answers, my code is here. Output below.
Original valid:True
DoomerOneLine valid:True
DoomerSqlLike valid:True
Svinja valid:True
Adricadar valid:True
Schmelter valid:True
Original 100000 iterations in 82ms
DoomerOneLine 100000 iterations in 58ms
DoomerSqlLike 100000 iterations in 82ms
Svinja 100000 iterations in 1039ms
Adricadar 100000 iterations in 879ms
Schmelter 100000 iterations in 9ms
At the moment, it looks as if Tim Schmelter's answer performs better by at least an order of magnitude.
// init sequences
var sequences = new int[][]
{
new int[] { 1, 2, 3, 4, 5, 7 },
new int[] { 5, 6, 7 },
new int[] { 2, 6, 7, 9 },
new int[] { 4 }
};
One-line way:
var result = sequences
.SelectMany(e => e.Distinct())
.GroupBy(e => e)
.Where(e => e.Count() > 1)
.Select(e => e.Key);
// result is { 2 4 5 7 6 }
Sql-like way (with ordering):
var result = (
from e in sequences.SelectMany(e => e.Distinct())
group e by e into g
where g.Count() > 1
orderby g.Key
select g.Key);
// result is { 2 4 5 6 7 }
May be fastest code (but not readable), complexity O(N):
var dic = new Dictionary<int, int>();
var subHash = new HashSet<int>();
int length = array.Length;
for (int i = 0; i < length; i++)
{
subHash.Clear();
int subLength = array[i].Length;
for (int j = 0; j < subLength; j++)
{
int n = array[i][j];
if (!subHash.Contains(n))
{
int counter;
if (dic.TryGetValue(n, out counter))
{
// duplicate
dic[n] = counter + 1;
}
else
{
// first occurance
dic[n] = 1;
}
}
else
{
// exclude duplucate in sub array
subHash.Add(n);
}
}
}
This should be very close to optimal - how "readable" it is depends on your taste. In my opinion it is also the most readable solution.
var seenElements = new HashSet<T>();
var repeatedElements = new HashSet<T>();
foreach (var list in source)
{
foreach (var element in list.Distinct())
{
if (seenElements.Contains(element))
{
repeatedElements.Add(element);
}
else
{
seenElements.Add(element);
}
}
}
return repeatedElements;
You can skip already Intesected sequences, this way will be a little faster.
public static IEnumerable<T> UnionOfIntersects<T>(this IEnumerable<IEnumerable<T>> source)
{
var result = new List<T>();
var sequences = source.ToList();
for (int sequenceIdx = 0; sequenceIdx < sequences.Count(); sequenceIdx++)
{
var sequence = sequences[sequenceIdx];
for (int targetSequenceIdx = sequenceIdx + 1; targetSequenceIdx < sequences.Count; targetSequenceIdx++)
{
var targetSequence = sequences[targetSequenceIdx];
var intersections = sequence.Intersect(targetSequence);
result.AddRange(intersections);
}
}
return result.Distinct();
}
How it works?
Input: {/*0*/ { 1, 2, 3, 4, 5, 7 } ,/*1*/ { 5, 6, 7 },/*2*/ { 2, 6, 7, 9 } , /*3*/{ 4 } }
Step 0: Intersect 0 with 1..3
Step 1: Intersect 1 with 2..3 (0 with 1 already has been intersected)
Step 2: Intersect 2 with 3 (0 with 2 and 1 with 2 already has been intersected)
Return: Distinct elements.
Result: { 2, 4, 5, 6, 7 }
You can test it with the below code
var lists = new List<List<int>>
{
new List<int> {1, 2, 3, 4, 5, 7},
new List<int> {5, 6, 7},
new List<int> {2, 6, 7, 9},
new List<int> {4 }
};
var result = lists.UnionOfIntersects();
You can try this approach, it might be more efficient and also allows to specify the minimum intersection-count and the comparer used:
public static IEnumerable<T> UnionOfIntersects<T>(this IEnumerable<IEnumerable<T>> source
, int minIntersectionCount
, IEqualityComparer<T> comparer = null)
{
if (comparer == null) comparer = EqualityComparer<T>.Default;
foreach (T item in source.SelectMany(s => s).Distinct(comparer))
{
int containedInHowManySequences = 0;
foreach (IEnumerable<T> seq in source)
{
bool contained = seq.Contains(item, comparer);
if (contained) containedInHowManySequences++;
if (containedInHowManySequences == minIntersectionCount)
{
yield return item;
break;
}
}
}
}
Some explaining words:
It enumerates all unique items in all sequences. Since Distinct is using a set this should be pretty efficient. That can help to speed up in case of many duplicates in all sequences.
The inner loop just looks into every sequence if the unique item is contained. Thefore it uses Enumerable.Contains which stops execution as soon as one item was found(so duplicates are no issue).
If the intersection-count reaches the minum intersection count this item is yielded and the next (unique) item is checked.
That should nail it:
int[][] test = { new int[] { 1, 2, 3, 4, 5, 7 }, new int[] { 5, 6, 7 }, new int[] { 2, 6, 7, 9 }, new int[] { 4 } };
var result = test.SelectMany(a => a.Distinct()).GroupBy(x => x).Where(g => g.Count() > 1).Select(y => y.Key).ToList();
First you make sure, there are no duplicates in each sequence. Then you join all sequences to a single sequence and look for duplicates as e.g. here.

Categories