Compare lists Unions - c#

List<String> A = new List<String>();
List<String> B = new List<String>();
List<String> itemsremoved = ((A∩B)^c)-A;
List<String> itemsadded = ((A∩B)^c)-B;
I want to know how to do a complement of the union A & B minus the elements of a list. Is there a function to do this?

LINQ provides extension methods for working with sets.
For example, the complement of set a relative to set b will be:
var a = new List<int> { 1, 2, 3, 6 };
var b = new List<int> { 3, 4, 5, 6 };
var comp = a.Except(b);
comp will contain elements [1, 2].
Do a Google search for C# and LINQ set operations and you're sure to find plenty of examples.

Related

How to add List<T> items dynamically to IEnumerable<T>

Code
public static void Main()
{
List<int> list1 = new List<int> {1, 2, 3, 4, 5, 6 };
List<int> list2 = new List<int> {1, 2, 3 };
List<int> list3 = new List<int> {1, 2 };
var lists = new IEnumerable<int>[] { list1, list2, list3 };
var commons = GetCommonItems(lists);
Console.WriteLine("Common integers:");
foreach (var c in commons)
Console.WriteLine(c);
}
static IEnumerable<T> GetCommonItems<T>(IEnumerable<T>[] lists)
{
HashSet<T> hs = new HashSet<T>(lists.First());
for (int i = 1; i < lists.Length; i++)
hs.IntersectWith(lists[i]);
return hs;
}
As for the sample, I showed "list1" "list2" "list3", but I may have more than 50 lists that are generating each list using for each loop. How can I add programmatically each "list" to IEnumerable lists for comparing data of each list?
I tried many ways like conversion to list, Add, Append, Concat but nothing worked.
Is there any other best way to compare the N number of lists?
The output of Code: 1 2
You can create a list of lists and add lists to that list dynamically. Something like this:
var lists = new List<List<int>>();
lists.Add(new List<int> {1, 2, 3, 4, 5, 6 });
lists.Add(new List<int> {1, 2, 3 });
lists.Add(new List<int> {1, 2 });
foreach (var list in listSources)
lists.Add(list);
var commons = GetCommonItems(lists);
To find intersections you can use this solution for example: Intersection of multiple lists with IEnumerable.Intersect() (actually looks like that's what you are using already).
Also make sure to change the signature of the GetCommonItems method:
static IEnumerable<T> GetCommonItems<T>(List<List<T>> lists)
What you could do is allow the GetCommonItems method to accept a variable amount of parameters using the params keyword. This way, you avoid needing to create a new collection of lists.
It goes without saying, however, that if the amount of lists in your source is variable as well, this could be trickier to use.
I've also amended the GetCommonItems method to work like the code from https://stackoverflow.com/a/1676684/9945524
public static void Main()
{
List<int> list1 = new List<int> { 1, 2, 3, 4, 5, 6 };
List<int> list2 = new List<int> { 1, 2, 3 };
List<int> list3 = new List<int> { 1, 2 };
var commons = GetCommonItems(list1, list2, list3); // pass the lists here
Console.WriteLine("Common integers:");
foreach (var c in commons)
Console.WriteLine(c);
}
static IEnumerable<T> GetCommonItems<T>(params List<T>[] lists)
{
return lists.Skip(1).Aggregate(
new HashSet<T>(lists.First()),
(hs, lst) =>
{
hs.IntersectWith(lst);
return hs;
}
);
}
Alternate solution using your existing Main method.
EDIT: changed the type of lists to IEnumerable<IEnumerable<T>> as per comment in this answer.
public static void Main()
{
List<int> list1 = new List<int> { 1, 2, 3, 4, 5, 6 };
List<int> list2 = new List<int> { 1, 2, 3 };
List<int> list3 = new List<int> { 1, 2 };
var lists = new List<List<int>> { list1, list2, list3 };
var commons = GetCommonItems(lists);
Console.WriteLine("Common integers:");
foreach (var c in commons)
Console.WriteLine(c);
}
static IEnumerable<T> GetCommonItems<T>(IEnumerable<IEnumerable<T>> enumerables)
{
return enumerables.Skip(1).Aggregate(
new HashSet<T>(enumerables.First()),
(hs, lst) =>
{
hs.IntersectWith(lst);
return hs;
}
);
}
IEnumerable is immutable so you always should return an implementation of IEnumerable depending on your needs.
If I understand correctly you want to get common items of N lists. I would use LINQ for this.
My proposition:
1. make one list that contains all of the items. =>
var allElements = new List<int>();
var lists = new List<List<int>>();
foreach (list in lists)
allElements.AddRange(list);
Take items that are repetitive
allElements.GroupBy(x => x).Where(x => x.Count() > 1).Select(x => x).ToList();

Merging arrays with common element

I want to merge arrays with common element. I have list of arrays like this:
List<int[]> arrList = new List<int[]>
{
new int[] { 1, 2 },
new int[] { 3, 4, 5 },
new int[] { 2, 7 },
new int[] { 8, 9 },
new int[] { 10, 11, 12 },
new int[] { 3, 9, 13 }
};
and I would like to merge these arrays like this:
List<int[]> arrList2 = new List<int[]>
{
new int[] { 1, 2, 7 },
new int[] { 10, 11, 12 },
new int[] { 3, 4, 5, 8, 9, 13 } //order of elements doesn't matter
};
How to do it?
Let each number be a vertex in the labelled graph. For each array connect vertices pointed by the numbers in the given array. E.g. given array (1, 5, 3) create two edges (1, 5) and (5, 3). Then find all the connected components in the graph (see: http://en.wikipedia.org/wiki/Connected_component_(graph_theory))
I'm pretty sure it is not the best and the fastest solution, but works.
static List<List<int>> Merge(List<List<int>> source)
{
var merged = 0;
do
{
merged = 0;
var results = new List<List<int>>();
foreach (var l in source)
{
var i = results.FirstOrDefault(x => x.Intersect(l).Any());
if (i != null)
{
i.AddRange(l);
merged++;
}
else
{
results.Add(l.ToList());
}
}
source = results.Select(x => x.Distinct().ToList()).ToList();
}
while (merged > 0);
return source;
}
I've used List<List<int>> instead of List<int[]> to get AddRange method available.
Usage:
var results = Merge(arrList.Select(x => x.ToList()).ToList());
// to get List<int[]> instead of List<List<int>>
var array = results.Select(x => x.ToArray()).ToList();
Use Disjoint-Set Forest data structure. The data structure supports three operations:
MakeSet(item) - creates a new set with a single item
Find(item) - Given an item, look up a set.
Union(item1, item2) - Given two items, connects together the sets to which they belong.
You can go through each array, and call Union on its first element and each element that you find after it. Once you are done with all arrays in the list, you will be able to retrieve the individual sets by going through all the numbers again, and calling Find(item) on them. Numbers the Find on which produce the same set should be put into the same array.
This approach finishes the merge in O(α(n)) amortized (α grows very slowly, so for all practical purposes it can be considered a small constant).

comparing two lists and removing missing numbers with C#

there are two lists:
List<int> list2 = new List<int>(new[] { 1, 2, 3, 5, 6 }); // missing: 0 and 4
List<int> list1 = new List<int>(new[] { 0, 1, 2, 3, 4, 5, 6 });
how do you compare two lists, find missing numbers in List1 and remove these numbers from List1? To be more precise, I need to find a way to specify starting and ending position for comparison.
I imagine that the proccess should be very similar to this:
Step 1.
int start_num = 3; // we know that comparisons starts at number 3
int start = list2.IndexOf(start_num); // we get index of Number (3)
int end = start + 2; // get ending position
int end_num = list2[end]; // get ending number (6)
now we've got positions of numbers (and numbers themselves) for comparison in List2 (3,5,6)
Step 2. To get positions of numbers in List1 for comparison - we can do the following:
int startlist1 = list1.IndexOf(start_num); // starting position
int endlist1 = list1.IndexOf(end_num); // ending position
the range is following: (3,4,5,6)
Step 3. Comparison. Tricky part starts here and I need a help with it
Basically now we need to compare list2 at (3,5,6) with list1 at (3,4,5,6). The missing number is "4".
// I have troubles with this step but the result will be:
int remove_it = 4; // or int []
Step 4. Odd number removal.
int remove_it = 4;
list1 = list1.Where(a => a != remove_it).ToList();
works great, but what will happen if we have 2 missing numbers? i.e.
int remove_it = 4 // becomes int[] remove_it = {4, 0}
Result As you have guessed the result is new List1, without number 4 in it.
richTextBox1.Text = "" + string.Join(",", list1.ToArray()); // output: 0,1,2,3,5,6
textBox1.Text = "" + start + " " + start_num; // output: 2 3
textBox3.Text = "" + end + " " + end_num; // output: 4 6
textBox2.Text = "" + startlist1; // output: 3
textBox4.Text = "" + endlist1; // output: 6
Can you guy help me out with Step 3 or point me out to the right direction?
Also, can you say what will happen if starting number(start_num) is the last number, but I need to get next two numbers? In example from above numbers were 3,5,6, but they should be no different than 5,6,0 or 6,0,1 or 0,1,2.
Just answering the first part:
var list3 = list1.Intersect(list2);
This will set list3 to { 0, 1, 2, 3, 4, 5, 6 } - { 0, 4 } = { 1, 2, 3, 5, 6 }
And a reaction to step 1:
int start_num = 3; // we know that comparisons starts at number 3
int start = list2.IndexOf(start_num); // we get index of Number (3)
int end = start + 2; // get ending position
From where do you get all those magic numbers (3, + 2 ) ?
I think you are over-thinking this, a lot.
var result = list1.Intersect(list2)
You can add a .ToList on the end if you really need the result to be a list.
List<int> list2 = new List<int>(new[] { 1, 2, 3, 5, 6 }); // missing: 0 and 4
List<int> list1 = new List<int>(new[] { 0, 1, 2, 3, 4, 5, 6 });
// find items in list 2 notin 1
var exceptions = list1.Except(list2);
// or are you really wanting to do a union? (unique numbers in both arrays)
var uniquenumberlist = list1.Union(list2);
// or are you wanting to find common numbers in both arrays
var commonnumberslist = list1.Intersect(list2);
maybe you should work with OrderedList instead of List...
Something like this:
list1.RemoveAll(l=> !list2.Contains(l));
To get the numbers that exist in list1 but not in list2, you use the Except extension method:
IEnumerable<int> missing = list1.Except(list2);
To loop through this result to remove them from list1, you have to realise the result, otherwise it will read from the list while you are changing it, and you get an exception:
List<int> missing = list1.Except(list2).ToList();
Now you can just remove them:
foreach (int number in missing) {
list1.Remove(number);
}
I'm not sure I understand your issue, and I hope the solution I give you to be good for you.
You have 2 lists:
List list2 = new List(new[] { 1, 2, 3, 5, 6 }); // missing: 0 and 4
List list1 = new List(new[] { 0, 1, 2, 3, 4, 5, 6 });
To remove from list1 all the missing numbers in list2 I suggest this solution:
Build a new list with missing numbers:
List diff = new List();
then put all the numbers you need to remove in this list. Now the remove process should be simple, just take all the elements you added in diff and remove from list2.
Did I understand correctly that algorithm is:
1) take first number in List 2 and find such number in List1,
2) then remove everything from list 1 until you find second number form list2 (5)
3) repeat step 2) for next number in list2.?
You can use Intersect in conjunction with Skip and Take to get the intersection logic combined with a range (here we ignore the fact 0 is missing as we skip it):
static void Main(string[] args)
{
var list1 = new List<int> { 1, 2, 3, 4, 5 };
var list2 = new List<int> { 0, 1, 2, 3, 5, 6 };
foreach (var i in list2.Skip(3).Take(3).Intersect(list1))
Console.WriteLine(i); // Outputs 3 then 5.
Console.Read();
}
Though if I'm being really honest, I'm not sure what is being asked - the only thing I'm certain on is the intersect part:
var list1 = new List<int> { 1, 2, 3, 4, 5 };
var list2 = new List<int> { 0, 1, 2, 3, 5, 6 };
foreach (var i in list2.Intersect(list1))
Console.WriteLine(i); // Outputs 1, 2, 3, 5.
ok, seems like I hadn't explained the problem well enough, sorry about it. Anyone interested can understand what I meant by looking at this code:
List<int> list2 = new List<int>() { 1, 2, 3, 5, 6 }; // missing: 0 and 4
List<int> list1 = new List<int>() { 0, 1, 2, 3, 4, 5, 6 };
int number = 3; // starting position
int indexer = list2.BinarySearch(number);
if (indexer < 0)
{
list2.Insert(~index, number); // don't look at this part
}
// get indexes of "starting position"
int index1 = list1.Select((item, i) => new { Item = item, Index = i }).First(x => x.Item == number).Index;
int index2 = list2.Select((item, i) => new { Item = item, Index = i }).First(x => x.Item == number).Index;
// reorder lists starting at "starting position"
List<int> reorderedList1 = list1.Skip(index1).Concat(list1.Take(index1)).ToList(); //main big
List<int> reorderedList2 = list2.Skip(index2).Concat(list2.Take(index2)).ToList(); // main small
int end = 2; // get ending position: 2 numbers to the right
int end_num = reorderedList2[end]; // get ending number
int endlist1 = reorderedList1.IndexOf(end_num); // ending position
//get lists for comparison
reorderedList2 = reorderedList2.Take(end + 1).ToList();
reorderedList1 = reorderedList1.Take(endlist1 + 1).ToList();
//compare lists
var list3 = reorderedList1.Except(reorderedList2).ToList();
if (list3.Count != 0)
{
foreach (int item in list3)
{
list1 = list1.Where(x => x != item).ToList(); // remove from list
}
}
// list1 is the result that I wanted to see
if there are any ways to optimize this code please inform me. cheers.

remove a value from an int array c#

I have an array of int values int[] ids.
I have a Datatable DataTable dt
I want to keep only those values in the array that are there in the Datatable column ids
Say int[] ids contain [2,3,4,5]
dt contains [2,3,4,3,4] ---ids here may repeat
so output ids will have only [2,3,4]
Pls suggest ways with lambda or linq....
I tried the crude way using two foreachs.
use
int[] myIDs = (from d in dt.AsEnumerable() select d.Field<int>("id")).Intersect (ids).ToArray();
For reference see:
http://msdn.microsoft.com/en-us/library/bb360891.aspx
http://msdn.microsoft.com/en-us/library/system.data.datatableextensions.asenumerable.aspx
http://msdn.microsoft.com/en-us/library/bb460136.aspx
http://msdn.microsoft.com/en-us/library/x303t819.aspx
http://msdn.microsoft.com/en-us/vcsharp/aa336746
http://msdn.microsoft.com/en-us/vcsharp/aa336761.aspx#intersect1
You need to create a new array.
Arrays are fixed size.
If you want a data structure able to remove an element you need a List.
Note that List removal operation have a worst case complexity of O(n).
For your particular problem however i would write something like this:
public int[] MyFunc(DataTable dt, int[] array)
{
Set<int> allowedsIds = new Set<int>();
Fill your set with ids you want to keep
int[] newArray = new int[inputArray.Length];
int newArrayCount = 0;
for (int i = 0; i < inputArray.Length; ++i)
{
if (allowedsIds.Contains(inputArray[i]))
{
newArray[newArrayCount++] = inputArray[i];
}
}
Array.Resize(ref newArray, newArrayCount);
return newArray;
}
You need the intersection of the 2 collections. Linq as a Intersect method for that.
From the Linq 101 samples:
public void Linq50()
{
int[] numbersA = { 0, 2, 4, 5, 6, 8, 9 };
int[] numbersB = { 1, 3, 5, 7, 8 };
var commonNumbers = numbersA.Intersect(numbersB);
Console.WriteLine("Common numbers shared by both arrays:");
foreach (var n in commonNumbers)
{
Console.WriteLine(n);
}
}
You can find more examples here in Linq 101 Samples.
Use the Intersect function:
var ids = new[] {2, 3, 4, 5};
var dt = new[] {2, 3, 4, 3, 4};
foreach (var id in ids.Intersect(dt))
{
}
You could create List<int> fromDB and (cycling over dataset) fill it with ids column values.
Then you could use:
List<int> result = ids.Intersect(fromDB).ToList();

find common items across multiple lists in C#

I have two generic list :
List<string> TestList1 = new List<string>();
List<string> TestList2 = new List<string>();
TestList1.Add("1");
TestList1.Add("2");
TestList1.Add("3");
TestList2.Add("3");
TestList2.Add("4");
TestList2.Add("5");
What is the fastest way to find common items across these lists?
Assuming you use a version of .Net that has LINQ, you can use the Intersect extension method:
var CommonList = TestList1.Intersect(TestList2)
If you have lists of objects and want to get the common objects for some property then use;
var commons = TestList1.Select(s1 => s1.SomeProperty).ToList().Intersect(TestList2.Select(s2 => s2.SomeProperty).ToList()).ToList();
Note: SomeProperty refers to some criteria you want to implement.
Assuming you have LINQ available. I don't know if it's the fastest, but a clean way would be something like:
var distinctStrings = TestList1.Union(TestList2).Distinct();
var distinctStrings = TestList1.Union(TestList2);
Update: well never mind my answer, I've just learnt about Intersect as well!
According to an update in the comments, Unions apply a distinct, which makes sense now that I think about it.
You can do this by counting occurrences of all items in all lists - those items whose occurrence count is equal to the number of lists, are common to all lists:
static List<T> FindCommon<T>(IEnumerable<List<T>> lists)
{
Dictionary<T, int> map = new Dictionary<T, int>();
int listCount = 0; // number of lists
foreach (IEnumerable<T> list in lists)
{
listCount++;
foreach (T item in list)
{
// Item encountered, increment count
int currCount;
if (!map.TryGetValue(item, out currCount))
currCount = 0;
currCount++;
map[item] = currCount;
}
}
List<T> result= new List<T>();
foreach (KeyValuePair<T,int> kvp in map)
{
// Items whose occurrence count is equal to the number of lists are common to all the lists
if (kvp.Value == listCount)
result.Add(kvp.Key);
}
return result;
}
Sort both arrays and start from the top of both and compare if they are equal.
Using a hash is even faster: Put the first array in a hash, then compare every item of the second array if it is already in the hash.
I don't know those Intersect and Union are implemented. Try to find out their running time if you care about the performance. Of course they are better suited if you need clean code.
Use the Intersect method:
IEnumerable<string> result = TestList1.Intersect(TestList2);
Using HashSet for fast lookup. Here is the solution:
using System;
using System.Linq;
using System.Collections.Generic;
public class Program
{
public static void Main()
{
List<int> list1 = new List<int> {1, 2, 3, 4, 5, 6 };
List<int> list2 = new List<int> {1, 2, 3 };
List<int> list3 = new List<int> {1, 2 };
var lists = new IEnumerable<int>[] {list1, list2, list3 };
var commons = GetCommonItems(lists);
Console.WriteLine("Common integers:");
foreach (var c in commons)
Console.WriteLine(c);
}
static IEnumerable<T> GetCommonItems<T>(IEnumerable<T>[] lists)
{
HashSet<T> hs = new HashSet<T>(lists.First());
for (int i = 1; i < lists.Length; i++)
hs.IntersectWith(lists[i]);
return hs;
}
}
Following the lead of #logicnp on counting the number of lists containing each member, once you have your list of lists, it's pretty much one line of code:
List<int> l1, l2, l3, cmn;
List<List<int>> all;
l1 = new List<int>() { 1, 2, 3, 4, 5 };
l2 = new List<int>() { 1, 2, 3, 4 };
l3 = new List<int>() { 1, 2, 3 };
all = new List<List<int>>() { l1, l2, l3 };
cmn = all.SelectMany(x => x).Distinct()
.Where(x => all .Select(y => (y.Contains(x) ? 1 : 0))
.Sum() == all.Count).ToList();
Or, if you prefer:
public static List<T> FindCommon<T>(IEnumerable<List<T>> Lists)
{
return Lists.SelectMany(x => x).Distinct()
.Where(x => Lists.Select(y => (y.Contains(x) ? 1 : 0))
.Sum() == Lists.Count()).ToList();
}

Categories