C# - How to get complement of two lists without identifiers - c#

I have two lists of the same type. That type does not have an identifier or any other guaranteed way to programatically distinguish.
List A: {1, 2, 2, 3, 5, 8, 8, 8}
List B: {1, 3, 5, 8}
I want the items from A that are not in B.
Desired Result: {2, 2, 8, 8}
If the types had identifiers, I could use a statement like the following...
var result = listA
.Where(a => listB.Where(b => b.Id == a.Id).Count() == 0)
.ToList();
So far, the only way I can do this is with a loop where I add each item the number of times it doesn't appear in the original list.
foreach (var val in listB.Select(b => b.val).Distinct())
{
var countA = listA.Where(a => a.val == val).Count();
var countB = listB.Where(b => b.val == val).Count();
var item = listA.Where(a => a.val == val).FirstOrDefault();
for (int i=0; i<countA-countB; i++)
result.Add(item);
}
Is there a cleaner way to achieve this?
EDIT:
Here is a simplified version of the object in the lists. It's coming from a Web service that's hitting another system.
public class myObject
{
public DateTime SomeDate { get; set; }
public decimal SomeNumber; { get; set; }
public bool IsSomething { get; set; }
public string SomeString { get; set; }
}
The data I am receiving has the same values for SomeDate/SomeString and repeated values for SomeNumber and IsSomething. Two objects might have equal properties, but I need to treat them as distinct objects.

try this:
var listA = new List<Int32> {1, 2, 2, 3, 5, 8, 8, 8};
var listB = new List<Int32> {1, 3, 5, 8};
var listResult = new List<Int32>(listA);
foreach(var itemB in listB)
{
listResult.Remove(itemB);
}

What am I missing?
class Program
{
static void Main(string[] args)
{
List<int> a = new List<int>();
a.Add(1);
a.Add(2);
a.Add(2);
a.Add(3);
a.Add(5);
a.Add(8);
a.Add(8);
a.Add(8);
List<int> b = new List<int>();
b.Add(1);
b.Add(3);
b.Add(5);
b.Add(8);
foreach (int x in b)
a.Remove(x);
foreach (int x in a)
Console.WriteLine(x);
Console.ReadKey(false);
}
}

Are the objects same instances in both lists? If so you can use .Where(a => listB.Where(b => b == a).Count() == 0)
Or
.Where(a => !listB.Any(b => b == a))

You could sort both lists and then iterate through them both at the same time.
public IEnumerable<int> GetComplement(IEnumerable<int> a, IEnumerable<int> b)
{
var listA = a.ToList();
listA.Sort();
var listB = b.ToList();
listB.Sort();
int i=0,j=0;
while( i < listA.Count && j < listB.Count )
{
if(listA[i] > listB[j]) {yield return listB[j];j++;}
else if (listA[i] < listB[j]) {yield return listA[i]; i++; }
else {i++;j++;}
}
while(i < listA.Count)
{
yield return listA[i];
i++;
}
while(j < listB.Count)
{
yield return listB[j];
j++;
}
}
I don't know if this is "cleaner", but it should be more performant on large sets of data.

This is a bit nasty but it does what you want. Not sure about performance though.
var a = new List<int> { 1, 2, 2, 3, 5, 8, 8, 8 };
var b = new List<int> { 1, 3, 5, 8 };
var c = from x in a.Distinct()
let a_count = a.Count(el => el == x)
let b_count = b.Count(el => el == x)
from val in Enumerable.Repeat (x, a_count - b_count)
select val;

Why don't you implement your own equality comparer for your myObject:
public class YourTypeEqualityComparer : IEqualityComparer<myObject>
{
public bool Equals(myObject x, myObject y)
public int GetHashCode(myObject obj)
}
and then use it like this:
var list1 = new List<myObj>();
var list2 = new List<myObj>()
list1.RemoveAll(i =>
list2.Contains(list1),
new YourTypeEqualityComparer()
);
now list1 contains result.

Related

How to remove duplicate pairs in a List

I got a List with pairs of integers. How do I remove pairs if they're duplicates? Distinct wont work cause the pair could be (2, 1) instead of (1, 2).
My list looks like this:
1, 2
2, 3
3, 1
3, 2
2, 4
4, 3
... I don't need (2, 3) and (3, 2)
I made a public struct FaceLine with public int A and B, then var faceline = new List<FaceLine>();.
I'm new to C# and lost.
You could use a custom IEqualityComparer<FaceLine>:
public class UnorderedFacelineComparer : IEqualityComparer<FaceLine>
{
public bool Equals(FaceLine x, FaceLine y)
{
int x1 = Math.Min(x.A, x.B);
int x2 = Math.Max(x.A, x.B);
int y1 = Math.Min(y.A, y.B);
int y2 = Math.Max(y.A, y.B);
return x1 == y1 && x2 == y2;
}
public int GetHashCode(FaceLine obj)
{
return obj.A ^ obj.B;
}
}
Then the query was very simple:
var comparer = new UnorderedFacelineComparer();
List<FaceLine> nonDupList = faceLine
.GroupBy(fl => fl, comparer)
.Where(g => g.Count() == 1)
.Select(g => g.First())
.ToList();
If you wanted to keep one of the duplicates you just need to remove the Where:
List<FaceLine> nonDupList = faceLine
.GroupBy(fl => fl, comparer)
.Select(g => g.First())
.ToList();
If you're happy using the common DistinctBy Linq extension (available via NuGet) you can do this fairly simply like so:
var result = list.DistinctBy(x => (x.A > x.B) ? (x.A, x.B) : (x.B, x.A));
Sample console app:
using System;
using System.Collections.Generic;
using MoreLinq;
namespace Demo
{
class Test
{
public Test(int a, int b)
{
A = a;
B = b;
}
public readonly int A;
public readonly int B;
public override string ToString()
{
return $"A={A}, B={B}";
}
}
class Program
{
static void Main()
{
var list = new List<Test>
{
new Test(1, 2),
new Test(2, 3),
new Test(3, 1),
new Test(3, 2),
new Test(2, 4),
new Test(4, 3)
};
var result = list.DistinctBy(x => (x.A > x.B) ? (x.A, x.B) : (x.B, x.A));
foreach (var item in result)
Console.WriteLine(item);
}
}
}
Using Linq :
List<List<int>> data = new List<List<int>>() {
new List<int>() {1, 2},
new List<int>() {2, 3},
new List<int>() {3, 1},
new List<int>() {3, 2},
new List<int>() {2, 4},
new List<int>() {4, 3}
};
List<List<int>> results =
data.Select(x => (x.First() < x.Last())
? new { first = x.First(), last = x.Last() }
: new { first = x.Last(), last = x.First() })
.GroupBy(x => x)
.Select(x => new List<int>() { x.First().first, x.First().last }).ToList();
Form a set of sets and you get the functionality for free (each smaller set contains exactly two integers).

How to count list items grouped by an embedded list

I have a class like
class MyClass
{
public DateTime Date { get; set; }
public List<int> IdList { get; set; }
public MyClass(DateTime initDate)
{
Date = initDate;
IdList = new List<int>();
}
}
and need to count the number of entries in a List<MyClass>, grouped by each int in IdList.
I have experimented with various Linq constructs, but I cannot get anything to work. Here is what I have so far:
List<MyClass> myc = new List<MyClass>();
myc.Add(new MyClass(new DateTime(2016, 1, 1)) { IdList = new List<int> { 1, 2 } });
myc.Add(new MyClass(new DateTime(2016, 1, 2)) { IdList = new List<int> { 1, 3 } });
myc.Add(new MyClass(new DateTime(2016, 1, 3)) { IdList = new List<int> { 1, 4 } });
myc.Add(new MyClass(new DateTime(2016, 1, 4)) { IdList = new List<int> { 5, 6 } });
myc.Add(new MyClass(new DateTime(2016, 1, 5)) { IdList = new List<int> { 2, 3 } });
var grouped = from p in myc
group p by p.IdList into g
select new { Id = g.Key, Count = g.Count() };
foreach (var x in grouped)
{
Console.WriteLine("ID: {0}, Count: {1}", x.Id, x.Count);
}
// Expecting output like:
// ID: 1, Count: 3
// ID: 2, Count : 2
// etc.
If there was a single int Id property in MyClass, it would be straightforward, but I cannot work out how to use the List<int>. Is there any alternative to writing nested loops and populating a Dictionary? Thanks for any help.
You can use SelectMany
var grouped = myc.SelectMany(x => x.IdList).GroupBy(x => x);
foreach (var i in g)
{
Console.WriteLine(string.Format("Id: {0}, Count: {1}", i.Key,i.Count()));
}
This should give you the output you're looking for.
I don't know if I've understand your requeriment correctly. But try this and let me know:
var groupedIds = myc.SelectMany(x => x.IdList.Select(i => i))
.GroupBy(x => x)
.ToList();
The full fiddle here
And here SelectMany documentation so you know what this code means.
Hope this helps!

linq ordered subset of another list

There are lots and lots of questions on SO about finding if one list is the subset of another list.
i.e. bool isSubset = !t2.Except(t1).Any();
I can't seem to find one that accounts for order
as in given a sequence:
1,1,2,5,8,1,9,1,2
The subsequences...
2,5,8,1,9 true
1,2,5,8,1 true
5,2,1 false
1,2,5,1,8 false
1,1,2 true
1,1,1,2 false
A list in which the order is significant is a generalisation of the concept of string. Therefore you want to use a substring-finding algorithm.
There are several possibilities, but Knuth–Morris–Pratt is a good choice. It has some initial Θ(m) overhead where m is the length of the sublist sought, and then finds in Θ(n) where n is the distance to the sublist sought, or the length of the whole list if it isn't there. This beats the simple item-by-item compare which is Θ((n-m+1) m):
public static class ListSearching
{
public static bool Contains<T>(this IList<T> haystack, IList<T> needle)
{
return Contains(haystack, needle, null);
}
public static bool Contains<T>(this IList<T> haystack, IList<T> needle, IEqualityComparer<T> cmp)
{
return haystack.IndexOf(needle, cmp) != -1;
}
public static int IndexOf<T>(this IList<T> haystack, IList<T> needle)
{
return IndexOf(haystack, needle, null);
}
public static int IndexOf<T>(this IList<T> haystack, IList<T> needle, IEqualityComparer<T> cmp)
{
if(haystack == null || needle == null)
throw new ArgumentNullException();
int needleCount = needle.Count;
if(needleCount == 0)
return 0;//empty lists are everywhere!
if(cmp == null)
cmp = EqualityComparer<T>.Default;
int count = haystack.Count;
if(needleCount == 1)//can't beat just spinning through for it
{
T item = needle[0];
for(int idx = 0; idx != count; ++idx)
if(cmp.Equals(haystack[idx], item))
return idx;
return -1;
}
int m = 0;
int i = 0;
int[] table = KMPTable(needle, cmp);
while(m + i < count)
{
if(cmp.Equals(needle[i], haystack[m + i]))
{
if(i == needleCount - 1)
return m == needleCount ? -1 : m;//match -1 = failure to find conventional in .NET
++i;
}
else
{
m = m + i - table[i];
i = table[i] > -1 ? table[i] : 0;
}
}
return -1;
}
private static int[] KMPTable<T>(IList<T> sought, IEqualityComparer<T> cmp)
{
int[] table = new int[sought.Count];
int pos = 2;
int cnd = 0;
table[0] = -1;
table[1] = 0;
while(pos < table.Length)
if(cmp.Equals(sought[pos - 1], sought[cnd]))
table[pos++] = ++cnd;
else if(cnd > 0)
cnd = table[cnd];
else
table[pos++] = 0;
return table;
}
}
Testing this:
var list = new[]{ 1, 1, 2, 5, 8, 1, 9, 1, 2 };
Console.WriteLine(list.Contains(new[]{2,5,8,1,9})); // True
Console.WriteLine(list.Contains(new[]{1,2,5,8,1})); // True
Console.WriteLine(list.Contains(new[]{5,2,1})); // False
Console.WriteLine(list.Contains(new[]{1,2,5,1,8})); // False
Console.WriteLine(list.Contains(new[]{1,1,2})); // True
Console.WriteLine(list.Contains(new[]{1,1,1,2})); // False
Unfortunately there is no such function in .net. You need Knuth–Morris–Pratt algo for it. One guy already implemented it as linq extension https://code.google.com/p/linq-extensions/
This works for me:
var source = new [] { 1,1,2,5,8,1,9,1,2 };
Func<int[], int[], bool> contains =
(xs, ys) =>
Enumerable
.Range(0, xs.Length)
.Where(n => xs.Skip(n).Take(ys.Length).SequenceEqual(ys))
.Any();
Console.WriteLine(contains(source, new [] { 2,5,8,1,9 })); // true
Console.WriteLine(contains(source, new [] { 1,2,5,8,1 })); // true
Console.WriteLine(contains(source, new [] { 5,2,1 })); // false
Console.WriteLine(contains(source, new [] { 1,2,5,1,8 })); // false
Console.WriteLine(contains(source, new [] { 1,1,2 })); // true
Console.WriteLine(contains(source, new [] { 1,1,1,2 })); // false
there is a workaround to the limitation. You can change the enumerable to a string and then make use of the Contains method.
var t1 = new List<int> {1, 1, 2, 5, 8, 1, 9, 1, 2};
var t2 = new List<int> {2,5,8,1,9};
var t3 = new List<int> {5,2,1};
var t1Str = String.Join(",", t1);
t1Str.Contains(String.Join(",", t2););//true
t1Str.Contains(String.Join(",", t3););//false
You can build your own extension, I wrote a simple IsSubset method:
Console App for testing:
class Program
{
static void Main(string[] args)
{
var list = new List<int> { 1, 3, 5, 2, 4, 6 };
var subList = new List<int> { 3, 5};
var subList2 = new List<int> { 1, 4 };
bool isSublist1 = subList.IsSubset(list);
bool isSublist2 = subList2.IsSubset(list);
Console.WriteLine(isSublist1 + "; " + isSublist2);
/* True; False */
Console.ReadKey();
}
}
IEnumerable Extension:
public static class IEnumerableExtensions
{
public static bool IsSubset<T>(this IEnumerable<T> subsetEnumerable, IEnumerable<T> enumerable)
{
var found = false;
var list = enumerable as IList<T> ?? enumerable.ToList();
var listCount = list.Count();
var subsetList = subsetEnumerable as IList<T> ?? subsetEnumerable.ToList();
var posListCount = subsetList.Count();
/* If the SubList is bigger, it can't be a sublist */
if (listCount < posListCount) {
return false;
}
/* find all indexes of the first item of the sublist in the list */
var firstElement = subsetList.First();
var indexes = new List<int>();
var index = 0;
foreach (var elem in list)
{
if (elem.Equals(firstElement))
{
indexes.Add(index);
}
index++;
}
/* check all first item founds for the subsequence */
foreach (var i in indexes)
{
int x=0;
for (x = 0; x < posListCount && (i + x) < listCount; x++)
{
if (!Equals(subsetList[x], list[(i + x)]))
{
found = false;
break;
}
found = true;
}
if (x + 1 < posListCount)
found = false;
}
return found;
}
}
May be using join can get you what you want. Join will return the matching records. If record count is greater than 0 than there is a match else no match.
Below I have explained through a sample code:
class Program
{
static void Main(string[] args)
{
List<Employee> empList = new List<Employee>
{
new Employee{EmpID = 1},
new Employee{EmpID = 1},
new Employee{EmpID = 2},
new Employee{EmpID = 5},
new Employee{EmpID = 8},
new Employee{EmpID = 1},
new Employee{EmpID = 9},
new Employee{EmpID = 1},
new Employee{EmpID = 2}
};
List<Manager> mgrList = new List<Manager>
{
new Manager{ManagerID = 7},
new Manager{ManagerID = 3},
new Manager{ManagerID = 6}
};
var result = (from emp in empList
join mgr in mgrList on emp.EmpID equals mgr.ManagerID
select new { emp.EmpID}).Count();
Console.WriteLine(result);
Console.ReadKey();
}
}
public class Employee
{
public int EmpID { get; set; }
}
public class Manager
{
public int ManagerID { get; set; }
}

Is there a more concise way to express this foreach loop?

I have 2 lists that I need to consolidate. List 1 has only the dates, and List 2 may have the time element as well:
var List1 = new[] {
new ListType{ val = new DateTime(2012, 1, 1)},
new ListType{ val = new DateTime(2012, 1, 2)}
};
List2 = new[] { new ListType{ val = new DateTime(2012, 1, 1, 5, 0, 0)} };
FinalList = new[] {
new ListType{ val = new DateTime(2012, 1, 1, 5, 0, 0)},
new ListType{ val = new DateTime(2012, 1, 2)}
};
The way I'm going about this is:
foreach (var l in List1) {
var match = List2.FirstOrDefault(q => q.val.Date == l.val);
if (match == null) continue;
l.val = match.val;
}
Is there a better way than iterating through List1, using FirstOrDefault and then reassigning the val? It works, so this is just more a curiosity if Linq has a more elegant way (i.e. I am missing something obvious).
Thanks
You can use Enumerable.Union with a custom IEqualityComparer<ListType>:
class ListType
{
public DateTime val { get; set; }
public class DateComparer : IEqualityComparer<ListType>
{
public bool Equals(ListType x, ListType y)
{
if (ReferenceEquals(x, y))
return true;
else if (x == null || y == null)
return false;
return x.val.Date == y.val.Date;
}
public int GetHashCode(ListType obj)
{
return obj.val.Date.GetHashCode();
}
}
}
and then ...
var finalList = List2.Union(List1, new ListType.DateComparer());
I wouldn't get rid of the loop, but for efficiency I'd build up a dictionary mapping dates to the first matchnig time:
var dateToTime = List2
.GroupBy(d => d.Date)
.ToDictionary(g => g.Key, g => g.First());
foreach (var l in List1)
{
DateTime match;
if (dateToTime.TryGetValue(l.val, out match))
l.val = match.val;
}
LINQ is made for querying items rather than updating items - if you need to update items, use something non-LINQ like a foreach loop. That said, if you want to generate a new list from the items in the first list, the following is the equivalent of your code:
var newList = List1.Select(l => new ListType { val =
dateToTime.ContainsKey(l.val) ? dateToTime[l.val] : l.val }).ToList();

Selecting unique elements from a List in C#

How do I select the unique elements from the list {0, 1, 2, 2, 2, 3, 4, 4, 5} so that I get {0, 1, 3, 5}, effectively removing all instances of the repeated elements {2, 4}?
var numbers = new[] { 0, 1, 2, 2, 2, 3, 4, 4, 5 };
var uniqueNumbers =
from n in numbers
group n by n into nGroup
where nGroup.Count() == 1
select nGroup.Key;
// { 0, 1, 3, 5 }
var nums = new int{ 0...4,4,5};
var distinct = nums.Distinct();
make sure you're using Linq and .NET framework 3.5.
With lambda..
var all = new[] {0,1,1,2,3,4,4,4,5,6,7,8,8}.ToList();
var unique = all.GroupBy(i => i).Where(i => i.Count() == 1).Select(i=>i.Key);
C# 2.0 solution:
static IEnumerable<T> GetUniques<T>(IEnumerable<T> things)
{
Dictionary<T, int> counts = new Dictionary<T, int>();
foreach (T item in things)
{
int count;
if (counts.TryGetValue(item, out count))
counts[item] = ++count;
else
counts.Add(item, 1);
}
foreach (KeyValuePair<T, int> kvp in counts)
{
if (kvp.Value == 1)
yield return kvp.Key;
}
}
Here is another way that works if you have complex type objects in your List and want to get the unique values of a property:
var uniqueValues= myItems.Select(k => k.MyProperty)
.GroupBy(g => g)
.Where(c => c.Count() == 1)
.Select(k => k.Key)
.ToList();
Or to get distinct values:
var distinctValues = myItems.Select(p => p.MyProperty)
.Distinct()
.ToList();
If your property is also a complex type you can create a custom comparer for the Distinct(), such as Distinct(OrderComparer), where OrderComparer could look like:
public class OrderComparer : IEqualityComparer<Order>
{
public bool Equals(Order o1, Order o2)
{
return o1.OrderID == o2.OrderID;
}
public int GetHashCode(Order obj)
{
return obj.OrderID.GetHashCode();
}
}
If Linq isn't available to you because you have to support legacy code that can't be upgraded, then declare a Dictionary, where the first int is the number and the second int is the number of occurences. Loop through your List, loading up your Dictionary. When you're done, loop through your Dictionary selecting only those elements where the number of occurences is 1.
I believe Matt meant to say:
static IEnumerable<T> GetUniques<T>(IEnumerable<T> things)
{
Dictionary<T, bool> uniques = new Dictionary<T, bool>();
foreach (T item in things)
{
if (!(uniques.ContainsKey(item)))
{
uniques.Add(item, true);
}
}
return uniques.Keys;
}
There are many ways to skin a cat, but HashSet seems made for the task here.
var numbers = new[] { 0, 1, 2, 2, 2, 3, 4, 4, 5 };
HashSet<int> r = new HashSet<int>(numbers);
foreach( int i in r ) {
Console.Write( "{0} ", i );
}
The output:
0 1 2 3 4 5
Here's a solution with no LINQ:
var numbers = new[] { 0, 1, 2, 2, 2, 3, 4, 4, 5 };
// This assumes the numbers are sorted
var noRepeats = new List<int>();
int temp = numbers[0]; // Or .First() if using IEnumerable
var count = 1;
for(int i = 1; i < numbers.Length; i++) // Or foreach (var n in numbers.Skip(1)) if using IEnumerable
{
if (numbers[i] == temp) count++;
else
{
if(count == 1) noRepeats.Add(temp);
temp = numbers[i];
count = 1;
}
}
if(count == 1) noRepeats.Add(temp);
Console.WriteLine($"[{string.Join(separator: ",", values: numbers)}] -> [{string.Join(separator: ",", values: noRepeats)}]");
This prints:
[0,1,2,2,2,3,4,4,5] -> [0,1,3,5]
In .Net 2.0 I`m pretty sure about this solution:
public IEnumerable<T> Distinct<T>(IEnumerable<T> source)
{
List<T> uniques = new List<T>();
foreach (T item in source)
{
if (!uniques.Contains(item)) uniques.Add(item);
}
return uniques;
}

Categories