Selecting unique elements from a List in C# - c#

How do I select the unique elements from the list {0, 1, 2, 2, 2, 3, 4, 4, 5} so that I get {0, 1, 3, 5}, effectively removing all instances of the repeated elements {2, 4}?

var numbers = new[] { 0, 1, 2, 2, 2, 3, 4, 4, 5 };
var uniqueNumbers =
from n in numbers
group n by n into nGroup
where nGroup.Count() == 1
select nGroup.Key;
// { 0, 1, 3, 5 }

var nums = new int{ 0...4,4,5};
var distinct = nums.Distinct();
make sure you're using Linq and .NET framework 3.5.

With lambda..
var all = new[] {0,1,1,2,3,4,4,4,5,6,7,8,8}.ToList();
var unique = all.GroupBy(i => i).Where(i => i.Count() == 1).Select(i=>i.Key);

C# 2.0 solution:
static IEnumerable<T> GetUniques<T>(IEnumerable<T> things)
{
Dictionary<T, int> counts = new Dictionary<T, int>();
foreach (T item in things)
{
int count;
if (counts.TryGetValue(item, out count))
counts[item] = ++count;
else
counts.Add(item, 1);
}
foreach (KeyValuePair<T, int> kvp in counts)
{
if (kvp.Value == 1)
yield return kvp.Key;
}
}

Here is another way that works if you have complex type objects in your List and want to get the unique values of a property:
var uniqueValues= myItems.Select(k => k.MyProperty)
.GroupBy(g => g)
.Where(c => c.Count() == 1)
.Select(k => k.Key)
.ToList();
Or to get distinct values:
var distinctValues = myItems.Select(p => p.MyProperty)
.Distinct()
.ToList();
If your property is also a complex type you can create a custom comparer for the Distinct(), such as Distinct(OrderComparer), where OrderComparer could look like:
public class OrderComparer : IEqualityComparer<Order>
{
public bool Equals(Order o1, Order o2)
{
return o1.OrderID == o2.OrderID;
}
public int GetHashCode(Order obj)
{
return obj.OrderID.GetHashCode();
}
}

If Linq isn't available to you because you have to support legacy code that can't be upgraded, then declare a Dictionary, where the first int is the number and the second int is the number of occurences. Loop through your List, loading up your Dictionary. When you're done, loop through your Dictionary selecting only those elements where the number of occurences is 1.

I believe Matt meant to say:
static IEnumerable<T> GetUniques<T>(IEnumerable<T> things)
{
Dictionary<T, bool> uniques = new Dictionary<T, bool>();
foreach (T item in things)
{
if (!(uniques.ContainsKey(item)))
{
uniques.Add(item, true);
}
}
return uniques.Keys;
}

There are many ways to skin a cat, but HashSet seems made for the task here.
var numbers = new[] { 0, 1, 2, 2, 2, 3, 4, 4, 5 };
HashSet<int> r = new HashSet<int>(numbers);
foreach( int i in r ) {
Console.Write( "{0} ", i );
}
The output:
0 1 2 3 4 5

Here's a solution with no LINQ:
var numbers = new[] { 0, 1, 2, 2, 2, 3, 4, 4, 5 };
// This assumes the numbers are sorted
var noRepeats = new List<int>();
int temp = numbers[0]; // Or .First() if using IEnumerable
var count = 1;
for(int i = 1; i < numbers.Length; i++) // Or foreach (var n in numbers.Skip(1)) if using IEnumerable
{
if (numbers[i] == temp) count++;
else
{
if(count == 1) noRepeats.Add(temp);
temp = numbers[i];
count = 1;
}
}
if(count == 1) noRepeats.Add(temp);
Console.WriteLine($"[{string.Join(separator: ",", values: numbers)}] -> [{string.Join(separator: ",", values: noRepeats)}]");
This prints:
[0,1,2,2,2,3,4,4,5] -> [0,1,3,5]

In .Net 2.0 I`m pretty sure about this solution:
public IEnumerable<T> Distinct<T>(IEnumerable<T> source)
{
List<T> uniques = new List<T>();
foreach (T item in source)
{
if (!uniques.Contains(item)) uniques.Add(item);
}
return uniques;
}

Related

Move list elements meeting condition to the top of the list

I want to move specific number to the top of this list.
int numberToBeMovedOnTop = 4;
List<int> lst = new List<int>(){1, 2, 3, 4, 5, 5, 4, 7, 9, 4, 2, 1};
List<int> lstOdd = lst.FindAll(l => l == numberToBeMovedOnTop);
lstOdd.AddRange(lst.FindAll(l => l != numberToBeMovedOnTop));
Where numberToBeMovedOnTop is a variable.
This gives me the desired result but is a better solution for this? I can iterate the list once and swap first occurence of numberToBeMovedOnTop with first element, second occurence with numberToBeMovedOnTop with second element and so on. But can this be done with some built-in C# function without iterating the list twice?
You could use LINQ:
List<int> lstOdd = lst.OrderByDescending(i => i == numberToBeMovedOnTop).ToList();
Why OrderByDescending? Because the comparison returns a bool and true is higher than false. You could also use:
List<int> lstOdd = lst.OrderBy(i => i == numberToBeMovedOnTop ? 0 : 1).ToList();
Note that this works because OrderBy and OrderByDescending are performing a stable sort. That means that the original order remains for all equal items.
For what it's worth, here is an extension method that works with any type and predicate and is a little bit more efficient:
public static List<T> PrependAll<T>(this List<T> list, Func<T, bool> predicate)
{
var returnList = new List<T>();
var listNonMatch = new List<T>();
foreach (T item in list)
{
if (predicate(item))
returnList.Add(item);
else
listNonMatch.Add(item);
}
returnList.AddRange(listNonMatch);
return returnList;
}
Usage: List<int> lstOdd = lst.PrependAll(i => i == numberToBeMovedOnTop);
Aside from using linq, it might be just as efficient/understandable to do this without linq
var listToAdd = new List<int>();
var listOdd = new List<int>();
for(int i = 0; i < lst.Count; i++)
{
if(lst[i] == numberToBeMovedOnTop)
{
listToAdd.Add(numberToBeMovedOnTop);
}
else
{
listOdd.Add(lst[i]);
}
}
listOdd.AddRange(listToAdd);
Keep track of those that you've removed, then add them on afterwards
Group by the predicate, then union?
var nums = new List<int> { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
var grp = nums.GroupBy(x => x % 2 == 0).ToList();
var changed = grp[0].Union(grp[1]).ToList();

Display match for only 3 consecutive numbers

How can I display only 3 consecutive numbers for example in my code below I only want it to return 4, as that appears 3 times.
9 is 4 times so do not want that and 7 is twice so not what want that.
The code I currently have display 9
int[] intArray = { 9, 9, 9, 9, 6, 4, 4, 4, 7, 7 };
var adjacentDuplicateNumbers = intArray
.Skip(1)
.Where((value, index) => value == intArray[index])
.Distinct();
var enumerable = adjacentDuplicateNumbers as int[] ?? adjacentDuplicateNumbers.ToArray();
if (enumerable.Any())
{
Console.WriteLine("{0} is a consecutive number and is repeated 3 times.", enumerable.First());
}
else
{
Console.WriteLine("no consecutive number found.");
}
Using the extension method of this post: LINQ to find series of consecutive numbers
public static IEnumerable<IEnumerable<T>> GroupWhile<T>(this IEnumerable<T> seq, Func<T, T, bool> condition)
{
T prev = seq.First();
List<T> list = new List<T>() { prev };
foreach (T item in seq.Skip(1))
{
if (condition(prev, item) == false)
{
yield return list;
list = new List<T>();
}
list.Add(item);
prev = item;
}
yield return list;
}
Usage:
var res = intArray.GroupWhile((a, b) => a == b).
Where(x => x.Count() == 3).Select(x => x.First());
Sometimes a simple foor loop is enough (and should be faster than linq)
int[] intArray = { 9, 9, 9, 9, 6, 4, 4, 4, 7, 7 };
var minus2 = intArray[0];
var minus1 = intArray[1];
var result = new List<int>();
for(int i = 2; i < intArray.Length; i++)
{
var current = intArray[i];
if(minus2 == minus1 && minus1 == current)
{
result.Add(current);
}
minus2 = minus1;
minus1 = current;
}
var results = intArray.Distinct()
.ToDictionary(k => k, v => intArray.Count(x => x == v))
.Where(x => x.Value == 3)
.Select(x => x.Key);
Take the district elements in the array. Use these as keys in a dictionary that map to the number of occurrences of this key in the original array. Use Where to only select pairs that match the required count (3). Use Select to return the resulting keys - in this example only 4.

The union of the intersects of the 2 set combinations of a sequence of sequences

How can I find the set of items that occur in 2 or more sequences in a sequence of sequences?
In other words, I want the distinct values that occur in at least 2 of the passed in sequences.
Note:
This is not the intersect of all sequences but rather, the union of the intersect of all pairs of sequences.
Note 2:
The does not include the pair, or 2 combination, of a sequence with itself. That would be silly.
I have made an attempt myself,
public static IEnumerable<T> UnionOfIntersects<T>(
this IEnumerable<IEnumerable<T>> source)
{
var pairs =
from s1 in source
from s2 in source
select new { s1 , s2 };
var intersects = pairs
.Where(p => p.s1 != p.s2)
.Select(p => p.s1.Intersect(p.s2));
return intersects.SelectMany(i => i).Distinct();
}
but I'm concerned that this might be sub-optimal, I think it includes intersects of pair A, B and pair B, A which seems inefficient. I also think there might be a more efficient way to compound the sets as they are iterated.
I include some example input and output below:
{ { 1, 1, 2, 3, 4, 5, 7 }, { 5, 6, 7 }, { 2, 6, 7, 9 } , { 4 } }
returns
{ 2, 4, 5, 6, 7 }
and
{ { 1, 2, 3} } or { {} } or { }
returns
{ }
I'm looking for the best combination of readability and potential performance.
EDIT
I've performed some initial testing of the current answers, my code is here. Output below.
Original valid:True
DoomerOneLine valid:True
DoomerSqlLike valid:True
Svinja valid:True
Adricadar valid:True
Schmelter valid:True
Original 100000 iterations in 82ms
DoomerOneLine 100000 iterations in 58ms
DoomerSqlLike 100000 iterations in 82ms
Svinja 100000 iterations in 1039ms
Adricadar 100000 iterations in 879ms
Schmelter 100000 iterations in 9ms
At the moment, it looks as if Tim Schmelter's answer performs better by at least an order of magnitude.
// init sequences
var sequences = new int[][]
{
new int[] { 1, 2, 3, 4, 5, 7 },
new int[] { 5, 6, 7 },
new int[] { 2, 6, 7, 9 },
new int[] { 4 }
};
One-line way:
var result = sequences
.SelectMany(e => e.Distinct())
.GroupBy(e => e)
.Where(e => e.Count() > 1)
.Select(e => e.Key);
// result is { 2 4 5 7 6 }
Sql-like way (with ordering):
var result = (
from e in sequences.SelectMany(e => e.Distinct())
group e by e into g
where g.Count() > 1
orderby g.Key
select g.Key);
// result is { 2 4 5 6 7 }
May be fastest code (but not readable), complexity O(N):
var dic = new Dictionary<int, int>();
var subHash = new HashSet<int>();
int length = array.Length;
for (int i = 0; i < length; i++)
{
subHash.Clear();
int subLength = array[i].Length;
for (int j = 0; j < subLength; j++)
{
int n = array[i][j];
if (!subHash.Contains(n))
{
int counter;
if (dic.TryGetValue(n, out counter))
{
// duplicate
dic[n] = counter + 1;
}
else
{
// first occurance
dic[n] = 1;
}
}
else
{
// exclude duplucate in sub array
subHash.Add(n);
}
}
}
This should be very close to optimal - how "readable" it is depends on your taste. In my opinion it is also the most readable solution.
var seenElements = new HashSet<T>();
var repeatedElements = new HashSet<T>();
foreach (var list in source)
{
foreach (var element in list.Distinct())
{
if (seenElements.Contains(element))
{
repeatedElements.Add(element);
}
else
{
seenElements.Add(element);
}
}
}
return repeatedElements;
You can skip already Intesected sequences, this way will be a little faster.
public static IEnumerable<T> UnionOfIntersects<T>(this IEnumerable<IEnumerable<T>> source)
{
var result = new List<T>();
var sequences = source.ToList();
for (int sequenceIdx = 0; sequenceIdx < sequences.Count(); sequenceIdx++)
{
var sequence = sequences[sequenceIdx];
for (int targetSequenceIdx = sequenceIdx + 1; targetSequenceIdx < sequences.Count; targetSequenceIdx++)
{
var targetSequence = sequences[targetSequenceIdx];
var intersections = sequence.Intersect(targetSequence);
result.AddRange(intersections);
}
}
return result.Distinct();
}
How it works?
Input: {/*0*/ { 1, 2, 3, 4, 5, 7 } ,/*1*/ { 5, 6, 7 },/*2*/ { 2, 6, 7, 9 } , /*3*/{ 4 } }
Step 0: Intersect 0 with 1..3
Step 1: Intersect 1 with 2..3 (0 with 1 already has been intersected)
Step 2: Intersect 2 with 3 (0 with 2 and 1 with 2 already has been intersected)
Return: Distinct elements.
Result: { 2, 4, 5, 6, 7 }
You can test it with the below code
var lists = new List<List<int>>
{
new List<int> {1, 2, 3, 4, 5, 7},
new List<int> {5, 6, 7},
new List<int> {2, 6, 7, 9},
new List<int> {4 }
};
var result = lists.UnionOfIntersects();
You can try this approach, it might be more efficient and also allows to specify the minimum intersection-count and the comparer used:
public static IEnumerable<T> UnionOfIntersects<T>(this IEnumerable<IEnumerable<T>> source
, int minIntersectionCount
, IEqualityComparer<T> comparer = null)
{
if (comparer == null) comparer = EqualityComparer<T>.Default;
foreach (T item in source.SelectMany(s => s).Distinct(comparer))
{
int containedInHowManySequences = 0;
foreach (IEnumerable<T> seq in source)
{
bool contained = seq.Contains(item, comparer);
if (contained) containedInHowManySequences++;
if (containedInHowManySequences == minIntersectionCount)
{
yield return item;
break;
}
}
}
}
Some explaining words:
It enumerates all unique items in all sequences. Since Distinct is using a set this should be pretty efficient. That can help to speed up in case of many duplicates in all sequences.
The inner loop just looks into every sequence if the unique item is contained. Thefore it uses Enumerable.Contains which stops execution as soon as one item was found(so duplicates are no issue).
If the intersection-count reaches the minum intersection count this item is yielded and the next (unique) item is checked.
That should nail it:
int[][] test = { new int[] { 1, 2, 3, 4, 5, 7 }, new int[] { 5, 6, 7 }, new int[] { 2, 6, 7, 9 }, new int[] { 4 } };
var result = test.SelectMany(a => a.Distinct()).GroupBy(x => x).Where(g => g.Count() > 1).Select(y => y.Key).ToList();
First you make sure, there are no duplicates in each sequence. Then you join all sequences to a single sequence and look for duplicates as e.g. here.

How do I group by List<int> in a Dictionary<string, List<int>> where the lists are identical?

I want to take a Dictionary<string, List<int>> and then create groups for all duplicate Lists in the dictionary.
Dictionary<string, List<int>> AllLists = new Dictionary<string, List<int>>()
{
{"one", new List<int>() {1, 2, 3, 4, 5}},
{"two", new List<int>() {1, 2, 3, 4, 5}},
{"three", new List<int>() {1, 1, 1, 1, 1}}
};
var ListGroups = AllLists.GroupBy(p => p.Value);
This should group the dictionary indexes with matching lists into their own groups, but it just creates a group for every index in the dictionary. What am I doing wrong?
That's going to use a reference comparison on your List<int> objects. Since both List<int> containing [1, 2, 3, 4, 5] are separately instantiated, they will have distinct references.
Try the following, for example:
var ListGroups = AllLists.GroupBy(p => string.Join(",", p.Value));
This will group by string representations of your lists. Note that this is probably not what you want to do and is purely demonstrative.
You can use this overload of the GroupBy method to pass in a custom IEqualityComparer<List<int>> that actually looks at the contents of the lists using Enumerable.SequenceEqual.
Here's the IEqualityComparer<IEnumerable<T>>:
class IEnumerableEqualityComparer<T> : IEqualityComparer<IEnumerable<T>>
{
public bool Equals(IEnumerable<T> a, IEnumerable<T> b)
{
return Enumerable.SequenceEqual(a, b);
}
public int GetHashCode(IEnumerable<T> source)
{
if (source == null)
{
return 0;
}
int shift = 0;
int result = 1;
foreach (var item in source)
{
int hash = item != null ? item.GetHashCode() : 17;
result ^= (hash << shift)
| (hash >> (32 - shift))
& (1 << shift - 1);
shift = (shift + 1) % 32;
}
return result;
}
}
And here's how you use it:
var ListGroups = AllLists.GroupBy(p => p.Value,
new IEnumerableEqualityComparer<int>());
Note that because IEqualityComparer<T> is contravariant in T you can use the above one for List<int> because that implements IEnumerable<int>.

Split array with LINQ

Assuming I have a list
var listOfInt = new List<int> {1, 2, 3, 4, 7, 8, 12, 13, 14}
How can I use LINQ to obtain a list of lists as follows:
{{1, 2, 3, 4}, {7, 8}, {12, 13, 14}}
So, i have to take the consecutive values and group them into lists.
You can create extension method (I omitted source check here) which will iterate source and create groups of consecutive items. If next item in source is not consecutive, then current group is yielded:
public static IEnumerable<List<int>> ToConsecutiveGroups(
this IEnumerable<int> source)
{
using (var iterator = source.GetEnumerator())
{
if (!iterator.MoveNext())
{
yield break;
}
else
{
int current = iterator.Current;
List<int> group = new List<int> { current };
while (iterator.MoveNext())
{
int next = iterator.Current;
if (next < current || current + 1 < next)
{
yield return group;
group = new List<int>();
}
current = next;
group.Add(current);
}
if (group.Any())
yield return group;
}
}
}
Usage is simple:
var listOfInt = new List<int> { 1, 2, 3, 4, 7, 8, 12, 13, 14 };
var groups = listOfInt.ToConsecutiveGroups();
Result:
[
[ 1, 2, 3, 4 ],
[ 7, 8 ],
[ 12, 13, 14 ]
]
UPDATE: Here is generic version of this extension method, which accepts predicate for verifying if two values should be considered consecutive:
public static IEnumerable<List<T>> ToConsecutiveGroups<T>(
this IEnumerable<T> source, Func<T,T, bool> isConsequtive)
{
using (var iterator = source.GetEnumerator())
{
if (!iterator.MoveNext())
{
yield break;
}
else
{
T current = iterator.Current;
List<T> group = new List<T> { current };
while (iterator.MoveNext())
{
T next = iterator.Current;
if (!isConsequtive(current, next))
{
yield return group;
group = new List<T>();
}
current = next;
group.Add(current);
}
if (group.Any())
yield return group;
}
}
}
Usage is simple:
var result = listOfInt.ToConsecutiveGroups((x,y) => (x == y) || (x == y - 1));
This works for both sorted and unsorted lists:
var listOfInt = new List<int> { 1, 2, 3, 4, 7, 8, 12, 13 };
int index = 0;
var result = listOfInt.Zip(listOfInt
.Concat(listOfInt.Reverse<int>().Take(1))
.Skip(1),
(v1, v2) =>
new
{
V = v1,
G = (v2 - v1) != 1 ? index++ : index
})
.GroupBy(x => x.G, x => x.V, (k, l) => l.ToList())
.ToList();
External index is building an index of consecutive groups that have value difference of 1. Then you can simply GroupBy with respect to this index.
To clarify solution, here is how this collection looks without grouping (GroupBy commented):
Assuming your input is in order, the following will work:
var grouped = input.Select((n, i) => new { n, d = n - i }).GroupBy(p => p.d, p => p.n);
It won't work if your input is e.g. { 1, 2, 3, 999, 5, 6, 7 }.
You'd get { { 1, 2, 3, 5, 6, 7 }, { 999 } }.
This works:
var results =
listOfInt
.Skip(1)
.Aggregate(
new List<List<int>>(new [] { listOfInt.Take(1).ToList() }),
(a, x) =>
{
if (a.Last().Last() + 1 == x)
{
a.Last().Add(x);
}
else
{
a.Add(new List<int>(new [] { x }));
}
return a;
});
I get this result:

Categories