C# method to remove duplicates from a List<T> - c#

I need a C# method to remove duplicates from a List<T> using a custom comparison operation. In .NET 4. Is there one or do I have to write it myself?

Assuming your comparison operation is IEqualityComparer<T> or can be converted to it, you're fine with LINQ:
var newList = oldList.Distinct(customComparer).ToList();
Obviously that creates a new list rather than removing elements from the old one, but in most cases that's okay. You could always completely replace the contents of the old list with the new list afterwards if not...

You could go with Jon's answer, or if you really want to remove duplicates from an existing list, something like this would work:
public static void RemoveDuplicates<T>(this IList<T> list, IEqualityComparer<T> comparer = null)
{
comparer = comparer ?? EqualityComparer<T>.Default;
var uniques = new HashSet<T>(comparer);
for (int i = list.Count - 1; i >= 0; --i)
{
if (!uniques.Add(list[i]))
{
list.RemoveAt(i);
}
}
}

Related

Efficient sorting when you only want the best

I have a list of 5000 items that are sorted using a custom algoritm.
I only need the best one, i.e. list[0] after the sort has completed.
So I need an algoritm that takes the first item of the list, compares it to the second item and then compares the better one of these two, with the third item etc. Just one loop through the whole list (order n).
Which sorting method in c# should I use for this rather common scenario?
I believe the Sort(..) algoritm that I currently use is very inefficient for this purpose.
You can use MoreLINQ's MaxBy() for this.
Depending on how you define "best", the selector parameter you specify might be different.
Example: You have a list of strings and the "best value" is the longest string.
string longestString = listOfStrings.MaxBy(x => x.Length);
As you can see from the linked implementation, this is O(n). This is the best that is possible for an unsorted set.
I found the MaxBy of MoreLinq with selection, projection, source and key somewhat too complicated and just used my own code as an extension method:
public static T GetBest<T>(this List<T> list, IComparer<T> comparer)
{
if (list == null) throw new ArgumentNullException("list");
if (comparer == null) throw new ArgumentNullException("comparer");
if (list.Count > 0)
{
T best = list[0];
for (int i = 1; i < list.Count; i++)
{
if (comparer.Compare(best, list[i]) > 0)
{
best = list[i];
}
}
return best;
}
return default(T);
}

what is the most performant way to delete elements from a List<T> in c# while doing a foreach?

i want to know what is the best way to delete elements from a List in c# while doing a foreach.
here is a code sample. first i create a list with some elements and then delte one:
List<int> foo = new List<int>();
foo.Add(1);
foo.Add(2);
foo.Add(3);
foreach (int i in foo)
{
if (i==2)
{
foo.Remove(i);
}
}
when i run this, i get a InvalidOperationException but how to solve this with a performant way?
If you must remove entries while enumerating, walk the list in backward direction, and remove items that you need to remove.
for (var i = foo.Count-1 ; i >= 0 ; i--) {
if (MustBeRemoved(foo[i])) {
foo.RemoveAt(i);
}
}
Note that this is not required in case of your post, where you know the values that need to be removed.
foo.RemoveAll(x => x == 2);
In case you decide not to use for and foreach ;)
I assume your actual use case is more complicated than what you have laid out. So let's assume you actually have some condition at play that applies to each element and that multiple elements can satisfy. We'll call that a predicate.
List<T> exposes a RemoveAll method that allows you to supply a predicate. Any item that matches that predicate is then removed. For example
Func<int, bool> isEven = i => i % 2 == 0;
List<int> ints = ...
ints.RemoveAll(item => isEven(item));
// ints will only contain odd numbers
Other approaches to consider would be walking over the list backwards in a for loop and removing by index, building a second list containing the items to delete, and then in a second loop over the second list, remove items from the first. Or you could just write a query to construct a new sequence containing the items that you wish to keep.
Why do you need a loop?
foo.Remove(2);
I think the best way is to iterate backwards using a simple for loop.
for(int i = foo.Count-1; i>=0; i--)
if(foo[i]==2) foo.RemoveAt(i);
change your foreach as below
foreach (int i in new List<int>(foo))
If you want to remove arbitrary elements, and not just one, you can use RemoveAll and specify a predicate:
foo.RemoveAll(element => (element == 2));
You can add the items that you want to remove to a temporary list, then remove them after the loop:
List<int> foo = new List<int>();
foo.Add(1);
foo.Add(2);
foo.Add(3);
List<int> remove = new List<int>();
foreach (int i in foo) {
if (i==2) {
remove.Add(i);
}
}
foreach (int i in remove) {
foo.Remove(i);
}
You can't edit a list while you're iterating it.
Consider:
List<int> foo;
int[] bar = foo.ToArray();
foreach(int i in bar)
{
if (i == 2)
{
foo.Remove(i);
}
}
But beware: you should walk this list backwards, because removing an item from the foo list will mean the bar list no longer aligns with it. (If you don't walk backwards, you'll have to keep track of the count of removals and adjust the index passed to the remove call!)

Removing element from list with predicate

I have a list from the .NET collections library and I want to remove a single element. Sadly, I cannot find it by comparing directly with another object.
I fear that using FindIndex and RemoveAt will cause multiple traversals of the list.
I don't know how to use Enumerators to remove elements, otherwise that could have worked.
RemoveAll does what I need, but will not stop after one element is found.
Ideas?
List<T> has a FindIndex method that accepts a predicate
int index = words.FindIndex(s => s.StartsWith("x"));
if (index >= 0)
{
words.RemoveAt(index);
}
Removes first word starting with "x". words is assumed to be a List<string> in this example.
If you want to remove only the first element that matches a predicate you can use the following (example):
List<int> list = new List<int>();
list.Remove(list.FirstOrDefault(x => x = 10));
where (x => x = 10) is obviously your predicate for matching the objects.
EDIT: Now the OP has changed to use a LinkedList<T>, it's easy to give an answer which only iterates as far as it has to:
public static void RemoveFirst<T>(LinkedList<T> list, Predicate<T> predicate)
{
var node = list.First;
while (node != null)
{
if (predicate(node.Value))
{
list.Remove(node);
return;
}
node = node.Next;
}
}
In case someone need same thing, but for IList<T>
(Inspired by Strillo answer, but more efficient)
public bool Remove(this IList<T> list, Predicate<T> predicate)
{
for(int i = 0; i < list.Count; i++)
{
if(predicate(list[i]))
{
list.RemoveAt(i);
return true;
}
}
return false;
}

'Cropping' a list in c#

Given a Generic IList of some type, which contains a number of items, is there any way of 'cropping' this list, so that only the fist x items are preserved, and the rest discarded?
If you can use Linq, it's just a matter of doing
// Extraact the first 5 items in myList to newList
var newList = myList.Take(5).ToList();
// You can combine with .Skip() to extract items from the middle
var newList = myList.Skip(2).Take(5).ToList();
Note that the above will create new lists with the 5 elements. If you just want to iterate over the first 5 elements, you don't have to create a new list:
foreach (var oneOfTheFirstFive in myList.Take(5))
// do stuff
The existing answers create a new list containing a subset of items from the original list.
If you need to truncate the original list in-place then these are your options:
// if your list is a concrete List<T>
if (yourList.Count > newSize)
{
yourList.RemoveRange(newSize, yourList.Count - newSize);
}
// or, if your list is an IList<T> or IList but *not* a concrete List<T>
while (yourList.Count > newSize)
{
yourList.RemoveAt(yourList.Count - 1);
}
you have a very simple way to:
IList<T> list = [...]; //initialize
IList<T> newList = new List<T>(max);
for (i=0; i<max; i++) newList.Add(list[i]);
Note: max MUST be less or equal then list length (otherwise you get IndexOutOfBoundsException)
If you need to do it just with the IList<T> interface, then something like this is the solution:
for (int i = list.Count - 1; i >= numberOfElementsToKeep; --i) {
list.RemoveAt(i);
}
Working backwards from the end of the list here, in order to avoid moving around data which will be deleted in subsequent loop iterations.

How to remove elements from an array

Hi I'm working on some legacy code that goes something along the lines of
for(int i = results.Count-1; i >= 0; i--)
{
if(someCondition)
{
results.Remove(results[i]);
}
}
To me it seems like bad practice to be removing the elements while still iterating through the loop because you'll be modifying the indexes.
Is this a correct assumption?
Is there a better way of doing this? I would like to use LINQ but I'm in 2.0 Framework
The removal is actually ok since you are going downwards to zero, only the indexes that you already passed will be modified. This code actually would break for another reason: It starts with results.Count, but should start at results.Count -1 since array indexes start at 0.
for(int i = results.Count-1; i >= 0; i--)
{
if(someCondition)
{
results.RemoveAt(i);
}
}
Edit:
As was pointed out - you actually must be dealing with a List of some sort in your pseudo-code. In this case they are conceptually the same (since Lists use an Array internally) but if you use an array you have a Length property (instead of a Count property) and you can not add or remove items.
Using a list the solution above is certainly concise but might not be easy to understand for someone that has to maintain the code (i.e. especially iterating through the list backwards) - an alternative solution could be to first identify the items to remove, then in a second pass removing those items.
Just substitute MyType with the actual type you are dealing with:
List<MyType> removeItems = new List<MyType>();
foreach(MyType item in results)
{
if(someCondition)
{
removeItems.Add(item);
}
}
foreach (MyType item in removeItems)
results.Remove(item);
It doesn't seem like the Remove should work at all. The IList implementation should fail if we're dealing with a fixed-size array, see here.
That being said, if you're dealing with a resizable list (e.g. List<T>), why call Remove instead of RemoveAt? Since you're already navigating the indices in reverse, you don't need to "re-find" the item.
May I suggest a somewhat more functional alternative to your current code:
Instead of modifying the existing array one item at a time, you could derive a new one from it and then replace the whole array as an "atomic" operation once you're done:
The easy way (no LINQ, but very similar):
Predicate<T> filter = delegate(T item) { return !someCondition; };
results = Array.FindAll(results, filter);
// with LINQ, you'd have written: results = results.Where(filter);
where T is the type of the items in your results array.
A somewhat more explicit alternative:
var newResults = new List<T>();
foreach (T item in results)
{
if (!someCondition)
{
newResults.Add(item);
}
}
results = newResults.ToArray();
Usually you wouldn't remove elements as such, you would create a new array from the old without the unwanted elements.
If you do go the route of removing elements from an array/list your loop should count down rather than up. (as yours does)
a couple of options:
List<int> indexesToRemove = new List<int>();
for(int i = results.Count; i >= 0; i--)
{
if(someCondition)
{
//results.Remove(results[i]);
indexesToRemove.Add(i);
}
}
foreach(int i in indexesToRemove) {
results.Remove(results[i]);
}
or alternatively, you could make a copy of the existing list, and instead remove from the original list.
//temp is a copy of results
for(int i = temp.Count-1; i >= 0; i--)
{
if(someCondition)
{
results.Remove(results[i]);
}
}

Categories