Efficient sorting when you only want the best - c#

I have a list of 5000 items that are sorted using a custom algoritm.
I only need the best one, i.e. list[0] after the sort has completed.
So I need an algoritm that takes the first item of the list, compares it to the second item and then compares the better one of these two, with the third item etc. Just one loop through the whole list (order n).
Which sorting method in c# should I use for this rather common scenario?
I believe the Sort(..) algoritm that I currently use is very inefficient for this purpose.

You can use MoreLINQ's MaxBy() for this.
Depending on how you define "best", the selector parameter you specify might be different.
Example: You have a list of strings and the "best value" is the longest string.
string longestString = listOfStrings.MaxBy(x => x.Length);
As you can see from the linked implementation, this is O(n). This is the best that is possible for an unsorted set.

I found the MaxBy of MoreLinq with selection, projection, source and key somewhat too complicated and just used my own code as an extension method:
public static T GetBest<T>(this List<T> list, IComparer<T> comparer)
{
if (list == null) throw new ArgumentNullException("list");
if (comparer == null) throw new ArgumentNullException("comparer");
if (list.Count > 0)
{
T best = list[0];
for (int i = 1; i < list.Count; i++)
{
if (comparer.Compare(best, list[i]) > 0)
{
best = list[i];
}
}
return best;
}
return default(T);
}

Related

what is the most performant way to delete elements from a List<T> in c# while doing a foreach?

i want to know what is the best way to delete elements from a List in c# while doing a foreach.
here is a code sample. first i create a list with some elements and then delte one:
List<int> foo = new List<int>();
foo.Add(1);
foo.Add(2);
foo.Add(3);
foreach (int i in foo)
{
if (i==2)
{
foo.Remove(i);
}
}
when i run this, i get a InvalidOperationException but how to solve this with a performant way?
If you must remove entries while enumerating, walk the list in backward direction, and remove items that you need to remove.
for (var i = foo.Count-1 ; i >= 0 ; i--) {
if (MustBeRemoved(foo[i])) {
foo.RemoveAt(i);
}
}
Note that this is not required in case of your post, where you know the values that need to be removed.
foo.RemoveAll(x => x == 2);
In case you decide not to use for and foreach ;)
I assume your actual use case is more complicated than what you have laid out. So let's assume you actually have some condition at play that applies to each element and that multiple elements can satisfy. We'll call that a predicate.
List<T> exposes a RemoveAll method that allows you to supply a predicate. Any item that matches that predicate is then removed. For example
Func<int, bool> isEven = i => i % 2 == 0;
List<int> ints = ...
ints.RemoveAll(item => isEven(item));
// ints will only contain odd numbers
Other approaches to consider would be walking over the list backwards in a for loop and removing by index, building a second list containing the items to delete, and then in a second loop over the second list, remove items from the first. Or you could just write a query to construct a new sequence containing the items that you wish to keep.
Why do you need a loop?
foo.Remove(2);
I think the best way is to iterate backwards using a simple for loop.
for(int i = foo.Count-1; i>=0; i--)
if(foo[i]==2) foo.RemoveAt(i);
change your foreach as below
foreach (int i in new List<int>(foo))
If you want to remove arbitrary elements, and not just one, you can use RemoveAll and specify a predicate:
foo.RemoveAll(element => (element == 2));
You can add the items that you want to remove to a temporary list, then remove them after the loop:
List<int> foo = new List<int>();
foo.Add(1);
foo.Add(2);
foo.Add(3);
List<int> remove = new List<int>();
foreach (int i in foo) {
if (i==2) {
remove.Add(i);
}
}
foreach (int i in remove) {
foo.Remove(i);
}
You can't edit a list while you're iterating it.
Consider:
List<int> foo;
int[] bar = foo.ToArray();
foreach(int i in bar)
{
if (i == 2)
{
foo.Remove(i);
}
}
But beware: you should walk this list backwards, because removing an item from the foo list will mean the bar list no longer aligns with it. (If you don't walk backwards, you'll have to keep track of the count of removals and adjust the index passed to the remove call!)

Removing element from list with predicate

I have a list from the .NET collections library and I want to remove a single element. Sadly, I cannot find it by comparing directly with another object.
I fear that using FindIndex and RemoveAt will cause multiple traversals of the list.
I don't know how to use Enumerators to remove elements, otherwise that could have worked.
RemoveAll does what I need, but will not stop after one element is found.
Ideas?
List<T> has a FindIndex method that accepts a predicate
int index = words.FindIndex(s => s.StartsWith("x"));
if (index >= 0)
{
words.RemoveAt(index);
}
Removes first word starting with "x". words is assumed to be a List<string> in this example.
If you want to remove only the first element that matches a predicate you can use the following (example):
List<int> list = new List<int>();
list.Remove(list.FirstOrDefault(x => x = 10));
where (x => x = 10) is obviously your predicate for matching the objects.
EDIT: Now the OP has changed to use a LinkedList<T>, it's easy to give an answer which only iterates as far as it has to:
public static void RemoveFirst<T>(LinkedList<T> list, Predicate<T> predicate)
{
var node = list.First;
while (node != null)
{
if (predicate(node.Value))
{
list.Remove(node);
return;
}
node = node.Next;
}
}
In case someone need same thing, but for IList<T>
(Inspired by Strillo answer, but more efficient)
public bool Remove(this IList<T> list, Predicate<T> predicate)
{
for(int i = 0; i < list.Count; i++)
{
if(predicate(list[i]))
{
list.RemoveAt(i);
return true;
}
}
return false;
}

C# method to remove duplicates from a List<T>

I need a C# method to remove duplicates from a List<T> using a custom comparison operation. In .NET 4. Is there one or do I have to write it myself?
Assuming your comparison operation is IEqualityComparer<T> or can be converted to it, you're fine with LINQ:
var newList = oldList.Distinct(customComparer).ToList();
Obviously that creates a new list rather than removing elements from the old one, but in most cases that's okay. You could always completely replace the contents of the old list with the new list afterwards if not...
You could go with Jon's answer, or if you really want to remove duplicates from an existing list, something like this would work:
public static void RemoveDuplicates<T>(this IList<T> list, IEqualityComparer<T> comparer = null)
{
comparer = comparer ?? EqualityComparer<T>.Default;
var uniques = new HashSet<T>(comparer);
for (int i = list.Count - 1; i >= 0; --i)
{
if (!uniques.Add(list[i]))
{
list.RemoveAt(i);
}
}
}

What's the fastest way to convert List<string> to List<int> in C# assuming int.Parse will work for every item?

By fastest I mean what is the most performant means of converting each item in List to type int using C# assuming int.Parse will work for every item?
You won't get around iterating over all elements. Using LINQ:
var ints = strings.Select(s => int.Parse(s));
This has the added bonus it will only convert at the time you iterate over it, and only as much elements as you request.
If you really need a list, use the ToList method. However, you have to be aware that the performance bonus mentioned above won't be available then.
If you're really trying to eeke out the last bit of performance you could try doing someting with pointers like this, but personally I'd go with the simple linq implementation that others have mentioned.
unsafe static int ParseUnsafe(string value)
{
int result = 0;
fixed (char* v = value)
{
char* str = v;
while (*str != '\0')
{
result = 10 * result + (*str - 48);
str++;
}
}
return result;
}
var parsed = input.Select(i=>ParseUnsafe(i));//optionally .ToList() if you really need list
There is likely to be very little difference between any of the obvious ways to do this: therefore go for readability (one of the LINQ-style methods posted in other answers).
You may gain some performance for very large lists by initializing the output list to its required capacity, but it's unlikely you'd notice the difference, and readability will suffer:
List<string> input = ..
List<int> output = new List<int>(input.Count);
... Parse in a loop ...
The slight performance gain will come from the fact that the output list won't need to be repeatedly reallocated as it grows.
I don't know what the performance implications are, but there is a List<T>.ConvertAll<TOutput> method for converting the elements in the current List to another type, returning a list containing the converted elements.
List.ConvertAll Method
var myListOfInts = myListString.Select(x => int.Parse(x)).ToList()
Side note: If you call ToList() on ICollection .NET framework automatically preallocates an
List of needed size, so it doesn't have to allocate new space for each new item added to the list.
Unfortunately LINQ Select doesn't return an ICollection (as Joe pointed out in comments).
From ILSpy:
// System.Linq.Enumerable
public static List<TSource> ToList<TSource>(this IEnumerable<TSource> source)
{
if (source == null)
{
throw Error.ArgumentNull("source");
}
return new List<TSource>(source);
}
// System.Collections.Generic.List<T>
public List(IEnumerable<T> collection)
{
if (collection == null)
{
ThrowHelper.ThrowArgumentNullException(ExceptionArgument.collection);
}
ICollection<T> collection2 = collection as ICollection<T>;
if (collection2 != null)
{
int count = collection2.Count;
this._items = new T[count];
collection2.CopyTo(this._items, 0);
this._size = count;
return;
}
this._size = 0;
this._items = new T[4];
using (IEnumerator<T> enumerator = collection.GetEnumerator())
{
while (enumerator.MoveNext())
{
this.Add(enumerator.Current);
}
}
}
So, ToList() just calls List constructor and passes in an IEnumerable.
The List constructor is smart enough that if it is an ICollection it uses most efficient way of filling a new instance of List

How to remove elements from an array

Hi I'm working on some legacy code that goes something along the lines of
for(int i = results.Count-1; i >= 0; i--)
{
if(someCondition)
{
results.Remove(results[i]);
}
}
To me it seems like bad practice to be removing the elements while still iterating through the loop because you'll be modifying the indexes.
Is this a correct assumption?
Is there a better way of doing this? I would like to use LINQ but I'm in 2.0 Framework
The removal is actually ok since you are going downwards to zero, only the indexes that you already passed will be modified. This code actually would break for another reason: It starts with results.Count, but should start at results.Count -1 since array indexes start at 0.
for(int i = results.Count-1; i >= 0; i--)
{
if(someCondition)
{
results.RemoveAt(i);
}
}
Edit:
As was pointed out - you actually must be dealing with a List of some sort in your pseudo-code. In this case they are conceptually the same (since Lists use an Array internally) but if you use an array you have a Length property (instead of a Count property) and you can not add or remove items.
Using a list the solution above is certainly concise but might not be easy to understand for someone that has to maintain the code (i.e. especially iterating through the list backwards) - an alternative solution could be to first identify the items to remove, then in a second pass removing those items.
Just substitute MyType with the actual type you are dealing with:
List<MyType> removeItems = new List<MyType>();
foreach(MyType item in results)
{
if(someCondition)
{
removeItems.Add(item);
}
}
foreach (MyType item in removeItems)
results.Remove(item);
It doesn't seem like the Remove should work at all. The IList implementation should fail if we're dealing with a fixed-size array, see here.
That being said, if you're dealing with a resizable list (e.g. List<T>), why call Remove instead of RemoveAt? Since you're already navigating the indices in reverse, you don't need to "re-find" the item.
May I suggest a somewhat more functional alternative to your current code:
Instead of modifying the existing array one item at a time, you could derive a new one from it and then replace the whole array as an "atomic" operation once you're done:
The easy way (no LINQ, but very similar):
Predicate<T> filter = delegate(T item) { return !someCondition; };
results = Array.FindAll(results, filter);
// with LINQ, you'd have written: results = results.Where(filter);
where T is the type of the items in your results array.
A somewhat more explicit alternative:
var newResults = new List<T>();
foreach (T item in results)
{
if (!someCondition)
{
newResults.Add(item);
}
}
results = newResults.ToArray();
Usually you wouldn't remove elements as such, you would create a new array from the old without the unwanted elements.
If you do go the route of removing elements from an array/list your loop should count down rather than up. (as yours does)
a couple of options:
List<int> indexesToRemove = new List<int>();
for(int i = results.Count; i >= 0; i--)
{
if(someCondition)
{
//results.Remove(results[i]);
indexesToRemove.Add(i);
}
}
foreach(int i in indexesToRemove) {
results.Remove(results[i]);
}
or alternatively, you could make a copy of the existing list, and instead remove from the original list.
//temp is a copy of results
for(int i = temp.Count-1; i >= 0; i--)
{
if(someCondition)
{
results.Remove(results[i]);
}
}

Categories