Remove N items from IList where match predicate - c#

I would like to remove N items from an IList collection. Here's what I've got:
public void RemoveSubcomponentsByTemplate(int templateID, int countToRemove)
{
// TaskDeviceSubcomponents is an IList
var subcomponents = TaskDeviceSubcomponents.Where(tds => tds.TemplateID == templateID).ToList();
if (subcomponents.Count < countToRemove)
{
string message = string.Format("Attempted to remove more subcomponents than found. Found: {0}, attempted: {1}", subcomponents.Count, countToRemove);
throw new ApplicationException(message);
}
subcomponents.RemoveRange(0, countToRemove);
}
Unfortunately, this code does not work as advertised. TaskDeviceSubcomponents is an IList, so it doesn't have the RemoveRange method. So, I call .ToList() to instantiate an actual List, but this gives me a duplicate collection with references to the same collection items. This is no good because calling RemoveRange on subcomponents does not affect TaskDeviceSubcomponents.
Is there a simple way to achieve this? I'm just not seeing it.

Unfortunately, I think you need to remove each item individually. I would change your code to this:
public void RemoveSubcomponentsByTemplate(int templateID, int countToRemove)
{
// TaskDeviceSubcomponents is an IList
var subcomponents = TaskDeviceSubcomponents
.Where(tds => tds.TemplateID == templateID)
.Take(countToRemove)
.ToList();
foreach (var item in subcomponents)
{
TaskDeviceSubcomponents.Remove(item);
}
}
Note that it is important to use ToList here so you are not iterating TaskDeviceSubcomponents while removing some of its items. This is because LINQ uses lazy evaluation, so it doesn't iterate over TaskDeviceSubcomponents until you iterate over subcomponents.
Edit: I neglected to only remove the number of items contained in countToRemove, so I added a Take call after the Where.
Edit 2: Specification for the Take()-Method: http://msdn.microsoft.com/en-us/library/bb503062(v=vs.110).aspx

Related

Get last duplicate element in a list

I have a list contains duplicate items.
List<string> filterList = new List<string>()
{
"postpone", "access", "success", "postpone", "success"
};
I get the output which is postpone, access, success by using
List<string> filter = filterList.Distinct().ToList();
string a = string.Join(",", filter.Select(a => a).ToArray());
Console.WriteLine(a);
I had saw other example, they can use groupby to get the latest element since they have other item like ID etc. Now I only have the string, how can I get the latest item in the list which is access, postpone, success? Any suggestion?
One way to do this would be use the Index of the item in original collection along with GroupBy. For example,
var lastDistinct = filterList.Select((x,index)=> new {Value=x,Index=index})
.GroupBy(x=>x.Value)
.Select(x=> x.Last())
.OrderBy(x=>x.Index)
.Select(x=>x.Value);
var result = string.Join(",",lastDistinct);
Output
access,postpone,success
An OrderedDictionary does this. All you have to do is add your items to it with a logic of "if it's in the dictionary, remove it. add it". OrderedDictionary preserves the order of adding so by removing an earlier added one and re-adding it it jumps to the end of the dictionary
var d = new OrderedDictionary();
filterList.ForEach(x => { if(d.Contains(x)) d.Remove(x); d[x] = null; });
Your d.Keys is now a list of strings
access
postpone
success
OrderedDictionary is in the Collections.Specialized namespace
If you wanted the keys as a CSV, you can use Cast to turn them from object to string
var s = string.Join(",", d.Keys.Cast<string>());
Your input list is only of type string, so using groupBy doesn't really add anything. If you consider your code, your first line gives you the distinct list, you only lose the distinct items because you did a string.join on line 2. All you need to do is add a line before you join:
List<string> filter = filterList.Distinct().ToList();
string last = filter.LastOrDefault();
string a = string.Join(",", filter.Select(a => a).ToArray());
Console.WriteLine(a);
I suppose you could make your code more terse because you need neither .Select(a => a) nor .ToArray() in your call to string.Join.
GroupBy would be used if you had a list of class/struct/record/tuple items, where you might want to group by a specific key (or keys) rather than using Distinct() on the whole thing. GroupBy is very useful and you should learn that, and also the ToDictionary and ToLookup LINQ helper functionality.
So why shouldn't you return the first occurrence of "postpone"? Because later in the sequence you see the same word "postpone" again. Why would you return the first occurrence of "access"? Because later in the sequence you don't see this word anymore.
So: return a word if the rest of the sequence does not have this word.
This would be easy in LINQ, with recursion, but it is not very efficient: for every word you would have to check the rest of the sequence to see if the word is in the rest.
It would be way more efficient to remember the highest index on which you found a word.
As an extension method. If you are not familiar with extension methods, see extension methods demystified.
private static IEnumerable<T> FindLastOccurences<T>(this IEnumerable<T> source)
{
return FindLastOccurrences<T>(source, null);
}
private static IEnumerable<T> FindLastOccurences<T>(this IEnumerable<T> source,
IEqualityComparer<T> comparer)
{
// TODO: check source not null
if (comparer == null) comparer = EqualityComparer<T>.Default;
Dictionary<T, int> dictionary = new Dictionary<T, int>(comparer);
int index = 0;
foreach (T item in source)
{
// did we already see this T? = is this in the dictionary
if (dictionary.TryGetValue(item, out int highestIndex))
{
// we already saw it at index highestIndex.
dictionary[item] = index;
}
else
{
// it is not in the dictionary, we never saw this item.
dictionary.Add(item, index);
}
++index;
}
// return the keys after sorting by value (which contains the highest index)
return dictionay.OrderBy(keyValuePair => keyValuePair.Value)
.Select(keyValuePair => keyValuePair.Key);
}
So for every item in the source sequence, we check if it is in the dictionary. If not, we add the item as key to the dictionary. The value is the index.
If it is already in the dictionary, then the value was the highest index of where we found this item before. Apparently the current index is higher, so we replace the value in the dictionary.
Finally we order the key value pairs in the dictionary by ascending value, and return only the keys.

Performing filtering and sorting on the original collection

I have got stucked in a scenario that i have a custom collection class which inherits form ICollection interface and i have a code segement like following:
myCustomCollectionObject.Where(obj=>obj.isValid).ToList().Sort(mycustomerComparer);
above code filters the original collection and then sort the collection
now in this kind of scenario sorting would be performed on a different collection rather than original collection.
So, is there any way or workaround for implementing first filtering then sorting on the original collection
If you can't use the immutable/functional goodness of Linq, then you have to go old-skool:
//Remove unwanted items
for (int i = myCustomCollectionObject.Length; i >= 0 ; i--)
{
if(!myCustomCollectionObject[i].IsValid)
myCustomCollectionObject.Remove(myCustomCollectionObject[i]);
}
myCustomCollectionObject.Sort(mycustomerComparer);
Just happened to learn myCustomCollectionObject isn't List<T>, hence a complete rewrite.
Approach 1:
Have a Sort method in your class
List<T> backingStructure; //assuming this is what you have.
public void Sort(IComparer<T> comparer)
{
backingStructure = backingStructure.Where(obj => obj.isValid).ToList();
backingStructure.Sort(comparer);
}
and call Sort on the internal backing structure. I assume it has to be List<T> or Array both which has Sort on them. I have added the filtering logic internal to your
Sort method.
Approach 2:
If you don't want that, ie you want your filtering logic to be external to class, then have a method to repopulate your backing structure from an IEnumerable<T>. Like:
List<T> backingStructure; //assuming this is what you have.
//return type chosen to make method name meaningful, up to you to have void
public UndoRedoObservableCollection<T> From(IEnumerable<T> list)
{
backingStructure.Clear();
foreach(var item in list)
//populate and return;
}
Call it like
myCustomCollectionObject = myCustomCollectionObject.From
(
myCustomCollectionObject.Where(obj => obj.isValid)
.OrderBy(x => x.Key)
);
But you will need a key to specify ordering.
Approach 3 (the best of all):
Have a RemoveInvalid method
List<T> backingStructure; //assuming this is what you have.
public void RemoveInvalid()
{
//you can go for non-Linq (for loop) removal approach as well.
backingStructure = backingStructure.Where(obj => obj.isValid).ToList();
}
public void Sort(IComparer<T> comparer)
{
backingStructure.Sort(comparer);
}
Call it:
myCustomCollectionObject.RemoveInvalid();
myCustomCollectionObject.Sort(mycustomerComparer);

How to remove elements from an array

Hi I'm working on some legacy code that goes something along the lines of
for(int i = results.Count-1; i >= 0; i--)
{
if(someCondition)
{
results.Remove(results[i]);
}
}
To me it seems like bad practice to be removing the elements while still iterating through the loop because you'll be modifying the indexes.
Is this a correct assumption?
Is there a better way of doing this? I would like to use LINQ but I'm in 2.0 Framework
The removal is actually ok since you are going downwards to zero, only the indexes that you already passed will be modified. This code actually would break for another reason: It starts with results.Count, but should start at results.Count -1 since array indexes start at 0.
for(int i = results.Count-1; i >= 0; i--)
{
if(someCondition)
{
results.RemoveAt(i);
}
}
Edit:
As was pointed out - you actually must be dealing with a List of some sort in your pseudo-code. In this case they are conceptually the same (since Lists use an Array internally) but if you use an array you have a Length property (instead of a Count property) and you can not add or remove items.
Using a list the solution above is certainly concise but might not be easy to understand for someone that has to maintain the code (i.e. especially iterating through the list backwards) - an alternative solution could be to first identify the items to remove, then in a second pass removing those items.
Just substitute MyType with the actual type you are dealing with:
List<MyType> removeItems = new List<MyType>();
foreach(MyType item in results)
{
if(someCondition)
{
removeItems.Add(item);
}
}
foreach (MyType item in removeItems)
results.Remove(item);
It doesn't seem like the Remove should work at all. The IList implementation should fail if we're dealing with a fixed-size array, see here.
That being said, if you're dealing with a resizable list (e.g. List<T>), why call Remove instead of RemoveAt? Since you're already navigating the indices in reverse, you don't need to "re-find" the item.
May I suggest a somewhat more functional alternative to your current code:
Instead of modifying the existing array one item at a time, you could derive a new one from it and then replace the whole array as an "atomic" operation once you're done:
The easy way (no LINQ, but very similar):
Predicate<T> filter = delegate(T item) { return !someCondition; };
results = Array.FindAll(results, filter);
// with LINQ, you'd have written: results = results.Where(filter);
where T is the type of the items in your results array.
A somewhat more explicit alternative:
var newResults = new List<T>();
foreach (T item in results)
{
if (!someCondition)
{
newResults.Add(item);
}
}
results = newResults.ToArray();
Usually you wouldn't remove elements as such, you would create a new array from the old without the unwanted elements.
If you do go the route of removing elements from an array/list your loop should count down rather than up. (as yours does)
a couple of options:
List<int> indexesToRemove = new List<int>();
for(int i = results.Count; i >= 0; i--)
{
if(someCondition)
{
//results.Remove(results[i]);
indexesToRemove.Add(i);
}
}
foreach(int i in indexesToRemove) {
results.Remove(results[i]);
}
or alternatively, you could make a copy of the existing list, and instead remove from the original list.
//temp is a copy of results
for(int i = temp.Count-1; i >= 0; i--)
{
if(someCondition)
{
results.Remove(results[i]);
}
}

Problems removing elements from a list when iterating through the list

I have a loop that iterates through elements in a list. I am required to remove elements from this list within the loop based on certain conditions. When I try to do this in C#, I get an exception. apparently, it is not allowed to remove elements from the list which is being iterated through. The problem was observed with a foreach loop. Is there any standard way to get around this problem?
Note : One solution I could think of is to create a copy of the list solely for iteration purpose and to remove elements from the original list within the loop. I am looking for a better way of dealing with this.
When using List<T> the ToArray() method helps in this scenario vastly:
List<MyClass> items = new List<MyClass>();
foreach (MyClass item in items.ToArray())
{
if (/* condition */) items.Remove(item);
}
The alternative is to use a for loop instead of a foreach, but then you have to decrement the index variable whenever you remove an element i.e.
List<MyClass> items = new List<MyClass>();
for (int i = 0; i < items.Count; i++)
{
if (/* condition */)
{
items.RemoveAt(i);
i--;
}
}
If your list is an actual List<T> then you can use the built-in RemoveAll method to delete items based on a predicate:
int numberOfItemsRemoved = yourList.RemoveAll(x => ShouldThisItemBeDeleted(x));
You could use LINQ to replace the initial list by a new list by filtering out items:
IEnumerable<Foo> initialList = FetchList();
initialList = initialList.Where(x => SomeFilteringConditionOnElement(x));
// Now initialList will be filtered according to the condition
// The filtered elements will be subject to garbage collection
This way you don't have to worry about loops.
You can use integer indexing to remove items:
List<int> xs = new List<int> { 1, 2, 3, 4 };
for (int i = 0; i < xs.Count; ++i)
{
// Remove even numbers.
if (xs[i] % 2 == 0)
{
xs.RemoveAt(i);
--i;
}
}
This can be weird to read and tough to maintain, though, especially if the logic in the loop gets any more complex.
Another trick is to loop through the list backwards.. removing an item won't affect any of the items you are going to encounter in the rest of the loop.
I'm not recommending this or anything else though. Everything you need this for can probably be done using LINQ statements to filter the list on your requirements.
You can iterate with foreach this way:
List<Customer> custList = Customer.Populate();
foreach (var cust in custList.ToList())
{
custList.Remove(cust);
}
Note: ToList on the list of variables, this iterates through the list created by the ToList but removes the items from the original list.
Hope this helps.
The recommended solution is to put all your elements you want to remove in a separate list and after the first loop, put a second loop where you iterate over the remove-list and remove those elements form the first list.
The reason you get an error is because you're using a foreach loop. If you think about how a foreach loop works this makes sense. The foreach loop calls the GetEnumerator method on the List. If you where to change the number of elements in the List, the Enumerator the foreach loop holds wouldn't have the correct number of elements. If you removed an element a null exception error would be thrown, and if you added an element the loop would miss an item.
If you like Linq and Lamda expressions I would recommend Darin Dimitrov solution, otherwise I would use the solution provided by Chris Schmich.

using Linq to generate a collection of things to be removed from another collection

I'm familiar with the problem of modifying a collection while looping over it with a foreach loop (i.e. "System.InvalidOperationException: Collection was modified"). However, it doesn't make sense to me that when I use Linq to create a List of keys to delete from a dictionary, then loop over my new List, I get the same exception.
Code before, that threw an exception:
IEnumerable<Guid> keysToDelete = _outConnections.Where(
pair => pair.Value < timeoutPoint
).Select(pair => pair.Key);
foreach (Guid key in keysToDelete)
{
...some stuff not dealing with keysToDelete...
_outConnections.Remove(key);
}
Code after, that worked:
List<Guid> keysToDelete = _outConnections.Where(
pair => pair.Value < timeoutPoint
).Select(pair => pair.Key).ToList();
for (int i=keysToDelete.Count-1; i>=0; i--)
{
Guid key = keysToDelete[i];
...some stuff not dealing with keysToDelete...
_outConnections.Remove(key);
}
Why is this? I have the feeling that maybe my Linq queries aren't really returning a new collection, but rather some subset of the original collection, hence it accuses me of modifying the collection keysToDelete when I remove an element from _outConnections.
Update: the following fix also works, thanks to Adam Robinson:
List<Guid> keysToDelete = _outConnections.Where(
pair => pair.Value < timeoutPoint
).Select(pair => pair.Key).ToList();
foreach (Guid key in keysToDelete)
{
...some stuff not dealing with keysToDelete...
_outConnections.Remove(key);
}
You're correct. LINQ uses what's called "deferred execution". Declaring your LINQ query doesn't actually do anything other than construct a query expression. It isn't until you actually enumerate over the list that the query is evaluated, and it uses the original list as the source.
However, calling ToList() should create a brand new list that has no relation to the original. Check the call stack of your exception to ensure that it is actually being thrown by keysToDelete.

Categories