ICollection<T> is non-index based, but TakeWhile() exists - c#

I'm trying to replace usages of T[] or List<T> as function parameters and return values with more appropriate types such as IEnumerable<T>, ICollection<T> and IList<T>.
ICollection<T> from my understanding is preferrable to IList<T> where you are only needing basic/simple collection functionality (eg an enumerator and count functionality) as it provides this with the least restriction. From reading on here one of the main differentiators I thought was that ICollection<T> doesn't require that the underlying collection to be index based where IList<T> does?
In switching my List<T> references over I needed to replace a List<T>.GetRange() call and I was very surprised to find the ICollection<T>.TakeWhile() extension method which has an overload supporting selection based on index?! (msdn link)
I'm confused why this method exists on ICollection where there is nothing index based on this interface? Have I misunderstood or how can this method actually work if the underlying collection is eg a Hashset or something?

The method, like most of LINQ, is on IEnumerable<T>. Any features that just pass the indexer to the consumer (such as TakeWhile) only need to loop while incrementing a counter. Some APIs may be able to optimize using an indexer, and then it is up to them to decide whether to do that, or just use IEnumerable<T> and simply skip (etc) unwanted data.
For example:
int i = 0;
foreach(var item in source) {
if(!predicate(i++, item)) break;
yield return item;
}

Indexing can be done without collection's support of it
int i = -1;
foreach(var item in collection)
{
i++;
// item is at index i;
}

TakeWhile and other extension methods from System.Linq.Enumerable class work on all the types implementing IEnumerable<T>. They all iterate over the collection (using foreach statement) and perform appropriate actions.
Here is the implementation of the TakeWhile method, with some simplifications:
private static IEnumerable<TSource> TakeWhile<TSource>(IEnumerable<TSource> source, Func<TSource, bool> predicate)
{
foreach (TSource item in source)
{
if (!predicate(item))
{
break;
}
yield return item;
}
}
As you see, it simply iterates over the collection, and evaluates the predicate. This is true for almost all other LINQ methods. The same will happen when you use any other collection, like HashSet<T>.

Related

Does LINQ to Objects keep its order

I have a List<Person> and instead want to convert them for simple processing to a List<string>, doing the following:
List<Person> persons = GetPersonsBySeatOrder();
List<string> seatNames = persons.Select(x => x.Name).ToList();
Console.WriteLine("First in line: {0}", seatNames[0]);
Is the .Select() statement on a LINQ to Objects object guaranteed to not change the order of the list members? Assuming no explicit distinct/grouping/ordering is added
Also, if an arbitrary .Where() clause is used first, is it still guaranteed to keep the relative order, or does it sometimes use non-iterative filtering?
As Fermin commented above, this is essentially a duplicate question. I failed on selecting the correct keywords to search stackoverflow
Preserving order with LINQ
It depends on the underlying collection type more than anything. You could get inconsistent ordering from a HashSet, but a List is safe. Even if the ordering you want is provided implicitly, it's better to define an explicit ordering if you need it though. It looks like you're doing that judging by the method names.
In current .Net implementation it use such code. But there are no guarantee that this implementation will be in future.
private static IEnumerable<TResult> SelectIterator<TSource, TResult>(IEnumerable<TSource> source, Func<TSource, int, TResult> selector)
{
int index = -1;
foreach (TSource source1 in source)
{
checked { ++index; }
yield return selector(source1, index);
}
}
Yes, Linq Select is guaranteed to return all its results in the order of the enumeration it is passed. Like most Linq functions, it is fully specified what it does. Barring handling of errors, this might as well be the code for Select:
IEnumerable<Y> Select<X, Y>(this IEnumerable<X> input, Func<X, Y> transform)
{
foreach (var x in input)
yield return transform(x);
}
But as Samantha Branham pointed out, the underlying collection might not have an intrinsic order. I've seen hashtables that rearrange themselves on read.

Cannot convert from an IEnumerable<T> to an ICollection<T>

I have defined the following:
public ICollection<Item> Items { get; set; }
When I run this code:
Items = _item.Get("001");
I get the following message:
Error 3
Cannot implicitly convert type
'System.Collections.Generic.IEnumerable<Storage.Models.Item>' to
'System.Collections.Generic.ICollection<Storage.Models.Item>'.
An explicit conversion exists (are you missing a cast?)
Can someone explain what I am doing wrong. I am very confused about the
difference between Enumerable, Collections and using the ToList()
Added information
Later in my code I have the following:
for (var index = 0; index < Items.Count(); index++)
Would I be okay to define Items as an IEnumerable?
ICollection<T> inherits from IEnumerable<T> so to assign the result of
IEnumerable<T> Get(string pk)
to an ICollection<T> there are two ways.
// 1. You know that the referenced object implements `ICollection<T>`,
// so you can use a cast
ICollection<T> c = (ICollection<T>)Get("pk");
// 2. The returned object can be any `IEnumerable<T>`, so you need to
// enumerate it and put it into something implementing `ICollection<T>`.
// The easiest is to use `ToList()`:
ICollection<T> c = Get("pk").ToList();
The second options is more flexible, but has a much larger performance impact. Another option is to store the result as an IEnumerable<T> unless you need the extra functionality added by the ICollection<T> interface.
Additional Performance Comment
The loop you have
for (var index = 0; index < Items.Count(); index++)
works on an IEnumerable<T> but it is inefficient; each call to Count() requires a complete enumeration of all elements. Either use a collection and the Count property (without the parenthesis) or convert it into a foreach loop:
foreach(var item in Items)
You cannot convert directly from IEnumerable<T> to ICollection<T>. You can use ToList method of IEnumerable<T> to convert it to ICollection<T>
someICollection = SomeIEnumerable.ToList();
Pending more information on the question:
please provide more information on the type of item and the signature of Get
Two things you can try are:
To cast the return value of _item.Get to (ICollection)
secondly to use _item.Get("001").ToArray() or _item.Get("001").ToList()
Please note the second will incur a performance hit for the array copy. If the signature (return type) of Get is not an ICollection then the first will not work, if it is not IEnumerable then the second will not work.
Following your clarification to question and in comments, I would personally declare the returning type of _item.Get("001") to ICollection. This means you won't have to do any casting or conversion (via ToList / ToArray) which would involve an unnecessary create/copy operation.
// Leave this the same
public ICollection<Item> Items { get; set; }
// Change function signature here:
// As you mention Item uses the same underlying type, just return an ICollection<T>
public ICollection<Item> Get(string value);
// Ideally here you want to call .Count on the collectoin, not .Count() on
// IEnumerable, as this will result in a new Enumerator being created
// per loop iteration
for (var index = 0; index < Items.Count(); index++)
Best regards,

Does IEnumerable always imply a collection?

Just a quick question regarding IEnumerable:
Does IEnumerable always imply a collection? Or is it legitimate/viable/okay/whatever to use on a single object?
The IEnumerable and IEnumerable<T> interfaces suggest a sequence of some kind, but that sequence doesn't need to be a concrete collection.
For example, where's the underlying concrete collection in this case?
foreach (int i in new EndlessRandomSequence().Take(5))
{
Console.WriteLine(i);
}
// ...
public class EndlessRandomSequence : IEnumerable<int>
{
public IEnumerator<int> GetEnumerator()
{
var rng = new Random();
while (true) yield return rng.Next();
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}
It is always and mandatory that IEnumerable is used on a single object - the single object is always the holder or producer of zero or more other objects that do not necessarily have any relation to IEnumerable.
It's usual, but not mandatory, that IEnumerable represents a collection.
Enumerables can be collections, as well as generators, queries, and even computations.
Generator:
IEnumerable<int> Generate(
int initial,
Func<int, bool> condition,
Func<int, int> iterator)
{
var i = initial;
while (true)
{
yield return i;
i = iterator(i);
if (!condition(i))
{
yield break;
}
}
}
Query:
IEnumerable<Process> GetProcessesWhereNameContains(string text)
{
// Could be web-service or database call too
var processes = System.Diagnostics.Process.GetProcesses();
foreach (var process in processes)
{
if (process.ProcessName.Contains(text))
{
yield return process;
}
}
}
Computation:
IEnumerable<double> Average(IEnumerable<double> values)
{
var sum = 0.0;
var count = 0;
foreach (var value in values)
{
sum += value;
yield return sum/++count;
}
}
LINQ is itself a series of operators that produce objects that implement IEnumerable<T> that don't have any underlying collections.
Good question, BTW!
NB: Any reference to IEnumerable also applies to IEnumerable<T> as the latter inherits the former.
Yes, IEnumerable implies a collection, or possible collection, of items.
The name is derived from enumerate, which means to:
Mention (a number of things) one by one.
Establish the number of.
According to the docs, it exposes the enumerator over a collection.
You can certainly use it on a single object, but this object will then just be exposed as an enumeration containing a single object, i.e. you could have an IEnumerable<int> with a single integer:
IEnumerable<int> items = new[] { 42 };
IEnumerable represents a collection that can be enumerated, not a single item. Look at MSDN; the interface exposes GetEnumerator(), which
...[r]eturns an enumerator that iterates through a collection.
Yes, IEnumerable always implies a collection, that is what enumerate means.
What is your use case for a single object?
I don't see a problem with using it on a single object, but why do want to do this?
I'm not sure whether you mean a "collection" or a .NET "ICollection" but since other people have only mentioned the former I will mention the latter.
http://msdn.microsoft.com/en-us/library/92t2ye13.aspx
By that definition, All ICollections are IEnumerable. But not the other way around.
But most data structure (Array even) just implement both interfaces.
Going on this train of thought: you could have a car depot (a single object) that does not expose an internal data structure, and put IEnumerable on it. I suppose.

How to create an extension method to handle bindinglist.removeall with predicate input

myGenericList.RemoveAll(x => (x.StudentName == "bad student"));
Works great, but a bindinglist does not have this method. How can I create an extension method for the bindinglist that takes as input a predicate and does the magic like the canned removeall for List
thankyou
Like I said in a comment, there is no magic in extension methods, just write the code the same way as if you wrote it normally, just put it in a static method in a static class and use the this keyword:
public static void RemoveAll<T>(this BindingList<T> list, Func<T, bool> predicate)
{
foreach (var item in list.Where(predicate).ToArray())
list.Remove(item);
}
You have to use ToArray() (or ToList()), because Where() is lazy and only enumerates the collection when needed and you can't enumerate changing collection.
Although this solution is quite slow (O(N2)), because every Remove() has to look through the collection to find the correct item to remove. We can do better:
public static void FastRemoveAll<T>(this BindingList<T> list, Func<T, bool> predicate)
{
for (int i = list.Count - 1; i >= 0; i--)
if (predicate(list[i]))
list.RemoveAt(i);
}
This uses the fact that we can get to i-th item in constant time, so the whole method is O(N). The iteration is easier to write backwards, so that indexes of items we have yet to consider aren't changing.
EDIT: Actually the second solution is still O(N2), because every RemoveAt() has to move all the items after the one that was removed.
I'd say:
public static class BindingListExtensions
{
public static void RemoveAll<T>(this BindingList<T> list, Func<T, bool> predicate)
{
// first check predicates -- uses System.Linq
// could collapse into the foreach, but still must use
// ToList() or ToArray() to avoid deferred execution
var toRemove = list.Where(predicate).ToList();
// then loop and remove after
foreach (var item in toRemove)
{
list.Remove(item);
}
}
}
And for those interested in the minutia, seems ToList() and ToArray() are so close to the same performance (and in fact each can be faster based on the circumstance) as to be negligible: I need to iterate and count. What is fastest or preferred: ToArray() or ToList()?

How can I add an item to a IEnumerable<T> collection?

My question as title above. For example
IEnumerable<T> items = new T[]{new T("msg")};
items.ToList().Add(new T("msg2"));
but after all it only has 1 item inside. Can we have a method like items.Add(item) like the List<T>?
You cannot, because IEnumerable<T> does not necessarily represent a collection to which items can be added. In fact, it does not necessarily represent a collection at all! For example:
IEnumerable<string> ReadLines()
{
string s;
do
{
s = Console.ReadLine();
yield return s;
} while (!string.IsNullOrEmpty(s));
}
IEnumerable<string> lines = ReadLines();
lines.Add("foo") // so what is this supposed to do??
What you can do, however, is create a new IEnumerable object (of unspecified type), which, when enumerated, will provide all items of the old one, plus some of your own. You use Enumerable.Concat for that:
items = items.Concat(new[] { "foo" });
This will not change the array object (you cannot insert items into to arrays, anyway). But it will create a new object that will list all items in the array, and then "Foo". Furthermore, that new object will keep track of changes in the array (i.e. whenever you enumerate it, you'll see the current values of items).
The type IEnumerable<T> does not support such operations. The purpose of the IEnumerable<T> interface is to allow a consumer to view the contents of a collection. Not to modify the values.
When you do operations like .ToList().Add() you are creating a new List<T> and adding a value to that list. It has no connection to the original list.
What you can do is use the Add extension method to create a new IEnumerable<T> with the added value.
items = items.Add("msg2");
Even in this case it won't modify the original IEnumerable<T> object. This can be verified by holding a reference to it. For example
var items = new string[]{"foo"};
var temp = items;
items = items.Add("bar");
After this set of operations the variable temp will still only reference an enumerable with a single element "foo" in the set of values while items will reference a different enumerable with values "foo" and "bar".
EDIT
I contstantly forget that Add is not a typical extension method on IEnumerable<T> because it's one of the first ones that I end up defining. Here it is
public static IEnumerable<T> Add<T>(this IEnumerable<T> e, T value) {
foreach ( var cur in e) {
yield return cur;
}
yield return value;
}
Have you considered using ICollection<T> or IList<T> interfaces instead, they exist for the very reason that you want to have an Add method on an IEnumerable<T>.
IEnumerable<T> is used to 'mark' a type as being...well, enumerable or just a sequence of items without necessarily making any guarantees of whether the real underlying object supports adding/removing of items. Also remember that these interfaces implement IEnumerable<T> so you get all the extensions methods that you get with IEnumerable<T> as well.
In .net Core, there is a method Enumerable.Append that does exactly that.
The source code of the method is available on GitHub..... The implementation (more sophisticated than the suggestions in other answers) is worth a look :).
A couple short, sweet extension methods on IEnumerable and IEnumerable<T> do it for me:
public static IEnumerable Append(this IEnumerable first, params object[] second)
{
return first.OfType<object>().Concat(second);
}
public static IEnumerable<T> Append<T>(this IEnumerable<T> first, params T[] second)
{
return first.Concat(second);
}
public static IEnumerable Prepend(this IEnumerable first, params object[] second)
{
return second.Concat(first.OfType<object>());
}
public static IEnumerable<T> Prepend<T>(this IEnumerable<T> first, params T[] second)
{
return second.Concat(first);
}
Elegant (well, except for the non-generic versions). Too bad these methods are not in the BCL.
No, the IEnumerable doesn't support adding items to it. The alternative solution is
var myList = new List(items);
myList.Add(otherItem);
To add second message you need to -
IEnumerable<T> items = new T[]{new T("msg")};
items = items.Concat(new[] {new T("msg2")})
I just come here to say that, aside from Enumerable.Concat extension method, there seems to be another method named Enumerable.Append in .NET Core 1.1.1. The latter allows you to concatenate a single item to an existing sequence. So Aamol's answer can also be written as
IEnumerable<T> items = new T[]{new T("msg")};
items = items.Append(new T("msg2"));
Still, please note that this function will not change the input sequence, it just return a wrapper that put the given sequence and the appended item together.
Not only can you not add items like you state, but if you add an item to a List<T> (or pretty much any other non-read only collection) that you have an existing enumerator for, the enumerator is invalidated (throws InvalidOperationException from then on).
If you are aggregating results from some type of data query, you can use the Concat extension method:
Edit: I originally used the Union extension in the example, which is not really correct. My application uses it extensively to make sure overlapping queries don't duplicate results.
IEnumerable<T> itemsA = ...;
IEnumerable<T> itemsB = ...;
IEnumerable<T> itemsC = ...;
return itemsA.Concat(itemsB).Concat(itemsC);
Others have already given great explanations regarding why you can not (and should not!) be able to add items to an IEnumerable. I will only add that if you are looking to continue coding to an interface that represents a collection and want an add method, you should code to ICollection or IList. As an added bonanza, these interfaces implement IEnumerable.
you can do this.
//Create IEnumerable
IEnumerable<T> items = new T[]{new T("msg")};
//Convert to list.
List<T> list = items.ToList();
//Add new item to list.
list.add(new T("msg2"));
//Cast list to IEnumerable
items = (IEnumerable<T>)items;
Easyest way to do that is simply
IEnumerable<T> items = new T[]{new T("msg")};
List<string> itemsList = new List<string>();
itemsList.AddRange(items.Select(y => y.ToString()));
itemsList.Add("msg2");
Then you can return list as IEnumerable also because it implements IEnumerable interface
Instances implementing IEnumerable and IEnumerator (returned from IEnumerable) don't have any APIs that allow altering collection, the interface give read-only APIs.
The 2 ways to actually alter the collection:
If the instance happens to be some collection with write API (e.g. List) you can try casting to this type:
IList<string> list = enumerableInstance as IList<string>;
Create a list from IEnumerable (e.g. via LINQ extension method toList():
var list = enumerableInstance.toList();
IEnumerable items = Enumerable.Empty(T);
List somevalues = new List();
items.ToList().Add(someValues);
items.ToList().AddRange(someValues);
Sorry for reviving really old question but as it is listed among first google search results I assume that some people keep landing here.
Among a lot of answers, some of them really valuable and well explained, I would like to add a different point of vue as, to me, the problem has not be well identified.
You are declaring a variable which stores data, you need it to be able to change by adding items to it ? So you shouldn't use declare it as IEnumerable.
As proposed by #NightOwl888
For this example, just declare IList instead of IEnumerable: IList items = new T[]{new T("msg")}; items.Add(new T("msg2"));
Trying to bypass the declared interface limitations only shows that you made the wrong choice.
Beyond this, all methods that are proposed to implement things that already exists in other implementations should be deconsidered.
Classes and interfaces that let you add items already exists. Why always recreate things that are already done elsewhere ?
This kind of consideration is a goal of abstracting variables capabilities within interfaces.
TL;DR : IMO these are cleanest ways to do what you need :
// 1st choice : Changing declaration
IList<T> variable = new T[] { };
variable.Add(new T());
// 2nd choice : Changing instantiation, letting the framework taking care of declaration
var variable = new List<T> { };
variable.Add(new T());
When you'll need to use variable as an IEnumerable, you'll be able to. When you'll need to use it as an array, you'll be able to call 'ToArray()', it really always should be that simple. No extension method needed, casts only when really needed, ability to use LinQ on your variable, etc ...
Stop doing weird and/or complex things because you only made a mistake when declaring/instantiating.
Maybe I'm too late but I hope it helps anyone in the future.
You can use the insert function to add an item at a specific index.
list.insert(0, item);
Sure, you can (I am leaving your T-business aside):
public IEnumerable<string> tryAdd(IEnumerable<string> items)
{
List<string> list = items.ToList();
string obj = "";
list.Add(obj);
return list.Select(i => i);
}

Categories