LINQ, Lists and Add items question - c#

I have the following which is an array
package.Resources
If I use
package.Resources.ToList().Add(resouce);
package.Resources doesn't actually contain the new item.
I have to use
var packageList = package.Resources.ToList();
packageList.Add(resource);
package.Resources = packageList.ToArray();
Why is that?

ToList() creates a completely new, different list based on the original array.
LINQ is read-only; Language INtegrated Query - it is only querying the data, not modifying it. All LINQ methods produce a projection - e.g., they project the original sequence into a new one, so you're always working against that.

When you call package.Resources.ToList().Add(resouce);, it's functionally the same as doing this:
var resourcesAsList = new List<WhateverTypeResourcesContains>();
foreach(var item in package.Resources)
{
resourcesAsList.Add(item);
}
var resourcesAsList.Add(resouce);
What this means is that package.Resources hasn't been modified, you've created a List<T> that contains each item that package.Resources contained.

Exanding upon what Rex said, when you call .ToList() on an array object, what you are saying is, "Make me a brand new List object that's a distinct reference from the original array, but use the objects from my original array as the objects in my list."
Because arrays are fixed in size, there's really no way to Add() an item to it without copying the existing array to a new array that has an increased capacity (even ReDim Preserve in the VB world copies the array behind the scense).
Because of this, you should ask yourself why you are using an array if you don't know the total number of items in the array when you instantiate it. When you know the number of items or when you are instantiating the array from an existing List or IEnumerable, arrays can be a decent light-weight way of holding a collection of objects. However, when you don't know the capacity until runtime and the source is not an IEnumerable , a List might be a better option because it is built to grow in capacity as items are added.
I use the following collection types as properties in my classes:
IEnumerable
I use IEnumerable when I'm going to be iterating over a collection that already exists and I won't be adding to it. For example, if I query a directory for its files and I want to have those files as a property in a class, I might use public IEnumerable<FileInfo> DirectoryFiles {get; set;}.
IEnumerable allows me to query over an existing set of data, but I don't necessarily care what collection it is, as long as it implements IEnumerable.
List<T>
I use a List<T> when I need a collection that could grow or shrink dynamically, i.e. the collection doesn't arealdy exist somewhere else and I may need to add or remove items from it.
A good example of this might be if you are allowing a user to add and remove items to a list box in the user interface of your application. Since the items in the list box might grow or shrink, a List<T> makes sense.
ObservableCollection<T>
I use this type of collection in the Silverlight/WPF world because it has built-in events for when the number of items in the collection changes. This is especially handy for data-binding scenarios when you want a UI element to automatically update when the list changes.
I almost never use Arrays explicitly unless I'm consuming an object that already has arrays. Granted, they're a nice lightweight way in which store a collection, but I can usually get by with the 3 types I listed above.

The property Resources is returning an IEnumerable. Enumerable.ToList() builds a new list from this IEnumerable. You are then adding an item to a new list, not the original collection that Resources is accessing, hence your update is having no affect.

Related

When will one prefer array, LinkedList or ArrayList over List<T>?

Is there any point of using those data types other then legacy code? Other data types like Dictionary or Graph are understandably used because they provide extra / different functionality. But array, LinkedList or ArrayList have less of a functionality and sometimes worst performance then List (ArrayList is less memory efficient in value types)
Then why use them at all?
Note: this is not an opinion - based question. All I want to know is use cases for these types
Another Note: I know about Linked list's O(1) insert time. I am asking when should it be utilized over the standard List, which has O(1) access time?
When it is better to use? (and the question about ArrayList and array remains)
ArrayList? sure: don't use it, basically ever (unless you don't want to migrate some legacy code, or can't because somebody has unwisely used BinaryFormatter).
LinkedList<T>, however, is not in the same category - it is niche, but it has uses due to cheap insertion/removal/etc, unlike List<T> which would need to move data around to perform insertion/removal. In most scenarios, you probably don't need that feature, so: don't use it unless you do?
LinkedList
Here is a list of differentiators from the List implementation. You use it when the items in the list need to maintain a specific order (hence the next and previous references).
Represents a doubly linked list.
LinkedList<T> provides separate nodes of type LinkedListNode<T>, so insert and removal are O(1) operations.
You can remove nodes and reinsert them, either in the same list or in another list, which results in no additional objects allocated on the heap. Because the list also maintains an internal count, getting the Count property is an O(1) operation.
Each node in a LinkedList<T> object is of the type LinkedListNode<T>. Because the LinkedList<T> is doubly linked, each node points forward to the Next node and backward to the Previous node.
List Vs ArrayList
ArrayList is a deprecated implementation used in the past. Prefer List<T> generic implementation in any new code.
As a generic collection, List<T> implements the generic IEnumerable<T> interface and can be used easily in LINQ
ArrayList belongs to the days that C# didn't have generics. It's deprecated in favor of List<T>. You shouldn't use ArrayList in new code that targets .NET >= 2.0 unless you have to interface with an old API that uses it.
Array vs List
Array is a fixed size collection and it supports multiple dimensions. It is the most efficient of the three for simple insert and iterations.

IEnumerable to List

Why IEnumerable.ToList() won't work if like:
var _listReleases= new List<string>;
_listReleases.Add("C#")
_listReleases.Add("Javascript");
_listReleases.Add("Python");
IEnumerable sortedItems = _listReleases.OrderBy(x => x);
_listReleases.Clear();
_listReleases.AddRange(sortedItems); // won't work
_listReleases.AddRange(sortedItems.ToList()); // won't work
Note: _listRelealse will be null
It doesn't work because of this line:
_listReleases.Clear();
First of all, _listReleases is not null at this point. It's merely empty, which is a completely different thing.
But to explain why this doesn't work as you expect: the IEnumerable interface type does not actually allocate or reserve storage for anything. It represents an object that you can use with a foreach loop, and nothing more. It does not actually need to store the items in the collection itself.
Sometimes, an IEnumerable reference does have those items in the same object, but it doesn't have to. That's what's going on here. The OrderBy() extension method only creates an object that knows how to look at the original list and return the items in a specific order. But this does not have storage for those items. It still depends on it's original data source.
The best solution for this situation is to stop using the _listReleases variable at this point, and instead just use the sortedItems variable. As long the former is not garabage collected, the latter will do what you need. But if you really want the _listReleases variable, you can do it like this:
_listReleases = sortedItems.ToList();
Now back to IEnumerables. There are some nice benefits to this property of not requiring immediate storage of the items themselves, and merely abstracting the ability to iterate over a collection:
Lazy Evaluation - That the work required to produce those items is not done until called for (and often, that means it won't need to be done all all, greatly improving performance).
Composition - An IEnumerable object can be modified during a program to incorprate new sets of rules or operations into the final result. This reduces program complexity and improves maintainability by allowing you to break apart a complex set of sorting or filtering requirements into it's component parts. This also makes it much easier to build a program where these rules can be easily determined by the user at run time, instead of in advance by the programmer at compile time.
Memory Efficiency - An IEnumerable makes it possible to iterate collections of data from sources such as a database in ways that only need to keep the current record loaded into memory at any given time. This feature can also be used to create unbounded collections: sets of items that may stretch on to infinity. You can build an IEnumerable with the BigInteger type to calculate the next prime on to infinity, if asked for. Moreover, you could use that collection in a useful way without crashing or hanging the program by combining this with the composition feature, so the program will know when to stop.
LINQ is lazily evaluated. When you run this line:
IEnumerable sortedItems = _listReleases.OrderBy(x => x);
You aren't actually ordering the items right then and there. Instead you're building an enumerable that will, when enumerated, return the objects that are currently in _listReleases in order. So when you Clear() the list, it no longer has any items to order.
You need to force it to evaluate before you clear _listReleases. An easy way to do this is to add a ToList() call. Also, the type IEnumerable isn't compatible with AddRange won't accept it. You can just use var to implicitly type it to List<string>, which will work because List<T> : IEnumerable<T> (it implements the interface).
var sortedItems = _listReleases.OrderBy(x => x).ToList();
_listReleases.Clear();
_listReleases.AddRange(sortedItems);
You should also note that methods like ToList() are extension methods for IEnumerable<T>, not IEnumerable, so ((IEnumerable)something).ToList() won't work. Unlike, say, Java, Something<T> and Something are completely distinct types in C#.

Do extension methods like `ToArray` and `ToList` operate by reference or by value?

Let's say I have a private dictionary or a list in my class. I want to return a readonly enumerator so that others can iterate over the list, but not have access to modify the items.
Instead of creating a wrapper class around the original, I'd like to return copies of the original items/elements. Will something like original.ToList<Type>().GetEnumerator() return a list with references to the original items, or a list with copies of the original items?
I should note that I also need indexing (i.e. accessing items by index, still not being able to modify them).
The methods create a new instance of the collection, but the item references will still be to the old items. In other words, a consumer could not update your internal collection, but they could update the items themselves.
Assuming you have appropriate encapsulation around modifying the items, this approach will work, though it is a little memory-intensive for larger lists, since you need to allocate memory for each new item reference. That's one reason why returning a wrapper is often preferred: it reduces the extra memory required to a single instance of the wrapper class.
ToList() and Torray() both work by value. They simply copy the values from the original IEnumerable<> to the new container.
But the values being copied might very well be references.
The straight call to
original.ToList<MyType>()
will create a "shallow" copy: the list will be new, but the objects will be the same.
If you would prefer a "deep" copy, you can use LINQ to duplicate your items before adding them to the list:
original.Select(item => new MyType(item)).ToList()
Assuming that your class has a constructor that takes an instance of itself and produces a copy, similar to a copy constructor of C++, this would produce a list of copies of your objects.

Why doesn't IEnumerable<T> implement Add(T)?

Just now find it by chance, Add(T) is defined in ICollection<T>, instead of IEnumerable<T>. And extension methods in Enumerable.cs don't contain Add(T), which I think is really weird. Since an object is enumerable, it must "looks like" a collection of items. Can anyone tell me why?
An IEnumerable<T> is just a sequence of elements; see it as a forward only cursor. Because a lot of those sequences are generating values, streams of data, or record sets from a database, it makes no sense to Add items to them.
IEnumerable is for reading, not for writing.
An enumerable is exactly that - something you can enumerate over and discover all the items. It does not imply that you can add to it.
Being able to enumerate is universal to many types of objects. For example, it is shared by arrays and collections. But you can't 'add' to an array without messing about with it's structure - whereas a Collection is specifically built to be added to and removed from.
Technically you can 'add' to an enumerable, however - by using Concat<> - however all this does is create an enumerator that enumerates from one enumerable to the next - giving the illusion of a single contigious set.
Each ICollection should be IEnumerable (I think, and the .NET Framework team seems to agree with me ;-)), but the other way around does not always make sense. There is a hierarchy of "collection like objects" in this world, and your assumption that an enumerable would be a collection you can add items to does not hold true in that hierarchy.
Example: a list of primary color names would be an IEnumerable returning "Red", "Blue" and "Green". It would make no logical sense at all to be able to do a primaryColors.Add("Bright Purple") on a "collection" filled like this:
...whatever...
{
...
var primaryColors = EnumeratePrimaryColors();
...
}
private static IEnumerable<string> EnumeratePrimaryColors() {
yield return "Red";
yield return "Blue";
yield return "Green";
}
As its name says, you can enumerate (loop) over an IEnumerable, and that's about it.
When you want to be able to Add something to it, it wouldn't be just an enumerable anymore, since it has extra features.
For instance, an array is an IEnumerable, but an array has a fixed length, so you can't add new items to it.
IEnumerable is just the 'base' for all kind of collections (even readonly collections - which have obviously no Add() method).
The more functionality you'd add to such 'base interface', the more specific it would be.
The name says it all. IEnumerable is for enumerating items only. ICollection is the actual collection of items and thus supports the Add method.

What is the difference between IEnumerable and arrays?

Will anyone describe IEnumerable and what is difference between IEnumerable and array
and where to use it.. all information about it and how to use it.
An array is a collection of objects with a set size.
int[] array = [0, 1, 2];
This makes it very useful in situations where you may want to access an item in a particular spot in the collection since the location in memory of each element is already known
array[1];
Also, the size of the array can be calculated quickly.
IEnumerable, on the other hand, basically says that given a start position it is possible to get the next value. One example of this may be an infinite series of numbers:
public IEnumerable<int> Infinite()
{
int i = 0;
while(true)
yield return i++;
}
Unlike an array an enumerable collection can be any size and it is possible to create the elements as they are required, rather than upfront, this allows for powerful constructs and is used extensively by LINQ to facilitate complex queries.
//This line won't do anything until you actually enumerate the created variable
IEnumerable<int> firstTenOddNumbers = Infinite().Where(x => x % 2 == 1).Take(10);
However the only way to get a specific element is to start at the beginning and enumerate through to the one you want. This will be considerably more expensive than getting the element from a pre-generated array.
Of course you can enumerate through an array, so an array implements the IEnumerable interface.
.NET has its IEnumerable interface misnamed - it should be IIterable. Basically a System.Collection.IEnumerable or (since generics) System.Collection.Generic.IEnumerable allows you to use foreach on the object implementing these interfaces.
(Side note: actually .NET is using duck typing for foreach, so you are not required to implement these interfaces - it's enough if you provide the suitable method implementations.)
An array (System.Array) is a type of a sequence (where by sequence I mean an iterable data structure, i.e. anything that implements IEnumerable), with some important differences.
For example, an IEnumerable can be - and is often - lazy-loaded. That means that until you explicitly iterate over it, the items won't be created. This can lead to strange behaviour if you're not aware of it.
As a consequence, an IEnumerable has no means of telling you how many items it contains until you actually iterate over it (which the Count extension method in System.Linq.Enumerable class does).
An array has a Length property, and with this we have arrived to the most important difference: an array if a sequence of fixed (and known) items. It also provides an indexer, so you can conveniently access its items without actually iterating over it.
And just for the record, the "real" enumerations in .NET are types defined with the enum keyword. They allow you express a choices without using magic numbers or strings. They can be also used as flags, when marked with the FlagsAttribute.
I suggest you to use your favioure search engine to get more details about these concepts - my brief summary clearly doesn't aim to provide a deep insight to these features.
An Array is a collection of data. It's implied that the items are store contiguously, and are directly addessable.
IEnumerable is a description of a collection of data. They aren't collections themselves. Specifically, it means that the collection can be stepped through, one item at a time.
IF you define a varaible as type IEnumerable, then it can reference a collection of any type that fits that description.
Arrays are Enumerable. So are Lists, Dictionaries, Sets and other collection types. Also, things which don't appear to be collection can be Enumerable, such as a string (which is IEnumerable<char>), or or the object returned by Enumerable.Range(), which generates a new item for each step without ever actually holding it anywhere.
Arrays
A .Net array is a collection of multiple values stored consecutively in memory. Individual elements in an array can be randomly accessed by index (and doing that is quite efficient). Important members of an array are:
this[Int32 index] (indexing operator)
Length
C# has built-in support for arrays and they can be initialized directly from code:
var array = new[] { 1, 2, 3, 4 };
Arrays can also be multidimensional and implement several interfaces including IEnumerable<T> (where T is the element type of the array).
IEnumerable<T>
The IEnumerable<T> interface defines the method GetEnumerator() but that method is rarely used directly. Instead the foreach loop is used to iterate through the enumeration:
IEnumerable<T> enumerable = ...;
foreach (T element in enumerable)
...
If the enumeration is done over an array or a list all the elements in the enumeration exists during the enumeration but it is also possible to enumerate elements that are created on the fly. The yield return construct is very useful for this.
It is possible to create an array from an enumeration:
var array = enumerable.ToArray();
This will get all elements from the enumeration and store them consecutively in a single array.
To sum it up:
Arrays are collection of elements that can be randomly accessed by index
Enumerations are abstraction over a collection of elements that can be accessed one after the other in a forward moving manner
One thing is that Arrays allow random access to some fixed size content. Where the IEnumerable interface provides the data sequentially, which you can pull from the IEnumerable one at a time until the data source is exhausted.

Categories