ASP.NET C# Lists Which and When? - c#

In C# There seem to be quite a few different lists. Off the top of my head I was able to come up with a couple, however I'm sure there are many more.
List<String> Types = new List<String>();
ArrayList Types2 = new ArrayList();
LinkedList<String> Types4 = new LinkedList<String>();
My question is when is it beneficial to use one over the other?
More specifically I am returning lists of unknown size from functions and I was wondering if there is a particular list that was better at this.

List<String> Types = new List<String>();
LinkedList<String> Types4 = new LinkedList<String>();
are generic lists, i.e. you define the data type that would go in there which decreased boxing and un-boxing.
for difference in list vs linklist, see this --> When should I use a List vs a LinkedList
ArrayList is a non-generic collection, which can be used to store any type of data type.

99% of the time List is what you'll want. Avoid the non-generic collections at all costs.
LinkedList is useful for adding or removing without shuffling items around, although you have to forego random access as a result. One advantage it does have is you can remove items whilst iterating through the nodes.

ArrayList is a holdover from before Generics. There's really no reason to use them ... they're slow and use more memory than List<>. In general, there's probably no reason to use LinkedList either unless you are inserting midway through VERY large lists.
The only thing you'll find in .NET faster than a List<> is a fixed array ... but the performance difference is surprisingly small.

See the article on Commonly Used Collection Types from MSDN for a list of the the various types of collections available to you, and their intended uses.

ArrayList is a .Net 1.0 list type.
List is a generic list introduced with generics in .Net 2.0.
Generic lists provide better compile time support. Generics lists are type safe. You cannot add objects of wrong type. Therefor you know which type the stored objects has. There are no typechecks and typecasts nessecary.
I dont know about performance differences.
This questions says something about the difference of List and LinkedList.

As mentioned, don't use ArrayList if at all possible.
Here's an bit on Wikipedia about the differences between arrays and linked lists.
In summary:
Arrays
Fast random access
Fast inserting/deleting at end
Good memory locality
Linked Lists
Fast inserting/deleting at beginning
Fast inserting/deleting at end
Fast inserting/deleting at middle (with enumerator)

Generally, use List. Don't use ArrayList; it's obsolete. Use LinkedList in the rare cases where you need to be able to add without resizing and don't mind the overhead and loss of random access.

ArrayList is probably smaller, memory-wise, since it is based on an array. It also has fast random-access to elements. However, adding or removing to the list will take longer. This might be sped up slightly if the object over-allocates under the assumption that you are going to keep adding. (That will, of course, reduce the memory advantage.)
The other lists will be slightly larger (4-to-8 bytes more memory per element), and will have poor random access times. However, it is very fast to add or remove objects to the ends of the list. Also, memory usage is usually spot-on for what you need.

Related

When will one prefer array, LinkedList or ArrayList over List<T>?

Is there any point of using those data types other then legacy code? Other data types like Dictionary or Graph are understandably used because they provide extra / different functionality. But array, LinkedList or ArrayList have less of a functionality and sometimes worst performance then List (ArrayList is less memory efficient in value types)
Then why use them at all?
Note: this is not an opinion - based question. All I want to know is use cases for these types
Another Note: I know about Linked list's O(1) insert time. I am asking when should it be utilized over the standard List, which has O(1) access time?
When it is better to use? (and the question about ArrayList and array remains)
ArrayList? sure: don't use it, basically ever (unless you don't want to migrate some legacy code, or can't because somebody has unwisely used BinaryFormatter).
LinkedList<T>, however, is not in the same category - it is niche, but it has uses due to cheap insertion/removal/etc, unlike List<T> which would need to move data around to perform insertion/removal. In most scenarios, you probably don't need that feature, so: don't use it unless you do?
LinkedList
Here is a list of differentiators from the List implementation. You use it when the items in the list need to maintain a specific order (hence the next and previous references).
Represents a doubly linked list.
LinkedList<T> provides separate nodes of type LinkedListNode<T>, so insert and removal are O(1) operations.
You can remove nodes and reinsert them, either in the same list or in another list, which results in no additional objects allocated on the heap. Because the list also maintains an internal count, getting the Count property is an O(1) operation.
Each node in a LinkedList<T> object is of the type LinkedListNode<T>. Because the LinkedList<T> is doubly linked, each node points forward to the Next node and backward to the Previous node.
List Vs ArrayList
ArrayList is a deprecated implementation used in the past. Prefer List<T> generic implementation in any new code.
As a generic collection, List<T> implements the generic IEnumerable<T> interface and can be used easily in LINQ
ArrayList belongs to the days that C# didn't have generics. It's deprecated in favor of List<T>. You shouldn't use ArrayList in new code that targets .NET >= 2.0 unless you have to interface with an old API that uses it.
Array vs List
Array is a fixed size collection and it supports multiple dimensions. It is the most efficient of the three for simple insert and iterations.

what is difference between string array and list of string in c#

I hear on MSDN that an array is faster than a collection.
Can you tell me how string[] is faster then List<string>.
Arrays are a lower level abstraction than collections such as lists. The CLR knows about arrays directly, so there's slightly less work involved in iterating, accessing etc.
However, this should almost never dictate which you actually use. The performance difference will be negligible in most real-world applications. I rarely find it appropriate to use arrays rather than the various generic collection classes, and indeed some consider arrays somewhat harmful. One significant downside is that there's no such thing as an immutable array (other than an empty one)... whereas you can expose read-only collections through an API relatively easily.
The article is from 2004, that means it's about .net 1.1 and there was no generics.
Array vs collection performance actually was a problem back then because collection types caused a lot of exta boxing-unboxing operations. But since .net 2.0, where generics was introduced, difference in performance almost gone.
An array is not resizable. This means that when it is created one block of memory is allocated, large enough to hold as many elements as you specify.
A List on the other hand is implicitly resizable. Each time you Add an item, the framework may need to allocate more memory to hold the item you just added. This is an expensive operation, so we end up saying "List is slower than array".
Of course this is a very simplified explanation, but hopefully enough to paint the picture.
An array is the simplest form of collection, so it's faster than other collections. A List (and many other collections) actually uses an array internally to hold its items.
An array is of course also limited by its simplicity. Most notably you can't change the size of an array. If you want a dynamic collection you would use a List.
List<string> is class with a private member that is a string[]. The MSDN documentation states this fact in several places. The List class is basically a wrapper class around an array that gives the array other functionality.
The answer of which is faster all depends on what you are trying to do with the list/array. For accessing and assigning values to elements, the array is probably negligibly faster since the List is an abstraction of the array (as Jon Skeet has said).
If you intend on having a data structure that grows over time (gets more and more elements), performance (ave. speed) wise the List will start to shine. That is because each time you resize an array to add another element it is an O(n) operation. When you add an element to a List (and the list is already at capacity) the list will double itself in size. I won't get into the nitty gritty details, but basically this means that increasing the size of a List is on average a O(log n) operation. Of course this has drawbacks too (you could have almost twice the amount of memory allocated as you really need if you only go a couple items past its last capacity).
Edit: I got a little mixed up in the paragraph above. As Eric has said below, the number of resizes for a List is O(log n), but the actual cost associated with resizing the array is amortized to O(1).

ArrayList versus an array of objects versus Collection of T

I have a class Customer (with typical customer properties) and I need to pass around, and databind, a "chunk" of Customer instances. Currently I'm using an array of Customer, but I've also used Collection of T (and List of T before I knew about Collection of T). I'd like the thinnest way to pass this chunk around using C# and .NET 3.5.
Currently, the array of Customer is working just fine for me. It data binds well and seems to be as lightweight as it gets. I don't need the stuff List of T offers and Collection of T still seems like overkill. The array does require that I know ahead of time how many Customers I'm adding to the chunk, but I always know that in advance (given rows in a page, for example).
Am I missing something fundamental or is the array of Customer OK? Is there a tradeoff I'm missing?
Also, I'm assuming that Collection of T makes the old loosely-typed ArrayList obsolete. Am I right there?
Yes, Collection<T> (or List<T> more commonly) makes ArrayList pretty much obsolete. In particular, I believe ArrayList isn't even supported in Silverlight 2.
Arrays are okay in some cases, but should be considered somewhat harmful - they have various disadvantages. (They're at the heart of the implementation of most collections, of course...) I'd go into more details, but Eric Lippert does it so much better than I ever could in the article referenced by the link. I would summarise it here, but that's quite hard to do. It really is worth just reading the whole post.
No one has mentioned the Framework Guidelines advice: Don't use List<T> in public API's:
We don’t recommend using List in
public APIs for two reasons.
List<T> is not designed to be extended. i.e. you cannot override any
members. This for example means that
an object returning List<T> from a
property won’t be able to get notified
when the collection is modified.
Collection<T> lets you overrides
SetItem protected member to get
“notified” when a new items is added
or an existing item is changed.
List has lots of members that are not relevant in many scenarios. We
say that List<T> is too “busy” for
public object models. Imagine
ListView.Items property returning
List<T> with all its richness. Now,
look at the actual ListView.Items
return type; it’s way simpler and
similar to Collection<T> or
ReadOnlyCollection<T>
Also, if your goal is two-way Databinding, have a look at BindingList<T> (with the caveat that it is not sortable 'out of the box'!)
Generally, you should 'pass around' IEnumerable<T> or ICollection<T> (depending on whether it makes sense for your consumer to add items).
If you have an immutable list of customers, that is... your list of customers will not change, it's relatively small, and you will always iterate over it first to last and you don't need to add to the list or remove from it, then an array is probably just fine.
If you're unsure, however, then your best bet is a collection of some type. What collection you choose depends on the operations you wish to perform on it. Collections are all about inserts, manipulations, lookups, and deletes. If you do frequent frequent searches for a given element, then a dictionary may be best. If you need to sort the data, then perhaps a SortedList will work better.
I wouldn't worry about "lightweight", unless you're talking a massive number of elements, and even then the advantages of O(1) lookups outweigh the costs of resources.
When you "pass around" a collection, you're only passing a reference, which is basically a pointer. So there is no performance difference between passing a collection and an array.
I'm going to put in a dissenting argument to both Jon and Eric Lippert )which means that you should be very weary of my answer, indeed!).
The heart of Eric Lippert's arguments against arrays is that the contents are immutable, while the data structure itself is not. With regards to returning them from methods, the contents of a List are just as mutable. In fact, because you can add or subtract elements from a List, I would argue that this makes the return value more mutable than an array.
The other reason I'm fond of Arrays is because sometime back I had a small section of performance critical code, so I benchmarked the performance characteristics of the two, and arrays blew Lists out of the water. Now, let me caveat this by saying it was a narrow test for how I was going to use them in a specific situation, and it goes against what I understand of both, but the numbers were wildly different.
Anyway, listen to Jon and Eric =), and I agree that List almost always makes more sense.
I agree with Alun, with one addition. If you may want to address the return value by subscript myArray[n], then use an IList.
An Array inherently supports IList (as well as IEnumerable and ICollection, for that matter). So if you pass by interface, you can still use the array as your underlying data structure. In this way, the methods that you are passing the array into don't have to "know" that the underlying datastructure is an array:
public void Test()
{
IList<Item> test = MyMethod();
}
public IList<Item> MyMethod()
{
Item[] items = new Item[] {new Item()};
return items;
}

How and when to abandon the use of arrays in C#?

I've always been told that adding an element to an array happens like this:
An empty copy of the array+1element is
created and then the data from the
original array is copied into it then
the new data for the new element is
then loaded
If this is true, then using an array within a scenario that requires a lot of element activity is contra-indicated due to memory and CPU utilization, correct?
If that is the case, shouldn't you try to avoid using an array as much as possible when you will be adding a lot of elements? Should you use iStringMap instead? If so, what happens if you need more than two dimensions AND need to add a lot of element additions. Do you just take the performance hit or is there something else that should be used?
Look at the generic List<T> as a replacement for arrays. They support most of the same things arrays do, including allocating an initial storage size if you want.
This really depends on what you mean by "add."
If you mean:
T[] array;
int i;
T value;
...
if (i >= 0 && i <= array.Length)
array[i] = value;
Then, no, this does not create a new array, and is in-fact the fastest way to alter any kind of IList in .NET.
If, however, you're using something like ArrayList, List, Collection, etc. then calling the "Add" method may create a new array -- but they are smart about it, they don't just resize by 1 element, they grow geometrically, so if you're adding lots of values only every once in a while will it have to allocate a new array. Even then, you can use the "Capacity" property to force it to grow before hand, if you know how many elements you're adding (list.Capacity += numberOfAddedElements)
In general, I prefer to avoid array usage. Just use List<T>. It uses a dynamically-sized array internally, and is fast enough for most usage. If you're using multi-dimentional arrays, use List<List<List<T>>> if you have to. It's not that much worse in terms of memory, and is much simpler to add items to.
If you're in the 0.1% of usage that requires extreme speed, make sure it's your list accesses that are really the problem before you try to optimize it.
If you're going to be adding/removing elements a lot, just use a List. If it's multidimensional, you can always use a List<List<int>> or something.
On the other hand, lists are less efficient than arrays if what you're mostly doing is traversing the list, because arrays are all in one place in your CPU cache, where objects in a list are scattered all over the place.
If you want to use an array for efficient reading but you're going to be "adding" elements frequently, you have two main options:
1) Generate it as a List (or List of Lists) and then use ToArray() to turn it into an efficient array structure.
2) Allocate the array to be larger than you need, then put the objects into the pre-allocated cells. If you end up needing even more elements than you pre-allocated, you can just reallocate the array when it fills, doubling the size each time. This gives O(log n) resizing performance instead of O(n) like it would be with a reallocate-once-per-add array. Note that this is pretty much how StringBuilder works, giving you a faster way to continually append to a string.
When to abandon the use of arrays
First and foremost, when semantics of arrays dont match with your intent - Need a dynamically growing collection? A set which doesn't allow duplicates? A collection that has to remain immutable? Avoid arrays in all that cases. That's 99% of the cases. Just stating the obvious basic point.
Secondly, when you are not coding for absolute performance criticalness - That's about 95% of the cases. Arrays perform better marginally, especially in iteration. It almost always never matter.
When you're not forced by an argument with params keyword - I just wished params accepted any IEnumerable<T> or even better a language construct itself to denote a sequence (and not a framework type).
When you are not writing legacy code, or dealing with interop
In short, its very rare that you would actually need an array. I will add as to why may one avoid it?
The biggest reason to avoid arrays imo is conceptual. Arrays are closer to implementation and farther from abstraction. Arrays conveys more how it is done than what is done which is against the spirit of high level languages. That's not surprising, considering arrays are closer to the metal, they are straight out of a special type (though internally array is a class). Not to be pedagogical, but arrays really do translate to a semantic meaning very very rarely required. The most useful and frequent semantics are that of a collections with any entries, sets with distinct items, key value maps etc with any combination of addable, readonly, immutable, order-respecting variants. Think about this, you might want an addable collection, or readonly collection with predefined items with no further modification, but how often does your logic look like "I want a dynamically addable collection but only a fixed number of them and they should be modifiable too"? Very rare I would say.
Array was designed during pre-generics era and it mimics genericity with lot of run time hacks and it will show its oddities here and there. Some of the catches I found:
Broken covariance.
string[] strings = ...
object[] objects = strings;
objects[0] = 1; //compiles, but gives a runtime exception.
Arrays can give you reference to a struct!. That's unlike anywhere else. A sample:
struct Value { public int mutable; }
var array = new[] { new Value() };
array[0].mutable = 1; //<-- compiles !
//a List<Value>[0].mutable = 1; doesnt compile since editing a copy makes no sense
print array[0].mutable // 1, expected or unexpected? confusing surely
Run time implemented methods like ICollection<T>.Contains can be different for structs and classes. It's not a big deal, but if you forget to override non generic Equals correctly for reference types expecting generic collection to look for generic Equals, you will get incorrect results.
public class Class : IEquatable<Class>
{
public bool Equals(Class other)
{
Console.WriteLine("generic");
return true;
}
public override bool Equals(object obj)
{
Console.WriteLine("non generic");
return true;
}
}
public struct Struct : IEquatable<Struct>
{
public bool Equals(Struct other)
{
Console.WriteLine("generic");
return true;
}
public override bool Equals(object obj)
{
Console.WriteLine("non generic");
return true;
}
}
class[].Contains(test); //prints "non generic"
struct[].Contains(test); //prints "generic"
The Length property and [] indexer on T[] seem to be regular properties that you can access through reflection (which should involve some magic), but when it comes to expression trees you have to spit out the exact same code the compiler does. There are ArrayLength and ArrayIndex methods to do that separately. One such question here. Another example:
Expression<Func<string>> e = () => new[] { "a" }[0];
//e.Body.NodeType == ExpressionType.ArrayIndex
Expression<Func<string>> e = () => new List<string>() { "a" }[0];
//e.Body.NodeType == ExpressionType.Call;
Yet another one. string[].IsReadOnly returns false, but if you are casting, IList<string>.IsReadOnly returns true.
Type checking gone wrong: (object)new ConsoleColor[0] is int[] returns true, whereas new ConsoleColor[0] is int[] returns false. Same is true for uint[] and int[] comparisons. No such problems if you use any other collection types.
How to abandon the use of arrays.
The most commonly used substitute is List<T> which has a cleaner API. But it is a dynamically growing structure which means you can add to a List<T> at the end or insert anywhere to any capacity. There is no substitute for the exact behaviour of an array, but people mostly use arrays as readonly collection where you can't add anything to its end. A substitute is ReadOnlyCollection<T>.
When the array is resized, a new array must be allocated, and the contents copied. If you are only modifying the contents of the array, it is just a memory assignment.
So, you should not use arrays when you don't know the size of the array, or the size is likely to change. However, if you have a fixed length array, they are an easy way of retrieving elements by index.
ArrayList and List grow the array by more than one when needed (I think it's by doubling the size, but I haven't checked the source). They are generally the best choice when you are building a dynamically sized array.
When your benchmarks indicate that array resize is seriously slowing down your application (remember - premature optimization is the root of all evil), you can evaluate writing a custom array class with tweaked resizing behavior.
Generally, if you must have the BEST indexed lookup performance it's best to build a List first and then turn it into a array thus paying a small penalty at first but avoiding any later. If the issue is that you will be continually adding new data and removing old data then you may want to use a ArrayList or List for convenience but keep in mind that they are just special case Arrays. When they "grow" they allocate a completely new array and copy everything into it which is extremely slow.
ArrayList is just an Array which grows when needed.
Add is amortized O(1), just be careful to make sure the resize won't happen at a bad time.
Insert is O(n) all items to the right must be moved over.
Remove is O(n) all items to the right must be moved over.
Also important to keep in mind that List is not a linked list. It's just a typed ArrayList. The List documentation does note that it performs better in most cases but does not say why.
The best thing to do is to pick a data structure which is appropriate to your problem. This depends one a LOT of things and so you may want to browse the System.Collections.Generic Namespace.
In this particular case I would say that if you can come up with a good key value Dictionary would be your best bet. It has insert and remove that approaches O(1). However, even with a Dictionary you have to be careful not to let it resize it's internal array (an O(n) operation). It's best to give them a lot of room by specifying a larger-then-you-expect-to-use initial capacity in the constructor.
-Rick
A standard array should be defined with a length, which reserves all of the memory that it needs in a contiguous block. Adding an item to the array would put it inside of the block of already reserved memory.
Arrays are great for few writes and many reads, particularly those of an iterative nature - for anything else, use one of the many other data structures.
You are correct an array is great for look ups. However modifications to the size of the array are costly.
You should use a container that supports incremental size adjustments in the scenario where you're modifying the size of the array. You could use an ArrayList which allows you to set the initial size, and you could continually check the size versus the capacity and then increment the capacity by a large chunk to limit the number of resizes.
Or you could just use a linked list. Then however look ups are slow...
If I think I'm going to be adding items to the collection a lot over its lifetime, than I'll use a List. If I know for sure what the size of the collection will be when its declared, then I'll use an array.
Another time I generally use an array over a List is when I need to return a collection as a property of an object - I don't want callers adding items that collection via List's Add methods, but instead want them to add items to the collection via my object's interface. In that case, I'll take the internal List and call ToArray and return an array.
If you are going to be doing a lot of adding, and you will not be doing random access (such as myArray[i]). You could consider using a linked list (LinkedList<T>), because it will never have to "grow" like the List<T> implementation. Keep in mind, though, that you can only really access items in a LinkedList<T> implementation using the IEnumerable<T> interface.
The best thing you can do is to allocate as much memory as you need upfront if possible. This will prevent .NET from having to make additional calls to get memory on the heap. Failing that then it makes sense to allocate in chunks of five or whatever number makes sense for your application.
This is a rule you can apply to anything really.

.Net Data structures: ArrayList, List, HashTable, Dictionary, SortedList, SortedDictionary -- Speed, memory, and when to use each? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 1 year ago.
Improve this question
.NET has a lot of complex data structures. Unfortunately, some of them are quite similar and I'm not always sure when to use one and when to use another. Most of my C# and VB books talk about them to a certain extent, but they never really go into any real detail.
What's the difference between Array, ArrayList, List, Hashtable, Dictionary, SortedList, and SortedDictionary?
Which ones are enumerable (IList -- can do 'foreach' loops)? Which ones use key/value pairs (IDict)?
What about memory footprint? Insertion speed? Retrieval speed?
Are there any other data structures worth mentioning?
I'm still searching for more details on memory usage and speed (Big-O notation)
Off the top of my head:
Array* - represents an old-school memory array - kind of like a alias for a normal type[] array. Can enumerate. Can't grow automatically. I would assume very fast insert and retrival speed.
ArrayList - automatically growing array. Adds more overhead. Can enum., probably slower than a normal array but still pretty fast. These are used a lot in .NET
List - one of my favs - can be used with generics, so you can have a strongly typed array, e.g. List<string>. Other than that, acts very much like ArrayList
Hashtable - plain old hashtable. O(1) to O(n) worst case. Can enumerate the value and keys properties, and do key/val pairs
Dictionary - same as above only strongly typed via generics, such as Dictionary<string, string>
SortedList - a sorted generic list. Slowed on insertion since it has to figure out where to put things. Can enum., probably the same on retrieval since it doesn't have to resort, but deletion will be slower than a plain old list.
I tend to use List and Dictionary all the time - once you start using them strongly typed with generics, its really hard to go back to the standard non-generic ones.
There are lots of other data structures too - there's KeyValuePair which you can use to do some interesting things, there's a SortedDictionary which can be useful as well.
If at all possible, use generics. This includes:
List instead of ArrayList
Dictionary instead of HashTable
First, all collections in .NET implement IEnumerable.
Second, a lot of the collections are duplicates because generics were added in version 2.0 of the framework.
So, although the generic collections likely add features, for the most part:
List is a generic implementation of ArrayList.
Dictionary<T,K> is a generic implementation of Hashtable
Arrays are a fixed size collection that you can change the value stored at a given index.
SortedDictionary is an IDictionary<T,K> that is sorted based on the keys.
SortedList is an IDictionary<T,K> that is sorted based on a required IComparer.
So, the IDictionary implementations (those supporting KeyValuePairs) are:
Hashtable
Dictionary<T,K>
SortedList<T,K>
SortedDictionary<T,K>
Another collection that was added in .NET 3.5 is the Hashset. It is a collection that supports set operations.
Also, the LinkedList is a standard linked-list implementation (the List is an array-list for faster retrieval).
Here are a few general tips for you:
You can use foreach on types that implement IEnumerable. IList is essentially an IEnumberable with Count and Item (accessing items using a zero-based index) properties. IDictionary on the other hand means you can access items by any-hashable index.
Array, ArrayList and List all implement IList.
Dictionary, SortedDictionary, and Hashtable implement IDictionary.
If you are using .NET 2.0 or higher, it is recommended that you use generic counterparts of mentioned types.
For time and space complexity of various operations on these types, you should consult their documentation.
.NET data structures are in System.Collections namespace. There are type libraries such as PowerCollections which offer additional data structures.
To get a thorough understanding of data structures, consult resources such as CLRS.
.NET data structures:
More to conversation about why ArrayList and List are actually different
Arrays
As one user states, Arrays are the "old school" collection (yes, arrays are considered a collection though not part of System.Collections). But, what is "old school" about arrays in comparison to other collections, i.e the ones you have listed in your title (here, ArrayList and List(Of T))? Let's start with the basics by looking at Arrays.
To start, Arrays in Microsoft .NET are, "mechanisms that allow you to treat several [logically-related] items as a single collection," (see linked article). What does that mean? Arrays store individual members (elements) sequentially, one after the other in memory with a starting address. By using the array, we can easily access the sequentially stored elements beginning at that address.
Beyond that and contrary to programming 101 common conceptions, Arrays really can be quite complex:
Arrays can be single dimension, multidimensional, or jadded (jagged arrays are worth reading about). Arrays themselves are not dynamic: once initialized, an array of n size reserves enough space to hold n number of objects. The number of elements in the array cannot grow or shrink. Dim _array As Int32() = New Int32(100) reserves enough space on the memory block for the array to contain 100 Int32 primitive type objects (in this case, the array is initialized to contain 0s). The address of this block is returned to _array.
According to the article, Common Language Specification (CLS) requires that all arrays be zero-based. Arrays in .NET support non-zero-based arrays; however, this is less common. As a result of the "common-ness" of zero-based arrays, Microsoft has spent a lot of time optimizing their performance; therefore, single dimension, zero-based (SZs) arrays are "special" - and really the best implementation of an array (as opposed to multidimensional, etc.) - because SZs have specific intermediary language instructions for manipulating them.
Arrays are always passed by reference (as a memory address) - an important piece of the Array puzzle to know. While they do bounds checking (will throw an error), bounds checking can also be disabled on arrays.
Again, the biggest hindrance to arrays is that they are not re-sizable. They have a "fixed" capacity. Introducing ArrayList and List(Of T) to our history:
ArrayList - non-generic list
The ArrayList (along with List(Of T) - though there are some critical differences, here, explained later) - is perhaps best thought of as the next addition to collections (in the broad sense). ArrayList inherit from the IList (a descendant of 'ICollection') interface. ArrayLists, themselves, are bulkier - requiring more overhead - than Lists.
IList does enable the implementation to treat ArrayLists as fixed-sized lists (like Arrays); however, beyond the additional functionallity added by ArrayLists, there are no real advantages to using ArrayLists that are fixed size as ArrayLists (over Arrays) in this case are markedly slower.
From my reading, ArrayLists cannot be jagged: "Using multidimensional arrays as elements... is not supported". Again, another nail in the coffin of ArrayLists. ArrayLists are also not "typed" - meaning that, underneath everything, an ArrayList is simply a dynamic Array of Objects: Object[]. This requires a lot of boxing (implicit) and unboxing (explicit) when implementing ArrayLists, again adding to their overhead.
Unsubstantiated thought: I think I remember either reading or having heard from one of my professors that ArrayLists are sort of the bastard conceptual child of the attempt to move from Arrays to List-type Collections, i.e. while once having been a great improvement to Arrays, they are no longer the best option as further development has been done with respect to collections
List(Of T): What ArrayList became (and hoped to be)
The difference in memory usage is significant enough to where a List(Of Int32) consumed 56% less memory than an ArrayList containing the same primitive type (8 MB vs. 19 MB in the above gentleman's linked demonstration: again, linked here) - though this is a result compounded by the 64-bit machine. This difference really demonstrates two things: first (1), a boxed Int32-type "object" (ArrayList) is much bigger than a pure Int32 primitive type (List); second (2), the difference is exponential as a result of the inner-workings of a 64-bit machine.
So, what's the difference and what is a List(Of T)? MSDN defines a List(Of T) as, "... a strongly typed list of objects that can be accessed by index." The importance here is the "strongly typed" bit: a List(Of T) 'recognizes' types and stores the objects as their type. So, an Int32 is stored as an Int32 and not an Object type. This eliminates the issues caused by boxing and unboxing.
MSDN specifies this difference only comes into play when storing primitive types and not reference types. Too, the difference really occurs on a large scale: over 500 elements. What's more interesting is that the MSDN documentation reads, "It is to your advantage to use the type-specific implementation of the List(Of T) class instead of using the ArrayList class...."
Essentially, List(Of T) is ArrayList, but better. It is the "generic equivalent" of ArrayList. Like ArrayList, it is not guaranteed to be sorted until sorted (go figure). List(Of T) also has some added functionality.
I found "Choose a Collection" section of Microsoft Docs on Collection and Data Structure page really useful
C# Collections and Data Structures : Choose a collection
And also the following matrix to compare some other features
I sympathise with the question - I too found (find?) the choice bewildering, so I set out scientifically to see which data structure is the fastest (I did the test using VB, but I imagine C# would be the same, since both languages do the same thing at the CLR level). You can see some benchmarking results conducted by me here (there's also some discussion of which data type is best to use in which circumstances).
They're spelled out pretty well in intellisense. Just type System.Collections. or System.Collections.Generics (preferred) and you'll get a list and short description of what's available.
Hashtables/Dictionaries are O(1) performance, meaning that performance is not a function of size. That's important to know.
EDIT: In practice, the average time complexity for Hashtable/Dictionary<> lookups is O(1).
The generic collections will perform better than their non-generic counterparts, especially when iterating through many items. This is because boxing and unboxing no longer occurs.
An important note about Hashtable vs Dictionary for high frequency systematic trading engineering: Thread Safety Issue
Hashtable is thread safe for use by multiple threads.
Dictionary public static members are thread safe, but any instance members are not guaranteed to be so.
So Hashtable remains the 'standard' choice in this regard.
There are subtle and not-so-subtle differences between generic and non-generic collections. They merely use different underlying data structures. For example, Hashtable guarantees one-writer-many-readers without sync. Dictionary does not.

Categories