Why C# ArrayList doesn't have Resize method? - c#

Coming from C++, it's very weird to find that C# ArrayList doesn't have Resize(count) method? Why? Am I missing something?

There are three separate operations you might wish to perform:
Changing the capacity of the ArrayList. This is achievable through ArrayList.Capacity and List<T>.Capacity
Changing the actual count of the list by trimming some elements. This is achievable through ArrayList.RemoveRange and List<T>.RemoveRange.
Changing the actual count of the list by adding some elements. This is achievable through ArrayList.AddRange and List<T>.AddRange. (As of .NET 3.5, you can use Enumerable.Repeat to very easily come up with a sequence of the right length.)
(I mention List<T> as unless you're really stuck on .NET 1.1, you'd be better off using the generic collections.)
If you want to perform some other operation, please specify it. Personally I'm glad that these three operations are separate. I can't think of any cases in my own experience where I've wanted to add or remove elements without knowing which I'd actually be doing.

You should use the Generic List<> (System.Collections.Generic.List) for this. It operates in constant amortized time. Or you can use the ArrayList.Capacity for your purpose.

Related

Why refactor argument of List<Term> to IEnumerable<Term>?

I have a method that looks like this:
public void UpdateTermInfo(List<Term> termInfoList)
{
foreach (Term termInfo in termInfoList)
{
UpdateTermInfo(termInfo);
}
m_xdoc.Save(FileName.FullName);
}
Resharper advises me to change the method signature to IEnumerable<Term> instead of List<Term>. What is the benefit of doing this?
The other answers point out that by choosing a "larger" type you permit a broader set of callers to call you. Which is a good enough reason in itself to make this change. However, there are other reasons. I would recommend that you make this change because when I see a method that takes a list or an array, the first thing I think is "what if that method tries to change an item in my list/array?"
You want the contents of a bucket, but you are requiring not just the bucket but also the ability to change its contents. Why would you require that if you're not going to use that ability? When you say "this method cannot take any old sequence; it has to take a mutable list that is indexed by integers" I think that you're making that requirement on the caller because you're going to take advantage of that power.
If "I'm planning on messing up your data structure" is not what you intend to communicate to the caller of the method then don't communicate that. A method that takes a sequence communicates "The most I'm going to do is read from this sequence in order".
Simply put, accepting an enumerable allows your function to be compatible with a broader scope of input arguments, such as arrays and LINQ queries.
To expound on accepting LINQ queries, one could do:
UpdateTermInfo(myTermList.Where(x => somefilter));
Additionally, specifying an interface rather than a concrete class allows others to provide their own implementation of that interface. In this way, you are being "subscriptive" rather than "proscriptive." (Yes, I did just make up a word.)
In general (with many exceptions relating to what sort of abilities you want to reserve for potential later modifications), it is a best-practice to implement functions using arguments that are the most general that they can be. This gives maximum flexibility to the consumer of your function.
As a result, if you are dead-set on using a list for this function (perhaps because at some later date you expect you might want to use properties such as Count or the index operator), I would strongly urge you to consider using IList<Term> instead of List<Term> for the reasons mentioned above.
List implements IEnumerable, using it would makes things more flexible. If an instance came along where you didn't want to use a List and wanted to use a different collection object it would cast from IEnumerable with ease.
For instance IEnumerable allows you to use Arrays and many others as opposed to always using a List.
Inumerable is simply a collection of items, dissimilar to a List, where you can add, remove, sort, use For Each, Count etc.
The main idea behind that refactor is that you make the method more general. You don't say what data structure you want, only what you need from it: that you can iterate through its elements.
So later, when you decide that O(n) search is not good enough for you, you only have to change one line and move along.
If you use List then you are confining yourself to only use a concrete implementation of List where as with IEnumerable you can pass in Arrays, Lists, Collections as they all implement that interface.

c# string[] vs IEnumerable<string>

What should I prefer if I know the number of elements before runtime?
Resharper offers me IEnumerable<string> instead of string[]?
ReSharper suggests IEnumerable<string> if you are only using methods defined for IEnumerable. It does so with the idea that, since you clearly do not need the value to be typed as array, you might want to hide the exact type from the consumers of (i.e., the code that uses) the value because you might want to change the type in the future.
In most cases, going with the suggestion is the right thing to do. The difference will not be something that you can observe while your program is running; rather, it's in how easily you will find it to make changes to your program in the future.
From the above you can also infer that the whole suggestion/question is meaningless unless the value we are talking about is passed across method boundaries (I don't remember if R# also offers it for a local variable).
If ReSharper suggests you use IEnumerable<string> it means you are only using features of that interface and no array specific features. Go with the suggestion of ReSharper and change it.
If you are trying to provide this method as an interface to other methods, I would prefer to have the output of your method more generic, hence would go for IEnumerable<string>.
Inside a method, if you are trying to instantiate and this is not being passed around to other methods, I would go for string[]. unless I need deferred execution. Although, it doesn't matter which one you use in this case.
The actual type should be string[] but depending on the user you may want to expose it as something else. e.g. IEnumerable<string> sequence = new string[5]... In particular if it's something like static readonly, then you should make it a ReadOnlyCollection so the entries can't be modified.
with string[] you can do more you can acces items by index with IEnumerable you have to loop to find specific index
It's probably suggesting this because it's looking for a better Liskov Substitution at this point in your code. Keep in mind the difference between the declared type and the implementing type. IEnumerable<> isn't an implementation, it's an interface. You can declare the variable as an IEnumerable<string> and build it with a string[] since the string array implements IEnumerable<string>.
What this does for you is allow you to pass around that string array as a more generic, more abstracted type. Anything which expects or returns an IEnumerable<string> (regardless of implementation, be it List<string> or string[] or anything else) can then use your string array, without having to worry about the specific implementation you pass it. As long as it satisfies the interface, it's polymorphic of the correct type.
Keep in mind that this isn't always the way to go. Sometimes you, as the developer, are very concerned with the implementation (perhaps for really fine-grained performance tuning, for example) and don't want to move up to an abstraction. The decision is up to you. ReSharper is merely making a suggestion to use an abstraction rather than an implementation in a variable/method declaration.
ReSharper is likely flagging it for you because you are not returning the least constrained type. If you aren't going to be using access on it by index in the future, I'd go with IEnumerable to have less constraint on the method which returns it.
Depends on your usage later on. If you need to enumare through these elements or sort or compare them later on then I would recommend IEnumerable otherwise go with array.
I wrote this response for a similar question regarding array or IEnumerable for return values, which was then closed as duplicate before I could post it. I thought the answer might be interesting to some so I post it here.
The main advantage of IEnumerable over T[] is that IEnumerable (for return values) can be made lazy. Ie it only computes the next element when needed.
Consider the difference between Directory.GetFiles and Directory.EnumerateFiles. GetFiles returns an Array, EnumerateFiles returns IEnumerable. This means that for a directory with two million files the Array will contain two million strings. EnumerateFiles only instansiate the strings as needed saving memory and improving response time.
However, it's not all benefits.
foreach is significantly less efficient on non-arrays (you can see this by disassembling the ILCode).
Array promises more, ie that its length will not change.
Lazy evaluation is not always better, consider the Directory class. The GetFiles implementation will open a find file handle, iterate over all files, close the find file handle and then return results. EnumerateFiles will do nothing until the first find file is requested, then the find file handle is opened and the files iterated, find file handle is closed when the enumerator is disposed. This means that the life-time of the find file handle is controlled by the caller, not the callee. Can be seen as less encapsulation and can give potential runtime errors with locked file handles.
In my humble opinion, I think R# is overzelous in suggestion IEnumerable over arrays especially so for return values (input parameters have less potential drawbacks). What I tend to do when I see a function that returns IEnumerable is a .ToArray in order to avoid potential issues with Lazy evaluation but if the Collection is already an Array this is inefficient.
I like the principle; promise alot, require little. Ie don't require that the input parameters must be arrays (use IEnumerable) but return Array over IEnumerable as Array is a bigger promise.

Where to use string [] vs list <string> in C#

String[] is light weight compared to list<string>. So if I don't have any need to manipulate my collection, should I use string[] or is it always advisable to go for list<string>?
In case of list<string>, do we need to perform null check or not required?
Use string[] when you need to work with static arrays: you don't need to add and remove elements -> only access elements by index. If you need to modify the collection use List<string>. And if you intend to only loop through the contents and never access by index use IEnumerable<string>.
If the collection should not be modified, use string[] or even better, IEnumerable<string>. This indicates that the collection of strings should be treated as a read-only collection of strings.
Using IEnumerable<string> in your API also opens up for the underlying implementation to be changed without breaking client code. You can easily use a string array as the underlying implementation and expose it as IEnumerable<string>. If the implementation at a later stage is better suited using a list or other structure, you can change it as long as it supports IEnumerable<string>.
I'd say you've summed it up well yourself.
If the size of your list won't change, and you don't need any of the advanced List functions like sorting, then String[] is preferable because as you say it's lightweight.
But consider potential future requirements - is it possible that you might one day want to use List for something? If so, consider using List now.
You need to check for null, both in String[] and also List. Both types can have a null value.
I would say it depends what you're trying to accomplish. Generally, however, my opinion is that you have access to a great framework that does a lot of hard work for you so use it (ie. use List<> instead of array).
Have a look at the members on offer to you by a class like List<> and you'll see what I mean: in addition to not having to worry as much about array capacity and index out of bounds exceptions, List and other ICollection/IList classes give you methods like Add, Remove, Clear, Insert, Find, etc that are infinitely helpful. I also believe
myList.Add (myWidg);
is a lot nicer to read and maintain than
myArr [i] = myWidg;
I would definitely vote for List. Apart from various member functions that a list supports, it provides 'no element' concept. There can be a list which have no elements but there cannot be an array with no elements. So, if we adhere to best practices of not returning null from a function, then we can safely check for the count of the element without doing a null check. In case of array, we have to check the null. Moreover, I seldom use a loop to search an element, either in array or list. LINQ just makes it neat and we can use it with List not array. Array has to be converted to list to make use of LINQ.
This really really depends on the situation. Anything really performance related should probably be done with arrays. Anything else would go with lists.

Why is there a List<T>.BinarySearch(...)?

I'm looking at List and I see a BinarySearch method with a few overloads, and I can't help wondering if it makes sense at all to have a method like that in List?
Why would I want to do a binary search unless the list was sorted? And if the list wasn't sorted, calling the method would just be a waste of CPU time. What's the point of having that method on List?
I note in addition to the other correct answers that binary search is surprisingly hard to write correctly. There are lots of corner cases and some tricky integer arithmetic. Since binary search is obviously a common operation on sorted lists, the BCL team did the world a service by writing the binary search algorithm correctly once rather than encouraging customers to all write their own binary search algorithm; a significant number of those customer-authored algorithms would be wrong.
Sorting and searching are two very common operations on lists. It would be unfriendly to limit a developer's options by not offering binary search on a regular list.
Library design requires compromises - the .NET designers chose to offer the binary search function on both arrays and lists in C# because they likely felt (as I do) that these are useful and common operations, and programmers who choose to use them understand their prerequisites (namely that the list is ordered) before calling them.
It's easy enough to sort a List<T> using one of the Sort() overloads. If you feel that you need an invariant that gaurantees sorting, you can always use SortedList<TKey,TValue> or SortedSet<T> instead.
BinarySearch only makes sense on a List<T> that is sorted, just like IList<T>.Add only makes sense for an IList<T> with IsReadOnly = false. It's messy, but it's just something to deal with: sometimes functionality X depends on criterion Y. The fact that Y isn't always true doesn't make X useless.
Now, in my opinion, it's frustrating that .NET doesn't have general Sort and BinarySearch methods for any IList<T> implementation (e.g., as extension methods). If it did, we could easily sort and search for items within any non-read-only collection providing random access.
Then again, you can always write your own (or copy someone else's).
Others have pointed out that BinarySearch is quite useful on a sorted List<T>. It doesn't really belong on List<T>, though, as anyone with C++ STL experience would immediately recognize.
With recent C# language developments, it makes more sense to define the notion of a sorted list (e.g., ISortedList<T> : IList<T>) and define BinarySearch (et. al.) as extension methods of that interface. This is a cleaner, more orthogonal type of design.
I've started doing just that as part of the Nito.Linq library. I expect the first stable release to be in a couple of months.
yes but List has Sort() method as well so you can call it before BinarySearch.
Searching and sorting are algorithmic primitives. It's helpful for the standard library to have fast reliable implementations. Otherwise, developers waste time reinventing the wheel.
However, in the case of the .NET Framework, it's unfortunate that the specific choices of algorithms happens to make them less useful than they might be. In some cases, their behaviour is not defined:
List<T>.BinarySearch If the List contains more than one element with the same value, the method returns only one of the occurrences, and it might return any one of the occurrences, not necessarily the first one.
List<T> This implementation performs an unstable sort; that is, if two elements are equal, their order might not be preserved. In contrast, a stable sort preserves the order of elements that are equal.
That's a shame, because there are deterministic algorithms that are just as fast, and these would be much more useful as building blocks. It's noteworthy that the binary search algorithms in Python, Ruby and Go all find the first matching element.
I agree it's completely dumb to Call BinarySearch on an unsorted list, but it's perfect if you know your large list is sorted.
I've used it when checking if items from a stream exist in a (more or less) static list of 100,000 items or more.
Binary Searching the list is ORDERS of magnitude faster than doing a list.Find, which is many orders of magnitude faster than a database look up.
I makes sense, and I'm glad it there (not that it would be rocket science to implement it if it wasn't).
Perhaps another point is that an array could be equally as unsorted. So in theory, having a BinarySearch on an array could be invalid too.
However, as with all features of a higher level language, they need to be applied by someone with reason and understanding of the data, or they will fail. Sure, some cross-checks could be applied, and we could have a flag that said "IsSorted" and it would fail on binary search otherwise, but .....
Some pseudo code:
if List is sorted
use the BinarySearch method
else if List is not sorted and you think sorting it is "waste of CPU time"
use a different algorithm that is more suitable and efficient

ASP.NET C# Lists Which and When?

In C# There seem to be quite a few different lists. Off the top of my head I was able to come up with a couple, however I'm sure there are many more.
List<String> Types = new List<String>();
ArrayList Types2 = new ArrayList();
LinkedList<String> Types4 = new LinkedList<String>();
My question is when is it beneficial to use one over the other?
More specifically I am returning lists of unknown size from functions and I was wondering if there is a particular list that was better at this.
List<String> Types = new List<String>();
LinkedList<String> Types4 = new LinkedList<String>();
are generic lists, i.e. you define the data type that would go in there which decreased boxing and un-boxing.
for difference in list vs linklist, see this --> When should I use a List vs a LinkedList
ArrayList is a non-generic collection, which can be used to store any type of data type.
99% of the time List is what you'll want. Avoid the non-generic collections at all costs.
LinkedList is useful for adding or removing without shuffling items around, although you have to forego random access as a result. One advantage it does have is you can remove items whilst iterating through the nodes.
ArrayList is a holdover from before Generics. There's really no reason to use them ... they're slow and use more memory than List<>. In general, there's probably no reason to use LinkedList either unless you are inserting midway through VERY large lists.
The only thing you'll find in .NET faster than a List<> is a fixed array ... but the performance difference is surprisingly small.
See the article on Commonly Used Collection Types from MSDN for a list of the the various types of collections available to you, and their intended uses.
ArrayList is a .Net 1.0 list type.
List is a generic list introduced with generics in .Net 2.0.
Generic lists provide better compile time support. Generics lists are type safe. You cannot add objects of wrong type. Therefor you know which type the stored objects has. There are no typechecks and typecasts nessecary.
I dont know about performance differences.
This questions says something about the difference of List and LinkedList.
As mentioned, don't use ArrayList if at all possible.
Here's an bit on Wikipedia about the differences between arrays and linked lists.
In summary:
Arrays
Fast random access
Fast inserting/deleting at end
Good memory locality
Linked Lists
Fast inserting/deleting at beginning
Fast inserting/deleting at end
Fast inserting/deleting at middle (with enumerator)
Generally, use List. Don't use ArrayList; it's obsolete. Use LinkedList in the rare cases where you need to be able to add without resizing and don't mind the overhead and loss of random access.
ArrayList is probably smaller, memory-wise, since it is based on an array. It also has fast random-access to elements. However, adding or removing to the list will take longer. This might be sped up slightly if the object over-allocates under the assumption that you are going to keep adding. (That will, of course, reduce the memory advantage.)
The other lists will be slightly larger (4-to-8 bytes more memory per element), and will have poor random access times. However, it is very fast to add or remove objects to the ends of the list. Also, memory usage is usually spot-on for what you need.

Categories