When we do .ToList() for an IEnumerable, the list can potentially reallocate while scanning the IEnumerable because it doesn't know the size upfront. If the size is known, is there a simple way to avoid the performance penalty? Something to the effect of initializing a List with the required capacity and then copying the IEnumerable into it? Ideally something as simple as .ToList(capacity) (which doesn't exist).
In cases when the capacity is part of IEnumerable<T> that is also an ICollection<T>, the library will allocate at the correct capacity.
Here is a reference implementation of List<T>(IEnumerable<T> source), which is invoked when you call ToList():
public List(IEnumerable<T> collection) {
if (collection==null)
ThrowHelper.ThrowArgumentNullException(ExceptionArgument.collection);
Contract.EndContractBlock();
ICollection<T> c = collection as ICollection<T>;
if( c != null) {
int count = c.Count;
if (count == 0) {
_items = _emptyArray;
} else {
_items = new T[count];
c.CopyTo(_items, 0);
_size = count;
}
} else {
_size = 0;
_items = _emptyArray;
// This enumerable could be empty. Let Add allocate a new array, if needed.
// Note it will also go to _defaultCapacity first, not 1, then 2, etc.
using(IEnumerator<T> en = collection.GetEnumerator()) {
while(en.MoveNext()) {
Add(en.Current);
}
}
}
}
Note how the constructor behaves when collection implements ICollection<T>: rather than iterating the content and calling Add for each item, it allocates the internal _items array, and copies the content into it without reallocations.
In situations when the capacity is not embedded in class implementing IEnumerable<T>, you can easily define one yourself, using a combination of standard methods:
public static class ToListExtension {
public static List<T> ToList<T>(this IEnumerable<T> source, int capacity)
{
var res = new List<T>(capacity);
res.AddRange(source);
return res;
}
}
Related
Is there such a thing as a non-concurrent bag? I see lots of mentions of ConcurrentBag, but nothing about Bag.
Does such a collection exist?
To clarify why I'd like to use such a collection, I often find that the order of a collection becomes an important but potentially difficult to trace property of a collection.
I'm not necessarily saying that this would often happen in good, well designed code, but there are situations where I wish I could say "do not expect any order on this collection, order it specifically as and when you need to".
No, but since it documented as being "optimized for scenarios where the same thread will be both producing and consuming data stored in the bag", you can just use ConcurrentBag in both concurrent and non-concurrent scenarios.
You could always write your own class that "hides" the order of the items.
For example:
public sealed class NonConcurrentBag<T>: IReadOnlyCollection<T>
{
public void Add(T item)
{
// When adding an item, add it to a random location to avoid callers assuming an ordering.
if (_items.Count == 0)
{
_items.Add(item);
return;
}
int index = _rng.Next(0, _items.Count);
_items.Add(_items[index]);
_items[index] = item;
}
public void Clear()
{
_items.Clear();
}
public T Take()
{
if (_items.Count == 0)
throw new InvalidOperationException("Attempting to Take() from an empty NonConcurrentBag");
var result = _items[_items.Count - 1];
_items.RemoveAt(_items.Count - 1);
return result;
}
public bool IsEmpty => _items.Count == 0;
public IEnumerator<T> GetEnumerator()
{
return _items.GetEnumerator();
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
public int Count => _items.Count;
readonly List<T> _items = new List<T>();
readonly Random _rng = new Random();
}
When you add an item, this sneakily adds it at a random index (shifting the displaced item to the end of the list).
That means that not only is the index of an item random when you add it, it can also move to a different index as you add more items. That'll foil anything that expects any particular order!
Adding and taking items is an O(1) operation except when adding and the underlying list needs to be resized, when it is an O(N) operation.
I found this method to shuffle a list but what I would like, is for it to return the new list, and I can not seem to figure out how this is done.
Here is what I tried
public static class Lists {
private static System.Random rng = new System.Random();
public static List<T> Shuffle<T>(this IList<T> list) {
int n = list.Count;
while(n > 1) {
n--;
int k = rng.Next(n + 1);
T value = list[k];
list[k] = list[n];
list[n] = value;
}
return list;
}
}
What it is saying is:
Can not convert IList to List
What I want is for the method to return the new Shuffled list, but I can't get it to return the list. How can this be done?
An IList is not necessarily a List. Your method returns a List, but is passed an IList:
public static List<T> Shuffle<T>(this IList<T> list) {
int n = list.Count;
while(n > 1) {
n--;
int k = rng.Next(n + 1);
T value = list[k];
list[k] = list[n];
list[n] = value;
}
return list; //Still an IList!
}
The easiest/best solution would be to just return an IList<T> instead of List<T>.
public static IList<T> Shuffle<T>(this IList<T> list) {
If you really need an actual list, call ToList:
return list.ToList(); //Now its a list
Of course, this enumerates your collection one more time than is necessary.
Can not convert List to IList
That's because you have an IList<T> and you're returning a List<T>. You can cast return (List<T>)list; but this will fail if it's called with another IList<T>.
It will also fail before that if the IList<T> is readonly.
However,
what I would like, is for it to return the new list
Well, there is no new list here. But we can easily have that be so:
public static List<T> Shuffle<T>(this IList<T> source) {
List<T> list = new List<T>(source);
With this change it now creates a new list, and does the shuffle on that before it returns it.
But wait, why restrict ourselves to IList<T> as input? We can create a new list from any IEnumerable<T>, so why not do that?
public static List<T> Shuffle<T>(this IEnumerable<T> source) {
List<T> list = new List<T>(source);
Still, there's a flaw in that Random is being used in non-single-thread code in a non-threadsafe way. That is also easily changed, remove the static Random and have:
public static List<T> Shuffle<T>(this IEnumerable<T> source) {
List<T> list = new List<T>(source);
Random rng = new System.Random();
Now it doesn't error, returns a new List as desired, accepts a wider range of input, and is threadsafe.
First of all, any List is automatically an IList. A List must implement the methods defined on the IList interface, so it is by definition, an IList.
But your code is defined to receive, as input, an IList. That means you can potentially pass it anything that implements the IList interface. So, you could pass it a List, or a Dictionary, or a ComboBoxListItems collection, or a SortedList, or any collection type object that implements IList. Because you are "trying" to manipulate and return a List when you are not passed a List the compiler is looking to see if it can convert what you were passed into a List, and then complaining that it can't. Most of the objects that implement IList cannot be automatically converted into a List.
Your code is confusing because the input parameter you are passing in is an IList, but is named list. You should fix that. and if you want to return a List, you need to create one in your method from whatever collection type was passed in. That's what .ToList() is for.
public static class Lists {
private static System.Random rng = new System.Random();
public static List<T> Shuffle<T>(this IList<T> iList) {
var list = iList.ToList();
int n = list .Count;
while(n > 1) {
n--;
int k = rng.Next(n + 1);
T value = list[k];
list[k] = list[n];
list[n] = value;
}
return list;
}
}
But even better, you probably ought to type your input parameter as IEnumerable<T>, not IList<T>. You can execute ToList() on any enumeration, so this allows your method to be utilized on a broader range of potential types.
public static class Lists {
private static System.Random rng = new System.Random();
public static List<T> Shuffle<T>(this IEnumerable<T> tEnumeration) {
var list = tEnumeration.ToList();
int n = list .Count;
while(n > 1) {
n--;
int k = rng.Next(n + 1);
T value = list[k];
list[k] = list[n];
list[n] = value;
}
return list;
}
}
public static class Lists
{
public static IList<T> Shuffle<T>(this IList<T> list)
{
Random rng = new Random();
int n = list.Count;
while (n > 1)
{
n--;
int k = rng.Next(n + 1);
T value = list[k];
list[k] = list[n];
list[n] = value;
}
return list;
}
}
I've changed it to be IList as the return type instead
I've moved the Random instance inside the method. It's all about scope and unless another method also happens to need to get random numbers then there isn't a need to have it outside the method
Moved the opening brace onto a new line. This is not Java.
What I want is for the method to return the new Shuffled list, but I can't get it to return the list. How can this be done?
This is not what your code does. Your code shuffles the list in place; it does not create a new one.
You can return a new List like this:
public static List<T> Shuffle<T>(this IList<T> list) {
return list.OrderBy(x => rng.Next()).ToList();
}
If you do want to shuffle the list in place, you should probably use the same type for the parameter and the return type. In you're case, since you're not using anything specific to List, you should probably use IList:
// Notice the I
public static IList<T> Shuffle<T>(this IList<T> list) {
// your code
}
This question already has answers here:
What is the performance of the Last() extension method for List<T>?
(5 answers)
Closed 7 years ago.
Does the .Last() extension method take into account if it's called on an IList? I'm just wondering if there's a significant performance difference between these:
IList<int> numbers = new int[] { 1, 2, 3, 4, 5, 6, 7, 8, 9 };
int lastNumber1 = numbers.Last();
int lastNumber2 = numbers[numbers.Count-1];
Intuition tells me that the first alternative is O(n) but the second is O(1). Is .Last() "smart" enough to try casting it to an IList?
Probably not, as it can do list[list.count-1]
Verified by reflector:
public static TSource Last<TSource>(this IEnumerable<TSource> source)
{
if (source == null)
{
throw Error.ArgumentNull("source");
}
IList<TSource> list = source as IList<TSource>;
if (list != null)
{
int count = list.Count;
if (count > 0)
{
return list[count - 1];
}
}
...
}
This is an undocumented optimization, but the predicate-less overload of Enumerable.Last does indeed skip straight to the end.
Note that the overload with a predicate doesn't just go from the end, working backwards as you might expect - it goes forwards from the start. I believe this is to avoid inconsistency when the predicate may throw an exception (or cause other side effects).
See my blog post about implementing First/Last/Single etc for more information - and an inconsistency which is present between the overloads of Single/SingleOrDefault.
Reflector:
public static TSource Last<TSource>(this IEnumerable<TSource> source)
{
...
if (list != null)
{
int count = list.Count;
if (count > 0)
{
return list[count - 1];
}
}
else
{
using (IEnumerator<TSource> enumerator = source.GetEnumerator())
{
...
}
}
throw Error.NoElements();
}
Answer: Yes.
Here's a neat way to find out:
class MyList<T> : IList<T> {
private readonly List<T> list = new List<T>();
public T this[int index] {
get {
Console.WriteLine("Inside indexer!");
return list[index];
}
set {
list[index] = value;
}
}
public void Add(T item) {
this.list.Add(item);
}
public int Count {
get {
Console.WriteLine("Inside Count!");
return this.list.Count;
}
}
// all other IList<T> interface members throw NotImplementedException
}
Then:
MyList<int> list = new MyList<int>();
list.Add(1);
list.Add(2);
Console.WriteLine(list.Last());
Output:
Inside Count!
Inside indexer!
2
If you try this:
Console.WriteLine(list.Last(n => n % 2 == 0));
then you get an exception in GetEnumerator showing that it is trying to walk the list. If we implement GetEnumerator via
public IEnumerator<T> GetEnumerator() {
Console.WriteLine("Inside GetEnumerator");
return this.list.GetEnumerator();
}
and try again we see
Inside GetEnumerator!
2
on the console showing that the indexer was never used.
Original poster is talking about an interface, not the implementation.
So it depends on the the underlying implementation behind the IList/Ilist<T> in question. You don't how how its indexer is implemented. I believe the framework's List<T> has a concrete implementation that utilizes an array, so a direct lookup is possible, but if all you have is an reference to IList<T>, that is not a given by any means.
There's no Sort() function for IList. Can someoene help me with this?
I want to sort my own IList.
Suppose this is my IList:
public class MyObject()
{
public int number { get; set; }
public string marker { get; set; }
}
How do I sort myobj using the marker string?
public void SortObject()
{
IList<MyObject> myobj = new List<MyObject>();
}
Use OrderBy
Example
public class MyObject()
{
public int number { get; set; }
public string marker { get; set; }
}
IList<MyObject> myobj = new List<MyObject>();
var orderedList = myobj.OrderBy(x => x.marker).ToList();
For a case insensitive you should use a IComparer
public class CaseInsensitiveComparer : IComparer<string>
{
public int Compare(string x, string y)
{
return string.Compare(x, y, StringComparison.OrdinalIgnoreCase);
}
}
IList<MyObject> myobj = new List<MyObject>();
var orderedList = myobj.OrderBy(x => x.marker, new CaseInsensitiveComparer()).ToList();
I would go against using OrderBy with a list because it's a LINQ extension method, therefore:
It wraps the list in an enumerable, then enumerates it and fills a new temporary list, then sorts this new list.
It wraps the sorted list inside another enumerable.
Then when you call ToList(), it iterates on it and fills another new list with the items.
In essence: it creates and fills 2 new lists and 2 enumerables in addition to the actual sorting.
In comparison, List.Sort() sorts in place and create nothing so it's way more efficient.
My recommendation would be:
If you know the underlying type, use List.Sort() or Array.Sort(array)
If you don't know the underlying type, copy the List to a temporary array and sort it using Array.Sort(array) and return it.
var sorted = myObj.OrderBy(x => x.marker);
OrderBy definitely gets the job done, but I personally prefer the syntax of List.Sort because you can feed it a Comparison<T> delegate instead of having to write a class that implements IComparer<T>. We can accomplish that goal with an extension method, and if that's something you're interested in, check out SortExtensions:
http://blog.velir.com/index.php/2011/02/17/ilistt-sorting-a-better-way/
For explanation why not to use OrderBy or similar check Christophe's answer.
Here is one attempt to make fast Sort:
public static void Sort<T>(this IList<T> ilist)
{
switch(ilist)
{
case List<T> lst:
lst.Sort();
break;
case Array arr:
Array.Sort(arr);
break;
default:
throw new NotImplementedException();
// or add slow impl if you don't want this to fail!!
}
}
To sort in-place you would essentially see these two approaches:
IList<T> list = .... // your ilist
var sorted = list.ToArray();
Array.Sort(sorted);
for (int i = 0; i < list.Count; i++)
{
list[i] = sorted[i];
}
and
IList<T> list = .... // your ilist
ArrayList.Adapter((IList)list).Sort();
The second one might look simpler but won't be great for value type collections since it incur boxing penalties. Furthermore there is no guarantee your IList<T> will be implementing IList. First one is better IMO.
You can also use the first approach to sort an ICollection<T> in-place but it is questionable if you should expose such a functionality since ICollection<T> contract doesn't guarantee an order (think hash structures). Anyway to show you code example:
ICollection<T> collection = .... // your icollection
var sorted = collection.ToArray();
Array.Sort(sorted);
collection.Clear();
foreach (var i in sorted)
{
collection.Add(i);
}
A note on sort stability, .NET's Array/List sorting algorithms are unstable. For a stable sort you will have to use:
IList<T> list = .... // your ilist
var sorted = list.OrderBy(i => i).ToArray();
for (int i = 0; i < list.Count; i++)
{
list[i] = sorted[i];
}
This can't be as fast as unstable sorts.
Finally, for a complete answer, perhaps a composite approach taken by watbywbarif is better:
public static void Sort<T>(this IList<T> list, IComparer<T> comparer, bool stable)
{
if (stable)
{
list.StableSort(comparer);
}
else
{
list.UnstableSort(comparer);
}
}
static void StableSort<T>(this IList<T> list, IComparer<T> comparer)
{
list.OrderBy(x => x, comparer).CopyTo(list);
}
static void UnstableSort<T>(this IList<T> list, IComparer<T> comparer)
{
switch (list)
{
case List<T> l:
l.Sort(comparer);
break;
case T[] a:
Array.Sort(a, comparer);
break;
default:
T[] sortable = list.ToArray();
sortable.UnstableSort(comparer);
sortable.CopyTo(list);
break;
}
}
static void CopyTo<T>(this IEnumerable<T> source, IList<T> target)
{
int i = 0;
foreach (T item in source)
{
target[i++] = item;
}
}
That's as far as built-in approaches go. For faster implemenation you will have to roll out your own, see: https://stackoverflow.com/a/19167475
My class contains a Dictionary<T, S> dict, and I want to expose a ReadOnlyCollection<T> of the keys. How can I do this without copying the Dictionary<T, S>.KeyCollection dict.Keys to an array and then exposing the array as a ReadOnlyCollection?
I want the ReadOnlyCollection to be a proper wrapper, ie. to reflect changes in the underlying Dictionary, and as I understand it copying the collection to an array will not do this (as well as seeming inefficient - I don't actually want a new collection, just to expose the underlying collection of keys...). Any ideas would be much appreciated!
Edit: I'm using C# 2.0, so don't have extension methods such as .ToList (easily) available.
If you really want to use ReadOnlyCollection<T>, the issue is that the constructor of ReadOnlyCollection<T> takes an IList<T>, while the KeyCollection of the Dictionary is only a ICollection<T>.
So if you want to wrap the KeyCollection in a ReadOnlyCollection, you'll have to create an adapter (or wrapper) type, implementing IList<T>, wrapping the KeyCollection. So it would look like:
var dictionary = ...;
var readonly_keys = new ReadOnlyCollection<T> (new CollectionListWrapper<T> (dictionary.Keys)
);
Not very elegant though, especially as the KeyCollection is already a readonly collection, and that you could simply pass it around as an ICollection<T> :)
DrJokepu said that it might be difficult to implement a wrapper for Keys Collection. But, in this particular case, I think the implementation is not so difficult because, as we know, this is a read-only wrapper.
This allows us to ignore some methods that, in other case, would be hard to implement.
Here's a quick implementation of the wrapper for Dictionary.KeyCollection :
class MyListWrapper<T, TValue> : IList<T>
{
private Dictionary<T, TValue>.KeyCollection keys;
public MyListWrapper(Dictionary<T, TValue>.KeyCollection keys)
{
this.keys = keys;
}
#region IList<T> Members
public int IndexOf(T item)
{
if (item == null)
throw new ArgumentNullException();
IEnumerator<T> e = keys.GetEnumerator();
int i = 0;
while (e.MoveNext())
{
if (e.Current.Equals(item))
return i;
i++;
}
throw new Exception("Item not found!");
}
public void Insert(int index, T item)
{
throw new NotImplementedException();
}
public void RemoveAt(int index)
{
throw new NotImplementedException();
}
public T this[int index]
{
get
{
IEnumerator<T> e = keys.GetEnumerator();
if (index < 0 || index > keys.Count)
throw new IndexOutOfRangeException();
int i = 0;
while (e.MoveNext() && i != index)
{
i++;
}
return e.Current;
}
set
{
throw new NotImplementedException();
}
}
#endregion
#region ICollection<T> Members
public void Add(T item)
{
throw new NotImplementedException();
}
public void Clear()
{
throw new NotImplementedException();
}
public bool Contains(T item)
{
return keys.Contains(item);
}
public void CopyTo(T[] array, int arrayIndex)
{
keys.CopyTo(array, arrayIndex);
}
public int Count
{
get { return keys.Count; }
}
public bool IsReadOnly
{
get { return true; }
}
public bool Remove(T item)
{
throw new NotImplementedException();
}
#endregion
#region IEnumerable<T> Members
public IEnumerator<T> GetEnumerator()
{
return keys.GetEnumerator();
}
#endregion
#region IEnumerable Members
System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
{
return keys.GetEnumerator();
}
#endregion
}
This might not be the best implementation for these methods :) but it was just for proving that this might be done.
Assuming you are using C# 3.0 and you have:
Dictionary< T,S > d;
Then
ReadOnlyCollection< T > r = new ReadOnlyCollection< T >( d.Keys.ToList() );
You will also need to import the System.Linq namespace.
Unfortunately you cannot to that direcly as far as I know as KeyCollection<T> does not expose anything that would allow you to do this easily.
You could, however, subclass ReadOnlyCollection<T> so that its constructor receives the dictionary itself and override the appropriate methods so that it exposes the Dictionary's items as if they were its own items.
For the record, in .NET 4.6, the KeyCollection<T> implements IReadOnlyCollection<T>, so if you use that interface, you can still reflect changes to the dictionary, still get O(1) contains*, and because the interface is covariant, you can return IReadOnlyCollection<some base type>
*Enumerable.Contains<T> does an as cast on the IEnumerable to forward it to ICollection<T>.Contains if available. See "remarks" on Enumerable.Contains: https://learn.microsoft.com/en-us/dotnet/api/system.linq.enumerable.contains#system-linq-enumerable-contains-1(system-collections-generic-ienumerable((-0))-0). Dictionary.KeyCollection also implements ICollection<T>
It's ugly, but this will do it
Dictionary<int,string> dict = new Dictionary<int, string>();
...
ReadOnlyCollection<int> roc = new ReadOnlyCollection<int>((new List<int>((IEnumerable<int>)dict.Keys)));