Properly exposing a List<T>?

Properly exposing a List<T>? - c#

I know I shouldn't be exposing a List<T> in a property, but I wonder what the proper way to do it is? For example, doing this:
public static class Class1
{
private readonly static List<string> _list;
public static IEnumerable<string> List
{
get
{
return _list;
//return _list.AsEnumerable<string>(); behaves the same
}
}
static Class1()
{
_list = new List<string>();
_list.Add("One");
_list.Add("Two");
_list.Add("Three");
}
}
would allow my caller to simply cast back to List<T>:
private void button1_Click(object sender, EventArgs e)
{
var test = Class1.List as List<string>;
test.Add("Four"); // This really modifies Class1._list, which is bad™
}
So if I want a really immutable List<T> would I always have to create a new list? For example, this seems to work (test is null after the cast):
public static IEnumerable<string> List
{
get
{
return new ReadOnlyCollection<string>(_list);
}
}
But I'm worried if there is a performance overhead as my list is cloned every time someone tries to access it?

Exposing a List<T> as a property isn't actually the root of all evil; especially if it allows expected usage such as foo.Items.Add(...).
You could write a cast-safe alternative to AsEnumerable():
public static IEnumerable<T> AsSafeEnumerable<T>(this IEnumerable<T> data) {
foreach(T item in data) yield return item;
}
But your biggest problem at the moment is thread safety. As a static member, you might have big problems here, especially if it is in something like ASP.NET. Even ReadOnlyCollection over an existing list would suffer from this:
List<int> ints = new List<int> { 1, 2, 3 };
var ro = ints.AsReadOnly();
Console.WriteLine(ro.Count); // 3
ints.Add(4);
Console.WriteLine(ro.Count); // 4
So simply wrapping with AsReadOnly is not enough to make your object thread-safe; it merely protects against the consumer adding data (but they could still be enumerating it while your other thread adds data, unless you synchronize or make copies).

Yes and No. Yes, there is a performance overhead, because a new object is created. No, your list is not cloned, it is wrapped by the ReadOnlyCollection.

If the class has no other purpose you could inherit from list and override the add method and have it throw an exception.

Use AsReadOnly() - see MSDN for details

You don't need to worry about the overhead of cloning: wrapping a collection with a ReadOnlyCollection does not clone it. It just creates a wrapper; if the underlying collection changes, the readonly version changes also.
If you worry about creating fresh wrappers over and over again, you can cache it in a separate instance variable.

I asked a similar question earlier:
Difference between List and Collection (CA1002, Do not expose generic lists)
Why does DoNotExposeGenericLists recommend that I expose Collection instead of List?
Based on that I would recommend that you use the List<T> internally, and return it as a Collection<T> or IList<T>. Or if it is only necessary to enumerate and not add or antyhing like that, IEnumerable<T>.
On the matter of being able to cast what you return in to other things, I would just say don't bother. If people want to use your code in a way that it was not intended, they will be able to in some way or another. I previously asked a question about this as well, and I would say the only wise thing to do is to expose what you intend, and if people use it in a different way, well, that is their problem :p Some related questions:
How should I use properties when dealing with List members (Especially this answer)
Encapsulation of for example collections

If you expose your list as IEnumerable, I wouldn't worry about callers casting back to List. You've explicitly indicated in the contract of your class that only the operations defined in IEnumerable are allowed on this list. So you have implicitly stated that the implementation of that list could change to pretty much anything that implements IEnumerable.

AsEnumerable and ReadOnlyCollection have problem when your enumeration is at midway and collection gets modified. These things are not thread safe. Returning them as an array and caching them at time of calling can be much better option.
For example,
public static String[] List{
get{
return _List.ToArray();
}
}
//While using ...
String[] values = Class1.List;
foreach(string v in values){
...
}
// instead of calling foreach(string v in Class1.List)
// again and again, values in this context will not be
// duplicated, however values are cached instance so
// immediate changes will not be available, but its
// thread safe
foreach(string v in values){
...
}

Related

Should I define custom enumerator or use built-in one?

I've been given some code from a customer that looks like this:
public class Thing
{
// custom functionality for Thing...
}
public class Things : IEnumerable
{
Thing[] things;
internal int Count { get { return things.Length; } }
public Thing this[int i] { get { return this.things[i]; } }
public IEnumerator GetEnumerator() { return new ThingEnumerator(this); }
// custom functionality for Things...
}
public class ThingEnumerator : IEnumerator
{
int i;
readonly int count;
Things container;
public ThingEnumerator(Things container)
{
i = -1;
count = container.Count;
this.container = container;
}
public object Current { get { return this.container[i]; } }
public bool MoveNext() { return ++i < count; }
public void Reset() { i = -1; }
}
What I'm wondering is whether it would have been better to have gotten rid of the ThingEnumerator class and replaced the Things.GetEnumerator call with an implementation that simply delegated to the array's GetEnumerator? Like so:
public IEnumerator GetEnumerator() { return things.GetEnumerator(); }
Are there any advantages to keeping the code as is? (Another thing I've noticed is that the existing code could be improved by replacing IEnumerator with IEnumerator<Thing>.)

With generics, there is really little value in implementing IEnumerable and IEnumerator yourself.
Removing these are replacing the class with a generic collection means you have far less code to maintain and has the advantage of using code that is known to work.

In the general case, there can sometimes be a reason to implement your own enumerator. You might want some functionality that the built-in one doesn't offer - some validation, logging, raising OnAccess-type events somewhere, perhaps some logic to lock items and release them afterwards for concurrent access (I've seen code that does that last one; it's odd and I wouldn't recommend it).
Having said that, I can't see anything like that in the example you've posted, so it doesn't seem to be adding any value beyond what IEnumerable provides. As a rule, if there's built-in code that does what you want, use it. All you'll achieve by rolling your own is to create more code to maintain.

The code you have looks like code that was written for .NET 1.0/1.1, before .NET generics were available - at that time, there was value in implementing your own collection class (generally derived from System.Collections.CollectionBase) so that the indexer property could be typed to the runtime type of the collection.
However, unless you were using value types and boxing/unboxing was the performance limitant, I would have inherited from CollectionBase and there would be no need to redefine GetEnumerator() or Count.
However, now, I would recommend one of these two approaches:
If you need the custom collection to have some custom functionality, then derive the collection from System.Collections.ObjectModel.Collection<Thing> - it provides all the necessary hooks for you to control insertion, replacement and deletion of items in the collection.
If you actually only need something that needs to be enumerated, I would return a standard IList<Thing> backed by a List<Thing>.

Unless you are doing something truly custom (such as some sort of validation) in the custom enumerator, there really isn't any reason to do this no.
Generally, go with what is available in the standard libraries unless there is definite reason not to. They are likely better tested and have more time spent on them, as individual units of code, then you can afford to spend, and why recreate the wheel?
In cases like this, the code already exists but it may still be better to replace the code if you have time to test very well. (It's a no-brainer if there is decent unit test coverage.)
You'll be reducing your maintenance overhead, removing a potential source of obscure bugs and leaving the code cleaner than you found it. Uncle Bob would be proud.

An array enumerator does pretty much the same as your custom enumerator, so yes, you can just as well return the array's enumerator directly.
In this case, I would recommend you do it, because array enumerators also perform more error checking and, as you stated, it's just simpler.

What are the potential hazards of an object removing itself from a list it is in?

For instance, if I have an class:
public class StuffHolder
{
List<Stuff> myList;
public StuffHolder()
{
myList = newList<Stuff>();
myList.Add(new Stuff(myList));
myList[0].stuffHappens();
}
}
and a Stuff Object:
public class Stuff
{
List<Stuff> myList;
public Stuff(List<Stuff> myList)
{
this.myList = myList;
}
public void stuffHappens()
{
myList.Remove(this);
}
}
What are the disadvantages of calling stuffHappens() rather than having stuff pass the information that it should be removed to the StuffHolder class and having the StuffHolder class remove that specific Stuff?

There's a hazard if stuffHappens() ever occurs in more than one thread at a time, as the List<T> collection is not thread-safe.
The bigger hazard is the confusion of responsibility, as it probably shouldn't be the job of Stuff to know about it being stored in a collection. This kind of design 'fuzziness' causes steadily increasing confusion as systems grow and evolve.

It will certainly work (i.e. the Stuff object will be removed from the list).
The question is why you have a StuffHolder in the first place. Usually when you wrap a collection like that you are doing it to maintain some invariants or cache some data. Using the list like this means you could violate the invariant.
Essentially the issue is that StuffHolder has no idea that an object has been removed from it's list. It's up to you whether that is a problem for your particular situation.

It's possible from the code perspective, so it's ok. The answer depends on what you are modeling, the design you have made. May be some scenarios where that solution is not a good one and some scenarios where it is. If you want, you can share what are you trying to achieve and we can discuss then.
Hope it helps.

In C# how to create a List<T> with a default count of zero?

Rookie question:
I have been experiencing a minor bug in my mvc2 application. I was able to trace it back to this code:
List<Stream2FieldTypes> Stream2FieldTypes = new List<Stream2FieldTypes>();
foreach (var item in stream.Stream2FieldTypes)
{
Stream2FieldTypes.Add(item);
}
The problem that I am experiencing is that when I instatiate the new list, it has a count of one. I'm thinking that this is probably due to my using the constructor. So I tried this:
List<Stream2FieldTypes> Stream2FieldTypes;
foreach (var item in stream.Stream2FieldTypes)
{
Stream2FieldTypes.Add(item);
}
But, of course this will not compile because of an error on Stream2FieldTypes.Add(item);. Is there a way that I can create a List<Stream2FieldTypes> and make sure that the count is zero?

The problem that I am experiencing is that when I instatiate the new list, it has a length of one
No, that's totally impossible. Your problem is somewhere else and unrelated to the number of elements of a newly instantiated list.
List<Stream2FieldTypes> Stream2FieldTypes = new List<Stream2FieldTypes>();
Stream2FieldTypes.Count will be 0 at this point no matter what you do (assuming of course single threaded sequential access but List<T> is not thread-safe anyways so it's a safe assumption :-)).

The constructor:
List<Stream2FieldTypes> Stream2FieldTypes = new List<Stream2FieldTypes>(0);
will create a list with a default capacity of zero.
ETA: Though, looking at Reflector, it seems that the static and default constructors also create the list with a default capacity of zero. So your code as it stands should create a list with no elements and no reserved capacity. Should be more performant than the explicit constructor.

Also, if you use IEnumerable, you can do some nice tricks:
public void processTheList(List<string> someList = null)
{
// Instead of special-casing null, make your code cleaner
var list = someList ?? Enumerable.Empty<string>();
// Now we can always assume list is a valid IEnumerable
foreach(string item in list) { /* ... */ }
}

This seems like a multi-threading issue, are you sure that this is a thread safe method and another thread didn't already add an item to this list?
We need to see this method in a bigger context of your code.

It looks to me that you have your constructor set up incorrectly. I may be wrong but instead of List<Stream2FieldTypes> Stream2FieldTypes = new List<Stream2FieldTypes>(); you should be naming it different than the type you are using? List<Stream2FieldTypes> SomethingElse = new List<Stream2FieldTypes>();
Try that it should work.

Enumerator problem, Any way to avoid two loops?

I have a third party api, which has a class that returns an enumerator for different items in the class.
I need to remove an item in that enumerator, so I cannot use "for each". Only option I can think of is to get the count by iterating over the enum and then run a normal for loop to remove the items.
Anyone know of a way to avoid the two loops?
Thanks
[update] sorry for the confusion but Andrey below in comments is right.
Here is some pseudo code out of my head that won't work and for which I am looking a solution which won't involve two loops but I guess it's not possible:
for each (myProperty in MyProperty)
{
if (checking some criteria here)
MyProperty.Remove(myProperty)
}
MyProperty is the third party class that implements the enumerator and the remove method.

Common pattern is to do something like this:
List<Item> forDeletion = new List<Item>();
foreach (Item i in somelist)
if (condition for deletion) forDeletion.Add(i);
foreach (Item i in forDeletion)
somelist.Remove(i); //or how do you delete items

Loop through it once and create a second array which contains the items which should not be deleted.

If you know it's a collection, you can go with reverted for:
for (int i = items.Count - 1; i >= 0; i--)
{
items.RemoveAt(i);
}
Otherwise, you'll have to do two loops.

You can create something like this:
public IEnumerable<item> GetMyList()
{
foreach (var x in thirdParty )
{
if (x == ignore)
continue;
yield return x;
}
}

I need to remove an item in that enumerator
As long as this is a single item that's not a problem. The rule is that you cannot continue to iterate after modifying the collection. Thus:
foreach (var item in collection) {
if (item.Equals(toRemove) {
collection.Remove(toRemove);
break; // <== stop iterating!!
}
}

It is not possible to remove an item from an Enumerator. What you can do is to copy or filter(or both) the content of the whole enumeration sequence.
You can achieve this by using linq and do smth like this:
YourEnumerationReturningFunction().Where(item => yourRemovalCriteria);

Can you elaborate on the API and the API calls you are using?
If you receive an IEnumerator<T> or IEnumerable<T> you cannot remove any item from the sequence behind the enumerator because there is no method to do so. And you should of course not rely on down casting an received object because the implementation may change. (Actually a well designed API should not expose mutable objects holding internal state at all.)
If you receive IList<T> or something similar you can just use a normal for loop from back to front and remove the items as needed because there is no iterator which state could be corrupted. (Here the rule about exposing mutable state should apply again - modifying the returned collection should not change any state.)

IEnumerator.Count() will decide at run-time what it needs to do - enumerate to count or reflect to see it's a collection and call .Count that way.
I like SJoerd's suggestion but I worry about how many items we may be talking about.

Why not something like ..
// you don't want 2 and 3
IEnumerable<int> fromAPI = Enumerable.Range(0, 10);
IEnumerable<int> result = fromAPI.Except(new[] { 2, 3 });

A clean, readable way to do this is as follows (I'm guessing at the third-party container's API here since you haven't specified it.)
foreach(var delItem in ThirdPartyContainer.Items
.Where(item=>ShouldIDeleteThis(item))
//or: .Where(ShouldIDeleteThis)
.ToArray()) {
ThirdPartyContainer.Remove(delItem);
}
The call to .ToArray() ensures that all items to be deleted have been greedily cached before the foreach iteration begins.
Behind the scenes this involves an array and an extra iteration over that, but that's generally very cheap, and the advantage of this method over the other answers to this question is that it works on plain enumerables and does not involve tricky mutable state issues that are hard to read and easy to get wrong.
By contrast, iterating in reverse, while not rocket science, is much more prone to off-by-one errors and harder to read; and it also relies on internals of the collection such as not changing order in between deletions (e.g. better not be a binary heap, say). Manually adding items that should be deleted to a temporary list is just unnecessary code - that's what .ToArray() will do just fine :-).

an enumerator always has a private field pointing to the real collection.
you can get it via reflection.modify it.
have fun.

LinQ optimization

Here is a peace of code:
void MyFunc(List<MyObj> objects)
{
MyFunc1(objects);
foreach( MyObj obj in objects.Where(obj1=>obj1.Good))
{
// Do Action With Good Object
}
}
void MyFunc1(List<MyObj> objects)
{
int iGoodCount = objects.Where(obj1=>obj1.Good).Count();
BeHappy(iGoodCount);
// do other stuff with 'objects' collection
}
Here we see that collection is analyzed twice and each time the value of 'Good' property is checked for each member: 1st time when calculating count of good objects, 2nd - when iterating through all good objects.
It is desirable to have that optimized, and here is a straightforward solution:
before call to MyFunc1 makecreate an additional temporary collection of good objects only (goodObjects, it can be IEnumerable);
get count of these objects and pass it as an additional parameter to MyFunc1;
in the 'MyFunc' method iterate not through 'objects.Where(...)' but through the 'goodObjects' collection.
Not too bad approach (as far as I see), but additional variable is required to be created in the 'MyFunc' method and additional parameter is required to be passed.
Question: is there any LinQ out-of-the-box functionality that allows any caching during 1st Where().Count(), remembering a processed collection and use it in the next iteration?
Any thoughts are welcome.
Thanks.

No, LINQ queries are not optimized in this way (what you describe is similar to the way SQL Server reuses a query execution plan). LINQ does not (and, for practical purposes, cannot) know enough about your objects in order to optimize this way. As far as it knows, your collection has changed (or is entirely different) between the two calls.
You're obviously aware of the ability to persist your query into a new List<T>, but apart from that there's really nothing that I can recommend without knowing more about your class and where else MyFunc is used.

As long as MyFunc1 doesn't need to modify the list by adding/removing objects, this will work.
void MyFunc(List<MyObj> objects)
{
ILookup<bool, MyObj> objLookup = objects.ToLookup(obj1 => obj1.Good);
MyFunc1(objLookup[true]);
foreach(MyObj obj in objLookup[true])
{
//..
}
}
void MyFunc1(IEnumerable<MyObj> objects)
{
//..
}

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Properly exposing a List<T>? - c#

Yes and No. Yes, there is a performance overhead, because a new object is created. No, your list is not cloned, it is wrapped by the ReadOnlyCollection.

If the class has no other purpose you could inherit from list and override the add method and have it throw an exception.

Use AsReadOnly() - see MSDN for details

Related

Should I define custom enumerator or use built-in one?

What are the potential hazards of an object removing itself from a list it is in?

In C# how to create a List<T> with a default count of zero?

Enumerator problem, Any way to avoid two loops?

LinQ optimization

Categories

Resources