Why do so many named collections in .NET not implement IEnumerable<T>?

Why do so many named collections in .NET not implement IEnumerable<T>? - c#

Random example:
ConfigurationElementCollection
.Net has tons of these little WhateverCollection classes that don't implement IEnumerable<T>, which means I can't use Linq to objects with them out of the box.
Even before Linq, you'd think they would have wanted to make use of generics (which were introduced all the way back in C# 2 I believe)
It seems I run across these annoying little collection types all the time.
Is there some technical reason?

The answer is in the question title: "named collections". Which is the way you had to make collections type-safe before generics became available. There are a lot of them in code that dates back to .NET 1.x, especially Winforms. There was no reasonable way to rewrite them using generics, that would have broken too much existing code.
So the named collection type is type safe but the rub is System.Collections.IEnumerator.Current, a property of type Object. You can Linqify these collections by using OfType() or Cast().

As Adam Houldsworth said in a comment already, you simply need to use the Cast<> method.
Example:
var a = new DogCollection();
var allFidos = a.Cast<Dog>().Where(d => d.Name == "Fido");

Related

Is testing generic collections for referential equality in C# a silly idea?

I'm implementing a special case of an immutable dictionary, which for convenience implements IEnumerable<KeyValuePair<Foo, Bar>>. Operations that would ordinarily modify the dictionary should instead return a new instance.
So far so good. But when I try to write a fluent-style unit test for the class, I find that neither of the two fluent assertion libraries I've tried (Should and Fluent Assertions) supports the NotBeSameAs() operation on objects that implement IEnumerable -- not unless you first cast them to Object.
When I first ran into this, with Should, I assumed that it was just a hole in the framework, but when I saw that Fluent Assertions had the same hole, it made my think that (since I'm a relative newcomer to C#) I might be missing something conceptual about C# collections -- the author of Should implied as much when I filed an issue.
Obviously there are other ways to test this -- cast to Object and use NotBeSameAs(), just use Object.ReferenceEquals, whatever -- but if there's a good reason not to, I'd like to know what that is.

An IEnumerable<T> is not neccessarily a real object. IEnumerable<T> guarantees that you can enumerate through it's states. In simple cases you have a container class like a List<T> that is already materialized. Then you could compare both Lists' addresses. However, your IEnumerable<T> might also point to a sequence of commands, that will be executed once you enumerate. Basically a state machine:
public IEnumerable<int> GetInts()
{
yield return 10;
yield return 20;
yield return 30;
}
If you save this in a variable, you don't have a comparable object (everything is an object, so you do... but it's not meaningful):
var x = GetInts();
Your comparison only works for materialized ( .ToList() or .ToArray() ) IEnumerables, because those state machines have been evaluated and their results been saved to a collection. So yes, the library actually makes sense, if you know you have materialized IEnumerables, you will need to make this knowledge public by casting them to Object and calling the desired function on this object "manually".

In addition what Jon Skeet suggested take a look at this February 2013 MSDN article from Ted Neward:
.NET Collections, Part 2: Working with C5
Immutable (Guarded) Collections
With the rise of functional concepts
and programming styles, a lot of emphasis has swung to immutable data
and immutable objects, largely because immutable objects offer a lot
of benefits vis-à-vis concurrency and parallel programming, but also
because many developers find immutable objects easier to understand
and reason about. Corollary to that concept, then, follows the concept
of immutable collections—the idea that regardless of whether the
objects inside the collection are immutable, the collection itself is
fixed and unable to change (add or remove) the elements in the
collection. (Note: You can see a preview of immutable collections
released on NuGet in the MSDN Base Class Library (BCL) blog at
bit.ly/12AXD78.)
It describes the use of an open source library of collection goodness called C5.
Look at http://itu.dk/research/c5/

Why is an Add method required for { } initialization?

To use initialization syntax like this:
var contacts = new ContactList
{
{ "Dan", "dan.tao#email.com" },
{ "Eric", "ceo#google.com" }
};
...my understanding is that my ContactList type would need to define an Add method that takes two string parameters:
public void Add(string name, string email);
What's a bit confusing to me about this is that the { } initializer syntax seems most useful when creating read-only or fixed-size collections. After all it is meant to mimic the initialization syntax for an array, right? (OK, so arrays are not read-only; but they are fixed size.) And naturally it can only be used when the collection's contents are known (at least the number of elements) at compile-time.
So it would almost seem that the main requirement for using this collection initializer syntax (having an Add method and therefore a mutable collection) is at odds with the typical case in which it would be most useful.
I'm sure I haven't put as much thought into this matter as the C# design team; it just seems that there could have been different rules for this syntax that would have meshed better with its typical usage scenarios.
Am I way off base here? Is the desire to use the { } syntax to initialize fixed-size collections not as common as I think? What other factors might have influenced the formulation of the requirements for this syntax that I'm simply not thinking of?

I'm sure I haven't put as much thought into this matter as the C# design team; it just seems that there could have been different rules for this syntax that would have meshed better with its typical usage scenarios.
Your analysis is very good; the key problem is the last three words in the statement above. What are the actual typical usage scenarios?
The by-design goal motivated by typical usage scenarios for collection initializers was to make initialization of existing collection types possible in an expression syntax so that collection initializers could be embedded in query comprehensions or converted to expression trees.
Every other scenario was lower priority; the feature exists at all because it helps make LINQ work.
The C# 3 compiler team was the "long pole" for that release of Visual Studio / .NET - we had the most days of work on the schedule of any team, which meant that every day we delayed, the product would be delayed. We wanted to ship a quality product on time for all of you guys, and the perfect is the enemy of the good. Yes, this feature is slightly clunky and doesn't do absolutely everything you might want it to, but it was more important to get it solid and tested for LINQ than to make it work for a bunch of immutable collection types that largely didn't even exist.
Had this feature been designed into the language from day one, while the frameworks types were still evolving, I'm sure that things would have gone differently. As we've discussed elsewhere on this site, I would dearly love to have a write-once-read-many fixed size array of values. It would be nice to define a common pattern for proffering up a bunch of state to initialize an arbitrary immutable collection. You are right that the collection initializer syntax would be ideal for such a thing.
Features like that are on the list for potential future hyptothetical language versions, but not real high on the list. In other words, let's get async/await right first before we think too hard about syntactic sugars for immutable collection initialization.

It's because the initialization statement is shorthand for the CLR. When it gets compiled into bytecode, it will call the Add method you've defined.
So you can make the case that this initialization statement is not really a "first class" feature, because it doesn't have a counterpart in IL. But that's the case for quite a lot of what we use, the "using" statement for example.

The reason for this is that it was retrofitted. I agree with you that using a constructor taking a collection would make vastly more sense, but not all of the existing collection classes implemented this and the change should (1) work with all existing collections, (2) not change the existing classes in any way.
It’s a compromise.

The main reason is Syntactic Sugar.
The initializer syntax only makes writing programing in C# a bit easier. It doesn't actually add any expressive power to the language.
If the initializer didn't require an Add() method, then it would be a much different feature than it is now. Basically, it's just not how C# works. There is no literal form for creating general collections.

Not an answer, strictly speaking, but if you want to know what sort of things influenced the design of collection initialisers then you'll probably find this interesting:
What is a collection? [straight from the Horse's mouth Mads Torgersen]

What should the initialization syntax use, if not an Add method? The initialization syntax is 'run' after the constructor of the collection is run, and the collection fully created. There must be some way of adding items to the collection after it's been created.
If you want to initialize a read-only collection, do it in the constructor (taking a T[] items argument or similar)

As far as I understand it, the collection initializer syntax is just syntactic sugar with no special tricks in it. It was designed in part to support initializing collections inside Linq queries:
from a in somewhere
select new {
Something = a.Something
Collection = new List<object>() {
a.Item1,
a.Item2,
...
}
}
Before there was no way to do this inline and you'd have to do it after the case, which was annoying.

I'd love to have the initializer syntax for immutable types(both collections and normal types). I think this could be implemented with a special constructor overload using a syntax similar to params.
For example something like this:
MyClass(initializer KeyValuePair<K,V>[] initialValues)
But unfortunately the C# team didn't implement such a thing yet :(
So we need to use a workaround like
MyClass(new KeyValuePair<K,V>[]{...})
for now

Collection initializers are expressions, so they can be used where only expression are valid, such as a field initializer or LINQ query. This makes their existence very useful.
I also think the curly-bracketed { } kind of initialization, smells more like a fixed size collection, but it's just a syntax choice.

C# syntax sugar - new way to set object attributes?

For the hardcore C# coders here, this might seem like a completely stupid question - however, I just came across a snippet of sample code in the AWS SDK forum and was completely sideswiped by it:
RunInstancesRequest runInstance = new RunInstancesRequest()
.WithMinCount(1)
.WithMaxCount(1)
.WithImageId(GetXMLElement("ami"))
.WithInstanceType("t1.micro");
This is very reminiscent of the old VB6 With ... End With syntax, which I have long lamented the absence of in C# - I've compiled it in my VS2008 project and it works a treat, saving numerous separate lines referencing these attributes individually.
I'm sure I've read articles in the past explaining why the VB6-style With-block wasn't in C#, so my question is: has this syntax always existed in the language, or is it a recent .NET change that has enabled it? Can we coat all object instantiations followed by attribute changes in the same sugar?

Isn't this better anyway?
RunInstancesRequest runInstance = new RunInstancesRequest
{
MinCount = 1,
MaxCount = 1,
ImageId = GetXMLEleemnt("ami"),
InstanceType = "t1.micro"
};

They implemented all those methods, each of which will also be returning the RunInstancesRequest object (aka, this). It's called a Fluent Interface

It is not syntactic sugar. Those methods just set a property and return the this object.

RunInstancesRequest runInstance = new RunInstancesRequest()
.WithMinCount(1)
.WithMaxCount(1)
.WithImageId(GetXMLElement("ami"))
.WithInstanceType("t1.micro");
==
RunInstancesRequest runInstance = new RunInstancesRequest().WithMinCount(1).WithMaxCount(1).WithImageId(GetXMLElement("ami")).WithInstanceType("t1.micro");
I don't know if that's considered syntactic sugar, or just pure formatting.

I think this technique is different than the With... syntax in VB. I think this is an example of chaining. Each method returns an instance of itself so you can chain the method calls.
See Method-Chaining in C#

The reason this syntax works for RunInstancesRequest is that each of the method calls that you are making return the original instance. The same concept can be applied to StringBuilder for the same reason, but not all classes have methods implemented in this way.

I would prefer having a constructor that takes all of those property values as arguments and sets them within the class.

It's always existed in C# and indeed in any C-style oo language (eh, most popular C-style language except C itself!)
It's unfair to compare it the the VB6 With...End With syntax, as it's much clearer what is going on in this case (about the only good thing I have to say about VB6's With...End With is at least it isn't as bad as Javascripts since it requires prior dots).
It is as people have said, a combination of the "fluent interface" and the fact that the . operator allows for whitespace before and after it, so we can put each item on newlines.
StringBuilder is the most commonly seen case in C#, as in:
new StringBuilder("This")
.Append(' ')
.Append("will")
.Append(' ')
.Append("work")
.Append('.');
A related, but not entirely the same, pattern is where you chain the methods of an immutable object that returns a different object of the same type as in:
DateTime yearAndADay = DateTime.UtcNow.AddYears(1).AddDays(1);
Yet another is returning modified IEnumerable<T> and IQueryable<T> objects from the LINQ related methods.
These though differ in returning different objects, rather than modifying a mutable object and returning that same object.
One of the main reasons that it is more common in C++ and Java than in C# is that C# has properties. This makes the most idiomatic means of assigning different properties a call to the related setter that is syntactically the same as setting a field. It does however block much of the most common use of the fluent interface idiom.
Personally, since the fluent interface idiom is not guaranteed (there's nothing to say MyClass.setProp(32) should return this or indeed, that it shouldn't return 32 which would also be useful in some cases), and since it is not as idiomatic in C#, I prefer to avoid it apart from with StringBuilder, which is such a well-know example that it almost exists as a separate StringBuilder idiom within C#

This syntax has always existed

Please refer to Extension Methods (C# Programming Guide)

Will Microsoft ever make all collections useable by LINQ?

I've been using LINQ for awhile (and enjoy it), but it feels like I hit a speedbump when I run across .NET specialized collections(DataRowCollection, ControlCollection). Is there a way to use LINQ with these specialized controls, and if not do you think Microsoft will address this in the next release of the framework? Or are we left to iterate over these the non-LINQ way, or pull the items out of the collection into LINQ-able collections ourselves?

The reason why collections like ControlCollection do not work with LINQ is that they are not strongly typed. Without an element type LINQ cannot create strongly typed methods. As long as you know the type you can use the Cast method to create a strongly typed enumeration and hence be used with LINQ. For example
ControlCollection col = ...
var query = col.Cast<Control>().Where(x => ...);
As to will Microsoft ever make these implement IEnumerable<T> by default. My guess is no here. The reason why is that doing so is a breaking change and can cause expected behavior in code. Even simply implementing IEnumerable<Control> for ControlCollection would cause changes to overload resolution that can, and almost certainly will, break user applications.

You should be able to do something like this:
myDataRowCollection.Cast<DataRow>().Where.....
and use Linq that way. If you know what the objects in the collection are, then you should be able to use that.

The reason for this is: Collections which do not implement IEnumerable<T> or IQueryable, can not be iterated in LINQ

Having some confusion with LINQ

Some background info;
LanguageResource is the base class
LanguageTranslatorResource and LanguageEditorResource inherit from LanguageResource
LanguageEditorResource defines an IsDirty property
LanguageResourceCollection is a collection of LanguageResource
LanguageResourceCollection internally holds LanguageResources in Dictionary<string, LanguageResource> _dict
LanguageResourceCollection.GetEnumerator() returns _dict.Values.GetEnumerator()
I have a LanguageResourceCollection _resources that contains only LanguageEditorResource objects and want to use LINQ to enumerate those that are dirty so I have tried the following. My specific questions are in bold.
_resources.Where(r => (r as LanguageEditorResource).IsDirty)
neither Where not other LINQ methods are displayed by Intellisense but I code it anyway and am told "LanguageResourceCollection does not contain a definition for 'Where' and no extension method...".
Why does the way that LanguageResourceCollection implements IEnumerable preclude it from supporting LINQ?
If I change the query to
(_resources as IEnumerable<LanguageEditorResource>).Where(r => r.IsDirty)
Intellisense displays the LINQ methods and the solution compiles. But at runtime I get an ArgumentNullException "Value cannot be null. Parameter name: source".
Is this a problem in my LINQ code?
Is it a problem with the general design of the classes?
How can I dig into what LINQ generates to try and see what the problem is?
My aim with this question is not to get a solution for the specific problem, as I will have to solve it now using other (non LINQ) means, but rather to try and improve my understanding of LINQ and learn how I can improve the design of my classes to work better with LINQ.

It sounds like your collection implements IEnumerable, not IEnumerable<T>, hence you need:
_resources.Cast<LanguageEditorResource>().Where(r => r.IsDirty)
Note that Enumerable.Where is defined on IEnumerable<T>, not IEnumerable - if you have the non-generic type, you need to use Cast<T> (or OfType<T>) to get the right type. The difference being that Cast<T> will throw an exception if it finds something that isn't a T, where-as OfType<T> simply ignores anything that isn't a T. Since you've stated that your collection only contains LanguageEditorResource, it is reasonable to check that assumption using Cast<T>, rather than silently drop data.
Check also that you have "using System.Linq" (and are referencing System.Core (.NET 3.5; else LINQBridge with .NET 2.0) to get the Where extension method(s).
Actually, it would be worth having your collection implement IEnumerable<LanguageResource> - which you could do quite simply using either the Cast<T> method, or an iterator block (yield return).
[edit]
To build on Richard Poole's note - you could write your own generic container here, presumably with T : LanguageResource (and using that T in the Dictionary<string,T>, and implementing IEnumerable<T> or ICollection<T>). Just a thought.

In addition to Marc G's answer, and if you're able to do so, you might want to consider dropping your custom LanguageResourceCollection class in favour of a generic List<LanguageResource>. This will solve your current problem and get rid of that nasty .NET 1.1ish custom collection.

How can I dig into what LINQ generates to try and see what the problem is?
Linq isn't generating anything here. You can step through with the debugger.
to try and improve my understanding of LINQ and learn how I can improve the design of my classes to work better with LINQ.
System.Linq.Enumerable methods rely heavily on the IEnumerable< T > contract. You need to understand how your class can produce targets that support this contract. The type that T represents is important!
You could add this method to LanguageResourceCollection:
public IEnumerable<T> ParticularResources<T>()
{
return _dict.Values.OfType<T>();
}
and call it by:
_resources
.ParticularResources<LanguageEditorResource>()
.Where(r => r.IsDirty)
This example would make more sense if the collection class didn't implement IEnumerable< T > against that same _dict.Values . The point is to understand IEnumerable < T > and generic typing.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.