How to "change" immutable objects effectively? - c#

Is there any clever design pattern or sort of generic method for "modification" of immutable objects?
Background:
Let's have set of different (not having common base) immutable objects, each having different set of {get; private set;} properties and one public constructor (accepting set of property values). These objects are and should remain immutable, but from time to time, in sort of special mode, their values are required to "change".
Such kind of modification means just creation of new object having the same values as the original one except the updated properties (like o = new c(o.a, updateB, updateC, o.d, ...)).
I can imagine calling the constructors in place, or defining an (e.g. extension) method returning new instance, accepting some parameters to identify the updated property/ies and modifying it/them via reflection, but everything seems to be very specific and not elegant to be used system wide for "any" immutable. I like the way linq handles e.g. the IEnumerables and the chain you can produce and completely re-transform the input collection. Any ideas?

One example of something similar I'm aware of is Roslyn, Microsoft's current C# compiler, which uses immutable data structures extensively. The syntax tree classes have convenience methods for every property for returning a new instance with just one property changed, e.g.
var cu = Syntax.CompilationUnit()
.AddMembers(
Syntax.NamespaceDeclaration(Syntax.IdentifierName("ACO"))
.AddMembers(
Syntax.ClassDeclaration("MainForm")
.AddBaseListTypes(Syntax.ParseTypeName("System.Windows.Forms.Form"))
.WithModifiers(Syntax.Token(SyntaxKind.PublicKeyword))
.AddMembers(
Syntax.PropertyDeclaration(Syntax.ParseTypeName("System.Windows.Forms.Timer"), "Ticker")
.AddAccessorListAccessors(
Syntax.AccessorDeclaration(SyntaxKind.GetAccessorDeclaration).WithSemicolonToken(Syntax.Token(SyntaxKind.SemicolonToken)),
Syntax.AccessorDeclaration(SyntaxKind.SetAccessorDeclaration).WithSemicolonToken(Syntax.Token(SyntaxKind.SemicolonToken))),
Syntax.MethodDeclaration(Syntax.ParseTypeName("void"), "Main")
.AddModifiers(Syntax.Token(SyntaxKind.PublicKeyword))
.AddAttributes(Syntax.AttributeDeclaration().AddAttributes(Syntax.Attribute(Syntax.IdentifierName("STAThread"))))
.WithBody(Syntax.Block())
)
)
);
or in your case:
o = o.WithB(updateB).WithC(updateC);
which I think reads quite nicely for the intended use case (updating only a few properties while keeping everything else the same). It especially beats the »always have to call the ctor« approach when there are many properties.
In Roslyn that's all auto-generated, I think. You can probably do something similar with T4, if warranted.

Related

Force function input parameters to be immutable?

I've just spent the best part of 2 days trying to track down a bug, it turns out I was accidentally mutating the values that were provided as input to a function.
IEnumerable<DataLog>
FilterIIR(
IEnumerable<DataLog> buffer
) {
double notFilter = 1.0 - FilterStrength;
var filteredVal = buffer.FirstOrDefault()?.oilTemp ?? 0.0;
foreach (var item in buffer)
{
filteredVal = (item.oilTemp * notFilter) + (filteredVal * FilterStrength);
/* Mistake here!
item.oilTemp = filteredValue;
yield return item;
*/
// Correct version!
yield return new DataLog()
{
oilTemp = (float)filteredVal,
ambTemp = item.ambTemp,
oilCond = item.oilCond,
logTime = item.logTime
};
}
}
My programming language of preference is usually C# or C++ depending on what I think suits the requirements better (this is part of a larger program that suits C# better)...
Now in C++ I would have been able to guard against such a mistake by accepting constant iterators which prevent you from being able to modify the values as you retrieve them (though I might need to build a new container for the return value). I've done a little searching and can't find any simple way to do this in C#, does anyone know different?
I was thinking I could make an IReadOnlyEnumerable<T> class which takes an IEnumerable as a constructor, but then I realized that unless it makes a copy of the values as you retrieve them it won't actually have any effect, because the underlying value can still be modified.
Is there any way I might be able to protect against such errors in future? Some wrapper class, or even if it's a small code snippet at the top of each function I want to protect, anything would be fine really.
The only sort of reasonable approach I can think of at the moment that'll work is to define a ReadOnly version of every class I need, then have a non-readonly version that inherits and overloads the properties and adds functions to provide a mutable version of the same class.
The problem is here isn't really about the IEnumerable. IEnumerables are actually immutable. You can't add or remove things from them. What's mutable is your DataLog class.
Because DataLog is a reference type, item holds a reference to the original object, instead of a copy of the object. This, plus the fact that DataLog is mutable, allows you to mutate the parameters passed in.
So on a high level, you can either:
make a copy of DataLog, or;
make DataLog immutable
or both...
What you are doing now is "making a copy of DataLog". Another way of doing this is changing DataLog from a class to a struct. This way, you'll always create a copy of it when passing it to methods (unless you mark the parameter with ref). So be careful when using this method because it might silently break existing methods that assume a pass-by-reference semantic.
You can also make DataLog immutable. This means removing all the setters. Optionally, you can add methods named WithXXX that returns a copy of the object with only one property different. If you chose to do this, your FilterIIR would look like:
yield return item.WithOilTemp(filteredVal);
The only sort of reasonable approach I can think of at the moment that'll work is to define a ReadOnly version of every class I need, then have a non-readonly version that inherits and overloads the properties and adds functions to provide a mutable version of the same class.
You don't actually need to do this. Notice how List<T> implements IReadOnlyList<T>, even though List<T> is clearly mutable. You could write an interface called IReadOnlyDataLog. This interface would only have the getters of DataLog. Then, have FilterIIR accept a IEnumerable<IReadOnlyDataLog> and DataLog implement IReadOnlyDataLog. This way, you will not accidentally mutate the DataLog objects in FilterIIR.

Best Practice List/Array/ReadOnlyCollection creation (and usage)

My code is littered with collections - not an unusual thing, I suppose. However, usage of the various collection types isn't obvious nor trivial. Generally, I'd like to use the type that's exposes the "best" API, and has the least syntactic noise. (See Best practice when returning an array of values, Using list arrays - Best practices for comparable questions). There are guidelines suggesting what types to use in an API, but these are impractical in normal (non-API) code.
For instance:
new ReadOnlyCollection<Tuple<string,int>>(
new List<Tuple<string,int>> {
Tuple.Create("abc",3),
Tuple.Create("def",37)
}
)
List's are a very common datastructure, but creating them in this fashion involves quite a bit of syntactic noise - and it can easily get even worse (e.g. dictionaries). As it turns out, many lists are never changed, or at least never extended. Of course ReadOnlyCollection introduces yet more syntactic noise, and it doesn't even convey quite what I mean; after all ReadOnlyCollection may wrap a mutating collection. Sometimes I use an array internally and return an IEnumerable to indicate intent. But most of these approaches have a very low signal-to-noise ratio; and that's absolutely critical to understanding code.
For the 99% of all code that is not a public API, it's not necessary to follow Framework Guidelines: however, I still want a comprehensible code and a type that communicates intent.
So, what's the best-practice way to deal with the bog-standard task of making small collections to pass around values? Should array be preferred over List where possible? Something else entirely? What's the best way - clean, readable, reasonably efficient - of passing around such small collections? In particular, code should be obvious to future maintainers that have not read this question and don't want to read swathes of API docs yet still understand what the intent is. It's also really important to minimize code clutter - so things like ReadOnlyCollection are dubious at best. Nothing wrong with wordy types for major API's with small surfaces, but not as a general practice inside a large codebase.
What's the best way to pass around lists of values without lots of code clutter (such as explicit type parameters) but that still communicates intent clearly?
Edit: clarified that this is about making short, clear code, not about public API's.
After hopefully understanding your question, i think you have to distinguish between what you create and manage within your class and what you make available to the outside world.
Within your class you can use whatever best fits your current task (pro/cons of List vs. Array vs. Dictionary vs. LinkedList vs. etc.). But this has maybe nothing to do about what you provide in your public properties or functions.
Within your public contract (properties and functions) you should give back the least type (or even better interface) that is needed. So just an IList, ICollection, IDictionary, IEnumerable of some public type. Thous leads that your consumer classes are just awaiting interfaces instead of concrete classes and so you can change the concrete implementation at a later stage without breaking your public contract (due to performance reasons use an List<> instead of a LinkedList<> or vice versa).
Update:
So, this isn't strictly speaking new; but this question convinced me to go ahead and announce an open source project I've had in the works for a while (still a work in progress, but there's some useful stuff in there), which includes an IArray<T> interface (and implementations, naturally) that I think captures exactly what you want here: an indexed, read-only, even covariant (bonus!) interface.
Some benefits:
It's not a concrete type like ReadOnlyCollection<T>, so it doesn't tie you down to a specific implementation.
It's not just a wrapper (like ReadOnlyCollection<T>), so it "really is" read-only.
It clears the way for some really nice extension methods. So far the Tao.NET library only has two (I know, weak), but more are on the way. And you can easily make your own, too—just derive from ArrayBase<T> (also in the library) and override the this[int] and Count properties and you're done.
If this sounds promising to you, feel free to check it out and let me know what you think.
It's not 100% clear to me where you're worried about this "syntactic noise": in your code or in calling code?
If you're tolerant of some "noise" in your own encapsulated code then I would suggest wrapping a T[] array and exposing an IList<T> which happens to be a ReadOnlyCollection<T>:
class ThingsCollection
{
ReadOnlyCollection<Thing> _things;
public ThingsCollection()
{
Thing[] things = CreateThings();
_things = Array.AsReadOnly(things);
}
public IList<Thing> Things
{
get { return _things; }
}
protected virtual Thing[] CreateThings()
{
// Whatever you want, obviously.
return new Thing[0];
}
}
Yes there is some noise on your end, but it's not bad. And the interface you expose is quite clean.
Another option is to make your own interface, something like IArray<T>, which wraps a T[] and provides a get-only indexer. Then expose that. This is basically as clean as exposing a T[] but without falsely conveying the idea that items can be set by index.
I do not pass around Listss if I can possibly help it. Generally I have something else that is managing the collection in question, which exposes the collection, for example:
public class SomeCollection
{
private List<SomeObject> m_Objects = new List<SomeObject>();
// ctor
public SomeCollection()
{
// Initialise list here, or wot-not/
} // eo ctor
public List<SomeObject> Objects { get { return m_Objects; } }
} // eo class SomeCollection
And so this would be the object passed around:
public void SomeFunction(SomeCollection _collection)
{
// work with _collection.Objects
} // eo SomeFunction
I like this approach, because:
1) I can populate my values in the ctor. They're there the momeny anyone news SomeCollection.
2) I can restrict access, if I want, to the underlying list. In my example I exposed it all, but you don't have to do this. You can make it read-only if you want, or validate additions to the list, prior to adding them.
3) It's clean. Far easier to read SomeCollection than List<SomeObject> everywhere.
4) If you suddenly realise that your collection of choice is inefficient, you can change the underlying collection type without having to go and change all the places where it got passed as a parameter (can you imagine the trouble you might have with, say, List<String>?)
I agree. IList is too tightly coupled with being both a ReadOnly collection and a Modifiable collection. IList should have inherited from an IReadOnlyList.
Casting back to IReadOnlyList wouldn't require a explicit cast. Casting forward would.
1.
Define your own class which implements IEnumerator, takes an IList in the new constructor, has a read only default item property taking an index, and does not include any properties/methods that could otherwise allow your list to me manipulated.
If you later want to allow modifying the ReadOnly wrapper like IReadOnlyCollection does, you can make another class which is a wrapper around your custom ReadOnly Collection and has the Insert/Add/Remove/RemoveAt/Clear/...implemented and cache those changes.
2.
Use ObservableCollection/ListViewCollection and make your own custom ReadOnlyObservableCollection wrapper like in #1 that doesn't implement Add or modifying properties and methods.
ObservableCollection can bind to ListViewCollection in such a way that changes to ListViewCollection do not get pushed back into ObservableCollection. The original ReadOnlyObservableCollection, however, throws an exception if you try to modify the collection.
If you need backwards/forwards compatibility, make two new classes inheriting from these. Then Implement IBindingList and handle/translate CollectionChanged Event (INotifyCollectionChanged event) to the appropriate IBindingList events.
Then you can bind it to older DataGridView and WinForm controls, as well as WPF/Silverlight controls.
Microsoft has created a Guidelines for Collections document which is a very informative list of DOs and DON'Ts that address most of your question.
It's a long list so here are the most relevant ones:
DO prefer collections over arrays.
DO NOT use ArrayList or List in public APIs. (public properties, public parameters and return types of public methods)
DO NOT use Hashtable or Dictionary in public APIs.
DO NOT use weakly typed collections in public APIs.
DO use the least-specialized type possible as a parameter type. Most members taking collections as parameters use the IEnumerable interface.
AVOID using ICollection or ICollection as a parameter just to access the Count property.
DO use ReadOnlyCollection, a subclass of ReadOnlyCollection, or in rare cases IEnumerable for properties or return values representing read-only collections.
As the last point states, you shouldn't avoid ReadOnlyCollection like you were suggesting. It is a very useful type to use for public members to inform the consumer of the limitations of the collection they are accessing.

Are there drawbacks to creating a class that encapsulates Generic Collection?

A part of my (C# 3.0 .NET 3.5) application requires several lists of strings to be maintained. I declare them, unsurprisingly, as List<string> and everything works, which is nice.
The strings in these Lists are actually (and always) Fund IDs. I'm wondering if it might be more intention-revealing to be more explicit, e.g.:
public class FundIdList : List<string> { }
... and this works as well. Are there any obvious drawbacks to this, either technically or philosophically?
I would start by going in the other direction: wrapping the string up into a class/struct called FundId. The advantage of doing so, I think, is greater than the generic list versus specialised list.
You code becomes type-safe: there is a lot less scope for you to pass a string representing something else into a method that expects a fund identifier.
You can constrain the strings that are valid in the constructor to FundId, i.e. enforce a maximum length, check that the code is in the expected format, &c.
You have a place to add methods/functions relating to that type. For example, if fund codes starting 'I' are internal funds you could add a property called IsInternal that formalises that.
As for FundIdList, the advantage to having such a class is similar to point 3 above for the FundId: you have a place to hook in methods/functions that operate on the list of FundIds (i.e. aggregate functions). Without such a place, you'll find that static helper methods start to crop up throughout the code or, in some static helper class.
List<> has no virtual or protected members - such classes should almost never be subclassed. Also, although it's possible you need the full functionality of List<string>, if you do - is there much point to making such a subclass?
Subclassing has a variety of downsides. If you declare your local type to be FundIdList, then you won't be able to assign to it by e.g. using linq and .ToList since your type is more specific. I've seen people decide they need extra functionality in such lists, and then add it to the subclassed list class. This is problematic, because the List implementation ignores such extra bits and may violate your constraints - e.g. if you demand uniqueness and declare a new Add method, anyone that simply (legally) upcasts to List<string> for instance by passing the list as a parameter typed as such will use the default list Add, not your new Add. You can only add functionality, never remove it - and there are no protected or virtual members that require subclassing to exploit.
So you can't really add any functionality you couldn't with an extension method, and your types aren't fully compatible anymore which limits what you can do with your list.
I prefer declaring a struct FundId containing a string and implementing whatever guarantees concerning that string you need there, and then working with a List<FundId> rather than a List<string>.
Finally, do you really mean List<>? I see many people use List<> for things for which IEnumerable<> or plain arrays are more suitable. Exposing your internal List in an api is particularly tricky since that means any API user can add/remove/change items. Even if you copy your list first, such a return value is still misleading, since people might expect to be able to add/remove/change items. And if you're not exposing the List in an API but merely using it for internal bookkeeping, then it's not nearly as interesting to declare and use a type that adds no functionality, only documentation.
Conclusion
Only use List<> for internals, and don't subclass it if you do. If you want some explicit type-safety, wrap string in a struct (not a class, since a struct is more efficient here and has better semantics: there's no confusion between a null FundId and a null string, and object equality and hashcode work as expected with structs but need to be manually specified for classes). Finally, expose IEnumerable<> if you need to support enumeration, or if you need indexing as well use the simple ReadOnlyCollection<> wrapper around your list rather than let the API client fiddle with internal bits. If you really need a mutatable list API, ObservableCollection<> at least lets you react to changes the client makes.
Personally I would leave it as a List<string>, or possibly create a FundId class that wraps a string and then store a List<FundId>.
The List<FundId> option would enforce type correct-ness and allow you to put some validation on FundIds.
Just leave it as a List<string>, you variable name is enough to tell others that it's storing FundIDs.
var fundIDList = new List<string>();
When do I need to inherit List<T>?
Inherit it if you have really special actions/operations to do to a fund id list.
public class FundIdList : List<string>
{
public void SpecialAction()
{
//can only do with a fund id list
//sorry I can't give an example :(
}
}
Unless I was going to want someone to do everything they could to List<string>, without any intervention on the part of FundIdList I would prefer to implement IList<string> (or an interface higher up the hierarchy if I didn't care about most of that interface's members) and delegate calls to a private List<string> when appropriate.
And if I did want someone to have that degree of control, I'd probably just given them a List<string> in the first place. Presumably you have something to make sure such strings actually are "Fund IDs", which you can't guarantee any more when you publicly use inheritance.
Actually, this sounds (and often does with List<T>) like a natural case for private inheritance. Alas, C# doesn't have private inheritance, so composition is the way to go.

Difference between Property and Method [duplicate]

This question already has answers here:
Exposing Member Objects As Properties or Methods in .NET
(7 answers)
Closed 9 years ago.
Which one is better to use when it come to return value for example
public int EmployeeAge
{
get{return intEmployeeAge};
}
And
public int EmployeeAge()
{
return intEmployeeAge;
}
Which one is better and why? And what is best programming practice to use when we have secnario like above ?
Properties are a useful way of expressing a feature of an object, allowing get/set in a common way that can be used by APIs like data-binding, reflection and serialization. So for simple values of the object, properties are handy. Properties can't take arguments, should not have significant side-effects*, and should return quickly and repeatably. Also, there is no such thing as an "extension property" (to mirror an extension method) nor a generic property.
(*=lazy loading etc isn't uncommon, however)
Methods (C# doesn't have functions) are better for expressing things that either change the state, or which have an expectation of taking some time and not necessarily being reproducible. They don't tend to work in binding / serialization etc.
Note that properties are actually just a special way of writing methods. There is little functional difference. It is all about expressing intent. The one thing you don't want to expose, however, is fields (the actual intEmployeeAge instance variable).
So I would have:
public int EmployeeAge { get{return intEmployeeAge}; }
or just (if on the Employee object):
public int Age { get{return intEmployeeAge}; }
Of course... then the question becomes "in what unit?" I assume that is years?
If all you need to do is return a value, use a property.
If you need to do something before returning a value, use a function.
Properties holds object data
Functions defines object behavior
Take a look at -> Property Usage Guidelines
Which one is better and why? And what is best programming practice to use when we have
secnario like above ?
I write in C# however I prefer to use Get/Set functions, for me it's just better way to express what I can get from object and how I can change it's state (and this methods are grouped by alphabet in Intelisense which is also nice). However, if team prefers other conventions it's not a problem but when I work on my own projects it's just easier to read API.
e.g
Obejct1 o = new Object1();
o.P1;
o.P2;
o.P3;
from looking to API you can't say what you change in a public API or what it a read only property, unless you use IDE that shows you a small icon showing actually what you can do.
Object1 o = new Object1();
o.GetP1();
o.SetP2();
o.SetP3();
One can easily find from API how data can be changed by type's clients.
A method returns values after work is completed and a value is the result of the work being done. I don't think this is what you are doing.
A property (accessor) is meant for returning variables, which seems to be what you're trying to achieve:
As per MSDN:
The accessor of a property contains
the executable statements associated
with getting (reading or computing) or
setting (writing) the property. The
accessor declarations can contain a
get accessor, a set accessor, or both.
The declarations take the following
forms:
public int EmployeeAge
{
get;
set;
}
Have a look here, as it gives a very good description on the uses of these.
Property is a way explore the internal data element of a class in a simple manner. We can implement a properties with the type-safe get and set method.Property is implicitly called using calling convention.Property works on compile and runtime.
Method is a block of code that contain a series of statements.Method is explicitly called.
Methods works on runtime.
I'm a little late to this party, but I'll just mention another surprising difference between a property and a parameterless "get" method. As #MarcGravell notes, lazy loading is a common pattern when using properties, but beware of the Heisenberg Watch Window gotcha!
I think this has a lot to do with the culture you are programming in. As I see it, the C# / .NET culture would prefer to use a property for this case.
My advice: Try to be consistent with the main libraries you are using.
BUT: Be wary of using properties (or functions that serve the same purpose, as in your example above), as they are often a sign of bad design. You want to tell your objects to do stuff, as opposed to asking them for information. Don't be religious about this, though, just be aware of this as a code smell.

Why should I avoid using Properties in C#?

In his excellent book, CLR Via C#, Jeffrey Richter said that he doesn't like properties, and recommends not to use them. He gave some reason, but I don't really understand. Can anyone explain to me why I should or should not use properties?
In C# 3.0, with automatic properties, does this change?
As a reference, I added Jeffrey Richter's opinions:
• A property may be read-only or write-only; field access is always readable and writable.
If you define a property, it is best to offer both get and set accessor methods.
• A property method may throw an exception; field access never throws an exception.
• A property cannot be passed as an out or ref parameter to a method; a field can. For
example, the following code will not compile:
using System;
public sealed class SomeType
{
private static String Name
{
get { return null; }
set {}
}
static void MethodWithOutParam(out String n) { n = null; }
public static void Main()
{
// For the line of code below, the C# compiler emits the following:
// error CS0206: A property or indexer may not
// be passed as an out or ref parameter
MethodWithOutParam(out Name);
}
}
• A property method can take a long time to execute; field access always completes immediately.
A common reason to use properties is to perform thread synchronization, which
can stop the thread forever, and therefore, a property should not be used if thread synchronization
is required. In that situation, a method is preferred. Also, if your class can be
accessed remotely (for example, your class is derived from System.MashalByRefObject),
calling the property method will be very slow, and therefore, a method is preferred to a
property. In my opinion, classes derived from MarshalByRefObject should never use
properties.
• If called multiple times in a row, a property method may return a different value each
time; a field returns the same value each time. The System.DateTime class has a readonly
Now property that returns the current date and time. Each time you query this
property, it will return a different value. This is a mistake, and Microsoft wishes that they
could fix the class by making Now a method instead of a property.
• A property method may cause observable side effects; field access never does. In other
words, a user of a type should be able to set various properties defined by a type in any
order he or she chooses without noticing any different behavior in the type.
• A property method may require additional memory or return a reference to something
that is not actually part of the object's state, so modifying the returned object has no
effect on the original object; querying a field always returns a reference to an object that
is guaranteed to be part of the original object's state. Working with a property that
returns a copy can be very confusing to developers, and this characteristic is frequently
not documented.
Jeff's reason for disliking properties is because they look like fields - so developers who don't understand the difference will treat them as if they're fields, assuming that they'll be cheap to execute etc.
Personally I disagree with him on this particular point - I find properties make the client code much simpler to read than the equivalent method calls. I agree that developers need to know that properties are basically methods in disguise - but I think that educating developers about that is better than making code harder to read using methods. (In particular, having seen Java code with several getters and setters being called in the same statement, I know that the equivalent C# code would be a lot simpler to read. The Law of Demeter is all very well in theory, but sometimes foo.Name.Length really is the right thing to use...)
(And no, automatically implemented properties don't really change any of this.)
This is slightly like the arguments against using extension methods - I can understand the reasoning, but the practical benefit (when used sparingly) outweighs the downside in my view.
Well, lets take his arguments one by one:
A property may be read-only or
write-only; field access is always
readable and writable.
This is a win for properties, since you have more fine-grained control of access.
A property method may throw an
exception; field access never throws
an exception.
While this is mostly true, you can very well call a method on a not initialized object field, and have an exception thrown.
• A property cannot be passed as an
out or ref parameter to a method; a
field can.
Fair.
• A property method can take a long
time to execute; field access always
completes immediately.
It can also take very little time.
• If called multiple times in a row, a
property method may return a different
value each time; a field returns the
same value each time.
Not true. How do you know the field's value has not changed (possibly by another thread)?
The System.DateTime class has a
readonly Now property that returns the
current date and time. Each time you
query this property, it will return a
different value. This is a mistake,
and Microsoft wishes that they could
fix the class by making Now a method
instead of a property.
If it is a mistake it's a minor one.
• A property method may cause
observable side effects; field access
never does. In other words, a user of
a type should be able to set various
properties defined by a type in any
order he or she chooses without
noticing any different behavior in the
type.
Fair.
• A property method may require
additional memory or return a
reference to something that is not
actually part of the object's state,
so modifying the returned object has
no effect on the original object;
querying a field always returns a
reference to an object that is
guaranteed to be part of the original
object's state. Working with a
property that returns a copy can be
very confusing to developers, and this
characteristic is frequently not
documented.
Most of the protestations could be said for Java's getters and setters too --and we had them for quite a while without such problems in practice.
I think most of the problems could be solved by better syntax highlighting (i.e differentiating properties from fields) so the programmer knows what to expect.
I haven't read the book, and you haven't quoted the part of it you don't understand, so I'll have to guess.
Some people dislike properties because they can make your code do surprising things.
If I type Foo.Bar, people reading it will normally expect that this is simply accessing a member field of the Foo class. It's a cheap, almost free, operation, and it's deterministic. I can call it over and over, and get the same result every time.
Instead, with properties, it might actually be a function call. It might be an infinite loop. It might open a database connection. It might return different values every time I access it.
It is a similar argument to why Linus hates C++. Your code can act surprising to the reader. He hates operator overloading: a + b doesn't necessarily mean simple addition. It may mean some hugely complicated operation, just like C# properties. It may have side effects. It may do anything.
Honestly, I think this is a weak argument. Both languages are full of things like this. (Should we avoid operator overloading in C# as well? After all, the same argument can be used there)
Properties allow abstraction. We can pretend that something is a regular field, and use it as if it was one, and not have to worry about what goes on behind the scenes.
That's usually considered a good thing, but it obviously relies on the programmer writing meaningful abstractions. Your properties should behave like fields. They shouldn't have side effects, they shouldn't perform expensive or unsafe operations. We want to be able to think of them as fields.
However, I have another reason to find them less than perfect. They can not be passed by reference to other functions.
Fields can be passed as ref, allowing a called function to access it directly. Functions can be passed as delegates, allowing a called function to access it directly.
Properties... can't.
That sucks.
But that doesn't mean properties are evil or shouldn't be used. For many purposes, they're great.
Back in 2009, this advice merely seemed like bellyaching of the Who Moved My Cheese variety. Today, it's almost laughably obsolete.
One very important point that many answers seem to tiptoe around but don't quite address head on is that these purported "dangers" of properties are an intentional part of the framework design!
Yes, properties can:
Specify different access modifiers for the getter and setter. This is an advantage over fields. A common pattern is to have a public getter and a protected or internal setter, a very useful inheritance technique which isn't achievable by fields alone.
Throw an exception. To date, this remains one of the most effective methods of validation, especially when working with UI frameworks that involve data-binding concepts. It's much more difficult to ensure that an object remains in a valid state when working with fields.
Take a long time to execute. The valid comparison here is with methods, which take equally long - not fields. No basis is given for the statement "a method is preferred" other than one author's personal preference.
Return different values from its getter on subsequent executions. This almost seems like a joke in such close proximity to the point extolling the virtues of ref/out parameters with fields, whose value of a field after a ref/out call is pretty much guaranteed to be different from its previous value, and unpredictably so.
If we're talking about the specific (and practically academic) case of single-threaded access with no afferent couplings, it's fairly well understood that it's just bad property design to have visible-state-changing side-effects, and maybe my memory is fading, but I just can't seem to recall any examples of folks using DateTime.Now and expecting the same value to come out every time. At least not any instances where they wouldn't have screwed it up just as badly with a hypothetical DateTime.Now().
Cause observable side effects - which is of course precisely the reason properties were invented as a language feature in the first place. Microsoft's own Property Design guidelines indicate that setter order shouldn't matter, as to do otherwise would imply temporal coupling. Certainly, you can't achieve temporal coupling with fields alone, but that's only because you can't cause any meaningful behaviour at all to happen with fields alone, until some method is executed.
Property accessors can actually help prevent certain types of temporal coupling by forcing the object into a valid state before any action is taken - for example, if a class has a StartDate and an EndDate, then setting the EndDate before the StartDate could force the StartDate back as well. This is true even in multi-threaded or asynchronous environments, including the obvious example of an event-driven user interface.
Other things that properties can do which fields can't include:
Lazy loading, one of the most effective ways of preventing initialization-order errors.
Change Notifications, which are pretty much the entire basis for the MVVM architecture.
Inheritance, for example defining an abstract Type or Name so derived classes can provide interesting but nevertheless constant metadata about themselves.
Interception, thanks to the above.
Indexers, which everyone who has ever had to work with COM interop and the inevitable spew of Item(i) calls will recognize as a wonderful thing.
Work with PropertyDescriptor which is essential for creating designers and for XAML frameworks in general.
Richter is clearly a prolific author and knows a lot about the CLR and C#, but I have to say, it seems like when he originally wrote this advice (I'm not sure if it's in his more recent revisions - I sincerely hope not) that he just didn't want to let go of old habits and was having trouble accepting the conventions of C# (vs. C++, for example).
What I mean by this is, his "properties considered harmful" argument essentially boils down to a single statement: Properties look like fields, but they might not act like fields. And the problem with the statement is, it isn't true, or at best it's highly misleading. Properties don't look like fields - at least, they aren't supposed to look like fields.
There are two very strong coding conventions in C# with similar conventions shared by other CLR languages, and FXCop will scream at you if you don't follow them:
Fields should always be private, never public.
Fields should be declared in camelCase. Properties are PascalCase.
Thus, there is no ambiguity over whether Foo.Bar = 42 is a property accessor or a field accessor. It's a property accessor and should be treated like any other method - it might be slow, it might throw an exception, etc. That's the nature of Abstraction - it's entirely up to the discretion of the declaring class how to react. Class designers should apply the principle of least surprise but callers should not assume anything about a property except that it does what it says on the tin. That's on purpose.
The alternative to properties is getter/setter methods everywhere. That's the Java approach, and it's been controversial since the beginning. It's fine if that's your bag, but it's just not how we roll in the .NET camp. We try, at least within the confines of a statically-typed system, to avoid what Fowler calls Syntactic Noise. We don't want extra parentheses, extra get/set warts, or extra method signatures - not if we can avoid them without any loss of clarity.
Say whatever you like, but foo.Bar.Baz = quux.Answers[42] is always going to be a lot easier to read than foo.getBar().setBaz(quux.getAnswers().getItem(42)). And when you're reading thousands of lines of this a day, it makes a difference.
(And if your natural response to the above paragraph is to say, "sure it's hard to read, but it would be easier if you split it up in multiple lines", then I'm sorry to say that you have completely missed the point.)
I don't see any reasons why you shouldn't use Properties in general.
Automatic properties in C# 3+ only simplify syntax a bit (a la syntatic sugar).
It's just one person's opinion. I've read quite a few c# books and I've yet to see anyone else saying "don't use properties".
I personally think properties are one of the best things about c#. They allow you to expose state via whatever mechanism you like. You can lazily instantiate the first time something is used and you can do validation on setting a value etc. When using and writing them, I just think of properties as setters and getters which a much nicer syntax.
As for the caveats with properties, there are a couple. One is probably a misuse of properties, the other can be subtle.
Firstly, properties are types of methods. It can be surprising if you place complicated logic in a property because most users of a class will expect the property to be fairly lightweight.
E.g.
public class WorkerUsingMethod
{
// Explicitly obvious that calculation is being done here
public int CalculateResult()
{
return ExpensiveLongRunningCalculation();
}
}
public class WorkerUsingProperty
{
// Not at all obvious. Looks like it may just be returning a cached result.
public int Result
{
get { return ExpensiveLongRunningCalculation(); }
}
}
I find that using methods for these cases helps to make a distinction.
Secondly, and more importantly, properties can have side-effects if you evaluate them while debugging.
Say you have some property like this:
public int Result
{
get
{
m_numberQueries++;
return m_result;
}
}
Now suppose you have an exception that occurs when too many queries are made. Guess what happens when you start debugging and rollover the property in the debugger? Bad things. Avoid doing this! Looking at the property changes the state of the program.
These are the only caveats I have. I think the benefits of properties far outweigh the problems.
That reason must have been given within a very specific context. It's usually the other way round - it is recomended to use properties as they give you a level of abstraction enabling you to change behaviour of a class without affecting its clients...
I can't help picking on the details of Jeffrey Richter's opinions:
A property may be read-only or write-only; field access is always readable and writable.
Wrong: Fields can marked read-only so only the object's constructor can write to them.
A property method may throw an exception; field access never throws an exception.
Wrong: The implementation of a class can change the access modifier of a field from public to private. Attempts to read private fields at runtime will always result in an exception.
I don't agree with Jeffrey Richter, but I can guess why he doesn't like properties (I haven't read his book).
Even though, properties are just like methods (implementation-wise), as a user of a class, I expect that its properties behave "more or less" like a public field, e.g:
there's no time-consuming operation going on inside the property getter/setter
the property getter has no side effects (calling it multiple times, does not change the result)
Unfortunately, I have seen properties which did not behave that way. But the problem are not the properties themselves, but the people who implemented them. So it just requires some education.
The argument assumes that properties are bad because they look like fields, but can do surprising actions. This assumption is invalidated by .NET programmers' expectactions:
Properties don't look like fields. Fields look like properties.
• A property method may throw an exception; field access never throws an exception.
So, a field is like a property that is guaranteed to never throw an exception.
• A property cannot be passed as an out or ref parameter to a method; a field can.
So, a field is like a property, but it has additional capabilities: passing to a ref/out accepting methods.
• A property method can take a long time to execute; field access always completes immediately. [...]
So, a field is like a fast property.
• If called multiple times in a row, a property method may return a different value each time; a field returns the same value each time. The System.DateTime class has a readonly Now property that returns the current date and time.
So, a field is like a property that is guaranteed to return the same value unless the field was set to a different value.
• A property method may cause observable side effects; field access never does.
Again, a field is a property that is guaranteed to not do that.
• A property method may require additional memory or return a reference to something that is not actually part of the object's state, so modifying the returned object has no effect on the original object; querying a field always returns a reference to an object that is guaranteed to be part of the original object's state. Working with a property that returns a copy can be very confusing to developers, and this characteristic is frequently not documented.
This one may be surprising, but not because it's a property that does this. Rather, it's that barely anyone returns mutable copies in properties, so that 0.1% case is surprising.
There is a time when I consider not using properties, and that is in writing .Net Compact Framework code. The CF JIT compiler does not perform the same optimisation as the desktop JIT compiler and does not optimise simple property accessors, so in this case adding a simple property causes a small amount of code bloat over using a public field. Usually this wouldn't be an issue, but almost always in the Compact Framework world you are up against tight memory constraints, so even tiny savings like this do count.
You shouldn't avoid using them but you should use them with qualification and care, for the reasons given by other contributors.
I once saw a property called something like Customers that internally opened an out-of-process call to a database and read the customer list. The client code had a 'for (int i to Customers.Count)' which was causing a separate call to the database on each iteration and for the access of the selected Customer. That's an egregious example that demonstrates the principle of keeping the property very light - rarely more than a internal field access.
One argument FOR using properties is that they allow you to validate the value being set. Another is that the value of of the property may be a derived value, not a single field, like TotalValue = amount * quantity.
Personally I only use properties when creating simple get / set methods. I stray away from it when coming to complicated data structures.
I haven't read his book but I agree with him.
It's quite a contradiction on the usual paradigm.
They look like a field but had many sideeffects. Exceptions, performance, sideeffects, read/write access difference, etc.
If you care about readability and reliability i think you'd care about this as well.
In my experience i never gained anything from properties and only felt like using them when i was lazy.
But many times i encountered people using properties as if they were fields inadvertently creating all sort of issues.
Ive worked professionally on many languages for quite a time and never needed them, not even once.
Of course other people might find them useful and that's fine too.
Depends on your user case and what you value.
I personally believe they are just 'features ' that you can choose to use or not.
Invoking methods instead of properties greatly reduces the readability of the invoking code. In J#, for example, using ADO.NET was a nightmare because Java doesn't support properties and indexers (which are essentially properties with arguments). The resulting code was extremely ugly, with empty parentheses method calls all over the place.
The support for properties and indexers is one of the basic advantages of C# over Java.

Categories