Copy constructor versus Clone() - c#

In C#, what is the preferred way to add (deep) copy functionality to a class? Should one implement the copy constructor, or rather derive from ICloneable and implement the Clone() method?
Remark: I wrote "deep" within brackets because I thought it was irrelevant. Apparently others disagree, so I asked whether a copy constructor/operator/function needs to make clear which copy variant it implements.

You should not derive from ICloneable.
The reason is that when Microsoft designed the .net framework they never specified whether the Clone() method on ICloneable should be a deep or shallow clone, thus the interface is semantically broken as your callers won't know whether the call will deep or shallow clone the object.
Instead, you should define your own IDeepCloneable (and IShallowCloneable) interfaces with DeepClone() (and ShallowClone()) methods.
You can define two interfaces, one with a generic parameter to support strongly typed cloning and one without to keep the weakly typed cloning ability for when you are working with collections of different types of cloneable objects:
public interface IDeepCloneable
{
object DeepClone();
}
public interface IDeepCloneable<T> : IDeepCloneable
{
T DeepClone();
}
Which you would then implement like this:
public class SampleClass : IDeepCloneable<SampleClass>
{
public SampleClass DeepClone()
{
// Deep clone your object
return ...;
}
object IDeepCloneable.DeepClone()
{
return this.DeepClone();
}
}
Generally I prefer to use the interfaces described as opposed to a copy constructor it keeps the intent very clear. A copy constructor would probably be assumed to be a deep clone, but it's certainly not as much of a clear intent as using an IDeepClonable interface.
This is discussed in the .net Framework Design Guidelines and on Brad Abrams' blog
(I suppose if you are writing an application (as opposed to a framework/library) so you can be sure no one outside of your team will be calling your code, it doesn't matter so much and you can assign a semantic meaning of "deepclone" to the .net ICloneable interface, but you should make sure this is well documented and well understood within your team. Personally I'd stick to the framework guidelines.)

In C#, what is the preferred way to add (deep) copy functionality to a class?
Should one implement the copy constructor,
or rather derive from ICloneable and implement the Clone() method?
The problem with ICloneable is, as others have mentioned, that it does not specify whether it is a deep or shallow copy, which makes it practically unuseable and, in practice, rarely used. It also returns object, which is a pain, since it requires a lot of casting. (And though you specifically mentioned classes in the question, implementing ICloneable on a struct requires boxing.)
A copy constuctor also suffers from one of the problems with ICloneable. It isn't obvious whether a copy constructor is doing a deep or shallow copy.
Account clonedAccount = new Account(currentAccount); // Deep or shallow?
It would be best to create a DeepClone() method. This way the intent is perfectly clear.
This raises the question of whether it should be a static or instance method.
Account clonedAccount = currentAccount.DeepClone(); // instance method
or
Account clonedAccount = Account.DeepClone(currentAccount); // static method
I slightly prefer the static version sometimes, just because cloning seems like something that is being done to an object rather than something the object is doing. In either case, there are going to be issues to deal with when cloning objects that are part of an inheritence hierarchy, and how those issues are delt with may ultimately drive the design.
class CheckingAccount : Account
{
CheckAuthorizationScheme checkAuthorizationScheme;
public override Account DeepClone()
{
CheckingAccount clone = new CheckingAccount();
DeepCloneFields(clone);
return clone;
}
protected override void DeepCloneFields(Account clone)
{
base.DeepCloneFields(clone);
((CheckingAccount)clone).checkAuthorizationScheme = this.checkAuthorizationScheme.DeepClone();
}
}

I recommend using a copy constructor over a clone method primarily because a clone method will prevent you from making fields readonly that could have been if you had used a constructor instead.
If you require polymorphic cloning, you can then add an abstract or virtual Clone() method to your base class that you implement with a call to the copy constructor.
If you require more than one kind of copy (ex: deep/shallow) you can specify it with a parameter in the copy constructor, although in my experience I find that usually a mixture of deep and shallow copying is what I need.
Ex:
public class BaseType {
readonly int mBaseField;
public BaseType(BaseType pSource) =>
mBaseField = pSource.mBaseField;
public virtual BaseType Clone() =>
new BaseType(this);
}
public class SubType : BaseType {
readonly int mSubField;
public SubType(SubType pSource)
: base(pSource) =>
mSubField = pSource.mSubField;
public override BaseType Clone() =>
new SubType(this);
}

There is a great argument that you should implement clone() using a protected copy constructor
It is better to provide a protected (non-public) copy constructor and invoke that from the clone method. This gives us the ability to delegate the task of creating an object to an instance of a class itself, thus providing extensibility and also, safely creating the objects using the protected copy constructor.
So this is not a "versus" question. You may need both copy constructor(s) and a clone interface to do it right.
(Although the recommended public interface is the Clone() interface rather than Constructor-based.)
Don't get caught-up in the explicit deep or shallow argument in the other answers. In the real world it is almost always something in-between - and either way, should not be the caller's concern.
The Clone() contract is simply "won't change when I change the first one". How much of the graph you have to copy, or how you avoid infinite recursion to make that happen shouldn't concern the caller.

Implementing ICloneable's not recommended due to the fact that it's not specified whether it's a deep or shallow copy, so I'd go for the constructor, or just implement something yourself. Maybe call it DeepCopy() to make it really obvious!

You'll run into problems with copy constructors and abstract classes. Imagine you want to do the following:
abstract class A
{
public A()
{
}
public A(A ToCopy)
{
X = ToCopy.X;
}
public int X;
}
class B : A
{
public B()
{
}
public B(B ToCopy) : base(ToCopy)
{
Y = ToCopy.Y;
}
public int Y;
}
class C : A
{
public C()
{
}
public C(C ToCopy)
: base(ToCopy)
{
Z = ToCopy.Z;
}
public int Z;
}
class Program
{
static void Main(string[] args)
{
List<A> list = new List<A>();
B b = new B();
b.X = 1;
b.Y = 2;
list.Add(b);
C c = new C();
c.X = 3;
c.Z = 4;
list.Add(c);
List<A> cloneList = new List<A>();
//Won't work
//foreach (A a in list)
// cloneList.Add(new A(a)); //Not this time batman!
//Works, but is nasty for anything less contrived than this example.
foreach (A a in list)
{
if(a is B)
cloneList.Add(new B((B)a));
if (a is C)
cloneList.Add(new C((C)a));
}
}
}
Right after doing the above, you start wishing you'd either used an interface, or settled for a DeepCopy()/ICloneable.Clone() implementation.

The problem with ICloneable is both intent and consistency. It's never clear whether it is a deep or shallow copy. Because of that, it's probably never used in only one manner or another.
I don't find a public copy constructor to be any clearer on that matter.
That said, I would introduce a method system that works for you and relays intent (a'la somewhat self documenting)

If the object you are trying to copy is Serializable you can clone it by serializing it and deserializing it. Then you don't need to write a copy constructor for each class.
I don't have access to the code right now but it is something like this
public object DeepCopy(object source)
{
// Copy with Binary Serialization if the object supports it
// If not try copying with XML Serialization
// If not try copying with Data contract Serailizer, etc
}

It is dependent on copy semantics of the class in question, which you should define yourself as the developer. Chosen method is usually based on intended use cases of the class. Maybe it will make a sense to implement both methods. But both share similar disadvantage - it is not exactly clear which copying method they implement. This should be clearly stated in documentation for your class.
For me having:
// myobj is some transparent proxy object
var state = new ObjectState(myobj.State);
// do something
myobject = GetInstance();
var newState = new ObjectState(myobject.State);
if (!newState.Equals(state))
throw new Exception();
instead of:
// myobj is some transparent proxy object
var state = myobj.State.Clone();
// do something
myobject = GetInstance();
var newState = myobject.State.Clone();
if (!newState.Equals(state))
throw new Exception();
looked as clearer statement of intent.

I think there should be a standard pattern for cloneable objects, though I'm not sure what exactly the pattern should be. With regard to cloning, it would seem there are three types of classes:
Those that explicitly support for deep cloning
Those that where memberwise cloning will work as deep cloning, but which neither have nor need explicit support.
Those which cannot be usefully deep cloned, and where memberwise cloning will yield bad results.
So far as I can tell, the only way (at least in .net 2.0) to get a new object of the same class as an existing object is to use MemberwiseClone. A nice pattern would seem to be to have a "new"/"Shadows" function Clone which always returns the present type, whose definition is always to call MemberwiseClone and then call a protected virtual subroutine CleanupClone(originalObject). The CleanupCode routine should call base.Cleanupcode to handle the base type's cloning needs and then add its own cleanup. If the cloning routine has to use the original object, it would have to be typecast, but otherwise the only typecasting would be on the MemberwiseClone call.
Unfortunately, the lowest level of class that was of type (1) above rather than type (2) would have to be coded to assume that its lower types would not need any explicit support for cloning. I don't really see any way around that.
Still, I think having a defined pattern would be better than nothing.
Incidentally, if one knows that one's base type supports iCloneable, but does not know the name of the function it uses, is there any way to reference the iCloneable.Clone function of one's base type?

If you read through all the interesting answers and discussions, you might still ask yourself how exactly you copy the properties - all of them explicitly, or is there a more elegant way to do it? If that is your remaining question, take a look at this (at StackOverflow):
How can I “deeply” clone the properties of 3rd party classes using a generic extension method?
It describes how to implement an extension method CreateCopy() which creates a "deep" copy of the object including all properties (without having to copy property by property manually).

Related

C#, making public members their methods private

I the following class:
public class Humptydump
{
public Humptydump()
{ }
public Rectangle Rectangle { public get; private set; }
}
in this class the Rectangle class comes from system.drawing,
how do i make it so people cannot access the methods of the rectangle, but can get the rectangle itself?
In your case, it will "just work".
Since Rectangle is a struct, your property will return a copy of the Rectangle. As such, it will be impossible for anybody to modify your Rectangle directly unless you expose methods to allow this.
That being said, it's impossible, in general, to provide access to a type without also providing access to methods defined on the type. The methods go along with the type. The only alternative in those cases would be to create a new type that exposed the data you choose without the data or methods you wish to be exposed, and provide access to that.
If rectangle was not a struct, one possible thing would be deriving it and hiding those methods:
public class DerivedClass : BaseClass
{
new private SomeReturnType SomeMethodFromBaseClasse(SameParametersAsInBaseClassAndSameSignature
{
//this simply hides the method from the user
//but user will still have the chance to cast to the BaseClass and
//access the methods from there
}
}
Are you talking about the Rectangle object specifically, or on a more general term and just using that as an example?
If you're talking on a more general term, this is something that comes up very often in refactoring patterns. This most commonly happens with collections on objects. If you expose, for example, a List<T> then even if the setter is private then people can still modify the collection through the getter, since they're not actually setting the collection when they do so.
To address this, consider the Law of Demeter. That is, when someone is interacting with a collection exposed by an object, should they really be interacting with the object itself? If so, then the collection shouldn't be exposed and instead the object should expose the functionality it needs to.
So, again in the case of a collection, you might end up with something like this:
class SomeObject
{
private List<AnotherObject> Things;
public void AddAnotherObject(AnotherObject obj)
{
// Add it to the list
}
public void RemoveAnotherObject(AnotherObject obj)
{
// Remove it from the list
}
}
Of course, you may also want to expose some copy of the object itself for people to read, but not modify. For a collection I might do something like this:
public IEnumerable<AnotherObject> TheObjects
{
get { return Things; }
}
That way anybody can see the current state of the objects and enumerate over them, but they can't actually modify it. Not because it doesn't have a setter, but because the IEnumerable<T> interface doesn't have options to modify the enumeration. Only to enumerate over it.
For your case with Rectangle (or something similar which isn't already a struct that's passed by value anyway), you would do something very similar. Store a private object and provide public functionality to modify it through the class itself (since what we're talking about is that the class needs to know when its members are modified) as well as functionality to inspect it without being able to modify what's being inspected. Something like this, perhaps:
class SomeObject
{
private AnotherObject Thing;
public AnotherObject TheThing
{
get { return Thing.Copy(); }
}
public void RenameThing(string name)
{
Thing.Name = name;
}
// etc.
}
In this case, without going into too much detail about what AnotherObject is (so consider this in some ways pseudo-code), the property to inspect the inner object returns a copy of it, not the actual reference to the actual object. For value types, this is the default behavior of the language. For reference types, you may need to strike a balance between this and performance (if creating a copy is a heavy operation).
In this case you'll also want to be careful of making the interface of your object unintuitive. Consuming code might expect to be able to modify the inner object being inspected, since it exposes functionality to modify itself. And, indeed, they can modify the copy that they have. How you address this depends heavily on the conceptual nature of the objects and how they relate to one another, which a contrived example doesn't really convey. You might create a custom DTO (even a struct) which returns only the observable properties of the inner object, making it more obvious that it's a copy and not the original. You might just say that it's a copy in the intellisense comments. You might make separate properties to return individual data elements of the inner object instead of a single property to return the object itself. There are plenty of options, it's up to you to determine what makes the most sense for your objects.

Loose coupling via using only primitive types / delegates

I have a conceptual / theoretical question about loose coupling and interfaces.
So one way to use an interface might be to encapsulate the parameters required by a certain constructor:
class Foo
{
public Foo(IFooInterface foo)
{
// do stuff that depends on the members of IFooInterface
}
}
So as long as the object passed in implements the contract, everything will work. From my understanding the main benefit here is that it enables polymorphism, but I'm not sure whether this really has anything to do with loose coupling.
Lets say for the sake of argument that an IFooInterface is as follows:
interface IFooInterface
{
string FooString { get; set; }
int FooInt { get; set; }
void DoFoo(string _str);
}
From a loose coupling standpoint, wouldnt it much better to NOT to use an IFooInterface in the above constructor, and instead set up the Foo like so:
class Foo
{
public Foo(string _fooString, int _fooInt, Action<string> _doFoo)
{
// do the same operations
}
}
Because say I want to drop the functionality of Foo into another project. That means that other project also has to reference IFooInterface, adding another dependency. But this way I can drop Foo into another project and it expresses exactly what it requires in order to work. Obviously I can just use overloaded constructors, but lets say for the sake of argument I dont want to and/or cannot modify Foo's constructors.
The most salient downside (to me atleast) is that if you have a method with a bunch of primitive parameters it gets ugly and hard to read. So I had the idea to create a sort of wrapping function that allows you to still pass in an interface rather than all the primitive types:
public static Func<T, R> Wrap<T, R>(Func<T, object[]> conversion)
{
return new Func<T, R>(t =>
{
object[] parameters = conversion(t);
Type[] args = new Type[parameters.Length];
for (int i = 0; i < parameters.Length; i++)
{
args[i] = parameters[i].GetType();
}
ConstructorInfo info = typeof(R).GetConstructor(args);
return (R)info.Invoke(parameters);
});
}
The idea here is that I can get back a function that takes an instance of some interface which conforms to the requirements of Foo, but Foo literally doesnt know anything about that interface. It could be used like so:
public Foo MakeFoo(IFooInterface foo)
{
return Wrap<IFooInterface, Foo>(f =>
new object[] { f.FooString, f.FooInt, f.DoFoo })(foo);
}
I've heard discussion about how interfaces are supposed to enable loose-coupling, but was wondering about this.
Wondering what some experienced programmers think.
In your initial example you're pretty close to the Parameter Object pattern, though it's more common to use a simple class (often with auto-properties) here without the extra abstraction of an interface.
Typically when you hear about passing an interface into a constructor, it's not to replace primitives but as a form of dependency injection. Instead of depending on MyFooRepository directly, one would take a dependency on IFooRepository which would remove the coupling to a specific implementation.
My first thought is that you did not provide Action<string> and Action<int> for the setters of FooString and FooInt, respectively. The implementation of IFooInterface may have rules concerning those setters, and may require access to other implementation details not exposed on the interface.
In the same vein, you should accept a Func<string> and Func<int> as well: the implementation of IFooInterface may have rules about what FooString and FooInt are as time progresses. For example, DoFoo may recalculate those values; you can't assume that they are just pass-throughs to fields that never change.
Taking this even further, if the getters, setters, or DoFoo require access to common state, the functions and actions will need to close over the same set of variables when you create them. At that point, you will be doing some mental gymnastics to comprehend the variable lifetimes and the relationships between the delegates.
This pairing of state and behavior is exactly what a class expresses, and the hiding of implementation details is exactly what an interface provides. Breaking those concepts into their component elements is certainly achievable, but it also breaks the coherence gained by grouping the members with a type.
To put it another way, you can give me noodles, sauce, vegetables, and hamburger, but that's not spaghetti and meatballs :-)

Using type object as returning type - bad practice?

I have a method
private object SetGrid(IGrid grid)
{
grid.PagerHelper.SetPage(1, 10);
grid.SortHelper.SetSort(SortOperator.Ascending);
grid.PagerHelper.RecordsPerPage = 10;
return grid;
}
which returns an object of type object.
Then I cast the object back to the previous type.
var projectModel = new ProjectModel();
projektyModel = (ProjectModel)SetGrid(projectModel);
The gain of this is, the method SetGrid can be reused across the app.
Is this a common practice or should I avoid doing this ?
You could use a generic method instead, and constrain the type argument to your IGrid interface:
private T SetGrid<T>(T grid) where T : IGrid
{
grid.PagerHelper.SetPage(1, 10);
grid.SortHelper.SetSort(SortOperator.Ascending);
grid.PagerHelper.RecordsPerPage = 10;
return grid;
}
You should still be able to call the method in exactly the same way, just without the cast. Type inferencing should be capable of automagically figuring out the required generic type argument for you:
var projectModel = new ProjectModel();
projektyModel = SetGrid(projectModel);
EDIT...
As other answers have mentioned, if your IGrid objects are reference types then you don't actually need to return anything at all from your method. If you pass a reference type then your method will update the original object, not a copy of it:
var projectModel = new ProjectModel(); // assume that ProjectModel is a ref type
projektyModel = SetGrid(projectModel);
bool sameObject = object.ReferenceEquals(projectModel, projektyModel); // true
Since you are passing in an object of a class that implements IGrid you could just as well change the return type to IGrid.
Also, since it's a reference type you don't even need to return the grid again. You could just as well use this:
var projectModel = new ProjectModel();
SetGrid(projectModel);
This is better accomplished with generics. You can use a constraint on the generic typeparam to preserve your type safety!
private T SetGrid<T>(T grid) where T : IGrid
{
grid.PagerHelper.SetPage(1, 10);
grid.SortHelper.SetSort(SortOperator.Ascending);
grid.PagerHelper.RecordsPerPage = 10;
return grid;
}
and then
var projectModel = new ProjectModel();
projectModel = SetGrid(projectModel);
Here, the generic typeparam "T" is actually inferred by the compiler by the way you call the method.
It's worth noting that in the particular use-case you've demonstrated, returning grid is probably unnecessary, as your original variable reference will be appropriately modified after the method call.
In the case you illustrate above there is no need to return grid. The IGrid instance is passed by reference, so your projectModel reference will be updated with the changes you've made in the SetGrid method.
If you still want to return the argument, at least return IGrid, since it is already known that the argument is an IGrid.
In general, provide as much type information as you can when programming in a statically typed language/manner.
"Is this a common practice or should I avoid doing this ?"
This is not common practice. You should avoid doing this.
Functions that only modify the parameter passed in should not have return types. If causes a bit of confusion. In the current C# you could make the modifying function an extention method for better read-ability.
It causes an unnecisary cast of the return type. It's a performance decrease, which may not be noticable... but its still needless since you are casting from an interface, return that interface even if the object is different from the parameter passed in.
Returning object is confusing to users of the function. Lets say the function created a copy and returned a copy... you would still want to return the interface passed in so that people using the function know "hey i'm getting an IGrid back." instead of having to figure out what type is being returned on thier own. The less you make your team mates think about stuff like this the better, for you and them.
This is a very weird example because SetGrid doesn't seem to do a lot of things other than setting some defaults. You are also letting the code perform manipulation on the object that could very well do that by itself. Meaning IGrid and ProjectModel could be refactored to this:
public interface IGrid {
// ...
public void setDefaults();
// ...
}
public class ProjectModel : IGrid {
// ...
public void setDefaults() {
PagerHelper.SetPage(1, 10);
SortHelper.SetSort(SortOperator.Ascending);
PagerHelper.RecordsPerPage = 10;
}
// ...
}
Using this refactoring you only need perform the same with this:
myProjectModel.setDefaults();
You could also create an abstract base class that implements IGrid that implements the setDefaults() method and let ProjectModel extend the abstract class.
what about the SOLID principles ? Concretely the Single Responsibility Principle. The class is in the first place something like a DTO. – user137348
I'm exercising the Interface Segregation Principle out of the SOLID principles here, to hide the implementation from the client of the class. I.e. so the client doesn't have to access the internals of the class it is using or else it is a violation of Principle of Least Knowledge.
Single Responsibility Principle (SRP) only tells that a class should only have one reason to change which is a very vague restriction since a change can be as narrow and broad as you want it to be.
I believe it is okay to put some configuration logic in a parameter class if it is small enough. Otherwise I'd put it all in a factory class. The reason I suggest this solution is because IGrid seems to have reference to PagerHelper and SortHelper that seem to be mutators for IGrid.
So I find it odd that you mention the class being a DTO. A DTO from a purist sense shouldn't have logic in it other than accessors (i.e. getter methods) which makes it strange that ProjectModel itself has references to PagerHelper and SortHelper which I assume can mutate it (i.e. they're setters). If you really want SRP the "helpers" should be in a factory class that creates the IGrid/ProjectModel instance.
Your grid is an IGrid, why not return IGrid?

Use new keyword if hiding was intended

I have the following snippet of code that's generating the "Use new keyword if hiding was intended" warning in VS2008:
public double Foo(double param)
{
return base.Foo(param);
}
The Foo() function in the base class is protected and I want to expose it to a unit test by putting it in wrapper class solely for the purpose of unit testing. I.e. the wrapper class will not be used for anything else. So one question I have is: is this accepted practice?
Back to the new warning. Why would I have to new the overriding function in this scenario?
The new just makes it absolutely clear that you know you are stomping over an existing method. Since the existing code was protected, it isn't as big a deal - you can safely add the new to stop it moaning.
The difference comes when your method does something different; any variable that references the derived class and calls Foo() would do something different (even with the same object) as one that references the base class and calls Foo():
SomeDerived obj = new SomeDerived();
obj.Foo(); // runs the new code
SomeBase objBase = obj; // still the same object
objBase.Foo(); // runs the old code
This could obviously have an impact on any existing code that knows about SomeDerived and calls Foo() - i.e. it is now running a completely different method.
Also, note that you could mark it protected internal, and use [InternalsVisibleTo] to provide access to your unit test (this is the most common use of [InternalsVisibleTo]; then your unit-tests can access it directly without the derived class.
The key is that you're not overriding the method. You're hiding it. If you were overriding it, you'd need the override keyword (at which point, unless it's virtual, the compiler would complain because you can't override a non-virtual method).
You use the new keyword to tell both the compiler and anyone reading the code, "It's okay, I know this is only hiding the base method and not overriding it - that's what I meant to do."
Frankly I think it's rarely a good idea to hide methods - I'd use a different method name, like Craig suggested - but that's a different discussion.
You're changing the visibility without the name. Call your function TestFoo and it will work. Yes, IMHO it's acceptable to subclass for this reason.
You'll always find some tricky situations where the new keyword can be used for hiding while it can be avoided most of the times.
However, recently I really needed this keyword, mainly because the language lacks some other proper synthax features to complete an existing accessor for instance:
If you consider an old-fashioned class like:
KeyedCollection<TKey, TItem>
You will notice that the accesor for acessing the items trough index is:
TItem this[Int32 index] { get; set; }
Has both { get; set; } and they are of course mandatory due to the inheritance regarding ICollection<T> and Collection<T>, but there is only one { get; } for acessing the items through their keys (I have some guesses about this design and there is plenty of reasons for that, so please note that I picked up the KeyedCollection<TKey, TItem>) just for illustrations purposes).
Anyway so there is only one getter for the keys access:
TItem this[TKey key] { get; }
But what about if I want to add the { set; } support, technically speaking it's not that stupid especially if you keep reasoning from the former definition of the propery, it's just a method... the only way is to implement explicitly another dummy interface but when you want to make implicit you have to come up with the new keyword, I'm hiding the accessor definition, keeping the get; base definition and just add a set stuffed with some personal things to make it work.
I think for this very specific scenario, this keyword is perfecly applicable, in particular in regards to a context where there is no brought to the { get; } part.
public new TItem this[TKey key]
{
get { return base... }
set { ... }
}
That's pretty much the only trick to avoid this sort of warning cause the compiler is suggesting you that you're maybe hiding without realizing what you are doing.

Immutable object pattern in C# - what do you think? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I have over the course of a few projects developed a pattern for creating immutable (readonly) objects and immutable object graphs. Immutable objects carry the benefit of being 100% thread safe and can therefore be reused across threads. In my work I very often use this pattern in Web applications for configuration settings and other objects that I load and cache in memory. Cached objects should always be immutable as you want to guarantee they are not unexpectedly changed.
Now, you can of course easily design immutable objects as in the following example:
public class SampleElement
{
private Guid id;
private string name;
public SampleElement(Guid id, string name)
{
this.id = id;
this.name = name;
}
public Guid Id
{
get { return id; }
}
public string Name
{
get { return name; }
}
}
This is fine for simple classes - but for more complex classes I do not fancy the concept of passing all values through a constructor. Having setters on the properties is more desirable and your code constructing a new object gets easier to read.
So how do you create immutable objects with setters?
Well, in my pattern objects start out as being fully mutable until you freeze them with a single method call. Once an object is frozen it will stay immutable forever - it cannot be turned into a mutable object again. If you need a mutable version of the object, you simply clone it.
Ok, now on to some code. I have in the following code snippets tried to boil the pattern down to its simplest form. The IElement is the base interface that all immutable objects must ultimately implement.
public interface IElement : ICloneable
{
bool IsReadOnly { get; }
void MakeReadOnly();
}
The Element class is the default implementation of the IElement interface:
public abstract class Element : IElement
{
private bool immutable;
public bool IsReadOnly
{
get { return immutable; }
}
public virtual void MakeReadOnly()
{
immutable = true;
}
protected virtual void FailIfImmutable()
{
if (immutable) throw new ImmutableElementException(this);
}
...
}
Let's refactor the SampleElement class above to implement the immutable object pattern:
public class SampleElement : Element
{
private Guid id;
private string name;
public SampleElement() {}
public Guid Id
{
get
{
return id;
}
set
{
FailIfImmutable();
id = value;
}
}
public string Name
{
get
{
return name;
}
set
{
FailIfImmutable();
name = value;
}
}
}
You can now change the Id property and the Name property as long as the object has not been marked as immutable by calling the MakeReadOnly() method. Once it is immutable, calling a setter will yield an ImmutableElementException.
Final note:
The full pattern is more complex than the code snippets shown here. It also contains support for collections of immutable objects and complete object graphs of immutable object graphs. The full pattern enables you to turn an entire object graph immutable by calling the MakeReadOnly() method on the outermost object. Once you start creating larger object models using this pattern the risk of leaky objects increases. A leaky object is an object that fails to call the FailIfImmutable() method before making a change to the object. To test for leaks I have also developed a generic leak detector class for use in unit tests. It uses reflection to test if all properties and methods throw the ImmutableElementException in the immutable state.
In other words TDD is used here.
I have grown to like this pattern a lot and find great benefits in it. So what I would like to know is if any of you are using similar patterns? If yes, do you know of any good resources that document it? I am essentially looking for potential improvements and for any standards that might already exist on this topic.
For info, the second approach is called "popsicle immutability".
Eric Lippert has a series of blog entries on immutability starting here. I'm still getting to grips with the CTP (C# 4.0), but it looks interesting what optional / named parameters (to the .ctor) might do here (when mapped to readonly fields)...
[update: I've blogged on this here]
For info, I probably wouldn't make those methods virtual - we probably don't want subclasses being able to make it non-freezable. If you want them to be able to add extra code, I'd suggest something like:
[public|protected] void Freeze()
{
if(!frozen)
{
frozen = true;
OnFrozen();
}
}
protected virtual void OnFrozen() {} // subclass can add code here.
Also - AOP (such as PostSharp) might be a viable option for adding all those ThrowIfFrozen() checks.
(apologies if I have changed terminology / method names - SO doesn't keep the original post visible when composing replies)
Another option would be to create some kind of Builder class.
For an example, in Java (and C# and many other languages) String is immutable. If you want to do multiple operations to create a String you use a StringBuilder. This is mutable, and then once you're done you have it return to you the final String object. From then on it's immutable.
You could do something similar for your other classes. You have your immutable Element, and then an ElementBuilder. All the builder would do is store the options you set, then when you finalize it it constructs and returns the immutable Element.
It's a little more code, but I think it's cleaner than having setters on a class that's supposed to be immutable.
After my initial discomfort about the fact that I had to create a new System.Drawing.Point on each modification, I've wholly embraced the concept some years ago. In fact, I now create every field as readonly by default and only change it to be mutable if there's a compelling reason – which there is surprisingly rarely.
I don't care very much about cross-threading issues, though (I rarely use code where this is relevant). I just find it much, much better because of the semantic expressiveness. Immutability is the very epitome of an interface which is hard to use incorrectly.
You are still dealing with state, and thus can still be bitten if your objects are parallelized before being made immutable.
A more functional way might be to return a new instance of the object with each setter. Or create a mutable object and pass that in to the constructor.
The (relatively) new Software Design paradigm called Domain Driven design, makes the distinction between entity objects and value objects.
Entity Objects are defined as anything that has to map to a key-driven object in a persistent data store, like an employee, or a client, or an invoice, etc... where changing the properties of the object implies that you need to save the change to a data store somewhere, and the existence of multiple instances of a class with the same "key" imnplies a need to synchronize them, or coordinate their persistence to the data store so that one instance' changes do not overwrite the others. Changing the properties of an entity object implies you are changing something about the object - not changing WHICH object you are referencing...
Value objects otoh, are objects that can be considered immutable, whose utility is defined strictly by their property values, and for which multiple instances, do not need to be coordinated in any way... like addresses, or telephone numbers, or the wheels on a car, or the letters in a document... these things are totally defined by their properties... an uppercase 'A' object in an text editor can be interchanged transparently with any other uppercase 'A' object throughout the document, you don't need a key to distinguish it from all the other 'A's In this sense it is immutable, because if you change it to a 'B' (just like changing the phone number string in a phone number object, you are not changing the data associated with some mutable entity, you are switching from one value to another... just as when you change the value of a string...
Expanding on the point by #Cory Foy and #Charles Bretana where there is a difference between entities and values. Whereas value-objects should always be immutable, I really don't think that an object should be able to freeze themselves, or allow themselves to be frozen arbitrarily in the codebase. It has a really bad smell to it, and I worry that it could get hard to track down where exactly an object was frozen, and why it was frozen, and the fact that between calls to an object it could change state from thawed to frozen.
That isn't to say that sometimes you want to give a (mutable) entity to something and ensure it isn't going to be changed.
So, instead of freezing the object itself, another possibility is to copy the semantics of ReadOnlyCollection< T >
List<int> list = new List<int> { 1, 2, 3};
ReadOnlyCollection<int> readOnlyList = list.AsReadOnly();
Your object can take a part as mutable when it needs it, and then be immutable when you desire it to be.
Note that ReadOnlyCollection< T > also implements ICollection< T > which has an Add( T item) method in the interface. However there is also bool IsReadOnly { get; } defined in the interface so that consumers can check before calling a method that will throw an exception.
The difference is that you can't just set IsReadOnly to false. A collection either is or isn't read only, and that never changes for the lifetime of the collection.
It would be nice at time to have the const-correctness that C++ gives you at compile time, but that starts to have it's own set of problems and I'm glad C# doesn't go there.
ICloneable - I thought I'd just refer back to the following:
Do not implement ICloneable
Do not use ICloneable in public APIs
Brad Abrams - Design Guidelines, Managed code and the .NET Framework
System.String is a good example of a immutable class with setters and mutating methods, only that each mutating method returns a new instance.
This is an important problem, and I've love to see more direct framework/language support to solve it. The solution you have requires a lot of boilerplate. It might be simple to automate some of the boilerplate by using code generation.
You'd generate a partial class that contains all the freezable properties. It would be fairly simple to make a reusable T4 template for this.
The template would take this for input:
namespace
class name
list of property name/type tuples
And would output a C# file, containing:
namespace declaration
partial class
each of the properties, with the corresponding types, a backing field, a getter, and a setter which invokes the FailIfFrozen method
AOP tags on freezable properties could also work, but it would require more dependencies, whereas T4 is built into newer versions of Visual Studio.
Another scenario which is very much like this is the INotifyPropertyChanged interface. Solutions for that problem are likely to be applicable to this problem.
My problem with this pattern is that you're not imposing any compile-time restraints upon immutability. The coder is responsible for making sure an object is set to immutable before for example adding it to a cache or another non-thread-safe structure.
That's why I would extend this coding pattern with a compile-time restraint in the form of a generic class, like this:
public class Immutable<T> where T : IElement
{
private T value;
public Immutable(T mutable)
{
this.value = (T) mutable.Clone();
this.value.MakeReadOnly();
}
public T Value
{
get
{
return this.value;
}
}
public static implicit operator Immutable<T>(T mutable)
{
return new Immutable<T>(mutable);
}
public static implicit operator T(Immutable<T> immutable)
{
return immutable.value;
}
}
Here's a sample how you would use this:
// All elements of this list are guaranteed to be immutable
List<Immutable<SampleElement>> elements =
new List<Immutable<SampleElement>>();
for (int i = 1; i < 10; i++)
{
SampleElement newElement = new SampleElement();
newElement.Id = Guid.NewGuid();
newElement.Name = "Sample" + i.ToString();
// The compiler will automatically convert to Immutable<SampleElement> for you
// because of the implicit conversion operator
elements.Add(newElement);
}
foreach (SampleElement element in elements)
Console.Out.WriteLine(element.Name);
elements[3].Value.Id = Guid.NewGuid(); // This will throw an ImmutableElementException
Just a tip to simplify the element properties: Use automatic properties with private set and avoid explicitly declaring the data field. e.g.
public class SampleElement {
public SampleElement(Guid id, string name) {
Id = id;
Name = name;
}
public Guid Id {
get; private set;
}
public string Name {
get; private set;
}
}
Here is a new video on Channel 9 where Anders Hejlsberg from 36:30 in the interview starts talking about immutability in C#. He gives a very good use case for popsicle immutability and explains how this is something you are currently required to implement yourself. It was music to my ears hearing him say it is worth thinking about better support for creating immutable object graphs in future versions of C#
Expert to Expert: Anders Hejlsberg - The Future of C#
Two other options for your particular problem that haven't been discussed:
Build your own deserializer, one that can call a private property setter. While the effort in building the deserializer at the beginning will be much more, it makes things cleaner. The compiler will keep you from even attempting to call the setters and the code in your classes will be easier to read.
Put a constructor in each class that takes an XElement (or some other flavor of XML object model) and populates itself from it. Obviously as the number of classes increases, this quickly becomes less desirable as a solution.
How about having an abstract class ThingBase, with subclasses MutableThing and ImmutableThing? ThingBase would contain all the data in a protected structure, providing public read-only properties for the fields and protected read-only property for its structure. It would also provide an overridable AsImmutable method which would return an ImmutableThing.
MutableThing would shadow the properties with read/write properties, and provide both a default constructor and a constructor that accepts a ThingBase.
Immutable thing would be a sealed class that overrides AsImmutable to simply return itself. It would also provide a constructor that accepts a ThingBase.
I dont like the idea of being able to change an object from a mutable to an immutable state, that kind of seems to defeat the point of design to me. When are you needing to do that? Only objects which represent VALUES should be immutable
You can use optional named arguments together with nullables to make an immutable setter with very little boilerplate. If you really do want to set a property to null then you may have some more troubles.
class Foo{
...
public Foo
Set
( double? majorBar=null
, double? minorBar=null
, int? cats=null
, double? dogs=null)
{
return new Foo
( majorBar ?? MajorBar
, minorBar ?? MinorBar
, cats ?? Cats
, dogs ?? Dogs);
}
public Foo
( double R
, double r
, int l
, double e
)
{
....
}
}
You would use it like so
var f = new Foo(10,20,30,40);
var g = f.Set(cat:99);

Categories