How to serialize struct which implements IEnumerable using protobuf-net and surrogate - c#

I'm using the protobuf-net package version 3.0.101.
The following code generates a runtime exception when executing typeModel.Serialize(ms, value). The exception is:
GenericArguments[0], 'UserQuery+Option`1[System.Int32]', on 'ProtoBuf.Serializers.RepeatedSerializer`2[TCollection,T] CreateEnumerable[TCollection,T]()' violates the constraint of type 'TCollection'.
void Main()
{
var typeModel = RuntimeTypeModel.Create();
typeModel.SetSurrogate<Option<int>, OptionSurrogate<int>>();
//typeModel[typeof(Option<int>)].IgnoreListHandling = true; // This doesn't help.
var value = new Option<int>(true, 5);
var ms = new MemoryStream();
typeModel.Serialize(ms, value);
}
struct Option<T> : IEnumerable<T>
{
public Option(bool hasValue, T value)
{
HasValue = hasValue;
Value = value;
}
public readonly bool HasValue;
public readonly T Value;
public IEnumerator<T> GetEnumerator()
{
if (HasValue) {
yield return Value;
}
}
IEnumerator IEnumerable.GetEnumerator() => GetEnumerator();
}
[ProtoContract]
struct OptionSurrogate<T>
{
public OptionSurrogate(Option<T> thing) {
HasValue = thing.HasValue;
Value = thing.Value;
}
[ProtoMember(1)]
public readonly bool HasValue;
[ProtoMember(2)]
public readonly T Value;
public static implicit operator Option<T>(OptionSurrogate<T> surrogate) => new Option<T>(surrogate.HasValue, surrogate.Value);
public static implicit operator OptionSurrogate<T>(Option<T> value) => new OptionSurrogate<T>(value);
}
There are two things that will fix this:
Changing struct Option<T> to class Option<T>
Removing IEnumerable<T> from struct Option<T>
However, neither of these are possible because the struct I want to serialize is in a 3rd party library.
Is this a bug in protobuf-net or is there a workaround?

One possible workaround that might work is wrapping the object in your own class, where you will be able to control which properties get serialized and how.
So I pulled the protobuf repo and stepped through their code for surrogate handling for IEnumerable<T>(the Option<T>) and their implementation makes heavy assumptions that any IEnumerable<T> is an ICollection by default leading to this error. Despite the fact that you explicitly give the runtime the information that Option<T> is not serializable using SetSurrogate, their implementation does not check surrogates until after the item it packed into a more manageable Enumerable type. The problem here is that they're trying to pack:
public IEnumerator<T> GetEnumerator()
{
if (HasValue) {
yield return Value;
}
}
and that's not possible (in-fact they don't even 'see'/'look' for this kind of implementation of IEnumerable.
All that being said, I recommend just explictly casting the objects before and after deserialization, I provided an example that I got working below.
static void Main()
{
var typeModel = RuntimeTypeModel.Create();
var value = new Option<int>(true, 5);
using var writer = new FileInfo("test.txt").OpenWrite();
typeModel.Serialize(writer , (OptionSurrogate<int>)value, typeModel);
writer .Dispose();
using var reader = new FileInfo("test.txt").OpenRead();
Option<int> deserializedValue = (OptionSurrogate<int>)typeModel.Deserialize(reader, null, typeof(OptionSurrogate<int>));
}
Edit
Added simple implementation of ICollection<T> to possibly prevent constraint error
Edit
Added Working work-around this time.
Editors Note
I am not an expert at protobuf and my opinion of how and why this bug is happening is based on some very simple debugging and stepping through their github source. This should not be taken as a fact as I neither wrote protobuf or have worked with the source code of protobuf long enough to properly opine on the cause of this error.

Related

Exception: DataAnalysis.Reference+<>c__DisplayClass4 is not serializable

I am trying to serialize objects of class Reference at the end of my program. A serialization exception is thrown, which complains that "DataAnalysis.Reference+<>c__DisplayClass4" is not marked as serializable.
Initially I had the two delegates without the Serializable attribute, so I gave it a try, but it didn't change anything. The classes Cacheable and Operation are already marked as Serializable - and in fact the serialization of the both of them worked perfectly fine before I introduced the Reference class.
I don't even know what c__DisplayClass4 means. So I am sorry, but I don't know what other parts of my 1 megabytes+ source code to post here to help you solve the problem, because in the end I would be posting everything.
As I said, everything worked fine before introducing the Reference class. So I am hoping the problem is somehow localized to it.
using System;
using System.Reflection;
namespace DataAnalysis
{
/// <summary>
/// Description of Reference.
/// </summary>
[Serializable]
public class Reference
{
[Serializable]
public delegate void ReferenceSetter(Operation op, Cacheable c);
[Serializable]
public delegate Cacheable ReferenceGetter(Operation op);
readonly ReferenceGetter refGetter;
readonly ReferenceSetter refSetter;
public Reference(ReferenceGetter getter, ReferenceSetter setter)
{
refGetter = getter;
refSetter = setter;
}
public Reference(FieldInfo operationField)
{
refGetter = (op => (Cacheable)operationField.GetValue(op));
refSetter = ((op, value) => operationField.SetValue(op, value));
}
public Cacheable this[Operation op]
{
get {return refGetter(op);}
set {refSetter(op, value);}
}
}
}
Edit: I have chosen taffer's first solution (avoid using the FieldInfo inside a delegate):
public class Reference
{
public delegate void ReferenceSetter(Operation op, Cacheable c);
public delegate Cacheable ReferenceGetter(Operation op);
readonly FieldInfo opField;
readonly ReferenceGetter refGetter;
readonly ReferenceSetter refSetter;
public Reference(ReferenceGetter getter, ReferenceSetter setter)
{
refGetter = getter;
refSetter = setter;
}
public Reference(FieldInfo operationField)
{
opField = operationField;
}
public Cacheable this[Operation op]
{
get
{
if (opField != null) return (Cacheable)opField.GetValue(op);
else return refGetter(op);
}
set
{
if (opField != null) opField.SetValue(op, value);
else refSetter(op, value);
}
}
}
Not polished yet, I will probably finally use an abstract Reference class with two implementations. But the principle becomes clear.
You get the error because of the way you initialize your fields in the second constructor:
public Reference(FieldInfo operationField)
{
// operationField is captured in the lambda below, which causes to generate an inner class
// where operationField will be a field so can be accessed by the method of the lambda body
refGetter = (op => (Cacheable)operationField.GetValue(op));
refSetter = ((op, value) => operationField.SetValue(op, value));
}
Solution 1:
Do not capture locals and parameters of the enclosing method in the lambda. The field should rather be a parameter of the delegate.
Solution2:
Implement ISerializable and provide a custom serialization:
void ISerializable.GetObjectData(SerializationInfo info, StreamingContext context)
{
info.AddValue("getter", refGetter);
info.AddValue("setter", refSetter);
}
// the special constructor needed for deserialization
private Reference(SerializationInfo info, StreamingContext context)
{
refGetter = (ReferenceGetter)info.GetValue("getter", typeof(ReferenceGetter));
refSetter = (ReferenceSetter)info.GetValue("setter", typeof(ReferenceSetter));
}
Please note that deserializing delegates of non-static methods can be problematic. Maybe you should check the Delegate.Method property in GetObjectData and throw an exception if the setter or getter is an instance method.

Trouble Converting Lambda Expression to Delegate Because of "some return types"

I'm writing a Linked List program in C# because I want to test how I feel about the language and I'm running into some serious difficulty. I'm trying to implement a Map method that functions like a Haskell map function (code below for both). However, I'm getting the error messages:
main.cs(43,66): error CS0029: Cannot implicitly convert type `void' to `MainClass.LinkedList<U>'
main.cs(43,33): error CS1662: Cannot convert `lambda expression' to delegate type `System.Func<MainClass.LinkedList<U>>' because some of the return types in the block are not implicitly convertible to the delegate return type
The relevant code in question:
Ideal Haskell code:
map :: [a] -> (a -> b) -> [b]
map (x:[]) f = (f x) : []
map (x:xs) f = (f x) : (map xs f)
C# code:
public class LinkedList<T> where T: class
{
public T first;
public LinkedList<T> rest;
public LinkedList(T x) {this.first = x;}
public void Join(LinkedList<T> xs)
{
Do(this.rest, ()=>this.rest.Join(xs), ()=>Assign(ref this.rest, xs));
}
public LinkedList<U> Map<U>(Func<T, U> f) where U: class
{
return DoR(this.rest, ()=>new LinkedList<U>(f(this.first)).Join(this.rest.Map(f)), ()=>new LinkedList<U>(f(this.first)));
}
public static void Assign<T>(ref T a, T b)
{
a = b;
}
public static U DoR<T, U>(T x, Func<U> f, Func<U> g)
{
if (x!=null) {return f();}
else {return g();}
}
public static void Do<T>(T x, Action f, Action g)
{
if (x != null) {f();}
else {g();}
}
While Assign, DoR (short for Do and Return), and Do seem like they're "code smell", they're what I came up with for trying to not write
if (x != null) {f();}
else {g();}
type statements (I'm used to patternmatching). If anybody has any better ideas, I'd love to know them, but mostly I'm concerned with the highlighted problem.
Starting with your immediate problem: the basic issue here is that you're mixing and matching lambda expressions that have either void return type or an actual return type. This can be addressed by changing your Join() method so that it returns the list used to call Join():
public LinkedList<T> Join(LinkedList<T> xs)
{
Do(this.rest, () => this.rest.Join(xs), () => Assign(ref this.rest, xs));
return this;
}
An alternative way would be to have a statement body lambda in the Map<U>() method that saves the new list to a variable and then returns that. But that adds a lot more code than just changing the Join() method, so it seems less preferable.
That said, you seem to be abusing C# a bit here. Just as when writing code in a functional language, one should really make an effort to write real functional code, in the manner idiomatic to that language, so too should one make an effort when writing C# code to write real imperative code, in the manner idiomatic to C#.
Yes, C# has some functional-like features in it, but they don't generally have the same power as the features found in real functional languages, and they are intended to allow C# programmers to get the low-hanging fruit of functional styles of code without having to switch languages. One particular thing also to be aware of is that lambda expressions generate a lot more code than normal C# imperative code.
Sticking to more idiomatic C# code, the data structure you're implementing above can be written much more concisely, and in a manner that creates much more efficient code. That would look something like this:
class LinkedList<T>
{
public T first;
public LinkedList<T> rest;
public LinkedList(T x) { first = x; }
public void Join(LinkedList<T> xs)
{
if (rest != null) rest.Join(xs);
else rest = xs;
}
public LinkedList<U> Map<U>(Func<T, U> f) where U : class
{
LinkedList<U> result = new LinkedList<U>(f(first));
if (rest != null) result.Join(rest.Map(f));
return result;
}
}
(For what it's worth, I don't see the point of the generic type constraint on your Map<U>() method. Why restrict it like that?)
Now, all that said, it seems to me that if you do want a functional-style linked-list implementation in C#, it would make sense to make it an immutable list. I'm not familiar with Haskell, but from my limited use of functional languages generally, I have the impression that immutability is a common feature in functional language data types, if not enforced 100% (e.g. XSL). So if trying to reimplement functional language constructs in C#, why not follow that paradigm?
See, for example, Eric Lippert's answer in Efficient implementation of immutable (double) LinkedList. Or his excellent series of articles on immutability in C# (you can start here: Immutability in C# Part One: Kinds of Immutability), where you can get ideas for how to create various immutable collection types.
In browsing Stack Overflow for related posts, I found several that, while not directly applicable to your question, may still be of interest (I know I found them very interesting):
how can I create a truly immutable doubly linked list in C#?
Immutable or not immutable?
Doubly Linked List in a Purely Functional Programming Language
Why does the same algorithm work in Scala much slower than in C#? And how to make it faster?
Converting C# code to F# (if statement)
I like that last one mainly for the way that in both the presentation of the question itself and the replies (answers and comments) help illustrate well why it's so important to avoid trying to just transliterate from one language to another, and instead to really try to become familiar with the way a language is designed to be used, and how common data structures and algorithms are represented in a given language, idiomatically.
Addendum:
Inspired by Eric Lippert's rough draft of an immutable list type, I wrote a different version that includes the Join() method, as well as the ability to add elements at the front and end of the list:
abstract class ImmutableList<T> : IEnumerable<T>
{
public static readonly ImmutableList<T> Empty = new EmptyList();
public abstract IEnumerator<T> GetEnumerator();
public abstract ImmutableList<T> AddLast(T t);
public abstract ImmutableList<T> InsertFirst(T t);
public ImmutableList<T> Join(ImmutableList<T> tail)
{
return new List(this, tail);
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
class EmptyList : ImmutableList<T>
{
public override ImmutableList<T> AddLast(T t)
{
return new LeafList(t);
}
public override IEnumerator<T> GetEnumerator()
{
yield break;
}
public override ImmutableList<T> InsertFirst(T t)
{
return AddLast(t);
}
}
abstract class NonEmptyList : ImmutableList<T>
{
public override ImmutableList<T> AddLast(T t)
{
return new List(this, new LeafList(t));
}
public override ImmutableList<T> InsertFirst(T t)
{
return new List(new LeafList(t), this);
}
}
class LeafList : NonEmptyList
{
private readonly T _value;
public LeafList(T t)
{
_value = t;
}
public override IEnumerator<T> GetEnumerator()
{
yield return _value;
}
}
class List : NonEmptyList
{
private readonly ImmutableList<T> _head;
private readonly ImmutableList<T> _tail;
public List(ImmutableList<T> head, ImmutableList<T> tail)
{
_head = head;
_tail = tail;
}
public override IEnumerator<T> GetEnumerator()
{
return _head.Concat(_tail).GetEnumerator();
}
}
}
The public API is a little different from Eric's. You enumerate it to access the elements. The implementation is different as well; using a binary tree was how I enabled the Join() method.
Note that with the interface IEnumerable<T> implemented, one way to implement the Map<U>() method is to not do it at all and instead just use the built-in Enumerable.Select():
ImmutableList<T> list = ...; // whatever your list is
Func<T, U> map = ...; // whatever your projection is
IEnumerable<U> mapped = list.Select(map);
As long as the map function is relatively inexpensive, that would work fine. Any time mapped is enumerated, it will re-enumerate list, applying the map function. The mapped enumeration remains immutable, because it's based on the immutable list object.
There are probably other ways to do it (for that matter, I know of at least one other), but the above is what made the most sense to be conceptually.

How can I access private List<T> members?

In general, in C# it is more convenient to use List than T[]. However, there are times when the profiler shows that List has significant performance penalties compared to natively-implemented bulk operations like Array.Copy and Buffer.BlockCopy. In addition, it is not possible to get pointers to List<> elements.
This makes working with dynamic meshes in Unity somewhat painful. Some of these problems could be alleviated if we could access T[] List._items. Is this possible to do without significant overhead? (either CPU or garbage)
If you know the layout of List, then you can use a dirty trick to cast managed object references. Do not use this unless you're willing to test on every target platform you run on, and re-test with every Unity upgrade.
The most dangerous thing about this is that it breaks invariants about the runtime and compiled type of the object. The compiler will generate code for an object of type TTo, but the object's RTTI field will still show an object of type TFrom.
[StructLayout(LayoutKind.Explicit)]
public struct ConvertHelper<TFrom, TTo>
where TFrom : class
where TTo : class {
[FieldOffset( 0)] public long before;
[FieldOffset( 8)] public TFrom input;
[FieldOffset(16)] public TTo output;
static public TTo Convert(TFrom thing) {
var helper = new ConvertHelper<TFrom, TTo> { input = thing };
unsafe {
long* dangerous = &helper.before;
dangerous[2] = dangerous[1]; // ie, output = input
}
var ret = helper.output;
helper.input = null;
helper.output = null;
return ret;
}
}
class PublicList<T> {
public T[] _items;
}
public static T[] GetBackingArray<T>(this List<T> list) {
return ConvertHelper<List<T>, PublicList<T>>.Convert(list)._items;
}
Using reflection is always possible. This generates a few hundred bytes of garbage for the call to GetValue(). It is also not very fast; on the order of 40 List< T > accesses.
// Helper class for fetching and caching FieldInfo values
class FieldLookup {
string sm_name;
Dictionary<Type, FieldInfo> sm_cache;
public FieldLookup(string name) {
sm_name = name;
sm_cache = new Dictionary<Type, FieldInfo>();
}
public FieldInfo Get(Type t) {
try {
return sm_cache[t];
} catch (KeyNotFoundException) {
var field = sm_cache[t] = t.GetField(
sm_name,
System.Reflection.BindingFlags.NonPublic |
System.Reflection.BindingFlags.GetField |
System.Reflection.BindingFlags.Instance);
return field;
}
}
}
static FieldLookup sm_items = new FieldLookup("_items");
public static T[] GetBackingArray<T>(this List<T> list) {
return (T[])sm_items.Get(typeof(List<T>)).GetValue(list);
}

Can I use a collection initializer for an Attribute?

Can an attribute in C# be used with a collection initializer?
For example, I'd like to do something like the following:
[DictionaryAttribute(){{"Key", "Value"}, {"Key", "Value"}}]
public class Foo { ... }
I know attributes can have named parameters, and since that seems pretty similar to object initializers, I was wondering if collection initializers were available as well.
Update: I'm sorry I'm mistaken - pass array of custom type is impossible :(
The types of positional and named parameters for an attribute class
are limited to the attribute parameter types, which are:
One of the following types: bool, byte, char, double, float,
int, long, short, string.
The type object.
The type System.Type.
An Enum type, provided it has public accessibility and the types
in which it is nested (if any) also have public accessibility (Section
17.2).
Single-dimensional arrays of the above types.
source
Source: stackoverflow.
You CAN DECLARE passing an array of a custom type:
class TestType
{
public int Id { get; set; }
public string Value { get; set; }
public TestType(int id, string value)
{
Id = id;
Value = value;
}
}
class TestAttribute : Attribute
{
public TestAttribute(params TestType[] array)
{
//
}
}
but compilation errors occur on the attribute declaration:
[Test(new[]{new TestType(1, "1"), new TestType(2, "2"), })]
public void Test()
{
}
Section 17.1.3 of the C# 4.0 specification specifically does not allow for multidimensional arrays inside the attribute parameters, so while Foo(string[,] bar) might allow you to call Foo(new [,] {{"a", "b"}, {"key2", "val2"}}), it is unfortunately, not available for attributes.
So with that in mind, a few possibilities to approximate what you want are:
Use a single-dimensional array, with alternating key and value pairs. The obvious downside to this approach is that it's not exactly enforcing names and values.
Allow your parameter to appear multiple times by tagging your attribute definition with the following attribute:
[AttributeUsage(AllowMultiple=true)]
In this way, you can now define:
[KeyVal("key1","val1"), KeyVal("key2","val2")]
public class Foo { ... }
This is a bit wordier than what I'm sure you were hoping for, but it makes a clear delineation between names and values.
Find a JSON package and provide an initializer for your attribute. The performance hit is inconsequential as this is done during code initialization. Using Newtonsoft.Json, for instance, you could make an attribute like so:
public class JsonAttribute : Attribute
{
Dictionary<string, string> _nameValues =
new Dictionary<string, string>();
public JsonAttribute(string jsoninit)
{
var dictionary = new Dictionary<string, string>();
dynamic obj = JsonConvert.DeserializeObject(jsoninit);
foreach(var item in obj)
_nameValues[item.Name] = item.Value.Value;
}
}
Which would then allow you to instantiate an attribute like so:
[Json(#"{""key1"":""val1"", ""key2"":""val2""}")]
public class Foo { ... }
I know it's a little quote-happy, a lot more involved, but there you are. Regardless, in this crazy dynamic world, knowing how to initialize objects with JSON isn't a bad skill to have in your back pocket.
The short answer is no.
Longer answer: In order for a class to support collection initializers, it needs to implement IEnumerable and it needs to have an add method. So for example:
public class MyClass<T,U> : IEnumerable<T>
{
public void Add(T t, U u)
{
}
public IEnumerator<T> GetEnumerator()
{
throw new NotImplementedException();
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}
I can then do this:
var mc = new MyClass<int, string> {{1, ""}, {2, ""}};
So using this, let's try to make it work for an attribute. (side note, since attributes don't support generics, I'm just hardcoding it using strings for testing) :
public class CollectionInitAttribute : Attribute, IEnumerable<string>
{
public void Add(string s1, string s2)
{
}
public IEnumerator<string> GetEnumerator()
{
throw new NotImplementedException();
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}
And now to test it:
[CollectionInit{{"1","1"}}]
public class MyClass
{
}
and that doesn't compile :( I'm not sure where the limitation is exactly, I'm guessing attributes aren't newed up the same way a regular object is and therefore this isn't supported. I'd be curious if this can theoretically be supported by a future version of the language....

I need to implement C# deep copy constructors with inheritance. What patterns are there to choose from?

I wish to implement a deepcopy of my classes hierarchy in C#
public Class ParentObj : ICloneable
{
protected int myA;
public virtual Object Clone ()
{
ParentObj newObj = new ParentObj();
newObj.myA = theObj.MyA;
return newObj;
}
}
public Class ChildObj : ParentObj
{
protected int myB;
public override Object Clone ( )
{
Parent newObj = this.base.Clone();
newObj.myB = theObj.MyB;
return newObj;
}
}
This will not work as when Cloning the Child only a parent is new-ed. In my code some classes have large hierarchies.
What is the recommended way of doing this? Cloning everything at each level without calling the base class seems wrong? There must be some neat solutions to this problem, what are they?
Can I thank everyone for their answers. It was really interesting to see some of the approaches. I think it would be good if someone gave an example of a reflection answer for completeness. +1 awaiting!
The typical approach is to use "copy constructor" pattern a la C++:
class Base : ICloneable
{
int x;
protected Base(Base other)
{
x = other.x;
}
public virtual object Clone()
{
return new Base(this);
}
}
class Derived : Base
{
int y;
protected Derived(Derived other)
: Base(other)
{
y = other.y;
}
public override object Clone()
{
return new Derived(this);
}
}
The other approach is to use Object.MemberwiseClone in the implementation of Clone - this will ensure that result is always of the correct type, and will allow overrides to extend:
class Base : ICloneable
{
List<int> xs;
public virtual object Clone()
{
Base result = this.MemberwiseClone();
// xs points to same List object here, but we want
// a new List object with copy of data
result.xs = new List<int>(xs);
return result;
}
}
class Derived : Base
{
List<int> ys;
public override object Clone()
{
// Cast is legal, because MemberwiseClone() will use the
// actual type of the object to instantiate the copy.
Derived result = (Derived)base.Clone();
// ys points to same List object here, but we want
// a new List object with copy of data
result.ys = new List<int>(ys);
return result;
}
}
Both approaches require that all classes in the hierarchy follow the pattern. Which one to use is a matter of preference.
If you just have any random class implementing ICloneable with no guarantees on implementation (aside from following the documented semantics of ICloneable), there's no way to extend it.
try the serialization trick:
public object Clone(object toClone)
{
BinaryFormatter bf = new BinaryFormatter();
MemoryStream ms= new MemoryStream();
bf.Serialize(ms, toClone);
ms.Flush();
ms.Position = 0;
return bf.Deserialize(ms);
}
WARNING:
This code should be used with a great deal of caution. Use at your own risk. This example is provided as-is and without a warranty of any kind.
There is one other way to perform a deep clone on an object graph. It is important to be aware of the following when considering using this sample:
Cons:
Any references to external classes will also be cloned unless those references are provided to the Clone(object, ...) method.
No constructors will be executed on cloned objects they are reproduced EXACTLY as they are.
No ISerializable or serialization constructors will be executed.
There is no way to alter the behavior of this method on a specific type.
It WILL clone everything, Stream, AppDomain, Form, whatever, and those will likely break your application in horrific ways.
It could break whereas using the serialization method is much more likely to continue working.
The implementation below uses recursion and can easily cause a stack overflow if your object graph is too deep.
So why would you want to use it?
Pros:
It does a complete deep-copy of all instance data with no coding required in the object.
It preserves all object graph references (even circular) in the reconstituted object.
It's executes more than 20 times fatser than the binary formatter with less memory consumption.
It requires nothing, no attributes, implemented interfaces, public properties, nothing.
Code Usage:
You just call it with an object:
Class1 copy = Clone(myClass1);
Or let's say you have a child object and you are subscribed to it's events... Now you want to clone that child object. By providing a list of objects to not clone, you can preserve some potion of the object graph:
Class1 copy = Clone(myClass1, this);
Implementation:
Now let's get the easy stuff out of the way first... Here is the entry point:
public static T Clone<T>(T input, params object[] stableReferences)
{
Dictionary<object, object> graph = new Dictionary<object, object>(new ReferenceComparer());
foreach (object o in stableReferences)
graph.Add(o, o);
return InternalClone(input, graph);
}
Now that is simple enough, it just builds a dictionary map for the objects during the clone and populates it with any object that should not be cloned. You will note the comparer provided to the dictionary is a ReferenceComparer, let's take a look at what it does:
class ReferenceComparer : IEqualityComparer<object>
{
bool IEqualityComparer<object>.Equals(object x, object y)
{ return Object.ReferenceEquals(x, y); }
int IEqualityComparer<object>.GetHashCode(object obj)
{ return RuntimeHelpers.GetHashCode(obj); }
}
That was easy enough, just a comparer that forces the use of the System.Object's get hash and reference equality... now comes the hard work:
private static T InternalClone<T>(T input, Dictionary<object, object> graph)
{
if (input == null || input is string || input.GetType().IsPrimitive)
return input;
Type inputType = input.GetType();
object exists;
if (graph.TryGetValue(input, out exists))
return (T)exists;
if (input is Array)
{
Array arItems = (Array)((Array)(object)input).Clone();
graph.Add(input, arItems);
for (long ix = 0; ix < arItems.LongLength; ix++)
arItems.SetValue(InternalClone(arItems.GetValue(ix), graph), ix);
return (T)(object)arItems;
}
else if (input is Delegate)
{
Delegate original = (Delegate)(object)input;
Delegate result = null;
foreach (Delegate fn in original.GetInvocationList())
{
Delegate fnNew;
if (graph.TryGetValue(fn, out exists))
fnNew = (Delegate)exists;
else
{
fnNew = Delegate.CreateDelegate(input.GetType(), InternalClone(original.Target, graph), original.Method, true);
graph.Add(fn, fnNew);
}
result = Delegate.Combine(result, fnNew);
}
graph.Add(input, result);
return (T)(object)result;
}
else
{
Object output = FormatterServices.GetUninitializedObject(inputType);
if (!inputType.IsValueType)
graph.Add(input, output);
MemberInfo[] fields = inputType.GetFields(BindingFlags.Public | BindingFlags.NonPublic | BindingFlags.Instance);
object[] values = FormatterServices.GetObjectData(input, fields);
for (int i = 0; i < values.Length; i++)
values[i] = InternalClone(values[i], graph);
FormatterServices.PopulateObjectMembers(output, fields, values);
return (T)output;
}
}
You will notice right-off the special case for array and delegate copy. Each have their own reasons, first Array does not have 'members' that can be cloned, so you have to handle this and depend on the shallow Clone() member and then clone each element. As for the delegate it may work without the special-case; however, this will be far safer since it's not duplicating things like RuntimeMethodHandle and the like. If you intend to include other things in your hierarchy from the core runtime (like System.Type) I suggest you handle them explicitly in similar fashion.
The last case, and most common, is simply to use roughly the same routines that are used by the BinaryFormatter. These allow us to pop all the instance fields (public or private) out of the original object, clone them, and stick them into an empty object. The nice thing here is that the GetUninitializedObject returns a new instance that has not had the ctor run on it which could cause issues and slow the performance.
Whether the above works or not will highly depend upon your specific object graph and the data therein. If you control the objects in the graph and know that they are not referencing silly things like a Thread then the above code should work very well.
Testing:
Here is what I wrote to originally test this:
class Test
{
public Test(string name, params Test[] children)
{
Print = (Action<StringBuilder>)Delegate.Combine(
new Action<StringBuilder>(delegate(StringBuilder sb) { sb.AppendLine(this.Name); }),
new Action<StringBuilder>(delegate(StringBuilder sb) { sb.AppendLine(this.Name); })
);
Name = name;
Children = children;
}
public string Name;
public Test[] Children;
public Action<StringBuilder> Print;
}
static void Main(string[] args)
{
Dictionary<string, Test> data2, data = new Dictionary<string, Test>(StringComparer.OrdinalIgnoreCase);
Test a, b, c;
data.Add("a", a = new Test("a", new Test("a.a")));
a.Children[0].Children = new Test[] { a };
data.Add("b", b = new Test("b", a));
data.Add("c", c = new Test("c"));
data2 = Clone(data);
Assert.IsFalse(Object.ReferenceEquals(data, data2));
//basic contents test & comparer
Assert.IsTrue(data2.ContainsKey("a"));
Assert.IsTrue(data2.ContainsKey("A"));
Assert.IsTrue(data2.ContainsKey("B"));
//nodes are different between data and data2
Assert.IsFalse(Object.ReferenceEquals(data["a"], data2["a"]));
Assert.IsFalse(Object.ReferenceEquals(data["a"].Children[0], data2["a"].Children[0]));
Assert.IsFalse(Object.ReferenceEquals(data["B"], data2["B"]));
Assert.IsFalse(Object.ReferenceEquals(data["B"].Children[0], data2["B"].Children[0]));
Assert.IsFalse(Object.ReferenceEquals(data["B"].Children[0], data2["A"]));
//graph intra-references still in tact?
Assert.IsTrue(Object.ReferenceEquals(data["B"].Children[0], data["A"]));
Assert.IsTrue(Object.ReferenceEquals(data2["B"].Children[0], data2["A"]));
Assert.IsTrue(Object.ReferenceEquals(data["A"].Children[0].Children[0], data["A"]));
Assert.IsTrue(Object.ReferenceEquals(data2["A"].Children[0].Children[0], data2["A"]));
data2["A"].Name = "anew";
StringBuilder sb = new StringBuilder();
data2["A"].Print(sb);
Assert.AreEqual("anew\r\nanew\r\n", sb.ToString());
}
Final Note:
Honestly it was a fun exercise at the time. It is generally a great thing to have deep cloning on a data model. Today's reality is that most data models are generated which obsoletes the usefulness of the hackery above with a generated deep clone routine. I highly recommend generating your data model & it's ability to perform deep-clones rather than using the code above.
The best way is by serializing your object, then returning the deserialized copy. It will pick up everything about your object, except those marked as non-serializable, and makes inheriting serialization easy.
[Serializable]
public class ParentObj: ICloneable
{
private int myA;
[NonSerialized]
private object somethingInternal;
public virtual object Clone()
{
MemoryStream ms = new MemoryStream();
BinaryFormatter formatter = new BinaryFormatter();
formatter.Serialize(ms, this);
object clone = formatter.Deserialize(ms);
return clone;
}
}
[Serializable]
public class ChildObj: ParentObj
{
private int myB;
// No need to override clone, as it will still serialize the current object, including the new myB field
}
It is not the most performant thing, but neither is the alternative: relection. The benefit of this option is that it seamlessly inherits.
You could use reflection to loop all variables and copy them.(Slow) if its to slow for you software you could use DynamicMethod and generate il.
serialize the object and deserialize it again.
I don't think you are implementing ICloneable correctly here; It requires a Clone() method with no parameters. What I would recommend is something like:
public class ParentObj : ICloneable
{
public virtual Object Clone()
{
var obj = new ParentObj();
CopyObject(this, obj);
}
protected virtual CopyObject(ParentObj source, ParentObj dest)
{
dest.myA = source.myA;
}
}
public class ChildObj : ParentObj
{
public override Object Clone()
{
var obj = new ChildObj();
CopyObject(this, obj);
}
public override CopyObject(ChildObj source, ParentObj dest)
{
base.CopyObject(source, dest)
dest.myB = source.myB;
}
}
Note that CopyObject() is basically Object.MemberwiseClone(), presumeably you would be doing more than just copying values, you would also be cloning any members that are classes.
Try to use the following [use the keyword "new"]
public class Parent
{
private int _X;
public int X{ set{_X=value;} get{return _X;}}
public Parent copy()
{
return new Parent{X=this.X};
}
}
public class Child:Parent
{
private int _Y;
public int Y{ set{_Y=value;} get{return _Y;}}
public new Child copy()
{
return new Child{X=this.X,Y=this.Y};
}
}
You should use the MemberwiseClone method instead:
public class ParentObj : ICloneable
{
protected int myA;
public virtual Object Clone()
{
ParentObj newObj = this.MemberwiseClone() as ParentObj;
newObj.myA = this.MyA; // not required, as value type (int) is automatically already duplicated.
return newObj;
}
}
public class ChildObj : ParentObj
{
protected int myB;
public override Object Clone()
{
ChildObj newObj = base.Clone() as ChildObj;
newObj.myB = this.MyB; // not required, as value type (int) is automatically already duplicated
return newObj;
}
}

Categories