I am deserializing lots of data with the Newtonsoft Json.NET library. Performance is high-priority, so all model classes are being manually deserialized with a JsonReader. Each model class has its own static constructor method FromJson which accepts a JsonReader to do the reading.
class Example
{
public Guid? Id { get; private set; }
public DateTime? Date { get; private set; }
public decimal? Amount { get; private set; }
public static Example FromJson(JsonReader reader)
{
var example = new Example();
reader.SkipToStartObject(); // Extension method, skips to first JsonToken.StartObject
while(reader.Read() && reader.TokenType == JsonToken.PropertyName)
{
var propertyName = reader.Value.ToString();
switch(propertyName)
{
case "id":
example.Id = reader.ReadAsGuid(); // Extension method
break;
case "date":
example.Date = reader.ReadAsDateTime();
break;
case "amount":
example.Amount = reader.ReadAsDecimal();
break;
default:
break;
}
}
return example;
}
}
I would like to somehow interface this class so that I can write a generic deserializer that takes that interface and automatically calls the FromJson() method. Ideally, I would be able to cleanly deserialize a WebResponse in a manner like so.
var response = await request.GetResponseAsync();
var stream = response.GetResponseStream();
return GenericJsonDeserializer.Deserialize<Example>(stream);
The GenericJsonDeserializer would constrain the allowed types to only those with the interface, set up a JsonReader from the stream, deserialize with the FromJson method, and return the object.
One problem is that C# interfaces don't allow required constructors, nor do they allow static methods. Thus, I cannot constrain GenericJsonSerializer.
This problem is solvable with reflection, but that brings about a new problem. Performance is critical, and I cannot afford to use reflection in this case. Creating a new instance inside of the generic method would either:
Require the use of Activator if the deserialization code was handled in a regular constructor, or
Require reflection to obtain the static FromJson function and invoke it, which is probably even slower.
In either case, compiling DynamicMethods by emitting IL would be a best bet (and probably offer the best performance), but I would like to avoid that scenario if possible.
Is there any other way I can constrain a generic method to require either a static constructor or a constructor overload that accepts a JsonReader for deserialization?
Instead of constraining the ctor, you can constrain to an initialize method:
the self referencing constraint is not really necessary
public interface IDeserializable<T> where T : IDeserializable<T>, new()
{
T FromJson(JsonReader reader);
}
Then modify Example to implement that interface:
public class Example : IDeserializable<Example>
{
//...
public Example FromJson(JsonReader reader)
{
// populate the object with json...
// you can create complex object like this:
// this.Example2 = new Example2().FromJson(reader);
return this;
}
}
Finally, define the Deserialize method as such:
public static class GenericJsonSerializer
{
public static T Deserialize<T>(Stream steam) where T : IDeserializable<T>, new()
{
using (var reader = ...)
{
var result = new T();
result.FromJson(reader);
return result;
}
}
}
Since you're using here the type 'Example':
GenericJsonDeserializer.Deserialize<Example>(stream);
You can just use:
Example.FromJson
Beacause you need to know the type anyway.
Just make a version that accepts Stream and JsonReader or whatever.
You can share the logic for creating JsonReader by some other static class if you need to.
There is also another approach. You can move / extract your FromJson method to another class / interface:
interface IMyJsonDeserializer
{
void FromJson(Stream stream, out ExampleClassA result);
void FromJson(Stream stream, out ExampleClassB result);
}
class MyJsonDeserializer : IMyJsonDeserializer
{
public void FromJson(Stream stream, out ExampleClassA result)
{
// code to deserialize
}
public void FromJson(Stream stream, out ExampleClassB result)
{
// code to deserialize
}
// .. more methods
}
Usage:
var deserializer = new MyJsonDeserializer(); // you can create it just once somewhere
ExampleClassA a;
deserializer.FromJson(stream, out a);
ExampleClassB b;
deserializer.FromJson(stream, out b);
If you have a lot of classes you can do some interface segregation and inheritance. You can now share your logic for creating JsonReader from Stream using OOP methods.
If you do care about perfrormance you can take a look at Utf8Json. It is proved to be faster than Newtonsoft.Json
Related
So after implemented Page Object Pattern using this tutorial i have several
Pages that derived from BasePageElementMap.
And i want to handle some operation so i have this class:
public class DownloadAttachmentsHandler
{
public DownloadAttachmentsHandler(BasePageElementMap basePageElementMap)
{
Type type = basePageElementMap.GetType();
}
}
Every Pages that derived from BasePageElementMap have this html elements that locate inside its class that derived from BasePageElementMap and from this Page i have this Map object that contains all my HTML elements that i am using.
public class YahooEmailPage: BasePage<YahooEmailPageElementMap, YahooEmailPageValidator>...
so in case i am call this function like this:
UploadAttachmentsHandler att = new UploadAttachmentsHandler(new YahooEmailPage().Map);
I want to cast this into YahooEmailPage from my DownloadAttachmentsHandler method.
So currently i have this type object, how can i case it into YahooEmailPage ?
If I understood correctly, you want the following:
public class DownloadAttachmentsHandler
{
public static object Cast(object obj, Type t)
{
try
{
var param = Expression.Parameter(obj.GetType());
return Expression.Lambda(Expression.Convert(param, t), param)
.Compile().DynamicInvoke(obj);
}
catch (TargetInvocationException ex)
{
throw ex.InnerException;
}
}
public DownloadAttachmentsHandler(BasePageElementMap basePageElementMap)
{
Type type = basePageElementMap.GetType();
dynamic foo = Cast(basePageElementMap, type);
}
}
Based on this answer by balage.
EDIT: For the example, lets assume that GetType() returns the type bar. You will have to create a method like this one:
public static void UseDynamic(bar input)
{
// Stuff
}
And then do
public DownloadAttachmentsHandler(BasePageElementMap basePageElementMap)
{
Type type = basePageElementMap.GetType();
dynamic foo = Cast(basePageElementMap, type);
UseDynamic(foo);
}
You can use overloads to avoid having to write many ifs or a switch. However, whichever approach you take, you will have to create a method for each possible type.
I am trying to convert the following c# code into java
abstract class BaseProcessor<T> where T : new()
{
public T Process(HtmlDocument html)
{
T data = new T();
Type type = data.GetType();
BindingFlags flags = BindingFlags.Public | BindingFlags.Instance | BindingFlags.SetProperty;
PropertyInfo[] properties = type.GetProperties(flags);
foreach (PropertyInfo property in properties)
{
string value = "test";
type.InvokeMember(property.Name, flags, Type.DefaultBinder, data, new object[] { value });
}
}
}
So i have done upto
public class BaseProcessor<T>
{
public T Process(String m_doc)
{
T data = (T) new BaseProcessor<T>(); // this is not working
Document doc = Jsoup.parse(m_doc);
return data;
}
}
When i instantiate the data object its not acquiring the properties of the Generic class at runtime
let say for example when i hit the code its not getting properties of DecodeModel class
IDocProcessor<DecodeModel> p = new DecodeThisProcessor();
return p.Process(doc);
public interface IDocProcessor<T>
{
T Process(String webresponse);
}
public class DecodeThisProcessor extends BaseProcessor<DecodeModel> implements IDocProcessor<DecodeModel>
{
public void setup();
}
So please help me what will be the right syntax to instantiate generic object data
You cannot instantiate generics. The reason is that the type is not available at run-time, but actually replaced with Object by the compiler. So
T data = new T(); // Not valid in Java for a generics T!
would in fact be:
Object data = new Object(); // Obviously not the desired result
Read the Java Generics Tutorial wrt. to "type erasure" for details.
You will need to employ the factory pattern.
T data = factory.make();
where
public interface Factory<T> {
T make();
}
needs to be implemented and passed to the constructor. To make this work, you need a factory that knows how to instantiate the desired class!
A (rather obvious) variant is to put the factory method into your - abstract - class.
public abstract class BaseProcessor<T>
{
protected abstract T makeProcessor();
public T Process(String m_doc)
{
T data = makeProcessor(); // this is now working!
and when extending BaseProcessor implement it for the actual final type.
Tough luck; in Java the whole of Generics is strictly a compile-time artifact and the instantiation of the type parameters doesn't exist in the runtime. The usual workaround is to pass an instance of Class as a marker, which will allow you to reflectively create an object of that type. This is fraught with many pitfalls, but is the best you can get in Java.
You can do this:
public class BaseProcessor<T>
{
private Class<T> clazz;
public BaseProcessor(Class<T> clazz)
{
this.clazz = clazz;
}
public T Process(String m_doc)
{
T data = clazz.newInstance()
Document doc = Jsoup.parse(m_doc);
return data;
}
}
Hint: Make sure that T has a no-arg constructor.
I asked a question yesterday regarding using either reflection or Strategy Pattern for dynamically calling methods.
However, since then I have decided to change the methods into individual classes that implement a common interface. The reason being, each class, whilst bearing some similarities also perform certain methods unique to that class.
I had been using a strategy as such:
switch (method)
{
case "Pivot":
return new Pivot(originalData);
case "GroupBy":
return new GroupBy(originalData);
case "Standard deviation":
return new StandardDeviation(originalData);
case "% phospho PRAS Protein":
return new PhosphoPRASPercentage(originalData);
case "AveragePPPperTreatment":
return new AveragePPPperTreatment(originalData);
case "AvgPPPNControl":
return new AvgPPPNControl(originalData);
case "PercentageInhibition":
return new PercentageInhibition(originalData);
default:
throw new Exception("ERROR: Method " + method + " does not exist.");
}
However, as the number of potential classes grow, I will need to keep adding new ones, thus breaking the closed for modification rule.
Instead, I have used a solution as such:
var test = Activator.CreateInstance(null, "MBDDXDataViews."+ _class);
ICalculation instance = (ICalculation)test.Unwrap();
return instance;
Effectively, the _class parameter is the name of the class passed in at runtime.
Is this a common way to do this, will there be any performance issues with this?
I am fairly new to reflection, so your advice would be welcome.
When using reflection you should ask yourself a couple of questions first, because you may end up in an over-the-top complex solution that's hard to maintain:
Is there a way to solve the problem using genericity or class/interface inheritance?
Can I solve the problem using dynamic invocations (only .NET 4.0 and above)?
Is performance important, i.e. will my reflected method or instantiation call be called once, twice or a million times?
Can I combine technologies to get to a smart but workable/understandable solution?
Am I ok with losing compile time type safety?
Genericity / dynamic
From your description I assume you do not know the types at compile time, you only know they share the interface ICalculation. If this is correct, then number (1) and (2) above are likely not possible in your scenario.
Performance
This is an important question to ask. The overhead of using reflection can impede a more than 400-fold penalty: that slows down even a moderate amount of calls.
The resolution is relatively easy: instead of using Activator.CreateInstance, use a factory method (you already have that), look up the MethodInfo create a delegate, cache it and use the delegate from then on. This yields only a penalty on the first invocation, subsequent invocations have near-native performance.
Combine technologies
A lot is possible here, but I'd really need to know more of your situation to assist in this direction. Often, I end up combining dynamic with generics, with cached reflection. When using information hiding (as is normal in OOP), you may end up with a fast, stable and still well-extensible solution.
Losing compile time type safety
Of the five questions, this is perhaps the most important one to worry about. It is very important to create your own exceptions that give clear information about reflection mistakes. That means: every call to a method, constructor or property based on an input string or otherwise unchecked information must be wrapped in a try/catch. Catch only specific exceptions (as always, I mean: never catch Exception itself).
Focus on TargetException (method does not exist), TargetInvocationException (method exists, but rose an exc. when invoked), TargetParameterCountException, MethodAccessException (not the right privileges, happens a lot in ASP.NET), InvalidOperationException (happens with generic types). You don't always need to try to catch all of them, it depends on the expected input and expected target objects.
To sum it up
Get rid of your Activator.CreateInstance and use MethodInfo to find the factory-create method, and use Delegate.CreateDelegate to create and cache the delegate. Simply store it in a static Dictionary where the key is equal to the class-string in your example code. Below is a quick but not-so-dirty way of doing this safely and without losing too much type safety.
Sample code
public class TestDynamicFactory
{
// static storage
private static Dictionary<string, Func<ICalculate>> InstanceCreateCache = new Dictionary<string, Func<ICalculate>>();
// how to invoke it
static int Main()
{
// invoke it, this is lightning fast and the first-time cache will be arranged
// also, no need to give the full method anymore, just the classname, as we
// use an interface for the rest. Almost full type safety!
ICalculate instanceOfCalculator = this.CreateCachableICalculate("RandomNumber");
int result = instanceOfCalculator.ExecuteCalculation();
}
// searches for the class, initiates it (calls factory method) and returns the instance
// TODO: add a lot of error handling!
ICalculate CreateCachableICalculate(string className)
{
if(!InstanceCreateCache.ContainsKey(className))
{
// get the type (several ways exist, this is an eays one)
Type type = TypeDelegator.GetType("TestDynamicFactory." + className);
// NOTE: this can be tempting, but do NOT use the following, because you cannot
// create a delegate from a ctor and will loose many performance benefits
//ConstructorInfo constructorInfo = type.GetConstructor(Type.EmptyTypes);
// works with public instance/static methods
MethodInfo mi = type.GetMethod("Create");
// the "magic", turn it into a delegate
var createInstanceDelegate = (Func<ICalculate>) Delegate.CreateDelegate(typeof (Func<ICalculate>), mi);
// store for future reference
InstanceCreateCache.Add(className, createInstanceDelegate);
}
return InstanceCreateCache[className].Invoke();
}
}
// example of your ICalculate interface
public interface ICalculate
{
void Initialize();
int ExecuteCalculation();
}
// example of an ICalculate class
public class RandomNumber : ICalculate
{
private static Random _random;
public static RandomNumber Create()
{
var random = new RandomNumber();
random.Initialize();
return random;
}
public void Initialize()
{
_random = new Random(DateTime.Now.Millisecond);
}
public int ExecuteCalculation()
{
return _random.Next();
}
}
I suggest you give your factory implementation a method RegisterImplementation. So every new class is just a call to that method and you are not changing your factories code.
UPDATE:
What I mean is something like this:
Create an interface that defines a calculation. According to your code, you already did this. For the sake of being complete, I am going to use the following interface in the rest of my answer:
public interface ICalculation
{
void Initialize(string originalData);
void DoWork();
}
Your factory will look something like this:
public class CalculationFactory
{
private readonly Dictionary<string, Func<string, ICalculation>> _calculations =
new Dictionary<string, Func<string, ICalculation>>();
public void RegisterCalculation<T>(string method)
where T : ICalculation, new()
{
_calculations.Add(method, originalData =>
{
var calculation = new T();
calculation.Initialize(originalData);
return calculation;
});
}
public ICalculation CreateInstance(string method, string originalData)
{
return _calculations[method](originalData);
}
}
This simple factory class is lacking error checking for the reason of simplicity.
UPDATE 2:
You would initialize it like this somewhere in your applications initialization routine:
CalculationFactory _factory = new CalculationFactory();
public void RegisterCalculations()
{
_factory.RegisterCalculation<Pivot>("Pivot");
_factory.RegisterCalculation<GroupBy>("GroupBy");
_factory.RegisterCalculation<StandardDeviation>("Standard deviation");
_factory.RegisterCalculation<PhosphoPRASPercentage>("% phospho PRAS Protein");
_factory.RegisterCalculation<AveragePPPperTreatment>("AveragePPPperTreatment");
_factory.RegisterCalculation<AvgPPPNControl>("AvgPPPNControl");
_factory.RegisterCalculation<PercentageInhibition>("PercentageInhibition");
}
Just as an example how to add initialization in the constructor:
Something similar to: Activator.CreateInstance(Type.GetType("ConsoleApplication1.Operation1"), initializationData);
but written with Linq Expression, part of code is taken here:
public class Operation1
{
public Operation1(object data)
{
}
}
public class Operation2
{
public Operation2(object data)
{
}
}
public class ActivatorsStorage
{
public delegate object ObjectActivator(params object[] args);
private readonly Dictionary<string, ObjectActivator> activators = new Dictionary<string,ObjectActivator>();
private ObjectActivator CreateActivator(ConstructorInfo ctor)
{
Type type = ctor.DeclaringType;
ParameterInfo[] paramsInfo = ctor.GetParameters();
ParameterExpression param = Expression.Parameter(typeof(object[]), "args");
Expression[] argsExp = new Expression[paramsInfo.Length];
for (int i = 0; i < paramsInfo.Length; i++)
{
Expression index = Expression.Constant(i);
Type paramType = paramsInfo[i].ParameterType;
Expression paramAccessorExp = Expression.ArrayIndex(param, index);
Expression paramCastExp = Expression.Convert(paramAccessorExp, paramType);
argsExp[i] = paramCastExp;
}
NewExpression newExp = Expression.New(ctor, argsExp);
LambdaExpression lambda = Expression.Lambda(typeof(ObjectActivator), newExp, param);
return (ObjectActivator)lambda.Compile();
}
private ObjectActivator CreateActivator(string className)
{
Type type = Type.GetType(className);
if (type == null)
throw new ArgumentException("Incorrect class name", "className");
// Get contructor with one parameter
ConstructorInfo ctor = type.GetConstructors()
.SingleOrDefault(w => w.GetParameters().Length == 1
&& w.GetParameters()[0].ParameterType == typeof(object));
if (ctor == null)
throw new Exception("There is no any constructor with 1 object parameter.");
return CreateActivator(ctor);
}
public ObjectActivator GetActivator(string className)
{
ObjectActivator activator;
if (activators.TryGetValue(className, out activator))
{
return activator;
}
activator = CreateActivator(className);
activators[className] = activator;
return activator;
}
}
The usage is following:
ActivatorsStorage ast = new ActivatorsStorage();
var a = ast.GetActivator("ConsoleApplication1.Operation1")(initializationData);
var b = ast.GetActivator("ConsoleApplication1.Operation2")(initializationData);
The same can be implemented with DynamicMethods.
Also, the classes are not required to be inherited from the same interface or base class.
Thanks, Vitaliy
One strategy that I use in cases like this is to flag my various implementations with a special attribute to indicate its key, and scan the active assemblies for types with that key:
[AttributeUsage(AttributeTargets.Class)]
public class OperationAttribute : System.Attribute
{
public OperationAttribute(string opKey)
{
_opKey = opKey;
}
private string _opKey;
public string OpKey {get {return _opKey;}}
}
[Operation("Standard deviation")]
public class StandardDeviation : IOperation
{
public void Initialize(object originalData)
{
//...
}
}
public interface IOperation
{
void Initialize(object originalData);
}
public class OperationFactory
{
static OperationFactory()
{
_opTypesByKey =
(from a in AppDomain.CurrentDomain.GetAssemblies()
from t in a.GetTypes()
let att = t.GetCustomAttributes(typeof(OperationAttribute), false).FirstOrDefault()
where att != null
select new { ((OperationAttribute)att).OpKey, t})
.ToDictionary(e => e.OpKey, e => e.t);
}
private static IDictionary<string, Type> _opTypesByKey;
public IOperation GetOperation(string opKey, object originalData)
{
var op = (IOperation)Activator.CreateInstance(_opTypesByKey[opKey]);
op.Initialize(originalData);
return op;
}
}
That way, just by creating a new class with a new key string, you can automatically "plug in" to the factory, without having to modify the factory code at all.
You'll also notice that rather than depending on each implementation to provide a specific constructor, I've created an Initialize method on the interface I expect the classes to implement. As long as they implement the interface, I'll be able to send the "originalData" to them without any reflection weirdness.
I'd also suggest using a dependency injection framework like Ninject instead of using Activator.CreateInstance. That way, your operation implementations can use constructor injection for their various dependencies.
Essentially, it sounds like you want the factory pattern. In this situation, you define a mapping of input to output types and then instantiate the type at runtime like you are doing.
Example:
You have X number of classes, and they all share a common interface of IDoSomething.
public interface IDoSomething
{
void DoSomething();
}
public class Foo : IDoSomething
{
public void DoSomething()
{
// Does Something specific to Foo
}
}
public class Bar : IDoSomething
{
public void DoSomething()
{
// Does something specific to Bar
}
}
public class MyClassFactory
{
private static Dictionary<string, Type> _mapping = new Dictionary<string, Type>();
static MyClassFactory()
{
_mapping.Add("Foo", typeof(Foo));
_mapping.Add("Bar", typeof(Bar));
}
public static void AddMapping(string query, Type concreteType)
{
// Omitting key checking code, etc. Basically, you can register new types at runtime as well.
_mapping.Add(query, concreteType);
}
public IDoSomething GetMySomething(string desiredThing)
{
if(!_mapping.ContainsKey(desiredThing))
throw new ApplicationException("No mapping is defined for: " + desiredThing);
return Activator.CreateInstance(_mapping[desiredThing]) as IDoSomething;
}
}
There's no error checking here. Are you absolutely sure that _class will resolve to a valid class? Are you controlling all the possible values or does this string somehow get populated by an end-user?
Reflection is generally most costly than avoiding it. Performance issues are proportionate to the number of objects you plan to instantiate this way.
Before you run off and use a dependency injection framework read the criticisms of it. =)
I'm currently trying to read some binary data using the BinaryReader. I've created a helper class to parse this data. Currently it is a static class with this kind of methods:
public static class Parser
{
public static ParseObject1 ReadObject1(BinaryReader reader){...}
public static ParseObject2 ReadObject2(BinaryReader reader{...}
}
Then I use it like this:
...
BinaryReader br = new BinaryReader(#"file.ext");
ParseObject1 po1 = Parser.ReadObject1(br);
...
ParseObject1 po2 = Parser.ReadObject2(br);
...
But then I started thinking, I could also just initialize the class like this
Parser p = new Parser(br);
ParseObject1 po1 = Parser.ReadObject1();
What would be a better implementation.
Which is faster isn't really relevant here; your concerns are more about concurrency and architecture.
In the case of a static Parser class to which you pass the BinaryReader as an argument to the ReadObject call, you're providing all of the data to the method, and (presumably, from your example) not persisting any data about the Reader in the Parser; this allows for you to instantiate multiple BinaryReader objects and to invoke the Parser on them separately, with no concurrency or collision problems. (Note that this ONLY applies if you have no persistent static data within your Parser object.)
On the other hand, if your Parser gets passed the BinaryReader object to operate upon, it's presumably persisting that BinaryReader data within itself; there's a potential complication there if you have interleaved calls to your Parser with different BinaryReader objects.
If your Parser doesn't need to maintain state between ReadObject1 and ReadObject2, I'd recommend keeping it static, and passing in the BinaryReader object reference; keeping it static in that instance is a good "descriptor" of the fact that there's no data persisted between those invocations. On the other hand, if there's data persisted about the BinaryReader within the Parser, I'd make it non-static, and pass the data in (like in your second example). Making it non-static but with class-persisted data makes it far less likely to cause problems with concurrency.
There's probably negligible difference in performance between the two implementations. I expect reading the binary file would take > 99% of the execution time.
If you're really concerned with performance, you could wrap both implementations in separate loops and time them.
The performance difference between these two approaches should be negligible. Personally, I would suggest using a non-static approach due to the flexibility that it provides. If you find it helpful to have much of the parsing logic consolidated in one place, you could use a combination approach (demonstrated in my example below).
Regarding performance, If you were repeatedly creating many new instances of your Parser class over a short period of time, you might notice a small performance impact, but then you would likely be able to refactor the code to avoid repeatedly creating instances of the Parser class. Also, while calling an instance method (especially a virtual method) is technically not as fast as calling a static method, again the performance difference should be very negligible.
McWafflestix brings up a good point about state. However, given that your current implementation uses static methods, I assume that your Parser class does not need to maintain state between calls to the Read methods, and therefore you should be able to reuse the same Parser instance in order to parse multiple objects from a BinaryReader stream.
Below is an example that illustrates the approach that I would probably take for this problem. Here are some features of this example:
Using polymorphism to abstract details about where the parsing logic resides for a given type of object.
Using a repository to store Parser instances so that they can be reused.
Using reflection to identify the parsing logic for a given class or struct.
Notice that I've kept the parsing logic in static methods within the ParseHelper class, and the Read instance methods on the MyObjectAParser and MyObjectBParser classes utilize those static methods on the ParseHelper class. This is just a design decision that you can make depending on what makes the most sense to you regarding how to organize your parsing logic. I'm guessing it would probably make sense to move some of the type-specific parsing logic into the individual Parser classes, but keep some of the general parsing logic in a ParseHelper class.
// define a non-generic parser interface so that we can refer to all types of parsers
public interface IParser
{
object Read(BinaryReader reader);
}
// define a generic parser interface so that we can specify a Read method specific to a particular type
public interface IParser<T> : IParser
{
new T Read(BinaryReader reader);
}
public abstract class Parser<T> : IParser<T>
{
public abstract T Read(BinaryReader reader);
object IParser.Read(BinaryReader reader)
{
return this.Read(reader);
}
}
// define a Parser attribute so that we can easily determine the correct parser for a given type
[AttributeUsage(AttributeTargets.Class | AttributeTargets.Struct, AllowMultiple = false, Inherited = true)]
public class ParserAttribute : Attribute
{
public Type ParserType { get; private set; }
public ParserAttribute(Type parserType)
{
if (!typeof(IParser).IsAssignableFrom(parserType))
throw new ArgumentException(string.Format("The type [{0}] does not implement the IParser interface.", parserType.Name), "parserType");
this.ParserType = parserType;
}
public ParserAttribute(Type parserType, Type targetType)
{
// check that the type represented by parserType implements the IParser interface
if (!typeof(IParser).IsAssignableFrom(parserType))
throw new ArgumentException(string.Format("The type [{0}] does not implement the IParser interface.", parserType.Name), "parserType");
// check that the type represented by parserType implements the IParser<T> interface, where T is the type specified by targetType
if (!typeof(IParser<>).MakeGenericType(targetType).IsAssignableFrom(parserType))
throw new ArgumentException(string.Format("The type [{0}] does not implement the IParser<{1}> interface.", parserType.Name, targetType.Name), "parserType");
this.ParserType = parserType;
}
}
// let's define a couple of example classes for parsing
// the MyObjectA class corresponds to ParseObject1 in the original question
[Parser(typeof(MyObjectAParser))] // the parser type for MyObjectA is MyObjectAParser
class MyObjectA
{
// ...
}
// the MyObjectB class corresponds to ParseObject2 in the original question
[Parser(typeof(MyObjectAParser))] // the parser type for MyObjectB is MyObjectBParser
class MyObjectB
{
// ...
}
// a static class that contains helper functions to handle parsing logic
static class ParseHelper
{
public static MyObjectA ReadObjectA(BinaryReader reader)
{
// <code here to parse MyObjectA from BinaryReader>
throw new NotImplementedException();
}
public static MyObjectB ReadObjectB(BinaryReader reader)
{
// <code here to parse MyObjectB from BinaryReader>
throw new NotImplementedException();
}
}
// a parser class that parses objects of type MyObjectA from a BinaryReader
class MyObjectAParser : Parser<MyObjectA>
{
public override MyObjectA Read(BinaryReader reader)
{
return ParseHelper.ReadObjectA(reader);
}
}
// a parser class that parses objects of type MyObjectB from a BinaryReader
class MyObjectBParser : Parser<MyObjectB>
{
public override MyObjectB Read(BinaryReader reader)
{
return ParseHelper.ReadObjectB(reader);
}
}
// define a ParserRepository to encapsulate the logic for finding the correct parser for a given type
public class ParserRepository
{
private Dictionary<Type, IParser> _Parsers = new Dictionary<Type, IParser>();
public IParser<T> GetParser<T>()
{
// attempt to look up the correct parser for type T from the dictionary
Type targetType = typeof(T);
IParser parser;
if (!this._Parsers.TryGetValue(targetType, out parser))
{
// no parser was found, so check the target type for a Parser attribute
object[] attributes = targetType.GetCustomAttributes(typeof(ParserAttribute), true);
if (attributes != null && attributes.Length > 0)
{
ParserAttribute parserAttribute = (ParserAttribute)attributes[0];
// create an instance of the identified parser
parser = (IParser<T>)Activator.CreateInstance(parserAttribute.ParserType);
// and add it to the dictionary
this._Parsers.Add(targetType, parser);
}
else
{
throw new InvalidOperationException(string.Format("Unable to find a parser for the type [{0}].", targetType.Name));
}
}
return (IParser<T>)parser;
}
// this method can be used to set up parsers without the use of the Parser attribute
public void RegisterParser<T>(IParser<T> parser)
{
this._Parsers[typeof(T)] = parser;
}
}
Usage example:
ParserRepository parserRepository = new ParserRepository();
// ...
IParser<MyObjectA> parserForMyObjectA = parserRepository.GetParser<MyObjectA>();
IParser<MyObjectB> parserForMyObjectB = parserRepository.GetParser<MyObjectB>();
using (var fs = new FileStream(#"file.ext", FileMode.Open, FileAccess.Read, FileShare.Read))
{
BinaryReader br = new BinaryReader(fs);
MyObjectA objA = parserForMyObjectA.Read(br);
MyObjectB objB = parserForMyObjectB.Read(br);
// ...
}
// Notice that this code does not explicitly reference the MyObjectAParser or MyObjectBParser classes.
I wish to implement a deepcopy of my classes hierarchy in C#
public Class ParentObj : ICloneable
{
protected int myA;
public virtual Object Clone ()
{
ParentObj newObj = new ParentObj();
newObj.myA = theObj.MyA;
return newObj;
}
}
public Class ChildObj : ParentObj
{
protected int myB;
public override Object Clone ( )
{
Parent newObj = this.base.Clone();
newObj.myB = theObj.MyB;
return newObj;
}
}
This will not work as when Cloning the Child only a parent is new-ed. In my code some classes have large hierarchies.
What is the recommended way of doing this? Cloning everything at each level without calling the base class seems wrong? There must be some neat solutions to this problem, what are they?
Can I thank everyone for their answers. It was really interesting to see some of the approaches. I think it would be good if someone gave an example of a reflection answer for completeness. +1 awaiting!
The typical approach is to use "copy constructor" pattern a la C++:
class Base : ICloneable
{
int x;
protected Base(Base other)
{
x = other.x;
}
public virtual object Clone()
{
return new Base(this);
}
}
class Derived : Base
{
int y;
protected Derived(Derived other)
: Base(other)
{
y = other.y;
}
public override object Clone()
{
return new Derived(this);
}
}
The other approach is to use Object.MemberwiseClone in the implementation of Clone - this will ensure that result is always of the correct type, and will allow overrides to extend:
class Base : ICloneable
{
List<int> xs;
public virtual object Clone()
{
Base result = this.MemberwiseClone();
// xs points to same List object here, but we want
// a new List object with copy of data
result.xs = new List<int>(xs);
return result;
}
}
class Derived : Base
{
List<int> ys;
public override object Clone()
{
// Cast is legal, because MemberwiseClone() will use the
// actual type of the object to instantiate the copy.
Derived result = (Derived)base.Clone();
// ys points to same List object here, but we want
// a new List object with copy of data
result.ys = new List<int>(ys);
return result;
}
}
Both approaches require that all classes in the hierarchy follow the pattern. Which one to use is a matter of preference.
If you just have any random class implementing ICloneable with no guarantees on implementation (aside from following the documented semantics of ICloneable), there's no way to extend it.
try the serialization trick:
public object Clone(object toClone)
{
BinaryFormatter bf = new BinaryFormatter();
MemoryStream ms= new MemoryStream();
bf.Serialize(ms, toClone);
ms.Flush();
ms.Position = 0;
return bf.Deserialize(ms);
}
WARNING:
This code should be used with a great deal of caution. Use at your own risk. This example is provided as-is and without a warranty of any kind.
There is one other way to perform a deep clone on an object graph. It is important to be aware of the following when considering using this sample:
Cons:
Any references to external classes will also be cloned unless those references are provided to the Clone(object, ...) method.
No constructors will be executed on cloned objects they are reproduced EXACTLY as they are.
No ISerializable or serialization constructors will be executed.
There is no way to alter the behavior of this method on a specific type.
It WILL clone everything, Stream, AppDomain, Form, whatever, and those will likely break your application in horrific ways.
It could break whereas using the serialization method is much more likely to continue working.
The implementation below uses recursion and can easily cause a stack overflow if your object graph is too deep.
So why would you want to use it?
Pros:
It does a complete deep-copy of all instance data with no coding required in the object.
It preserves all object graph references (even circular) in the reconstituted object.
It's executes more than 20 times fatser than the binary formatter with less memory consumption.
It requires nothing, no attributes, implemented interfaces, public properties, nothing.
Code Usage:
You just call it with an object:
Class1 copy = Clone(myClass1);
Or let's say you have a child object and you are subscribed to it's events... Now you want to clone that child object. By providing a list of objects to not clone, you can preserve some potion of the object graph:
Class1 copy = Clone(myClass1, this);
Implementation:
Now let's get the easy stuff out of the way first... Here is the entry point:
public static T Clone<T>(T input, params object[] stableReferences)
{
Dictionary<object, object> graph = new Dictionary<object, object>(new ReferenceComparer());
foreach (object o in stableReferences)
graph.Add(o, o);
return InternalClone(input, graph);
}
Now that is simple enough, it just builds a dictionary map for the objects during the clone and populates it with any object that should not be cloned. You will note the comparer provided to the dictionary is a ReferenceComparer, let's take a look at what it does:
class ReferenceComparer : IEqualityComparer<object>
{
bool IEqualityComparer<object>.Equals(object x, object y)
{ return Object.ReferenceEquals(x, y); }
int IEqualityComparer<object>.GetHashCode(object obj)
{ return RuntimeHelpers.GetHashCode(obj); }
}
That was easy enough, just a comparer that forces the use of the System.Object's get hash and reference equality... now comes the hard work:
private static T InternalClone<T>(T input, Dictionary<object, object> graph)
{
if (input == null || input is string || input.GetType().IsPrimitive)
return input;
Type inputType = input.GetType();
object exists;
if (graph.TryGetValue(input, out exists))
return (T)exists;
if (input is Array)
{
Array arItems = (Array)((Array)(object)input).Clone();
graph.Add(input, arItems);
for (long ix = 0; ix < arItems.LongLength; ix++)
arItems.SetValue(InternalClone(arItems.GetValue(ix), graph), ix);
return (T)(object)arItems;
}
else if (input is Delegate)
{
Delegate original = (Delegate)(object)input;
Delegate result = null;
foreach (Delegate fn in original.GetInvocationList())
{
Delegate fnNew;
if (graph.TryGetValue(fn, out exists))
fnNew = (Delegate)exists;
else
{
fnNew = Delegate.CreateDelegate(input.GetType(), InternalClone(original.Target, graph), original.Method, true);
graph.Add(fn, fnNew);
}
result = Delegate.Combine(result, fnNew);
}
graph.Add(input, result);
return (T)(object)result;
}
else
{
Object output = FormatterServices.GetUninitializedObject(inputType);
if (!inputType.IsValueType)
graph.Add(input, output);
MemberInfo[] fields = inputType.GetFields(BindingFlags.Public | BindingFlags.NonPublic | BindingFlags.Instance);
object[] values = FormatterServices.GetObjectData(input, fields);
for (int i = 0; i < values.Length; i++)
values[i] = InternalClone(values[i], graph);
FormatterServices.PopulateObjectMembers(output, fields, values);
return (T)output;
}
}
You will notice right-off the special case for array and delegate copy. Each have their own reasons, first Array does not have 'members' that can be cloned, so you have to handle this and depend on the shallow Clone() member and then clone each element. As for the delegate it may work without the special-case; however, this will be far safer since it's not duplicating things like RuntimeMethodHandle and the like. If you intend to include other things in your hierarchy from the core runtime (like System.Type) I suggest you handle them explicitly in similar fashion.
The last case, and most common, is simply to use roughly the same routines that are used by the BinaryFormatter. These allow us to pop all the instance fields (public or private) out of the original object, clone them, and stick them into an empty object. The nice thing here is that the GetUninitializedObject returns a new instance that has not had the ctor run on it which could cause issues and slow the performance.
Whether the above works or not will highly depend upon your specific object graph and the data therein. If you control the objects in the graph and know that they are not referencing silly things like a Thread then the above code should work very well.
Testing:
Here is what I wrote to originally test this:
class Test
{
public Test(string name, params Test[] children)
{
Print = (Action<StringBuilder>)Delegate.Combine(
new Action<StringBuilder>(delegate(StringBuilder sb) { sb.AppendLine(this.Name); }),
new Action<StringBuilder>(delegate(StringBuilder sb) { sb.AppendLine(this.Name); })
);
Name = name;
Children = children;
}
public string Name;
public Test[] Children;
public Action<StringBuilder> Print;
}
static void Main(string[] args)
{
Dictionary<string, Test> data2, data = new Dictionary<string, Test>(StringComparer.OrdinalIgnoreCase);
Test a, b, c;
data.Add("a", a = new Test("a", new Test("a.a")));
a.Children[0].Children = new Test[] { a };
data.Add("b", b = new Test("b", a));
data.Add("c", c = new Test("c"));
data2 = Clone(data);
Assert.IsFalse(Object.ReferenceEquals(data, data2));
//basic contents test & comparer
Assert.IsTrue(data2.ContainsKey("a"));
Assert.IsTrue(data2.ContainsKey("A"));
Assert.IsTrue(data2.ContainsKey("B"));
//nodes are different between data and data2
Assert.IsFalse(Object.ReferenceEquals(data["a"], data2["a"]));
Assert.IsFalse(Object.ReferenceEquals(data["a"].Children[0], data2["a"].Children[0]));
Assert.IsFalse(Object.ReferenceEquals(data["B"], data2["B"]));
Assert.IsFalse(Object.ReferenceEquals(data["B"].Children[0], data2["B"].Children[0]));
Assert.IsFalse(Object.ReferenceEquals(data["B"].Children[0], data2["A"]));
//graph intra-references still in tact?
Assert.IsTrue(Object.ReferenceEquals(data["B"].Children[0], data["A"]));
Assert.IsTrue(Object.ReferenceEquals(data2["B"].Children[0], data2["A"]));
Assert.IsTrue(Object.ReferenceEquals(data["A"].Children[0].Children[0], data["A"]));
Assert.IsTrue(Object.ReferenceEquals(data2["A"].Children[0].Children[0], data2["A"]));
data2["A"].Name = "anew";
StringBuilder sb = new StringBuilder();
data2["A"].Print(sb);
Assert.AreEqual("anew\r\nanew\r\n", sb.ToString());
}
Final Note:
Honestly it was a fun exercise at the time. It is generally a great thing to have deep cloning on a data model. Today's reality is that most data models are generated which obsoletes the usefulness of the hackery above with a generated deep clone routine. I highly recommend generating your data model & it's ability to perform deep-clones rather than using the code above.
The best way is by serializing your object, then returning the deserialized copy. It will pick up everything about your object, except those marked as non-serializable, and makes inheriting serialization easy.
[Serializable]
public class ParentObj: ICloneable
{
private int myA;
[NonSerialized]
private object somethingInternal;
public virtual object Clone()
{
MemoryStream ms = new MemoryStream();
BinaryFormatter formatter = new BinaryFormatter();
formatter.Serialize(ms, this);
object clone = formatter.Deserialize(ms);
return clone;
}
}
[Serializable]
public class ChildObj: ParentObj
{
private int myB;
// No need to override clone, as it will still serialize the current object, including the new myB field
}
It is not the most performant thing, but neither is the alternative: relection. The benefit of this option is that it seamlessly inherits.
You could use reflection to loop all variables and copy them.(Slow) if its to slow for you software you could use DynamicMethod and generate il.
serialize the object and deserialize it again.
I don't think you are implementing ICloneable correctly here; It requires a Clone() method with no parameters. What I would recommend is something like:
public class ParentObj : ICloneable
{
public virtual Object Clone()
{
var obj = new ParentObj();
CopyObject(this, obj);
}
protected virtual CopyObject(ParentObj source, ParentObj dest)
{
dest.myA = source.myA;
}
}
public class ChildObj : ParentObj
{
public override Object Clone()
{
var obj = new ChildObj();
CopyObject(this, obj);
}
public override CopyObject(ChildObj source, ParentObj dest)
{
base.CopyObject(source, dest)
dest.myB = source.myB;
}
}
Note that CopyObject() is basically Object.MemberwiseClone(), presumeably you would be doing more than just copying values, you would also be cloning any members that are classes.
Try to use the following [use the keyword "new"]
public class Parent
{
private int _X;
public int X{ set{_X=value;} get{return _X;}}
public Parent copy()
{
return new Parent{X=this.X};
}
}
public class Child:Parent
{
private int _Y;
public int Y{ set{_Y=value;} get{return _Y;}}
public new Child copy()
{
return new Child{X=this.X,Y=this.Y};
}
}
You should use the MemberwiseClone method instead:
public class ParentObj : ICloneable
{
protected int myA;
public virtual Object Clone()
{
ParentObj newObj = this.MemberwiseClone() as ParentObj;
newObj.myA = this.MyA; // not required, as value type (int) is automatically already duplicated.
return newObj;
}
}
public class ChildObj : ParentObj
{
protected int myB;
public override Object Clone()
{
ChildObj newObj = base.Clone() as ChildObj;
newObj.myB = this.MyB; // not required, as value type (int) is automatically already duplicated
return newObj;
}
}