For various reasons, I need to implement a type caching mechanism in C#. Fortunately, the CLR provides Type.GUID to uniquely identify a type. Unfortunately, I can't find any way to look up a type based on this GUID. There's Type.GetTypeFromCLSID() but based on my understanding of the documentation (and experiments) that does something very, very different.
Is there any way to get a type based on its GUID short of looping through all the loaded types and comparing to their GUIDs?
EDIT: I forgot to mention that I would really like a "type fingerprint" of fixed width, that's why the GUID is so appealing to me. In a general case, of course, the fully qualified name of the type would work.
why not use the designated property for that, ie. AssemblyQualifiedName? This property is documented as "can be persisted and later used to load the Type".
The GUID is for COM interop.
This may just be a summary of answers already posted, but I don't think there is a way to do this without first building a map of Guid->Type.
We do this in our framework on initialization:
static TypeManager()
{
AppDomain.CurrentDomain.AssemblyLoad += (s, e) =>
{
_ScanAssembly(e.LoadedAssembly);
};
foreach (Assembly a in AppDomain.CurrentDomain.GetAssemblies())
{
_ScanAssembly(a);
}
}
private static void _ScanAssembly(Assembly a)
{
foreach (Type t in a.GetTypes())
{
//optional check to filter types (by interface or attribute, etc.)
//Add type to type map
}
}
Handling the AssemblyLoad event takes care of dynamically loaded assemblies.
From what I understand, Type.GUID uses the assembly version of the type as part of the Guid generation algorithm. This may lead to trouble if you increment your assembly version numbers. Using the GetDeterministicGuid method described in another answer would probably be advisable, depending on your application.
Don't loop to compare. Populate a Dictionary<Type> and use the Contains method.
Dictionary<Type> types = new Dictionary<Types>();
... //populate
if (types.Contains(someObject.GetType()))
//do something
This will certainly give you a fixed size entry, since all of them will be object references (instances of Type essentially being factory objects).
What about (from Generating Deterministic GUIDs):
private Guid GetDeterministicGuid(string input)
{
// use MD5 hash to get a 16-byte hash of the string:
MD5CryptoServiceProvider provider = new MD5CryptoServiceProvider();
byte[] inputBytes = Encoding.Default.GetBytes(input);
byte[] hashBytes = provider.ComputeHash(inputBytes);
// generate a guid from the hash:
Guid hashGuid = new Guid(hashBytes);
return hashGuid;
}
And throw in that typeof().AssemblyQualifiedName. You could to store this data inside a Dictionary<string, Guid> collection (or, whatever, a <Guid, string>).
This way you'll have always a same GUID for a given type (warning: collision is possible).
If you are in control of these classes I would recommend:
public interface ICachable
{
Guid ClassId { get; }
}
public class Person : ICachable
{
public Guid ClassId
{
get { return new Guid("DF9DD4A9-1396-4ddb-98D4-F8F143692C45"); }
}
}
You can generate your GUIDs using Visual Studio, Tools->Create Guid.
The Mono documentation reports that a module has a Metadata heap of guids.
Perhaps Cecil might help you lookup a type based on its guid? Not sure though, there is a GuidHeap class, it seems to be generating the guids though, but perhaps this is enough for your cache to work?
I would use the typeof (class).GUID to find the instance in the cache dictionary:
private Dictionary<Guid, class> cacheDictionary { get; set; }
and I would have a method to return the dictionary and the GUID as parameter of the method to search for the class in the dictionary.
public T Getclass<T>()
{
var key = typeof(T).GUID;
var foundClass= cacheDictionary.FirstOrDefault(x => x.Key == key);
T item;
if (foundClass.Equals(default(KeyValuePair<Guid, T>)))
{
item = new T()
cacheDictionary.Add(key, item);
}
else
item = result.Value;
return item;
}
and I would use a singleton pattern for the cache,
and the call would be something like the code below:
var cachedObject = Cache.Instance.Getclass<class>();
Related
I'm want a unique ID (preferably static, without computation) for each class implementation, but not instance. The most obvious way to do this is just hardcode a value in the class, but keeping the values unique becomes a task for an human and isn't ideal.
class Base
{
abstract int GetID();
}
class Foo: Base
{
int GetID() => 10;
}
class Bar: Base
{
int GetID() => 20;
}
Foo foo1 = new Foo();
Foo foo2 = new Foo();
Bar bar = new Bar();
foo1.GetID() == foo2.GetID();
foo1.GetID() != bar.GetID()
The class name would be an obvious unique identifier, but I need an int (or fixed length bytes). I pack the entire object into bytes, and use the id to know what class it is when I unpack it at the other end.
Hashing the class name every time I call GetID() seems needlessly process heavy just to get an ID number.
I could also make an enum as a lookup, but again I need to populate the enum manually.
EDIT: People have been asking important questions, so I'll put the info here.
Needs to be unique per class, not per instance (this is why the identified duplicate question doesn't answer this one).
ID value needs to be persistent between runs.
Value needs to be fixed length bytes or int. Variable length strings such as class name are not acceptable.
Needs to reduce CPU load wherever possible (caching results or using assembly based metadata instead of doing a hash each time).
Ideally, the ID can be retrieved from a static function. This means I can make a static lookup function that matches ID to class.
Number of different classes that need ID isn't that big (<100) so collisions isn't a major concern.
EDIT2:
Some more colour since people are skeptical that this is really needed. I'm open to a different approach.
I'm writing some networking code for a game, and its broken down into message objects. Each different message type is a class that inherits from MessageBase, and adds it's own fields which will be sent.
The MessageBase class has a method for packing itself into bytes, and it sticks a message identifier (the class ID) on the front. When it comes to unpacking it at the other end, I use the identifier to know how to unpack the bytes. This results in some easy to pack/unpack messages and very little overhead (few bytes for ID, then just class property values).
Currently I hard code an ID number in the classes, but it doesn't seem like the best way of doing things.
EDIT3: Here is my code after implementing the accepted answer.
public class MessageBase
{
public MessageID id { get { return GetID(); } }
private MessageID cacheId;
private MessageID GetID()
{
// Check if cacheID hasn't been intialised
if (cacheId == null)
{
// Hash the class name
MD5 md5 = MD5.Create();
byte[] md5Bytes = md5.ComputeHash(Encoding.UTF8.GetBytes(GetType().AssemblyQualifiedName));
// Convert the first few bytes into a uint32, and create the messageID from it and store in cache
cacheId = new MessageID(BitConverter.ToUInt32(md5Bytes, 0));
}
// Return the cacheId
return cacheId;
}
}
public class Protocol
{
private Dictionary<Type, MessageID> messageTypeToId = new Dictionary<Type, int>();
private Dictionary<MessageID, Type> idToMessageType = new Dictionary<int, Type>();
private Dictionary<MessageID, Action<MessageBase>> handlers = new Dictionary<int, Action<MessageBase>>();
public Protocol()
{
// Create a list of all classes that are a subclass of MessageBase this namespace
IEnumerable<Type> messageClasses = from t in Assembly.GetExecutingAssembly().GetTypes()
where t.Namespace == GetType().Namespace && t.IsSubclassOf(typeof(MessageBase))
select t;
// Iterate through the list of message classes, and store their type and id in the dicts
foreach(Type messageClass in messageClasses)
{
MessageID = (MessageID)messageClass.GetField("id").GetValue(null);
messageTypeToId[messageClass] = id;
idToMessageType[id] = messageClass;
}
}
}
Given that you can get a Type by calling GetType on the instance, you can easily cache the results. That reduces the problem to working out how to generate an ID for each type. You'd then call something like:
int id = typeIdentifierCache.GetIdentifier(foo1.GetType());
... or make GetIdentifier accept object and it can call GetType(), leaving you with
int id = typeIdentifierCache.GetIdentifier(foo1);
At that point, the detail is all in the type identifier cache.
A simple option would be to take a hash (e.g. SHA-256, for stability and making it very unlikely that you'll encounter collisions) of the fully-qualified type name. To prove that you have no collisions, you could easily write a unit test that runs over all the type names in the assembly and hashes them, then checks there are no duplicates. (Even that might be overkill, given the nature of SHA-256.)
This is all assuming that the types are in a single assembly. If you need to cope with multiple assemblies, you may want to hash the assembly-qualified name instead.
Here is one suggestion. I have used a sha256 byte array which is guaranteed to be a fixed size and astronomically unlikely to have a collision. That may well be overkill, you can easily substitute it out for something smaller. You could also use the AssemblyQualifiedName rather than FullName if you need to worry about version differences or the same class name in multiple assemblies
Firstly, here are all my usings
using System;
using System.Collections.Concurrent;
using System.Text;
using System.Security.Cryptography;
Next, a static cached type hasher object to remember the mapping between your types and the resulting byte arrays. You don't need the Console.WriteLines below, they are just there to demonstrate that you are not computing it over and over again.
public static class TypeHasher
{
private static ConcurrentDictionary<Type, byte[]> cache = new ConcurrentDictionary<Type, byte[]>();
public static byte[] GetHash(Type type)
{
byte[] result;
if (!cache.TryGetValue(type, out result))
{
Console.WriteLine("Computing Hash for {0}", type.FullName);
SHA256Managed sha = new SHA256Managed();
result = sha.ComputeHash(Encoding.UTF8.GetBytes(type.FullName));
cache.TryAdd(type, result);
}
else
{
// Not actually required, but shows that hashing only done once per type
Console.WriteLine("Using cached Hash for {0}", type.FullName);
}
return result;
}
}
Next, an extension method on object so that you can ask for anything's id. Of course if you have a more suitable base class, it doesn't need to go on object per se.
public static class IdExtension
{
public static byte[] GetId(this object obj)
{
return TypeHasher.GetHash(obj.GetType());
}
}
Next, here are some random classes
public class A
{
}
public class ChildOfA : A
{
}
public class B
{
}
And finally, here is everything put together.
public class Program
{
public static void Main()
{
A a1 = new A();
A a2 = new A();
B b1 = new B();
ChildOfA coa = new ChildOfA();
Console.WriteLine("a1 hash={0}", Convert.ToBase64String(a1.GetId()));
Console.WriteLine("b1 hash={0}", Convert.ToBase64String(b1.GetId()));
Console.WriteLine("a2 hash={0}", Convert.ToBase64String(a2.GetId()));
Console.WriteLine("coa hash={0}", Convert.ToBase64String(coa.GetId()));
}
}
Here is the console output
Computing Hash for A
a1 hash=VZrq0IJk1XldOQlxjN0Fq9SVcuhP5VWQ7vMaiKCP3/0=
Computing Hash for B
b1 hash=335w5QIVRPSDS77mSp43if68S+gUcN9inK1t2wMyClw=
Using cached Hash for A
a2 hash=VZrq0IJk1XldOQlxjN0Fq9SVcuhP5VWQ7vMaiKCP3/0=
Computing Hash for ChildOfA
coa hash=wSEbCG22Dyp/o/j1/9mIbUZTbZ82dcRkav4olILyZs4=
On the other side, you would use reflection to iterate all of the types in your library and store a reverse dictionary of hash to type.
Have not seen you answer the question if the same value needs to persist between different runs, but if all you need is a unique ID for a class, then use the built-in and simple GetHashCode method:
class BaseClass
{
public int ClassId() => typeof(this).GetHashCode();
}
If you are worried about performance of multiple calls to GetHashCode(), then first, don't, that is ridiculous micro-optimization, but if you insist, then store it.
GetHashCode() is fast, that is its entire purpose, as a fast way to compare values in a hash.
EDIT:
After doing some tests, the same hash code is returned between different runs using this method. I did not test after altering the classes, though, I am not aware of the exact method on how a Type is hashed.
Edit: all answers below (as at 19th Dec '16) are useful in making a decision. I accepted the most thorough answer to my question; but in the end chose to simply hash the file.
I am caching objects and using the assembly version as part of the key to invalidate the cached objects every time the build changes. This is inefficient because the actual class of the cached objects rarely change and are valid across builds.
How can I instead use a hash of the specific class signature (basically all properties) for the key, such that it only changes when the class itself changes?
I can think of a somewhat complicated way using reflection, but I wonder if there is a simple trick I'm missing or any compile time mechanism.
Thanks!
E.g. Signature of Foo --> #ABCD
public class Foo {
public string Bar {get; set;}
}
New signature of Foo (property type changed) --> #WXYZ
public class Foo {
public char[] Bar {get; set;}
}
As others have pointed out it is dangerous to do something like that because a signature doesn't define the logic behind it. That being sad:
This is an extensible approach:
The method basically uses reflection to crawl through all properties of your type.
It then gets some specific values of those properties and calls ToString() on them.
Those values are appended to a string and GetHashCode() will be used on that string.
private int GetTypeHash<T>()
{
var propertiesToCheck = typeof(T).GetProperties();
if(propertiesToCheck == null || propertiesToCheck.Length == 0)
return 0;
StringBuilder sb = new StringBuilder();
foreach(var propertyToCheck in propertiesToCheck)
{
//Some simple things that could change:
sb.Append((int)propertyToCheck.Attributes);
sb.Append(propertyToCheck.CanRead);
sb.Append(propertyToCheck.CanWrite);
sb.Append(propertyToCheck.IsSpecialName);
sb.Append(propertyToCheck.Name);
sb.Append(propertyToCheck.PropertyType.AssemblyQualifiedName);
//It might be an index property
var indexParams = propertyToCheck.GetIndexParameters();
if(indexParams != null && indexParams.Length != 0)
{
sb.Append(indexParams.Length);
}
//It might have custom attributes
var customAttributes = propertyToCheck.CustomAttributes;
if(customAttributes != null)
{
foreach(var cusAttr in customAttributes)
{
sb.Append(cusAttr.GetType().AssemblyQualifiedName);
}
}
}
return sb.ToString().GetHashCode();
}
You can hash the whole class file and use that as a key. When the file changes, the hash will change and that will meet your need
You can use the public properties of the class and generate an hash based on the name and type of each property:
int ComputeTypeHash<T>()
{
return typeof(T).GetProperties()
.SelectMany(p => new[] { p.Name.GetHashCode(), p.PropertyType.GetHashCode() })
.Aggregate(17, (h, x) => unchecked(h * 23 + x));
}
ComputeTypeHash<Foo_v1>().Dump(); // 1946663838
ComputeTypeHash<Foo_v2>().Dump(); // 1946663838
ComputeTypeHash<Foo_v3>().Dump(); // 1985957629
public class Foo_v1
{
public string Bar { get; set; }
}
public class Foo_v2
{
public string Bar { get; set; }
}
public class Foo_v3
{
public char[] Bar { get; set; }
}
Doing something like this is dangerous as you (or someone else) could be introducing logic into the properties themselves at some point. It's also possible that the properties make internal calls to other methods that do change (among other things). You won't be detecting changes that go beyond the signature so you are leaving the door open to disaster.
If these group of classes you refer to rarely change, consider moving them out of the main assembly and into their own one or even break it down into more than one assembly if it makes sense. That way their assembly(ies) will not change versions and there will be no cache refresh.
In his answer to For which scenarios is protobuf-net not appropriate? Marc mentions:
jagged arrays / nested lists without intermediate types aren't OK - you can shim this by introducing an intermediate type in the middle
I'm hoping this suggests there is a way to do it without changing my underlying code, maybe using a surrogate?
Has anybody found a good approach to serializing/deserializing a nested/jagged array
At the current time, it would require (as the message suggests) changes to your model. However, in principal this is something that that the library could do entirely in its own imagination - that is simply code that I haven't written / tested yet. So it depends how soon you need it... I can take a look at it, but I can't guarantee any particular timescale.
A solution might be to serialize an intermediate type, and use a getter/setter to hide it from the rest of your code.
Example:
List<double[]> _nestedArray ; // The nested array I would like to serialize.
[ProtoMember(1)]
private List<ProtobufArray<double>> _nestedArrayForProtoBuf // Never used elsewhere
{
get
{
if (_nestedArray == null) // ( _nestedArray == null || _nestedArray.Count == 0 ) if the default constructor instanciate it
return null;
return _nestedArray.Select(p => new ProtobufArray<double>(p)).ToList();
}
set
{
_nestedArray = value.Select(p => p.MyArray).ToList();
}
}
[ProtoContract]
public class ProtobufArray<T> // The intermediate type
{
[ProtoMember(1)]
public T[] MyArray;
public ProtobufArray()
{ }
public ProtobufArray(T[] array)
{
MyArray = array;
}
}
To give an idea of my requirement, consider these classes -
class A { }
class B {
String m_sName;
public String Name {
get { return m_sName; }
set { m_sName = value; }
}
int m_iVal;
public int Val {
get { return m_iVal; }
set { m_iVal = value; }
}
A m_objA;
public A AObject {
get { return m_objA; }
set { m_objA = value; }
}
}
Now, I need to identify the classes of the objects passed to a function
void MyFunc(object obj) {
Type type = obj.GetType();
foreach (PropertyInfo pi in type.GetProperties()) {
if (pi.PropertyType.IsClass) { //I need objects only
if (!type.IsGenericType && type.FullName.ToLower() == "system.string") {
object _obj = pi.GetValue(obj, null);
//do something
}
}
}
}
I don't like this piece of code -
if (!type.IsGenericType && type.FullName.ToLower() == "system.string") {
because then i have to filter out classes like, System.Int16, System.Int32, System.Boolean and so on.
Is there an elegant way through which I can find out if the object is of a class defined by me and not of system provided basic classes?
One possible approach would be to use the Type.Assembly property and filter out anything that is not declared in one of your assemblies. The drawback of this approach is that you need to know all your assemblies at execution time, which might be hard in certain (not as common) scenarios.
There isn't really a reliable way. One thing that comes to mind is to look at the assembly the given type is defined: type.Assembly and compare this against a list of known assemblies.
As far as I Know there is no way to know if a class is from the BCL or is a user defined class but maybe you could just cache some assembly information from some well known framework dll.
You could cycle through all the classes in mscorlib.dll and put them into a List and then checking your class names against that list.
You could have a look at the PublicKeyToken attribute of the AssemblyQualifiedName on the type's Assembly property. But you would have to gather up the different tokens used by the framework for different versions of the runtime and compare to those.
The easiest way, if you have the possibility, is to mark your own classes with an attribute that you can check for (instead of checking for generics and the name of the type).
I've got a cheap and quick solution that might work:
if( type.IsClass && ! type.IsSealed )
The System.String object is a class but it is also sealed against inheritance. This works as long as you aren't using sealed classes in your code.
In .net how do I fetch object's name in the declaring type. For example...
public static void Main()
{
Information dataInformation = new Information();
}
public class Inforamtion
{
//Constructor
public Inforamtion()
{
//Can I fetch name of object i.e. "dataInformation" declared in Main function
//I want to set the object's Name property = dataInformation here, because it is the name used in declaring that object.
}
public string Name = {get; set;}
}
As far as the CLR goes, there's not really a way to determine an object's name. That sort of information is stored (to some extent) in the debugging information and the assembly, but it's not used at runtime. Regardless, the object you're referring to is just a bunch of bytes in memory. It could have multiple references to it with multiple names, so even if you could get the names of all the variables referencing the object, it would be impossible to programmatically determine which one you're looking to use.
Long story short: you can't do that.
That is the variable name, not the object name. It also poses the question: what is the name here:
Information foo, bar;
foo = bar = new Information();
You can't do this for constructors etc; in limited scenarios it is possible to get a variable name via Expression, if you really want:
public static void Main()
{
Information dataInformation = new Information();
Write(() => dataInformation);
}
static void Write<T>(Expression<Func<T>> expression)
{
MemberExpression me = expression.Body as MemberExpression;
if (me == null) throw new NotSupportedException();
Console.WriteLine(me.Member.Name);
}
Note that this relies on the capture implementation, etc - and is generally cheeky.
I don't think this is possible.
But at the first place, why do you need something like this??
With my experience i have realized that if you need something weird from a compiler or a language which is not offered, then (most often) it means that there is something wrong with the approach or the logic.
Please reconsider why are you trying to achieve this.