How is Reflection implemented in C#? - c#

I got curious as to where Type.GetType() is implemented, so I took a peek at the assembly and noticed Type.GetType() calls base.GetType() and since Type inherits from MemberInfo I took a look and it is defined as _MemberInfo.GetType() which returns this.GetType(). Since I cannot find the actual code that shows how C# can get type information I would like to know:
How does the CLR get Type and MemberInfo from objects at Runtime?

The ACTUAL source for .NET Framework 2.0 is available on the internet (for educational purposes) here: http://www.microsoft.com/en-us/download/details.aspx?id=4917
This is the C# Language implementation. You can use 7zip to unpack it. You will find the reflection namespace here (relatively):
.\sscli20\clr\src\bcl\system\reflection
I am digging for the specific implementation you are asking about, but this is a good start.
UPDATE: Sorry, but I think its a dead end. Type.GetType() calls to the base implementation which comes from System.Object. If you inspect that codefile (.\sscli20\clr\src\bcl\system\object.cs) you will find the method is extern (see code below). Further inspect could uncover the implementation, but its not in the BCL. I suspect it will be in C++ code somewhere.
// Returns a Type object which represent this object instance.
//
[MethodImplAttribute(MethodImplOptions.InternalCall)]
public extern Type GetType();
UPDATE (AGAIN): I dug deeper and found the answer in the implementation of the CLR virtual machine itself. (Its in C++).
The first piece of puzzle is here:
\sscli20\clr\src\vm\ecall.cpp
Here we see the code that maps the external call to an C++ function.
FCFuncStart(gObjectFuncs)
FCIntrinsic("GetType", ObjectNative::GetClass, CORINFO_INTRINSIC_Object_GetType)
FCFuncElement("InternalGetHashCode", ObjectNative::GetHashCode)
FCFuncElement("InternalEquals", ObjectNative::Equals)
FCFuncElement("MemberwiseClone", ObjectNative::Clone)
FCFuncEnd()
Now, we need to go find ObjectNative::GetClass ... which is here:
\sscli20\clr\src\vm\comobject.cpp
and here is the implementation of GetType:
FCIMPL1(Object*, ObjectNative::GetClass, Object* pThis)
{
CONTRACTL
{
THROWS;
SO_TOLERANT;
DISABLED(GC_TRIGGERS); // FCallCheck calls ForbidenGC now
INJECT_FAULT(FCThrow(kOutOfMemoryException););
SO_TOLERANT;
MODE_COOPERATIVE;
}
CONTRACTL_END;
OBJECTREF objRef = ObjectToOBJECTREF(pThis);
OBJECTREF refType = NULL;
TypeHandle typeHandle = TypeHandle();
if (objRef == NULL)
FCThrow(kNullReferenceException);
typeHandle = objRef->GetTypeHandle();
if (typeHandle.IsUnsharedMT())
refType = typeHandle.AsMethodTable()->GetManagedClassObjectIfExists();
else
refType = typeHandle.GetManagedClassObjectIfExists();
if (refType != NULL)
return OBJECTREFToObject(refType);
HELPER_METHOD_FRAME_BEGIN_RET_ATTRIB_2(Frame::FRAME_ATTR_RETURNOBJ, objRef, refType);
if (!objRef->IsThunking())
refType = typeHandle.GetManagedClassObject();
else
refType = CRemotingServices::GetClass(objRef);
HELPER_METHOD_FRAME_END();
return OBJECTREFToObject(refType);
}
FCIMPLEND
One last thing, the implementation of GetTypeHandle along with some other supporting functions can be found in here:
\sscli20\clr\src\vm\object.cpp

The most significant parts of reflection are implemented as part of the CLI itself. As such, you could look at either the MS CLI reference source (aka "Rotor"), or the mono source. But: it will mostly be C/C++. The public API implementation details (MethodInfo, Type etc) may be C#.

It might not answer you question directly. However, here is a little outline of how managed code knows everything about types.
Whenever you compile code the compiler analyzes/parses the source files and collects information it encounters. For example take a look at class below.
class A
{
public int Prop1 {get; private set;}
protected bool Met2(float input) {return true;}
}
The compiler can see that this is an internal class with two members. Member one is a property of type int with private setter. Member 2 is a protected method with name Met2 and type boolean that takes float input (input name is 'input'). So, it has all this information.
It stores this information in the assembly. There are a couple of tables. For example classes (types) all leave in one table, methods live in another table. Think in turms of SQL tables, though they are definitely are not.
When a user (developer) wants to know information about a type it calls GetType method. This method relies on objects hidden field - type object pointer. This object is basically a pointer to a class table. Each class table will have a pointer to the first method in methods table. Each method record will have a pointer to the first parameter in the parameters table.
PS: this mechamism is key to making .NET assemblies more secure. You cannot replace pointers to methods. It will break the signature of the assebmly.
JIT compilation relies heavily on this tables as well

As #GlennFerrieLive points out, the call to GetType is an InternalCall which means the implementation is within the CLR itself and not in any of the BCL.
My understanding is that the internal CLR method takes the runtime type information from the this pointer, which basically amounts to the name of the type. It then look up the complete type information from the metadata present in all loaded assemblies (presumably, in the current appdomain), which is what makes reflection rather expensive. The metadata area is basically a database of all the types and members present in the assembly and it constructs an instance of Type or Method|Property|FieldInfo from this data.

Related

In C# and ECMA-CIL, can a struct-instantiated generic be implemented using boxing?

ECMA-CIL allows generic instances to actually yield a different implementation of the generic definition when instantiated. The instantiation can be specialized based on the chosen generic arguments.
Is there any case where a generic may behave differently if instantiated by a struct instead of an object reference? This is a question regarding semantics; I am not talking about performance.
In other words, could a naive implementation of ECMA-CIL decide to implement struct-instantiated generics as boxed values (as in Java)?
I read ECMA-CIL, but I'm still not sure about this. Any feedback is more than appreciated. Although I'm particularly interested in what happens at the bytecode level, an answer from the C# language perspective is also valuable.
Here's a simulated boxed value type in C#:
public class Boxed<T> where T : struct
{
public T Value; // do not assign to!
public Boxed(T value) => Value = value;
}
This is the best we can do at .NET level, since there is no native way to make a boxed value type reference (C++/CLI uses a tagged object to specify that). This is also more or less equivalent to System.Runtime.CompilerServices.StrongBox<T>.
Is there any case where a generic may behave differently if instantiated by a struct instead of an object reference? This is a question regarding semantics; I am not talking about performance.
Of course, using Boxed<T> means that default(Boxed<T>) is null for example. Here is a situation where this could be an issue:
public static class GlobalVariable<T>
{
public T Value;
}
If .NET actually implemented GlobalVariable<int> using GlobalVariable<Boxed<int>>, the Value field would contain null instead of 0. A conforming CLI implementation would have to implicitly use Value = new Boxed<T>(default(T)); when the type is instantiated, or modify ldfld (or ldflda) to create such an instance there if there is null (wreaking havoc for threading and readonly).
Another issue is copy semantics of value types. Each opcode like ldloc or similar assumes that a value type would be copied and mutating the value of the target should not affect the source. For example:
public static class GlobalVariable<T>
{
public T Value;
public delegate void Mutator(ref T value);
public static T MutateCopy(Mutator mutator)
{
var copy = Value;
mutator(ref copy);
return copy;
}
}
Calling something like GlobalVariable<Boxed<ValueTuple<int>>>.MutateCopy would pose an issue (if the runtime actually translated mutator(ref copy) to something like mutator(ref copy.Value) in order to call the delegate) since it would access the same instance. The runtime would have to "clone" Value as an object during the assignment in order to fix it.
Summed up, a value type type argument *could* be implemented using boxing, but without any additional special treatment, you get the same issues you would get in Java when using wrapper classes. It could work without mutable value types or strict null conversion checks (and potentially without byref parameters), but anything more than that will require additional (and complicated) changes to the implementation or languages. For an example of such a thing, see "unboxed" reference types in C++/CLI (i.e. reference types without the hat ^, which have value semantics).

Datatype of Methods

Question
How does a delegate store a reference to a function? The source code appears to refer to it as an Object, and the manner in which it invokes the method seems redacted from the source code. Can anyone explain how C# is handling this?
Original Post
It seems I'm constantly fighting the abstractions C# imposes on its programmers. One that's been irking me is the obfuscation of Functions/Methods. As I understand it, all methods are in fact anonymous methods assigned to properties of a class. This is the reason why no function is prefixed by a datatype. For example...
void foo() { ... }
... would be written in Javascript as...
Function foo = function():void { ... };
In my experience, Anonymous functions are typically bad form, but here it's replete throughout the language standard. Because you cannot define a function with its datatype (and apparently the implication/handling is assumed by the compiler), how does one store a reference to a method if the type is never declared?
I'm trying very hard to avoid Delegates and its variants (Action & Func), both because...
it is another abstraction from what's actually happening
the unnecessary overhead required to instantiate these classes (which in turn carry their own pointers to the methods being called).
Looking at the source code for the Delegate.cs, it appears to refer to the reference of a function as simply Object (see lines 23-25).
If these really are objects, how are we calling them? According to the delegate.cs trail, it dead-ends on the following path:
Delegate.cs:DynamicInvoke() > DynamicInvokeImpl() > methodinfo.cs:UnsafeInvoke() > UnsafeInvokeInternal() > RuntimeMethodHandle.InvokeMethod() > runtimehandles.cs:InvokeMethod()
internal extern static object InvokeMethod(object target, object[] arguments, Signature sig, bool constructor);
This really doesn't explain how its invoked if indeed the method is an object. It feels as though this is not code at all, and the actual code called has been redacted from source repository.
Your help is appreciated.
Response to Previous Comments
#Amy: I gave an example immediately after that statement to explain what I meant. If a function were prefixed by a datatype, you could write a true anonymous function, and store it as a property to an Object such as:
private Dictionary<string, Function> ops = new Dictionary<string, Function> {
{"foo", int (int a, int b) { return a + b } }
};
As it stands, C# doesn't allow you to write true anonymous functions, and walls that functionality off behind Delegates and Lambda expressions.
#500 Internal server error: I already explained what I was trying to do. I even bolded it. You assume there's any ulterior motive here; I'm simply trying to understand how C# stores a reference to a method. I even provided links to the source code so that others could read the code for themselves and help answer the question.
#Dialecticus: Obviously if I already found the typical answer on Google, the only other place to find the answer I'm looking for would be here. I realize this is outside the knowledge of most C# developers, and that's why I've provided the source code links. You don't have to reply if you don't know the answer.
While I'm not fully understanding your insights about "true anonymous functions", "not prefixed by a data type" etc, I can explain you how applications written in C# call methods.
First of all, there is no such a thing "function" in C#. Each and every executable entity in C# is in fact a method, that means, it belongs to a class. Even if you define lambdas or anonymous functions like this:
collection.Where(item => item > 0);
the C# compiler creates a compiler-generated class behind the scenes and puts the lambda body return item > 0 into a compiler-generated method.
So assuming you have this code:
class Example
{
public static void StaticMethod() { }
public void InstanceMethod() { }
public Action Property { get; } = () => { };
}
static class Program
{
static void Main()
{
Example.StaticMethod();
var ex = new Example();
ex.InstanceMethod();
ex.Property();
}
}
The C# compiler will create an IL code out of that. The IL code is not executable right away, it needs to be run in a virtual machine.
The IL code will contain a class Example with two methods (actually, four - a default constructor and the property getter method will be automatically generated) and a compiler-generated class containing a method whose body is the body of the lambda expression.
The IL code of Main will look like this (simplified):
call void Example::StaticMethod()
newobj instance void Example::.ctor()
callvirt instance void Example::InstanceMethod()
callvirt instance class [mscorlib]System.Action Example::get_Prop()
callvirt instance void [mscorlib]System.Action::Invoke()
Notice those call and callvirt instructions: these are method calls.
To actually execute the called methods, their IL code needs to be compiled into machine code (CPU instructions). This occurs in the virtual machine called .NET Runtime. There are several of them like .NET Framework, .NET Core, Mono etc.
A .NET Runtime contains a JIT (just-in-time) compiler. It converts the IL code to the actually executable code during the execution of your program.
When the .NET Runtime first encounters the IL code "call method StaticMethod from class Example", it first looks in the internal cache of already compiled methods. When there are no matches (which means this is the first call of that method), the Runtime asks the JIT compiler to create such a compiled-and-ready-to-run method using the IL code. The IL code is converted into a sequence of CPU operations and stored in the process' memory. A pointer to that compiled code is stored in the cache for future reuse.
This all will happen behind the call or callvirt IL instructions (again, simplified).
Once this happened, the Runtime is ready to execute the method. The CPU gets the compiled code's first operation address as the next operation to execute and goes on until the code returns. Then, the Runtime takes over again and proceeds with next IL instructions.
The DynamicInvoke method of the delegates does the same thing: it instructs the Runtime to call a method (after some additional arguments checks etc). The "dead end" you mention RuntimeMethodHandle.InvokeMethod is an intrinsic call to the Runtime directly. The parameters of this method are:
object target - the object on which the delegate invokes the instance method (this parameter).
object[] arguments - the arguments to pass to the method.
Signature sig - the actual method to call, Signature is an internal class that provides the connection between the managed IL code and native executable code.
bool constructor - true if this is a constructor call.
So in summary, methods are not represented as objects in C# (while you of course can have a delegate instance that is an object, but it doesn't represent the executable method, it rather provides an invokable reference to it).
Methods are called by the Runtime, the JIT compiler makes the methods executable.
You cannot define a global "function" outside of classes in C#. You could get a direct native pointer to the compiled (jitted) method code and probably even call it manually by directly manipulating own process' memory. But why?
You clearly misunderstand main differences between script languages, C/C++ and C#.
I guess the main difficulty is that there is no such thing as a function in C#. At all.
C#7 introduced the new feature "a local function", but that is not what a function in JS is.
All pieces of code are methods.
That name is intentionally different from function or a procedure to emphasize the fact that all executable code in C# belongs to a class.
Anonymous methods and lambdas are just a syntax sugar.
A compiler will generate a real method in the same (or a nested) class, where the method with anonymous method declaration belongs to.
This simple article explains it. You can take the examples, compile them and check the generated IL code yourself.
So all the methods (anonymous or not) do belong to a class. It's impossible to answer your updated question, besides saying It does not store a reference to a function, as there is no such thing in C#.
How does one store a reference to a method?
Depending on what you mean by reference, it can be either
An instance of MethodInfo class, used to reference reflection information for a method,
RuntimeMethodHandle (obtainable via RuntimeMethodInfo.MethodHandle) stores a real memory pointer to a JITed method code
A delegate, that is very different from just a memory pointer, but logically could be used to "pass a method reference to another method" .
I believe you are looking for the MethodInfo option, it has a MethodInfo.Invoke method which is very much alike Function..apply function in JS. You have already seen in the Delegate source code how that class is used.
If by "reference" you mean the C-style function pointer, it is in RuntimeMethodHandle struct. You should never use it without solid understanding how a particular .Net platform implementation and a C# compiler work.
Hopefully it clarifies things a bit.
A delegate is simply a pointer(memory location to jump to) to a method with the specified parameters and return type. Any Method that matches the signature(Parameters and return type) is eligible to fulfill the role, irrespective of the defined object. Anonymous simply means the delegate is not named.
Most times the type is implied(if it is not you will get a compiler error):
C# is a strongly typed language. That means every expression (including delegates) MUST have a return type(including void) as well as strongly typed parameters(if any). Generics were created to permit explicit types to be used within general contexts, such as Lists.
To put it another way, delegates are the type-safe managed version of C++ callbacks.
Delegates are helpful in eliminating switch statements by allowing the code to jump to the proper handler without testing any conditions.
A delegate is similar to a Closure in Javascript terminology.
In your response to Amy, you are attempting to equate a loosely typed language like JS, and a strongly typed language C#. In C# it is not possible to pass an arbitrary(loosely-typed) function anywhere. Lambdas and delegates are the only way to guarantee type safety.
I would recommend trying F#, if you are looking to pass functions around.
EDIT:
If you are trying to mimic the behavior of Javascipt, I would try looking at using inheritance through Interfaces. I can mimic multiple inheritance, and be type safe at the same time. But, be aware that it cannot fully supplant Javascript's dependency injection model.
As you probably found out C# doesn't have the concept of a function as in your JavaScript example.
C# is a statically typed language and the only way you can use function pointers is by using the built in types (Func,Action) or custom delegates.(I'm talking about safe,strongly typed pointers)
Javascript is a dynamic language that's why you can do what you describe
If you are willing to lose type safety, you can use the "dynamic" features of C# or refection to achieve what you want like in the following examples (Don't do this,use Func/Action)
using System;
using System.Collections.Generic;
using System.Linq;
using System.Reflection;
namespace ConsoleApp1
{
class Program
{
private static Dictionary<string, Func<int, int, int>> FuncOps = new Dictionary<string, Func<int, int, int>>
{
{"add", (a, b) => a + b},
{"subtract", (a, b) => a - b}
};
//There are no anonymous delegates
//private static Dictionary<string, delegate> DelecateOps = new Dictionary<string, delegate>
//{
// {"add", delegate {} }
//};
private static Dictionary<string, dynamic> DynamicOps = new Dictionary<string, dynamic>
{
{"add", new Func<int, int, int>((a, b) => a + b)},
{"subtract", new Func<int, int, int>((a, b) => a - b)},
{"inverse", new Func<int, int>((a) => -a )} //Can't do this with Func
};
private static Dictionary<string, MethodInfo> ReflectionOps = new Dictionary<string, MethodInfo>
{
{"abs", typeof(Math).GetMethods().Single(m => m.Name == "Abs" && m.ReturnParameter.ParameterType == typeof(int))}
};
static void Main(string[] args)
{
Console.WriteLine(FuncOps["add"](3, 2));//5
Console.WriteLine(FuncOps["subtract"](3, 2));//1
Console.WriteLine(DynamicOps["add"](3, 2));//5
Console.WriteLine(DynamicOps["subtract"](3, 2));//1
Console.WriteLine(DynamicOps["inverse"](3));//-3
Console.WriteLine(ReflectionOps["abs"].Invoke(null, new object[] { -1 }));//1
Console.ReadLine();
}
}
}
one more example that you shouldn't use
delegate object CustomFunc(params object[] paramaters);
private static Dictionary<string, CustomFunc> CustomParamsOps = new Dictionary<string, CustomFunc>
{
{"add", parameters => (int) parameters[0] + (int) parameters[1]},
{"subtract", parameters => (int) parameters[0] - (int) parameters[1]},
{"inverse", parameters => -((int) parameters[0])}
};
Console.WriteLine(CustomParamsOps["add"](3, 2)); //5
Console.WriteLine(CustomParamsOps["subtract"](3, 2)); //1
Console.WriteLine(CustomParamsOps["inverse"](3)); //-3
I will provide a really short and simplified answer compared to the others. Everything in C# (classes, variables, properties, structs, etc) has a backed with tons of things your programs can hook into. This network of backend stuff slightly lowers the speed of C# when compared to "deeper" languages like C++, but also gives programmers a lot more tools to work with and makes the language easier to use. In this backend is included things like "garbage collection," which is a feature that automatically deletes objects from memory when there are no variables left that reference them. Speaking of reference, the whole system of passing objects by reference, which is default in C#, is also managed in the backend. In C#, Delegates are possible because of features in this backend that allow for something called "reflection."
From Wikipedia:
Reflection is the ability of a computer program to examine,
introspect, and modify its own structure and behavior at runtime.
So when C# compiles and it finds a Delegate, it is just going to make a function, and then store a reflective reference to that function in the variable, allowing you to pass it around and do all sorts of cool stuff with it. You aren't actually storing the function itself in the variable though, you are storing a reference, which is kinda like an address that points you to where the function is stored in RAM.

Marshalling Microsoft.VisualBasic.Collection to COM

I have a C# class that has a property(name is List) of type Microsoft.VisualBasic.Collection. We need to exposed this property to COM. For that, I was writing an interface for my class such that the property would marshaled as of UnmanagedType.IDispatch.
Something like this:
[DispId(0x68030000)]
Collection List { [DispId(0x68030000)] [return: MarshalAs(UnmanagedType.IDispatch)] get; }
This piece of code was earlier in VB and was being used by C++ as tye VT_DISPATCH. However, while building the C# library, I get the following error:
C:\Program Files
(x86)\MSBuild\14.0\bin\Microsoft.Common.CurrentVersion.targets(4335,5):
error MSB3212: The assembly "Name.dll" could not be converted to a
type library. Type library exporter encountered an error while
processing 'Namespace.InterfaceName.get_List(#0), ProjectName'. Error:
Error loading type library/DLL.
I read through few of the posts online which suggested that such errors might cause because of repetitive GUIDs. But that is not the case. I tried with multiple GUIDs. I don't feel this is an issue with any other attribute that I had set on my Interface since, I am able to marshal other properties and function calls ( some of them use primitive types and others use custom classes) successfully.
This is how it is consumed in the consuming C++ application:
LPDISPATCH result;
InvokeHelper(0x68030000, DISPATCH_PROPERTYGET, VT_DISPATCH, (void*)&result, NULL);
return result;
This has become a go live issue with the client and I really do not have much time to continue investigation since this is due tomorrow.
Appreciate any help!
I don't think Microsoft.VisualBasic.Collection is COM-visible. Therefore, you cannot use this type as a return value or parameter in a COM class or interface. However, ICollection (which Microsoft.VisualBasic.Collection implements) is COM visible. If that would suit your purposes, use that as the type of your property rather than Microsoft.VisualBasic.Collection.

Are the placeholders of Generics compiled as an actual data type? [duplicate]

I had thought that Generics in C# were implemented such that a new class/method/what-have-you was generated, either at run-time or compile-time, when a new generic type was used, similar to C++ templates (which I've never actually looked into and I very well could be wrong, about which I'd gladly accept correction).
But in my coding I came up with an exact counterexample:
static class Program {
static void Main()
{
Test testVar = new Test();
GenericTest<Test> genericTest = new GenericTest<Test>();
int gen = genericTest.Get(testVar);
RegularTest regTest = new RegularTest();
int reg = regTest.Get(testVar);
if (gen == ((object)testVar).GetHashCode())
{
Console.WriteLine("Got Object's hashcode from GenericTest!");
}
if (reg == testVar.GetHashCode())
{
Console.WriteLine("Got Test's hashcode from RegularTest!");
}
}
class Test
{
public new int GetHashCode()
{
return 0;
}
}
class GenericTest<T>
{
public int Get(T obj)
{
return obj.GetHashCode();
}
}
class RegularTest
{
public int Get(Test obj)
{
return obj.GetHashCode();
}
}
}
Both of those console lines print.
I know that the actual reason this happens is that the virtual call to Object.GetHashCode() doesn't resolve to Test.GetHashCode() because the method in Test is marked as new rather than override. Therefore, I know if I used "override" rather than "new" on Test.GetHashCode() then the return of 0 would polymorphically override the method GetHashCode in object and this wouldn't be true, but according to my (previous) understanding of C# generics it wouldn't have mattered because every instance of T would have been replaced with Test, and thus the method call would have statically (or at generic resolution time) been resolved to the "new" method.
So my question is this: How are generics implemented in C#? I don't know CIL bytecode, but I do know Java bytecode so I understand how Object-oriented CLI languages work at a low level. Feel free to explain at that level.
As an aside, I thought C# generics were implemented that way because everyone always calls the generic system in C# "True Generics," compared to the type-erasure system of Java.
In GenericTest<T>.Get(T), the C# compiler has already picked that object.GetHashCode should be called (virtually). There's no way this will resolve to the "new" GetHashCode method at runtime (which will have its own slot in the method-table, rather than overriding the slot for object.GetHashCode).
From Eric Lippert's What's the difference, part one: Generics are not templates, the issue is explained (the setup used is slightly different, but the lessons translate well to your scenario):
This illustrates that generics in C# are not like templates in C++.
You can think of templates as a fancy-pants search-and-replace
mechanism.[...] That’s not how generic types work; generic types are,
well, generic. We do the overload resolution once and bake in the
result. [...] The IL we’ve generated for the generic type already has
the method its going to call picked out. The jitter does not say
“well, I happen to know that if we asked the C# compiler to execute
right now with this additional information then it would have picked a
different overload. Let me rewrite the generated code to ignore the
code that the C# compiler originally generated...” The jitter knows
nothing about the rules of C#.
And a workaround for your desired semantics:
Now, if you do want overload resolution to be re-executed at runtime based on the runtime types of
the arguments, we can do that for you; that’s what the new “dynamic”
feature does in C# 4.0. Just replace “object” with “dynamic” and when
you make a call involving that object, we’ll run the overload
resolution algorithm at runtime and dynamically spit code that calls
the method that the compiler would have picked, had it known all the
runtime types at compile time.

Make type's instances non-storable

Is there a way to mark a type (or even better, an interface) so that no instances of it can be stored in a field (in a similar way to TypedReference and ArgIterator)?
In the same way, is there a way to prevent instances from being passed through anonymous methods and -- In general -- To mimic the behavior of the two types above?
Can this be done through ILDasm or more generally through IL editing? Since UnconstrainedMelody achieves normally unobtainable results through binary editing of a compiled assembly, maybe there's a way to "mark" certain types (or even better, abstract ones or marker interfaces) through the same approach.
I doubt it’s hardcoded in the compiler because the documentation for the error CS0610 states:
There are some types that cannot be used as fields or properties. These types include...
Which in my opinion hints that the set of types like those can be extended -- But I could be wrong.
I've searched a bit on SO and while I understand that throwing a compiler error programmatically can't be done, I could find no source stating that certain "special" types' behaviors couldn't be replicated.
Even if the question is mostly academic, there could be some usages for an answer. For example, it could be useful sometimes to be sure that a certain object's lifetime is constrained to the method block which creates it.
EDIT: RuntimeArgumentHandle is one more (unmentioned) non-storable type.
EDIT 2: If it can be of any use, it seems that the CLR treats those types in a different way as well, if not only the compiler (still assuming that the types are in no way different from others). The following program, for example, will throw a TypeLoadException regarding TypedReference*. I've adapted it to make it shorter but you can work around it all you want. Changing the pointer's type to, say, void* will not throw the exception.
using System;
unsafe static class Program
{
static TypedReference* _tr;
static void Main(string[] args)
{
_tr = (TypedReference*) IntPtr.Zero;
}
}
Okay. This isn't quite a complete analysis, but I suspect it's a sufficient one for the purposes of determining whether you can do this, even by twiddling IL - which, so far as I can tell, you can't.
I also, on looking at the decompiled version with dotPeek, couldn't see anything special about that particular type/those particular types in there, attribute-wise or otherwise:
namespace System
{
/// <summary>
/// Describes objects that contain both a managed pointer to a location and a runtime representation of the type that may be stored at that location.
/// </summary>
/// <filterpriority>2</filterpriority>
[ComVisible(true)]
[CLSCompliant(false)]
public struct TypedReference
{
So, that done, I tried creating such a class using System.Reflection.Emit:
namespace NonStorableTest
{
//public class Invalid
//{
// public TypedReference i;
//}
class Program
{
static void Main(string[] args)
{
AssemblyBuilder asmBuilder = AppDomain.CurrentDomain.DefineDynamicAssembly(new AssemblyName("EmitNonStorable"),
AssemblyBuilderAccess.RunAndSave);
ModuleBuilder moduleBuilder = asmBuilder.DefineDynamicModule("EmitNonStorable", "EmitNonStorable.dll");
TypeBuilder invalidBuilder = moduleBuilder.DefineType("EmitNonStorable.Invalid",
TypeAttributes.Class | TypeAttributes.Public);
ConstructorBuilder constructorBuilder = invalidBuilder.DefineDefaultConstructor(MethodAttributes.Public);
FieldBuilder fieldI = invalidBuilder.DefineField("i", typeof (TypedReference), FieldAttributes.Public);
invalidBuilder.CreateType();
asmBuilder.Save("EmitNonStorable.dll");
Console.ReadLine();
}
}
}
That, when you run it, throws a TypeLoadException, the stack trace of which points you to System.Reflection.Emit.TypeBuilder.TermCreateClass. So then I went after that with the decompiler, which gave me this:
[SuppressUnmanagedCodeSecurity]
[SecurityCritical]
[DllImport("QCall", CharSet = CharSet.Unicode)]
private static void TermCreateClass(RuntimeModule module, int tk, ObjectHandleOnStack type);
Pointing into the unmanaged parts of the CLR. At this point, not to be defeated, I dug into the shared sources for the reference version of the CLR. I won't go through all the tracing through that I did, to avoid bloating this answer beyond all reasonable use, but eventually, you end up in \clr\src\vm\class.cpp, in the MethodTableBuilder::SetupMethodTable2 function (which also appears to set up field descriptors), where you find these lines:
// Mark the special types that have embeded stack poitners in them
if (strcmp(name, "ArgIterator") == 0 || strcmp(name, "RuntimeArgumentHandle") == 0)
pClass->SetContainsStackPtr();
and
if (pMT->GetInternalCorElementType() == ELEMENT_TYPE_TYPEDBYREF)
pClass->SetContainsStackPtr();
This latter relating to information found in \src\inc\cortypeinfo.h, thus:
// This describes information about the COM+ primitive types
// TYPEINFO(enumName, className, size, gcType, isArray,isPrim, isFloat,isModifier)
[...]
TYPEINFO(ELEMENT_TYPE_TYPEDBYREF, "System", "TypedReference",2*sizeof(void*), TYPE_GC_BYREF, false, false, false, false)
(It's actually the only ELEMENT_TYPE_TYPEDBYREF type in that list.)
ContainsStackPtr() is then in turn used elsewhere in various places to prevent those particular types from being used, including in fields - from \src\vm\class.cp, MethodTableBuilder::InitializeFieldDescs():
// If it is an illegal type, say so
if (pByValueClass->ContainsStackPtr())
{
BuildMethodTableThrowException(COR_E_BADIMAGEFORMAT, IDS_CLASSLOAD_BAD_FIELD, mdTokenNil);
}
Anyway: to cut a long, long, long story short, it would seem to be the case that which types are non-storable in this way is effectively hard-coded into the CLR, and thus that if you want to change the list or provide an IL means to flag up types as non-storable, you're pretty much going to have to take Mono or the shared-source CLR and spin off your own version.
I couldn't find anything in the IL that indicated those types are in any way special. I do know that the compiler has many, many special cases, such as transforming int/Int32 into the internal type int32, or many things pertaining to the Nullable struct. I highly suspect these types are also special cases.
A possible solution would be Roslyn, which I expect would let you create such a constraint.

Categories