__makeref as a way to get a reference value in C#?

__makeref as a way to get a reference value in C#? - c#

I've seen this code which was used showing the reference value :
static void Main(string[] args)
{
string s1 = "ab";
string s2 = "a"+"b";
string s3 = new StringBuilder("a").Append("b").ToString();
Console.WriteLine(GetMemoryAddress(s1));
Console.WriteLine(GetMemoryAddress(s2));
Console.WriteLine(GetMemoryAddress(s3));
}
static IntPtr GetMemoryAddress(object s1)
{
unsafe
{
TypedReference tr = __makeref(s1);
IntPtr ptr = **(IntPtr**) (&tr);
return ptr;
}
}
Result (as expected):
(I know that string interning kicks in here, but that's not the question).
Question:
Although it seems that it does do the job,
Does using __makeref is this the right way of getting the reference value in c#?
Or are there any situations in which this ^ would fail ....?

Although it seems that it does do the job, Does using __makeref is this the right way of getting the reference value in c#?
There is no "right way" of doing this in C# - it isn't something you're meant to try and do, but: in terms of what it is doing - this is essentially relying on the internal layout of TypedReference and a type coercion; it'll work (as long as TypedReference doesn't change internally - for example the order of the Type and Value fields changes), but... it is nasty.
There is a more direct approach; in IL, you can convert from a managed pointer to an unmanaged pointer silently. Which means you can do something nasty like:
unsafe delegate void* RefToPointer(object obj);
static RefToPointer GetRef { get; } = MakeGetRef();
static RefToPointer MakeGetRef()
{
var dm = new DynamicMethod("evil", typeof(void*), new[] { typeof(object) });
var il = dm.GetILGenerator();
il.Emit(OpCodes.Ldarg_0);
il.Emit(OpCodes.Ret);
return (RefToPointer)dm.CreateDelegate(typeof(RefToPointer));
}
and now you can just do:
var ptr = new IntPtr(GetRef(o));
Console.WriteLine(ptr);
This is horrible, and you should never do it - and of course the GC can move things while you're not looking (or even while you are looking), but... it works.
Whether ref-emit is "better" than undocumented and unsupported language features like __makeref and type-coercion: is a matter of some debate. Hopefully purely academic debate!

Related

Why is TypedReference.MakeTypedReference so constrained?

I've finally understood the usage of the TypedReference.MakeTypedReference method, but why are the arguments so limited? The underlying private InternalMakeTypedReference(void* result, object target, IntPtr[] flds, RuntimeType lastFieldType) can do a lot more things than the MakeTypedReference that limits the field array to have elements and the field types to be non-primitive.
I've made a sample usage code that shows the full possibility of it:
private static readonly MethodInfo InternalMakeTypedReferenceMethod = typeof(TypedReference).GetMethod("InternalMakeTypedReference", flags);
private static readonly Type InternalMakeTypedReferenceDelegateType = ReflectionTools.NewCustomDelegateType(InternalMakeTypedReferenceMethod.ReturnType, InternalMakeTypedReferenceMethod.GetParameters().Select(p => p.ParameterType).ToArray());
private static readonly Delegate InternalMakeTypedReference = Delegate.CreateDelegate(InternalMakeTypedReferenceDelegateType, InternalMakeTypedReferenceMethod);
public static void MakeTypedReference([Out]TypedReference* result, object target, params FieldInfo[] fields)
{
IntPtr ptr = (IntPtr)result;
IntPtr[] flds = new IntPtr[fields.Length];
Type lastType = target.GetType();
for(int i = 0; i < fields.Length; i++)
{
var field = fields[i];
if(field.IsStatic)
{
throw new ArgumentException("Field cannot be static.", "fields");
}
flds[i] = field.FieldHandle.Value;
lastType = field.FieldType;
}
InternalMakeTypedReference.DynamicInvoke(ptr, target, flds, lastType);
}
Unfortunately, actually calling it needs more hacks, as it can't be invoked from MethodInfo and one parameter is RuntimeType, so the delegate type has to be generated dynamically (DynamicMethod can be also used).
Now what can this do? It can access any field (class or struct type, even primitive) of any value of any object without limitations. Moreover, it can create a reference to a boxed value type.
object a = 98;
TypedReference tr;
InteropTools.MakeTypedReference(&tr, a);
Console.WriteLine(__refvalue(tr, int)); //98
__refvalue(tr, int) = 1;
Console.WriteLine(a); //1
So why have the developers seemingly senselessly decided to disallow this kind of usage, while this is obviously useful?

Blame Plato and his darn "theory of types"...
It is the inherent in the nature of any ref (managed pointer) reference--including the new C# 7 ref local and ref return features--and as you observe, TypedReference, that you can use such for both reading and writing to the target. Because isn't that the whole point?
Now because the CTS can't rule-out either of those possibilities, strong-typing requires that the Type of every ref be constrained from both above and below in the type hierarchy.
More formally, the Type is constrained to be the intersection of the polymorphic covariance and contravariance for which it would otherwise be eligible. Obviously, the result of this intersection collapses to a single Type, itself, which is henceforth invariant.

So why have the developers seemingly senselessly decided to disallow this kind of usage, while this is obviously useful?
Because we don't need it if we have fields.
What you are doing here in a very complicated way is basically the following:
((Int32)a).m_value = 1;
Of course, in pure C# we cannot do this because a ((Point)p).X = 1 like assignment fails with CS0445: Cannot modify the result of an unboxing conversion.
Not to mention that Int32.m_value is int, which is the Int32 struct again. You cannot create such a value type in C#: CS0523: Struct member causes a cycle in the struct layout.
The MakeTypedReference actually returns a TypedReference for a FieldInfo. A little bit cleaner version of the unrestricted variant could be:
// If target contains desiredField, then returns it as a TypedReference,
// otherwise, returns the reference to the last field
private static unsafe void MakeTypedReference(TypedReference* result, object target, FieldInfo desiredField = null)
{
var flds = new List<IntPtr>();
Type lastType = target.GetType();
foreach (FieldInfo f in target.GetType().GetFields(BindingFlags.Instance | BindingFlags.Public | BindingFlags.NonPublic))
{
flds.Add(f.FieldHandle.Value);
lastType = f.FieldType;
if (f == desiredField)
break;
}
InternalMakeTypedReference.DynamicInvoke((IntPtr)result, target, flds.ToArray(), lastType);
}
So if target is an int, it returns the reference to the m_value field, which is the int value itself.
But if you deal with a FieldInfo anyway, it is much more simple to use its SetValue for the same effect:
object a = 98;
FieldInfo int32mValue = typeof(int).GetTypeInfo().GetDeclaredField("m_value");
int32mValue.SetValue(a, 1);
Console.WriteLine(a); // 1
If you really want to use a TypedReference (without the reflection API), then you can use it directly on the object and then access the boxed value via this reference. All you need to know is the memory layout of a managed object reference:
object a = 98;
// pinning is required to prevent GC reallocating the object during the pointer operations
var objectPinned = GCHandle.Alloc(a, GCHandleType.Pinned);
try
{
TypedReference objRef = __makeref(a);
// objRef.Value->object->boxed content
int* rawContent = (int*)*(IntPtr*)*(IntPtr*)&objRef;
// A managed object reference points to the type handle
// (that is another pointer to the method table), which is
// followed by the first field.
if (IntPtr.Size == 4)
rawContent[1] = 1;
else
rawContent[2] = 1;
}
finally
{
objectPinned.Free();
}
Console.WriteLine(a); // 1
But actually this is just a bit faster than the FieldInfo.SetValue version, mainly due to the object pinning.

Is modifying a value type from within a using statement undefined behavior?

This one's really an offshoot of this question, but I think it deserves its own answer.
According to section 15.13 of the ECMA-334 (on the using statement, below referred to as resource-acquisition):
Local variables declared in a
resource-acquisition are read-only, and shall include an initializer. A
compile-time error occurs if the
embedded statement attempts to modify
these local variables (via assignment
or the ++ and -- operators) or
pass them as ref or out
parameters.
This seems to explain why the code below is illegal.
struct Mutable : IDisposable
{
public int Field;
public void SetField(int value) { Field = value; }
public void Dispose() { }
}
using (var m = new Mutable())
{
// This results in a compiler error.
m.Field = 10;
}
But what about this?
using (var e = new Mutable())
{
// This is doing exactly the same thing, but it compiles and runs just fine.
e.SetField(10);
}
Is the above snippet undefined and/or illegal in C#? If it's legal, what is the relationship between this code and the excerpt from the spec above? If it's illegal, why does it work? Is there some subtle loophole that permits it, or is the fact that it works attributable only to mere luck (so that one shouldn't ever rely on the functionality of such seemingly harmless-looking code)?

I would read the standard in such a way that
using( var m = new Mutable() )
{
m = new Mutable();
}
is forbidden - with reason that seem obious.
Why for the struct Mutable it is not allowed beats me. Because for a class the code is legal and compiles fine...(object type i know..)
Also I do not see a reason why changing the contents of the value type does endanger the RA. Someone care to explain?
Maybe someone doing the syntx checking just misread the standard ;-)
Mario

I suspect the reason it compiles and runs is that SetField(int) is a function call, not an assignment or ref or out parameter call. The compiler has no way of knowing (in general) whether SetField(int) is going to mutate the variable or not.
This appears completely legal according to the spec.
And consider the alternatives. Static analysis to determine whether a given function call is going to mutate a value is clearly cost prohibitive in the C# compiler. The spec is designed to avoid that situation in all cases.
The other alternative would be for C# to not allow any method calls on value type variables declared in a using statement. That might not be a bad idea, since implementing IDisposable on a struct is just asking for trouble anyway. But when the C# language was first developed, I think they had high hopes for using structs in lots of interesting ways (as the GetEnumerator() example that you originally used demonstrates).

To sum it up
struct Mutable : IDisposable
{
public int Field;
public void SetField( int value ) { Field = value; }
public void Dispose() { }
}
class Program
{
protected static readonly Mutable xxx = new Mutable();
static void Main( string[] args )
{
//not allowed by compiler
//xxx.Field = 10;
xxx.SetField( 10 );
//prints out 0 !!!! <--- I do think that this is pretty bad
System.Console.Out.WriteLine( xxx.Field );
using ( var m = new Mutable() )
{
// This results in a compiler error.
//m.Field = 10;
m.SetField( 10 );
//This prints out 10 !!!
System.Console.Out.WriteLine( m.Field );
}
System.Console.In.ReadLine();
}
So in contrast to what I wrote above, I would recommend to NOT use a function to modify a struct within a using block. This seems wo work, but may stop to work in the future.
Mario

This behavior is undefined. In The C# Programming language at the end of the C# 4.0 spec section 7.6.4 (Member Access) Peter Sestoft states:
The two bulleted points stating "if the field is readonly...then
the result is a value" have a slightly surprising effect when the
field has a struct type, and that struct type has a mutable field (not
a recommended combination--see other annotations on this point).
He provides an example. I created my own example which displays more detail below.
Then, he goes on to say:
Somewhat strangely, if instead s were a local variable of struct type
declared in a using statement, which also has the effect of making s
immutable, then s.SetX() updates s.x as expected.
Here we see one of the authors acknowledge that this behavior is inconsistent. Per section 7.6.4, readonly fields are treated as values and do not change (copies change). Because section 8.13 tells us using statements treat resources as read-only:
the resource variable is read-only in the embedded statement,
resources in using statements should behave like readonly fields. Per the rules of 7.6.4 we should be dealing with a value not a variable. But surprisingly, the original value of the resource does change as demonstrated in this example:
//Sections relate to C# 4.0 spec
class Test
{
readonly S readonlyS = new S();
static void Main()
{
Test test = new Test();
test.readonlyS.SetX();//valid we are incrementing the value of a copy of readonlyS. This is per the rules defined in 7.6.4
Console.WriteLine(test.readonlyS.x);//outputs 0 because readonlyS is a value not a variable
//test.readonlyS.x = 0;//invalid
using (S s = new S())
{
s.SetX();//valid, changes the original value.
Console.WriteLine(s.x);//Surprisingly...outputs 2. Although S is supposed to be a readonly field...the behavior diverges.
//s.x = 0;//invalid
}
}
}
struct S : IDisposable
{
public int x;
public void SetX()
{
x = 2;
}
public void Dispose()
{
}
}
The situation is bizarre. Bottom line, avoid creating readonly mutable fields.

Dynamic Structs are failing

I have a problem using a class of made of structures.
Here's the basic definition:
using System;
struct Real
{
public double real;
public Real(double real)
{
this.real = real;
}
}
class Record
{
public Real r;
public Record(double r)
{
this.r = new Real(r);
}
public void Test(double origval, double newval)
{
if (this.r.real == newval)
Console.WriteLine("r = newval-test passed\n");
else if (this.r.real == origval)
Console.WriteLine("r = origval-test failed\n");
else
Console.WriteLine("r = neither-test failed\n");
}
}
When I create a non-dynamic (static?) Record, setting the Real works.
When I create a dynamic Record, setting the real doesn't work.
When I create a dynamic Record, replacing the real works.
And here's the test program
class Program
{
static void Main(string[] args)
{
double origval = 8.0;
double newval = 5.0;
// THIS WORKS - create fixed type Record, print, change value, print
Record record1 = new Record(origval);
record1.r.real = newval; // change value ***
record1.Test(origval, newval);
// THIS DOESN'T WORK. change value is not making any change!
dynamic dynrecord2 = new Record(origval);
dynrecord2.r.real = newval; // change value
dynrecord2.Test(origval, newval);
// THIS WORKS - create dynamic type Record, print, change value, print
dynamic dynrecord3 = new Record(origval);
dynamic r = dynrecord3.r; // copy out value
r.real = newval; // change copy
dynrecord3.r = r; // copy in modified value
dynrecord3.Test(origval, newval);
}
}
And here's the output:
r = newval-test passed
r = origval-test failed
r = newval-test passed
When I change the struct Real to class Real, all three cases work.
So what's going on?
Thanks,
Max

dynamic is really a fancy word for object as far as the core CLI is concerned, so you are mutating a boxed copy. This is prone to craziness. Mutating a struct in the first place is really, really prone to error. I would simply make the struct immutable - otherwise you are going to get this over and over.

I dug a little deeper into this problem. Here's an answer from Mads Torgersen of Microsoft.
From Mads:
This is a little unfortunate but by design. In
dynrecord2.r.real = newval; // change value
The value of dynrecord2.r gets boxed, which means copied into its own heap object. That copy is the one getting modified, not the original that you subsequently test.
This is a consequence of the very “local” way in which C# dynamic works. Think about a statement like the above – there are two fundamental ways that we could attack that:
1) Realize at compile time that something dynamic is going on, and essentially move the whole statement to be bound at runtime
2) Bind individual operations at runtime when their constituents are dynamic, returning something dynamic that may in turn cause things to be bound at runtime
In C# we went with the latter, which is nicely compositional, and makes it easy to describe dynamic in terms of the type system, but has some drawbacks – such as boxing of resulting value types for instance.
So what you are seeing is a result of this design choice.
I took another look at the MSIL. It essentially takes
dynrecord2.r.real = newval;
and turns it into:
Real temp = dynrecord2.r;
temp.real = newval;
If dynrecord2.r is a class, it just copies the handle so the change affects the internal field. If dynrecord2.r is a struct, a copy is made, and the change doesn't affect the original.
I'll leave it up to the reader to decide if this is a bug or a feature.
Max

Make your struct immutable and you won't have problems.
struct Real
{
private double real;
public double Real{get{return real;}}
public Real(double real)
{
this.real = real;
}
}
Mutable structs can be useful in native interop or some high performance scenarios, but then you better know what you're doing.

Invoke delegates without params but using local params c#

I find myself doing the following a lot, and i don't know if there is any side effects or not but consider the following in a WinForms C# app.
(please excuse any errors as i am typing the code in, not copy pasting anything)
int a = 1;
int b = 2;
int c = 3;
this.Invoke((MethodInvoker)delegate()
{
int lol = a + b + c;
});
Is there anything wrong with that? Or should i be doing the long way >_<
int a = 1;
int b = 2;
int c = 3;
TrippleIntDelegate ffs = new TrippleIntDelegate(delegate(int a_, int b_, int c_)
{
int lol = a_ + b_ + c_;
});
this.Invoke(ffs);
The difference being the parameters are passed in instead of using the local variables, some pretty sweet .net magic. I think i looked at reflector on it once and it created an entirely new class to hold those variables.
So does it matter? Can i be lazy?
Edit: Note, do not care about the return value obviously. Otherwise i'd have to use my own typed delegate, albeit i could still use the local variables without passing it in!

The way you use it, it doesn't really make a difference. However, in the first case, your anonymous method is capturing the variables, which can have pretty big side effects if you don't know what your doing. For instance :
// No capture :
int a = 1;
Action<int> action = delegate(int a)
{
a = 42; // method parameter a
});
action(a);
Console.WriteLine(a); // 1
// Capture of local variable a :
int a = 1;
Action action = delegate()
{
a = 42; // captured local variable a
};
action();
Console.WriteLine(a); // 42

There's nothing wrong with passing in local variables as long as you understand that you're getting deferred execution. If you write this:
int a = 1;
int b = 2;
int c = 3;
Action action = () => Console.WriteLine(a + b + c);
c = 10;
action(); // Or Invoke(action), etc.
The output of this will be 13, not 6. I suppose this would be the counterpart to what Thomas said; if you read locals in a delegate, it will use whatever values the variables hold when the action is actually executed, not when it is declared. This can produce some interesting results if the variables hold reference types and you invoke the delegate asynchronously.
Other than that, there are lots of good reasons to pass local variables into a delegate; among other things, it can be used to simplify threading code. It's perfectly fine to do as long as you don't get sloppy with it.

Well, all of the other answers seem to ignore the multi-threading context and the issues that arise in that case. If you are indeed using this from WinForms, your first example could throw exceptions. Depending on the actual data you are trying to reference from your delegate, the thread that code is actually invoked on may or may not have the right to access the data you close around.
On the other hand, your second example actually passes the data via parameters. That allows the Invoke method to properly marshal data across thread boundaries and avoid those nasty threading issues. If you are calling Invoke from, say, a background worker, then then you should use something like your second example (although I would opt to use the Action<T, ...> and Func<T, ...> delegates whenever possible rather than creating new ones).

From a style perspective I'd choose the paramater passing variant. It's expresses the intent much easier to pass args instad of take ambients of any sort (and also makes it easier to test). I mean, you could do this:
public void Int32 Add()
{
return this.Number1 + this.Number2
}
but it's neither testable or clear. The sig taking params is much clearer to others what the method is doing... it's adding two numbers: not an arbatrary set of numbers or whatever.
I regularly do this with parms like collections which are used via ref anyway and don't need to be explicitlly 'returned':
public List<string> AddNames(List<String> names)
{
names.Add("kevin");
return names;
}
Even though the names collection is passed by ref and thus does not need to be explicitly returned, it is to me much clearer that the method takes the list and adds to it, then returns it back. In this case, there is no technical reason to write the sig this way, but, to me, good reasons as far as clarity and therefore maintainablity are concerned.

c#: generically convert unmanaged array to managed list

I am dealing with a set of native functions that return data through dynamically-allocated arrays. The functions take a reference pointer as input, then point it to the resulting array.
For example:
typedef struct result
{
//..Some Members..//
}
int extern WINAPI getInfo(result**);
After the call, 'result' points to a null-terminated array of result*.
I want to create a managed list from this unmanaged array. I can do the following:
struct Result
{
//..The Same Members..//
}
public static unsafe List<Result> getManagedResultList(Result** unmanagedArray)
{
List<Result> resultList = new List<Result>();
while (*unmanagedArray != null)
{
resultList.Add(**unmanagedArray);
++unmanaged;
}
return result;
}
This works, it will be tedious and ugly to reimplement for every type of struct that I'll have to deal with (~35). I'd like a solution that is generic over the type of struct in the array. To that end, I tried:
public static unsafe List<T> unmanagedArrToList<T>(T** unmanagedArray)
{
List<T> result = new List<T>();
while (*unmanagedArray != null)
{
result.Add((**unmanagedArray));
++unmanagedArray;
}
return result;
}
But that won't compile because you cannot "take the address of, get the size of, or declare a pointer to a managed type('T')".
I also tried to do this without using unsafe code, but I ran into the problem that Marshal.Copy() needs to know the size of the unmanaged array. I could only determine this using unsafe code, so there seemed to be no benefit to using Marshal.Copy() in this case.
What am I missing? Could someone suggest a generic approach to this problem?

You can make a reasonable assumption that size and representation of all pointers is the same (not sure if C# spec guarantees this, but in practice you'll find it to be the case). So you can treat your T** as IntPtr*. Also, I don't see how Marshal.Copy would help you here, since it only has overloads for built-in types. So:
public static unsafe List<T> unmanagedArrToList<T>(IntPtr* p)
{
List<T> result = new List<T>();
for (; *p != null; ++p)
{
T item = (T)Marshal.PtrToStructure(*p, typeof(T));
result.Add(item);
}
return result;
}
Of course you'll need an explicit cast to IntPtr* whenever you call this, but at least there's no code duplication otherwise.

You said:
Marshal.Copy() needs to know the size
of the unmanaged array. I could only
determine this using unsafe code
It seems that you're missing Marshal.SizeOf().
From what you've mentioned in the post, that may be enough to solve your problem. (Also, the parameter of your function may need to be Object** instead of T**.)

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.