what's the implication of void**? - c#

When I develop in COM, I always see (void**) type conversion as below.
QueryInterface(/* [in] */ REFIID riid,/* [out] */ void** ppInterface)
What's exact meaning of it?
IMHO, it tells the compiler not to enforce type validation, since the type which is pointed by the ppInterface is not known to the client code at compile time.
Thanks~~~
Update 1
I understand it this way:
void* p implies AnyType* p
void ** pp implies pointer to AnyType*
Update 2
If void**pp means "pointer to void*", then what checks does the compiler do when it sees it?

A void ** is a pointer to a void *. This can be used to pass the address of a void * variable that will be used as an output parameter - eg:
void alloc_two(int n, void **a, void **b)
{
*a = malloc(n * 100);
*b = malloc(n * 200);
}
/* ... */
void *x;
void *y;
alloc_two(10, &x, &y);

The reason why COM uses void** with QueryInterface are somewhat special. (See below.)
Generally, void** simply means a pointer to void*, and it can be used for out parameters, ie. parameters that indicate a place where a function can return a value to. Your comment /* [out] */ indicates that the location pointed to by ppvInterface will be written to.
"Why can parameters with a pointer type be used as out parameters?", you ask? Remember that you can change two things with a pointer variable:
You can change the pointer itself, such that it points to another object. (ptr = ...)
You can modify the pointed-to object. (*ptr = ...)
Pointers are passed to a function by value, ie. the function gets its own local copy of the original pointer that was passed to it. This means you can change the pointer parameter inside the function (1) without affecting the original pointer, since only the local copy is modified. However, you can change the pointed-to object (2) and this will be visible outside of the function, because the copy has the same value as the original pointer and thus references the same object.
Now, about COM specifically:
A pointer to an interface (specified by riid) will be returned in the variable referenced by ppvInterface. QueryInterface achieves this via mechanism (2) mentioned above.
With void**, one * is required to allow mechanism (2); the other * reflects the fact that QueryInterface does not return a newly created object (IUnknown), but an already existing one: In order to avoid duplication of that object, a pointer to that object (IUnknown*) is returned.
If you're asking why ppvInterface has type void** and not IUnknown**, which would seem more reasonable type-safety-wise (since all interfaces must derive from IUnknown), then read the following argument taken from the book Essential COM by Don Box, p. 60 (chapter Type Coercion and IUnknown):
One additional subtlety related to QueryInterface concerns its second parameter, which is of type void **. It is very ironic that QueryInterface, the underpinning of the COM type system, has a fairly type-unsafe prototype in C++ [...]
IPug *pPug = 0;
hr = punk->QueryInterface(IID_IPug, (void**)&pPug);
Unfortunately, the following looks equally correct to the C++ compiler:
IPug *pPug = 0;
hr = punk->QueryInterface(IID_ICat, (void**)&pPug);
This more subtle variation also compiles correctly:
IPug *pPug = 0;
hr = punk->QueryInterface(IID_ICat, (void**)pPug);
Given that the rules of inheritance do not apply to pointers, this alternative definition of QueryInterface does not alleviate the problem:
HRESULT QueryInterface(REFIID riid, IUnknown** ppv);
The same limitation applies to references as to pointers as well. The following alternative definition is arguably more convenient for clients to use:
HRESULT QueryInterface(const IID& riid, void* ppv);
[...] Unfortunately, this solution does not reduce the number of errors [...] and, by eliminating the need for a cast, removes a visual indicator that C++ type safety might be in jeopardy. Given the desired semantics of QueryInterface, the argument types Microsoft chose are reasonable, if not type safe or elegant. [...]

It is just a pointer to void*.
Eg:
Something* foo;
Bar((void**)&foo);
// now foo points to something meaningful
Edit: A possible implementation in C#.
struct Foo { }
static Foo foo = new Foo();
unsafe static void Main(string[] args)
{
Foo* foo;
Bar((void**)&foo);
}
static unsafe void Bar(void** v)
{
fixed (Foo* f = &foo)
{
*v = f;
}
}

Passing by void * also ensures that the pointed to object cannot be deleted or tampered (accidentally).
"This implies that an object cannot be deleted using a pointer of type void* because there are no objects of type void."

It's a pointer to the interface pointer you request using this call. Obviously you can request all sorts of interfaces, so it has to be a void pointer. If the interface doesn't exist, the pointer is set to NULL.
edit: Detailed information to be found here: http://msdn.microsoft.com/en-us/library/ms682521(VS.85).aspx

It allows the API to specify that a pointer may be used as an [in-out] parameter in future, but for now, the pointer is unused. (NULL is usually the required value.)
When returning one of many possible types, with no common supertype (such as with QueryInterface), returning a void* is really the only option, and as this needs to be passed as an [out] parameter a pointer to that type (void**) is needed.

not to enforce type validation
Indeed, void* or void** are there to allow the use of different types of pointers, that can be downcasted to void* to fit in the function parameters type.

Pointer to pointer of unknown interface that can be provided.

Instead of using pointers to pointers, try using a reference to a pointer. It's a bit more C++ than using **.
e.g.
void Initialise(MyType &*pType)
{
pType = new MyType();
}

Related

Marshal out parameter that is a reference type

I'm trying to get a C# class (not struct!) from a C++ function back (using out parameter). This is the C# side:
[StructLayout(LayoutKind.Sequential)]
public class OutClass
{
public int X;
}
// This interface function will be mapped to the C++ function
void MarshalOutClass(out OutClass outClass);
The C++ part looks like this
struct OutClass
{
int x = 0;
};
extern "C" MARSHAL_TESTS_API void MarshalOutClass(OutClass** out);
The function mapping works via a custom mechanism and is not part of this question. All marshaling attributes are treated normally.
In C#, we usually declare out arguments inline, like:
MarshalOutClass(out var outClass);
DoSomething(outClass);
Since C# reference types are marshaled by pointer, I figured I would have to add another pointer for ref or out parameters. So I use OutClass** on the C++ side.
I assume that C# translates the
MarshalOutClass(out var outClass);
part to roughly
OutClass outClass = default(OutClass);
MarshalOutClass(ref outClass);
which is a reference to null, or on C++ side: a pointer to a nullptr. This wouldn't be a problem for C# value types (aka struct) because their default is a default constructed instance.
This means, I'd have to manually create an instance of my object on the C++ side and marshal it back to C#.
void MarshalOutClass(OutClass** out)
{
auto ptr = static_cast<OutClass*>(CoTaskMemAlloc(sizeof(OutClass)));
*ptr = OutClass{};
*out = ptr;
}
The code seems to work, but I'm not sure if I'm leaking memory with this or if the marshaler is going to to take care of it properly. I don't do any cleanup on the C# side.
This brings me to the following questions:
is my assumption on how ref and out are translated to C++ correct?
is CoTaskMemAlloc the correct function here?
do I have to perform any additional memory management related tasks (on either side)?
is the overall approach correct here? What should I do differently?
I know that static_cast<OutClass*>(CoTaskMemAlloc(sizeof(OutClass))) is a bit sketchy here but lets assume OutClass is always a trivial type.

concept and datatype of 'this' pointer in c++ and c#

I am a bit unclear about the concept of this pointer.
I know the this pointer in c++ is hidden pointer and it is used to refer the current invoking object. But I want to know if there is a datatype for this pointer, For eg. int *p; states that p is a pointer to an integer. Similarly this pointer points to what? (I mean where is this 'this' pointer declared or written, or where does it exist and how is it written there.)
Second question is in the context of C#
The question is -- what is 'this' in the context of C# if it is not a pointer in C# (I found out it is not a pointer in C# when i tried using the -> arrow operator with 'this' keyword).
And again, what is the datatype of 'this' in C# (I know the answer to the above question relating to c++ would answer it, but I wanted to know is it different in C# )
P.S : I am an amateur programmer, and in learning stage, so apologies for using terms like "how is it written there" instead of the proper technical terminologies in asking the question.
I want to know if there is a datatype for this pointer
The type of the hidden pointer this is always the type of the class where the pointer is used. For example, below
struct Foo {
void bar1() {
cout << this << endl;
}
void bar2() const {
cout << this << endl;
}
};
the type of this inside bar1() is Foo*; inside bar2() const it is const Foo*.
Similarly this pointer points to what?
It is a pointer to the current instance. For example, below
struct Foo {
void bar() {
cout << this << endl;
}
} foo;
the pointers this inside bar() and the expression &foo point to the same object.
what is this in the context of C# if it is not a pointer?
C# has no pointers*, but its concept of object references is reasonably similar. In C#, this represents a reference to the current object. The type of this is the type of the class inside which it is referenced.
Note that the runtime type of the object pointed to by this may be different - for example, it could be a subclass. However, the static type of the pointer this matches the type of the class inside which the pointer this is referenced.
* unless you venture into unsafe context, anyway.
In C++ the this pointer is an implicit parameter passed to all nonstatic member functions (passed to them as a hidden argument). Have a look here for more explanations: http://www.tutorialspoint.com/cplusplus/cpp_this_pointer.htm and here: http://www.geeksforgeeks.org/this-pointer-in-c/, http://msdn.microsoft.com/en-us/library/y0dddwwd.aspx.
In C# the "this" is a reserved keyword which is a reference to the object for which the method is running. As you know in C# you don't use pointers, but you use references to objects.
Here's IMHO a useful reference, that describes the problem pretty fine: MSDN
Yes, this pointer in C++ has a type.
class A
{
public:
void func() {}
}
Inside func type of this is:-
A * const //for non-const functions.
A const * const //for const functions.
this is most certainly a pointer in C++, in class Foo it will be of the type Foo *, unless you're in a volatile or const method - in that case those 2 modifiers will propagate to the this type.
Similarly in C# this is just a magical reference variable of the current object's type.

VARIANT datatype of C++ into C#

What is equivalent of the VARIANT datatype of C++ in C#?
I have code in C++ which uses the VARIANT datatype. How can I convert that code in C#?
Well, there are actually two variant's in C++: boost::variant and COM variant. The solution follows more or less the same idea, but the former is more complex. I expect you mean to use the latter.
Let me first start by telling that this is something you just shouldn't use if possible. That said, this is how you do it :-)
Variants and interop
Variants are sometimes used in interop of if you need the byte representation to be the same.
If you're dealing with interop, make sure to check out the VariantWrapper class on MSDN and make it work like that.
Variants and porting considerations
Variants are mostly used in APIs, and usually like this:
void Foo(SomeEnum operation, Variant data);
The reason it's done like this in C++ is because there is no base object class and you therefore need something like this. The easiest way to port this is to change the signature to:
void Foo(SomeEnum operation, object data);
However, if you're porting anyway, you also seriously want to consider these two, since they are resolved at compile-time and can save you the big 'switch' that usually follows in method Foo:
void SomeOperation(int data);
void SomeOperation(float data);
// etc
Variants and byte consistency
In rare cases you need to manipulate the bytes themselves.
Essentially the variant is just a big union of value types wrapped in a single value type (struct). In C++, you can allocate a value type on the heap because a struct is the same as a class (well sort-of). How the value type is being used is just a bit important but more on that later.
Union simply means you are going to overlap all the data in memory. Notice how I explicitly noted value type above; for variant's this is basically what it's all about. This also gives us a way to test it - namely by checking another value in the struct.
The way to do this in C# is to use the StructLayout attribute in a value type, which basically works as follows:
[StructLayout(LayoutKind.Explicit)]
public struct Variant
{
[FieldOffset(0)]
public int Integer;
[FieldOffset(0)]
public float Float;
[FieldOffset(0)]
public double Double;
[FieldOffset(0)]
public byte Byte;
// etc
}
// Check if it works - shouldn't print 0.
public class VariantTest
{
static void Main(string[] args)
{
Variant v = new Variant() { Integer = 2 };
Console.WriteLine("{0}", v.Float);
Console.ReadLine();
}
}
C++ variant's can also be stored on the heap as I noted earlier. If you do this, you probably still want the memory signature to be the same. The way to do this is to box the Variant struct we build earlier by simply casing it to object.
This is a tricky question.
From C# 4, you can use dynamic to indicate that the type is known at run-time.
By my personal understanding, however, c++ requires the type known at compile time. Thus you might consider to use object, but object in C# is an existent type.
For the concept of multi-type, single value (AKA polymorphism) of VARIANT, you would not need to find a corresponding type in C#, just define your classes and interfaces. You can always reference an object as its interface which the class implements.
If you are porting the code, and to figure out a syntax that you can simply use in LHS and for the considering of the type is known at compile time, then use var.
When .NET implements a COM interface, just use VARIANT* instead.
Then bypass marshalling on the .NET receiving side by using a IntPtr type to receive the pointer.
public class ComVariant
{
[StructLayout(LayoutKind.Sequential)]
public struct Variant
{
public ushort vt;
public ushort wReserved1;
public ushort wReserved2;
public ushort wReserved3;
public Int32 data01;
public Int32 data02;
}
private Variant _variant;
private IntPtr _variantPtr;
public ComVariant(int variantPtr) : this(new IntPtr(variantPtr))
{
}
public ComVariant(IntPtr variantPtr)
{
_variant = (Variant)Marshal.PtrToStructure(variantPtr, typeof(Variant));
_variantPtr = variantPtr;
}
public VarEnum Vt
{
get
{
return (VarEnum)_variant.vt;
}
set
{
_variant.vt = (ushort)value;
}
}
public object Object
{
get
{
return Marshal.GetObjectForNativeVariant(_variantPtr);
}
}
}
then if you are accessing a VT_UNKNOWN pointing to a COM interface object instance, just
var variant = new ComVariant(variantPtr);
var stream = variant.Object as IStream; // will not be null if type is correct
var obj = variant.Object as IObj; // in general...
will do the trick, but pay attention not to use a newly allocated VARIANT and giving its ownership to the .NET implementation without deallocating it somewhere...
For more complex code you might read this article which also talks about memory management.
Let's take a step back. Sooner or later, we want the actual data in the VARIANT. A VARIANT is just a holder for meaningful data. Suppose we converted the VARIANT to some sort of Object in C# that had the variant type and some raw buffer under the .NET hood (e.g. .NET strings can expose the raw buffer). At that point, the VARIANT type would need to be determined from the object and the raw data converted or cast to the data type specified by the variant, and then create a new Object e.g. string/int/etc. from the raw data.
So, rather than worry about passing the VARIANT to C#, look at the variant data type and convert it in C++ to the actual data type and pass that to C#.
For example, if the VARIANT type is VT_INT, then get the int from the variant and can use something like:
VARIANT var;
Int^ returnInt = gcnew Int(var.intVal);
returnInt can be returned as an Out parameter from a C++ function in a C++ dll that can be called from C#. The C++ dll needs to use /clr option.
Function would look like:-
void ThisFunctionReturnsAnInt(Runtime::InteropServices::OutAttribute Int^ % returnIntValue)
{
VARIANT var;
Int^ returnInt = gcnew Int(var.intVal);
}
Can use similar approach for other data types. It's only natural, a VARIANT of VT_INT is really just like an int, it's not as if there's some major conversion going on, you're just taking the actual value out of the VARIANT at the time when you're interested in it as you would if you were passing a straight integer value from C++ to C#. You would still need to do the gcnew anyway.

Why are function pointers not considered object oriented?

In the C# language specifications it explicitly states:
Delegates are similar to the concept
of function pointers found in some
other languages, but unlike function
pointers, delegates are
object-oriented and type-safe.
I understand delegates need to be a little more flexible than pointers because .NET moves memory around. That's the only difference I'm aware of, but I am not sure how this would turn a delegate into in OO concept...?
What makes a function pointer not object oriented? Are pointers and function pointers equivalent?
Well, Wikipedia says that "object oriented" means using "features such as data abstraction, encapsulation, messaging, modularity, polymorphism, and inheritance." Lacking a better definition, let's go with that.
Function pointers don't contain data, they don't encapsulate implementation details, they neither send nor receive messages, they are not modular, they are not typically used in a polymorphic manner (though I suppose they could in theory be covariant and contravariant in their return and formal parameter types, as delegates now are in C# 4) and they do not participate in an inheritance hierarchy. They are not self-describing; you can't ask a function pointer for its type because it doesn't have one.
By contrast, delegates capture data -- they hold on to the receiver. They support messaging in the sense that you can "message" a delegate by calling its ToString or GetType or Invoke or BeginInvoke methods to tell it to do something, and it "messages" you back with the result. Delegate types can be restricted to certain accessibility domains if you choose to do so. They are self-describing objects that have metadata and at runtime know their own type. They can be combined with other delegates. They can be used polymorphically as System.MulticastDelegate or System.Delegate, the types from which they inherit. And they can be used polymorphically in the sense that in C# 4 delegate types may be covariant and contravariant in their return and parameter types.
I believe it is because, when you hold a delegate to a member method, the OO framework "knows" you are holding a reference to the holding object, whereas with function pointers, first of all function isn't necessarily a member method and second of all, if the function is a member methods, the OO framework doesn't know it has to prevent the owning object from being freed.
Function pointers are just memory addresses.
Delegates are objects that have methods and properties:
-BeginInvoke
-DynamicInvoke
-Invoke
-Method
-Target
etc.
I'll explain with C++ examples because it's a language where this problem is present (and solved another way).
A mere function pointer just holds the address of a function, nothing else.
Consider the function
void f(int x) { return; }
Now, a simple function pointer is declared and assigned like this:
void (*fptr)(int) = &f;
And you can use it simply:
foo(5); // calls f(5)
However, in an object oriented language we usually deal with member functions, not free functions. And this is where things get nasty. Consider the following class:
class C { void g(int x) { return; } };
Declaring a function pointer to C::g is done like this:
void (*C::gptr)(int) = &C::g;
The reason why we need a different syntax is that member functions have a hidden this parameter, thus their signature is different.
For the same reason, calling them is problematic. That this parameter needs a value, which means you need to have an instance. Calling a pointer to a member function is done like this:
C c;
(c.*gptr)(5); // calls c.g(5);
Aside from the weird syntax, the real problem with this is that you need to pass the object together with your function pointer when you really just want to pass around one thing.
The obvious idea is to encapsulate the two, and that's what a delegate is. This is why a delegate is considered more OOP. I have no idea why it is considered more type-safe (maybe because you can cast function pointers to void*).
BTW the C++ solution in C++0x is adopted from Boost. It is called std::function and std::bind and works like this:
std::function<void (C*, int)> d = std::bind(&c::g, &c);
d(5); // calls c.g(5);
A function pointer can have no knowledge of the instance it belongs to unless you pass it in explicitly - all function pointers are to static members. A delegate, on the other hand, can be a regular member of the class, and the correct instance of the object will be used when the delegate is invoked.
Suppose one wants to design a general purpose anyprintf method which can behave as either fprintf, sprintf, cprintf [console printf with color support]. One approach would be to have it accept a function that accepts a void* and a char along with a void* and a va_list; it should then for each character of output call the passed-in function, passing it the supplied pointer and the character to be output.
Given such a function, one could implement vsprintf and fprintf [ignoring their return values for simplicitly] via:
void fprint_function(void* data, char ch) { fputc( (FILE*)data, ch); }
void sprint_function(void* data, char ch) { char**p = (char**)data; *((*p)++) = ch; }
void fprint_function(void* data, char ch) { cputchar( ch); }
void vfprintf(FILE *f, va_list vp, const char *fmt, va_list vp)
{
vsanyprintf(fprint_function, (void*)f, st, vp);
}
void vsprintf(char *st, va_list vp, const char *fmt, va_list vp)
{
vsanyprintf(fprint_function, (void*)f, st, vp);
}
void vcprintf(va_list vp, const char *fmt, va_list vp)
{
vsanyprintf(cprint_function, (void*)0, st, vp);
}
Effectively, the combination of the function pointer and void* behave as a method. Unfortunately, there's no way for the compiler to ensure that the data which is passed in the void* will be of the form expected by the supplied function. C++ and other object-oriented language add in compile-time validation of such type consistency.

Are ref and out in C# the same a pointers in C++?

I just made a Swap routine in C# like this:
static void Swap(ref int x, ref int y)
{
int temp = x;
x = y;
y = temp;
}
It does the same thing that this C++ code does:
void swap(int *d1, int *d2)
{
int temp=*d1;
*d1=*d2;
*d2=temp;
}
So are the ref and out keywords like pointers for C# without using unsafe code?
They're more limited. You can say ++ on a pointer, but not on a ref or out.
EDIT Some confusion in the comments, so to be absolutely clear: the point here is to compare with the capabilities of pointers. You can't perform the same operation as ptr++ on a ref/out, i.e. make it address an adjacent location in memory. It's true (but irrelevant here) that you can perform the equivalent of (*ptr)++, but that would be to compare it with the capabilities of values, not pointers.
It's a safe bet that they are internally just pointers, because the stack doesn't get moved and C# is carefully organised so that ref and out always refer to an active region of the stack.
EDIT To be absolutely clear again (if it wasn't already clear from the example below), the point here is not that ref/out can only point to the stack. It's that when it points to the stack, it is guaranteed by the language rules not to become a dangling pointer. This guarantee is necessary (and relevant/interesting here) because the stack just discards information in accordance with method call exits, with no checks to ensure that any referrers still exist.
Conversely when ref/out refers to objects in the GC heap it's no surprise that those objects are able to be kept alive as long as necessary: the GC heap is designed precisely for the purpose of retaining objects for any length of time required by their referrers, and provides pinning (see example below) to support situations where the object must not be moved by GC compacting.
If you ever play with interop in unsafe code, you will find that ref is very closely related to pointers. For example, if a COM interface is declared like this:
HRESULT Write(BYTE *pBuffer, UINT size);
The interop assembly will turn it into this:
void Write(ref byte pBuffer, uint size);
And you can do this to call it (I believe the COM interop stuff takes care of pinning the array):
byte[] b = new byte[1000];
obj.Write(ref b[0], b.Length);
In other words, ref to the first byte gets you access to all of it; it's apparently a pointer to the first byte.
Reference parameters in C# can be used to replace one use of pointers, yes. But not all.
Another common use for pointers is as a means for iterating over an array. Out/ref parameters can not do that, so no, they are not "the same as pointers".
ref and out are only used with function arguments to signify that the argument is to be passed by reference instead of value. In this sense, yes, they are somewhat like pointers in C++ (more like references actually). Read more about it in this article.
The nice thing about using out is that you're guaranteed that the item will be assigned a value -- you will get a compile error if not.
Actually, I'd compare them to C++ references rather than pointers. Pointers, in C++ and C, are a more general concept, and references will do what you want.
All of these are undoubtedly pointers under the covers, of course.
While comparisons are in the eye of the beholder...I say no. 'ref' changes the calling convention but not the type of the parameters. In your C++ example, d1 and d2 are of type int*. In C# they are still Int32's, they just happen to be passed by reference instead of by value.
By the way, your C++ code doesn't really swap its inputs in the traditional sense. Generalizing it like so:
template<typename T>
void swap(T *d1, T *d2)
{
T temp = *d1;
*d1 = *d2;
*d2 = temp;
}
...won't work unless all types T have copy constructors, and even then will be much more inefficient than swapping pointers.
The short answer is Yes (similar functionality, but not exactly the same mechanism).
As a side note, if you use FxCop to analyse your code, using out and ref will result in a "Microsoft.Design" error of "CA1045:DoNotPassTypesByReference."

Categories