I am learning the basics of C++, coming from the .NET world (C#).
One topic i found interesting was the const keyword and its usage with pointers (const pointer/pointer to const).
I'd like to know if there's any C# language equivalent of the const pointer/pointer to const that C++ has?
(I know C# doesn't have pointers, i am considering references to be the pointer-like types in C#).
Also, out of interest, if there's no such equivalent, what were the decisions behind not including such a feature?
There is no direct equivalent to passing references as 'const' in C#, but there are alternative ways to accomplish its purpose. The most common way to do this is to make your reference class either completely immutable (once constructed, its state should never change) or pass it as an immutable public interface. The latter is the closest to the intention of the 'const' parameter contract (I'm giving you a reference to something so you can use it, but I'm asking you not to change it.) A poorly-behaved client could 'cast away' the public interface to a mutable form, of course, but it still makes the intention clear. You could 'cast away' const in C++, as well, thought this was rarely a good idea.
One other thing in C++ is that you would often prefer to pass as const when you knew that the lifetime of the reference you were passing was limited in scope. C++ often follows the pattern where objects are created and destroyed on the stack within method scope, so any references to those objects should not be persisted outside that scope (since using them after they fall out of scope could cause really nasty stack corruption crashes.) A const reference should not be mutated, so it's a strong hint that storing it somewhere to reference later would be a bad idea. A method with const parameters is promising that it's safe to pass these scoped references. Since C# never allows storing references to objects on the stack (outside of parameters), this is less of a concern.
The concept of constant objects (i.e. readonly) in C# (or Java for that matter) corresponds approximately to object *const in C++, i.e. a constant pointer to a non-constant object.
There are several reasons for it - for one specifying const correctly and making it useful in the language is quite hard. Taking c++ as an example, you have to define lots of methods twice with only small changes to the signature, there's const_cast, the fact that const is only applied shallow, etc.
So C# went for the easy solution to make the language simpler - D went the other way with transitive const correctness, etc. as I understand it (never written a single line in D, so take that with a grain of salt).
The usual solution in C#/Java is to have immutable classes, possibly using a builder pattern, or a simple wrapper that prohibits changes (e.g. all the unmodifiable collections that wrap another collection and throw exceptions for the mutating methods like add).
8,000,000 years later, but C# 7.2 uses the "in" keyword which is sort of like const. It tells the compiler to pass a struct or primitive variable by reference (like ref or out) but the method WILL NOT modify the variable.
public void DoSomething(in int variable){
//Whatever
}
is functionally equivalent to C++'s
void Foo::DoSomething(int& const variable){
//Whatever
}
or arguably even
void Foo::DoSomething(int const * const variable){
//Whatever
}
The main reason for doing this in C#, according to MSDN, is to tell the compiler that it can pass the variable by reference since it won't be modified. This allows for potentially better performance when passing large structs
Regarding constant pointers, see this answer: Difference between const. pointer and reference?
In other words, a reference in C++ is very similar to a const pointer for most applications. However, a reference in C# is closer to a pointer in C++ regarding how they can be used.
Reference objects passed as arguments to methods in C# can be reinstantiated (say, from Object Instance A to Object Instance B), however the B's lifetime is only within the method scope and is disposed of once returned to the caller (since the pointer itself is passed by value) and the caller's reference is always to A. In this sense, you can freely pass around references in C# and know that they cannot be made to point to different objects (unless ref/out keywords are used).
C# example -
class Foo
{
//Stateful field
public int x;
//Constructor
public Foo()
{
x = 6;
}
}
public class Program
{
public static void Main()
{
var foo = new Foo();
foo.x = 8;
VarTestField(foo);
Console.WriteLine(foo.x);
RefTestField(ref foo);
Console.WriteLine(foo.x);
}
//Object passed by reference, pointer passed by value
static void VarTestField(Foo whatever){
whatever = new Foo();
}
//Object passed by reference, pointer passed by reference
static void RefTestField(ref Foo whatever){
whatever = new Foo();
}
}
Output:
8
6
So no, you cannot declare a constant pointer in C#. Still, using proven design patterns, algorithms, and OOP fundamentals wisely, along with the built in syntax of the language, you can achieve the desired behavior.
Related
I'm aware that Pinnable<T> is an internal class used by the methods in the new Unsafe class, and it's not meant to be used anywhere else other than in that class. This question is not about something practical, but it's just to understand why it's been designed like this and to learn a bit more about the language and its various "tricks" like this one.
As a recap, the Pinnable<T> class is defined here, and it looks like this:
[StructLayout(LayoutKind.Sequential)]
internal sealed class Pinnable<T>
{
public T Data;
}
And it's mainly used in the Span<T>.DangerousCreate method, here:
public static Span<T> DangerousCreate(object obj, ref T objectData, int length)
{
Pinnable<T> pinnable = Unsafe.As<Pinnable<T>>(obj);
IntPtr byteOffset = Unsafe.ByteOffset<T>(ref pinnable.Data, ref objectData);
return new Span<T>(pinnable, byteOffset, length);
}
The reason for Pinnable<T> being that it's used to keep track of the original object, in case the Span<T> instance was created by one (instead of a native pointer).
Given that reference type doesn't matter when pinning a reference (fixing both a ref T and Unsafe.As<T, byte>(ref T) works the same), is there a specific reason why the Pinnable<T> class was made generic? The original design in DotNetCross here in fact had a Pinnable class with just a single byte field, and it worked just the same. Is there any reason why using a generic class in this case would be an advantage, other than avoiding to cast the reference time when writing/reading/returning it?
Is there any other way, other than this unsafe-cast done with Unsafe.As, to get a reference to an object (I mean a reference to the object contents, otherwise it'd be the same as any variable of a class type)? I mean, any way to get a reference (which should basically have the same address of the actual object variable in the first place, right?) to an object without having to pass through some custom defined secondary class.
First of all, the Struct in [StructLayout(LayoutKind.Sequential)] doesn't mean that it is only valid for structs, it means the layout of the actual structure of the fields in memory, be it in a class or in a value type. This controls the actual runtime layout of the data, not just how the type would marshal to unmanaged code. The Sequential is important because without it, the runtime is pretty much free to store the memory however it sees fit, which means that Data may have some padding before it.
From what I understand about the implementation, the reason for Pinnable is to allow creating an instance of Span to a memory that may be moved by the GC, without having to pin the object first. If you don't use actual pointers and just references, nothing at all will need to be pinned.
I have noticed that it was introduced in a commit with a description saying it made Span more "portable" (a bold word for something that does a lot of unsafe things). I can't think of any other reason than something related to alignment for why it is generic. I suppose representing a T in terms of an offset from another T is better than as an offset from a byte. It may happen that the type of the first field may play a role in its actual address, even if the type was marked with LayoutKind.Sequential.
A reference to an object is different from an interior reference to an object (a reference to its data). It is implementation defined, but in .NET Framework, an instance of any class (or a boxed value type) starts with a header consisting of a sync block (for lock) and a pointer to the method table, a.k.a. the type of the object. On 32-bit, the header is 8 bytes, but the actual pointer points to the pointer to the method table (for performance reasons, getting the type happens more often than locking an object).
One but not portable way of getting the pointer to the start of the data is therefore casting the object reference to a pointer and adding 4 bytes to it. There the first field should start.
Another way I can think of is utilising GCHandle.AddrOfPinnedObject. It is commonly used for accessing array or string data, but it works for other objects:
[StructLayout(LayoutKind.Sequential)]
class Obj
{
public int A;
}
var obj = new Obj();
var gc = GCHandle.Alloc(obj, GCHandleType.Pinned);
IntPtr interior = gc.AddrOfPinnedObject();
Marshal.WriteInt32(interior, 0, 16);
Console.WriteLine(obj.A);
I think this actually is quite portable, but still needs to pin the object (there is InternalAddrOfPinnedObject defined in GCHandle, but even if that doesn't check whether the handle is actually pinned, the returned value may not be valid if it was used on a non-pinned object).
Still, the technique Span uses seems like the most portable way of doing that, since a lot of the underlying work is done in pure CIL (like reference arithmetics).
I do write c++ accessor to class member as
SomeClass const& x() const { return m_x; }
It seems that the only protection of this sort in c# is to define property with private (or undefined) set. But this protects only against assignments not against manipulation of the some-class state.
Side note: c++ allows m_x to be deleted through const pointer - IMHO this is simply amazing oversight of standard bodies.
Now, with C# 7.2, you can use ref readonly for the same purpose. You can check more about that here. Check the third point.
const in C++ doesn't protect against anything, you can cast it away without any issues.
And while C# doesn't have its equivalent, you can (and usually should) create real immutable classes. This puts the burden of constness on the object returned where it belongs, and there's nothing you can do to "cast it away" (barring reflection).
In the C# language specifications it explicitly states:
Delegates are similar to the concept
of function pointers found in some
other languages, but unlike function
pointers, delegates are
object-oriented and type-safe.
I understand delegates need to be a little more flexible than pointers because .NET moves memory around. That's the only difference I'm aware of, but I am not sure how this would turn a delegate into in OO concept...?
What makes a function pointer not object oriented? Are pointers and function pointers equivalent?
Well, Wikipedia says that "object oriented" means using "features such as data abstraction, encapsulation, messaging, modularity, polymorphism, and inheritance." Lacking a better definition, let's go with that.
Function pointers don't contain data, they don't encapsulate implementation details, they neither send nor receive messages, they are not modular, they are not typically used in a polymorphic manner (though I suppose they could in theory be covariant and contravariant in their return and formal parameter types, as delegates now are in C# 4) and they do not participate in an inheritance hierarchy. They are not self-describing; you can't ask a function pointer for its type because it doesn't have one.
By contrast, delegates capture data -- they hold on to the receiver. They support messaging in the sense that you can "message" a delegate by calling its ToString or GetType or Invoke or BeginInvoke methods to tell it to do something, and it "messages" you back with the result. Delegate types can be restricted to certain accessibility domains if you choose to do so. They are self-describing objects that have metadata and at runtime know their own type. They can be combined with other delegates. They can be used polymorphically as System.MulticastDelegate or System.Delegate, the types from which they inherit. And they can be used polymorphically in the sense that in C# 4 delegate types may be covariant and contravariant in their return and parameter types.
I believe it is because, when you hold a delegate to a member method, the OO framework "knows" you are holding a reference to the holding object, whereas with function pointers, first of all function isn't necessarily a member method and second of all, if the function is a member methods, the OO framework doesn't know it has to prevent the owning object from being freed.
Function pointers are just memory addresses.
Delegates are objects that have methods and properties:
-BeginInvoke
-DynamicInvoke
-Invoke
-Method
-Target
etc.
I'll explain with C++ examples because it's a language where this problem is present (and solved another way).
A mere function pointer just holds the address of a function, nothing else.
Consider the function
void f(int x) { return; }
Now, a simple function pointer is declared and assigned like this:
void (*fptr)(int) = &f;
And you can use it simply:
foo(5); // calls f(5)
However, in an object oriented language we usually deal with member functions, not free functions. And this is where things get nasty. Consider the following class:
class C { void g(int x) { return; } };
Declaring a function pointer to C::g is done like this:
void (*C::gptr)(int) = &C::g;
The reason why we need a different syntax is that member functions have a hidden this parameter, thus their signature is different.
For the same reason, calling them is problematic. That this parameter needs a value, which means you need to have an instance. Calling a pointer to a member function is done like this:
C c;
(c.*gptr)(5); // calls c.g(5);
Aside from the weird syntax, the real problem with this is that you need to pass the object together with your function pointer when you really just want to pass around one thing.
The obvious idea is to encapsulate the two, and that's what a delegate is. This is why a delegate is considered more OOP. I have no idea why it is considered more type-safe (maybe because you can cast function pointers to void*).
BTW the C++ solution in C++0x is adopted from Boost. It is called std::function and std::bind and works like this:
std::function<void (C*, int)> d = std::bind(&c::g, &c);
d(5); // calls c.g(5);
A function pointer can have no knowledge of the instance it belongs to unless you pass it in explicitly - all function pointers are to static members. A delegate, on the other hand, can be a regular member of the class, and the correct instance of the object will be used when the delegate is invoked.
Suppose one wants to design a general purpose anyprintf method which can behave as either fprintf, sprintf, cprintf [console printf with color support]. One approach would be to have it accept a function that accepts a void* and a char along with a void* and a va_list; it should then for each character of output call the passed-in function, passing it the supplied pointer and the character to be output.
Given such a function, one could implement vsprintf and fprintf [ignoring their return values for simplicitly] via:
void fprint_function(void* data, char ch) { fputc( (FILE*)data, ch); }
void sprint_function(void* data, char ch) { char**p = (char**)data; *((*p)++) = ch; }
void fprint_function(void* data, char ch) { cputchar( ch); }
void vfprintf(FILE *f, va_list vp, const char *fmt, va_list vp)
{
vsanyprintf(fprint_function, (void*)f, st, vp);
}
void vsprintf(char *st, va_list vp, const char *fmt, va_list vp)
{
vsanyprintf(fprint_function, (void*)f, st, vp);
}
void vcprintf(va_list vp, const char *fmt, va_list vp)
{
vsanyprintf(cprint_function, (void*)0, st, vp);
}
Effectively, the combination of the function pointer and void* behave as a method. Unfortunately, there's no way for the compiler to ensure that the data which is passed in the void* will be of the form expected by the supplied function. C++ and other object-oriented language add in compile-time validation of such type consistency.
I just made a Swap routine in C# like this:
static void Swap(ref int x, ref int y)
{
int temp = x;
x = y;
y = temp;
}
It does the same thing that this C++ code does:
void swap(int *d1, int *d2)
{
int temp=*d1;
*d1=*d2;
*d2=temp;
}
So are the ref and out keywords like pointers for C# without using unsafe code?
They're more limited. You can say ++ on a pointer, but not on a ref or out.
EDIT Some confusion in the comments, so to be absolutely clear: the point here is to compare with the capabilities of pointers. You can't perform the same operation as ptr++ on a ref/out, i.e. make it address an adjacent location in memory. It's true (but irrelevant here) that you can perform the equivalent of (*ptr)++, but that would be to compare it with the capabilities of values, not pointers.
It's a safe bet that they are internally just pointers, because the stack doesn't get moved and C# is carefully organised so that ref and out always refer to an active region of the stack.
EDIT To be absolutely clear again (if it wasn't already clear from the example below), the point here is not that ref/out can only point to the stack. It's that when it points to the stack, it is guaranteed by the language rules not to become a dangling pointer. This guarantee is necessary (and relevant/interesting here) because the stack just discards information in accordance with method call exits, with no checks to ensure that any referrers still exist.
Conversely when ref/out refers to objects in the GC heap it's no surprise that those objects are able to be kept alive as long as necessary: the GC heap is designed precisely for the purpose of retaining objects for any length of time required by their referrers, and provides pinning (see example below) to support situations where the object must not be moved by GC compacting.
If you ever play with interop in unsafe code, you will find that ref is very closely related to pointers. For example, if a COM interface is declared like this:
HRESULT Write(BYTE *pBuffer, UINT size);
The interop assembly will turn it into this:
void Write(ref byte pBuffer, uint size);
And you can do this to call it (I believe the COM interop stuff takes care of pinning the array):
byte[] b = new byte[1000];
obj.Write(ref b[0], b.Length);
In other words, ref to the first byte gets you access to all of it; it's apparently a pointer to the first byte.
Reference parameters in C# can be used to replace one use of pointers, yes. But not all.
Another common use for pointers is as a means for iterating over an array. Out/ref parameters can not do that, so no, they are not "the same as pointers".
ref and out are only used with function arguments to signify that the argument is to be passed by reference instead of value. In this sense, yes, they are somewhat like pointers in C++ (more like references actually). Read more about it in this article.
The nice thing about using out is that you're guaranteed that the item will be assigned a value -- you will get a compile error if not.
Actually, I'd compare them to C++ references rather than pointers. Pointers, in C++ and C, are a more general concept, and references will do what you want.
All of these are undoubtedly pointers under the covers, of course.
While comparisons are in the eye of the beholder...I say no. 'ref' changes the calling convention but not the type of the parameters. In your C++ example, d1 and d2 are of type int*. In C# they are still Int32's, they just happen to be passed by reference instead of by value.
By the way, your C++ code doesn't really swap its inputs in the traditional sense. Generalizing it like so:
template<typename T>
void swap(T *d1, T *d2)
{
T temp = *d1;
*d1 = *d2;
*d2 = temp;
}
...won't work unless all types T have copy constructors, and even then will be much more inefficient than swapping pointers.
The short answer is Yes (similar functionality, but not exactly the same mechanism).
As a side note, if you use FxCop to analyse your code, using out and ref will result in a "Microsoft.Design" error of "CA1045:DoNotPassTypesByReference."
Is this function declaration in C#:
void foo(string mystring)
the same as this one in C:
void foo(char *)
i.e. In C#, does the called function receive a pointer behind the scenes?
In this specific instance, it is more like:
void foo(const char *);
.Net strings are immutable and passed by reference. However, in general C# receives a pointer or reference to an object behind the scenes.
There are pointers behind the scenes in C#, though they are more like C++'s smart pointers, so the raw pointers are encapsulated. A char* isn't really the same as System.String since a pointer to a char usually means the start of a character array, and a C# string is an object with a length field and a character array. The pointer points to the outer structure which points into something like a wchar_t array, so there's some indirection with a C# string and wider characters for Unicode support.
No. In C# (and all other .NET languages) the String is a first-class data type. It is not simply an array of characters. You can convert back and forth between them, but they do not behave the same. There are a number of string manipulation methods (like "Substring()" and "StartsWith") that are available to the String class, which don't apply to arrays in general, which an array of characters is simply an instance of.
Essentially, yes. In C#, string (actually System.String) is a reference type, so when foo() is called, it receives a pointer to the string in the heap.
For value types (int, double, etc.), the function receives a copy of the value. For other objects, it's a reference pointing to the original object.
Strings are special because they are immutable. Technically it means it will pass the reference, but in practice it will behave pretty much like a value type.
You can force value types to pass a reference by using the ref keyword:
public void Foo(ref int value) { value = 12 }
public void Bar()
{
int val = 3;
Foo(ref val);
// val == 12
}
no in c# string is unicode.
in c# it is not called a pointer, but a reference.
If you mean - will the method be allowed to access the contents of the character space, the answer is yes.
Yes, because a string is of dynamic size, so there must be heap memory behind the scenes
However they are NOT the same.
in c the pointer points to a string that may also be used elsewhere, so changing it will effect those other places.
Anything that is not a "value type", which essentially covers enums, booleans, and built-in numeric types, will be passed "by reference", which is arguably the same as the C/C++ mechanism of passing by reference or pointer. Syntactically and semantically it is essentially identical to C/C++ passing by reference.
Note, however, that in C# strings are immutable, so even though it is passed by reference you can't edit the string without creating a new one.
Also note that you can't pass an argument as "const" in C#, regardless whether it is a value type or a reference type.
While those are indeed equivalent in a semantic sense (i.e. the code is doing something with a string), C#, like Java, keeps pointers completely out of its everyday use, relegating them to areas such as transitions to native OS functions - even then, there are framework classes which wrap those up nicely, such as SafeFileHandle.
Long story short, don't go out of your way thinking of pointers in C#.
As far as I know, all classes in C# (not sure about the others) are reference types.