Related
What is equivalent of the VARIANT datatype of C++ in C#?
I have code in C++ which uses the VARIANT datatype. How can I convert that code in C#?
Well, there are actually two variant's in C++: boost::variant and COM variant. The solution follows more or less the same idea, but the former is more complex. I expect you mean to use the latter.
Let me first start by telling that this is something you just shouldn't use if possible. That said, this is how you do it :-)
Variants and interop
Variants are sometimes used in interop of if you need the byte representation to be the same.
If you're dealing with interop, make sure to check out the VariantWrapper class on MSDN and make it work like that.
Variants and porting considerations
Variants are mostly used in APIs, and usually like this:
void Foo(SomeEnum operation, Variant data);
The reason it's done like this in C++ is because there is no base object class and you therefore need something like this. The easiest way to port this is to change the signature to:
void Foo(SomeEnum operation, object data);
However, if you're porting anyway, you also seriously want to consider these two, since they are resolved at compile-time and can save you the big 'switch' that usually follows in method Foo:
void SomeOperation(int data);
void SomeOperation(float data);
// etc
Variants and byte consistency
In rare cases you need to manipulate the bytes themselves.
Essentially the variant is just a big union of value types wrapped in a single value type (struct). In C++, you can allocate a value type on the heap because a struct is the same as a class (well sort-of). How the value type is being used is just a bit important but more on that later.
Union simply means you are going to overlap all the data in memory. Notice how I explicitly noted value type above; for variant's this is basically what it's all about. This also gives us a way to test it - namely by checking another value in the struct.
The way to do this in C# is to use the StructLayout attribute in a value type, which basically works as follows:
[StructLayout(LayoutKind.Explicit)]
public struct Variant
{
[FieldOffset(0)]
public int Integer;
[FieldOffset(0)]
public float Float;
[FieldOffset(0)]
public double Double;
[FieldOffset(0)]
public byte Byte;
// etc
}
// Check if it works - shouldn't print 0.
public class VariantTest
{
static void Main(string[] args)
{
Variant v = new Variant() { Integer = 2 };
Console.WriteLine("{0}", v.Float);
Console.ReadLine();
}
}
C++ variant's can also be stored on the heap as I noted earlier. If you do this, you probably still want the memory signature to be the same. The way to do this is to box the Variant struct we build earlier by simply casing it to object.
This is a tricky question.
From C# 4, you can use dynamic to indicate that the type is known at run-time.
By my personal understanding, however, c++ requires the type known at compile time. Thus you might consider to use object, but object in C# is an existent type.
For the concept of multi-type, single value (AKA polymorphism) of VARIANT, you would not need to find a corresponding type in C#, just define your classes and interfaces. You can always reference an object as its interface which the class implements.
If you are porting the code, and to figure out a syntax that you can simply use in LHS and for the considering of the type is known at compile time, then use var.
When .NET implements a COM interface, just use VARIANT* instead.
Then bypass marshalling on the .NET receiving side by using a IntPtr type to receive the pointer.
public class ComVariant
{
[StructLayout(LayoutKind.Sequential)]
public struct Variant
{
public ushort vt;
public ushort wReserved1;
public ushort wReserved2;
public ushort wReserved3;
public Int32 data01;
public Int32 data02;
}
private Variant _variant;
private IntPtr _variantPtr;
public ComVariant(int variantPtr) : this(new IntPtr(variantPtr))
{
}
public ComVariant(IntPtr variantPtr)
{
_variant = (Variant)Marshal.PtrToStructure(variantPtr, typeof(Variant));
_variantPtr = variantPtr;
}
public VarEnum Vt
{
get
{
return (VarEnum)_variant.vt;
}
set
{
_variant.vt = (ushort)value;
}
}
public object Object
{
get
{
return Marshal.GetObjectForNativeVariant(_variantPtr);
}
}
}
then if you are accessing a VT_UNKNOWN pointing to a COM interface object instance, just
var variant = new ComVariant(variantPtr);
var stream = variant.Object as IStream; // will not be null if type is correct
var obj = variant.Object as IObj; // in general...
will do the trick, but pay attention not to use a newly allocated VARIANT and giving its ownership to the .NET implementation without deallocating it somewhere...
For more complex code you might read this article which also talks about memory management.
Let's take a step back. Sooner or later, we want the actual data in the VARIANT. A VARIANT is just a holder for meaningful data. Suppose we converted the VARIANT to some sort of Object in C# that had the variant type and some raw buffer under the .NET hood (e.g. .NET strings can expose the raw buffer). At that point, the VARIANT type would need to be determined from the object and the raw data converted or cast to the data type specified by the variant, and then create a new Object e.g. string/int/etc. from the raw data.
So, rather than worry about passing the VARIANT to C#, look at the variant data type and convert it in C++ to the actual data type and pass that to C#.
For example, if the VARIANT type is VT_INT, then get the int from the variant and can use something like:
VARIANT var;
Int^ returnInt = gcnew Int(var.intVal);
returnInt can be returned as an Out parameter from a C++ function in a C++ dll that can be called from C#. The C++ dll needs to use /clr option.
Function would look like:-
void ThisFunctionReturnsAnInt(Runtime::InteropServices::OutAttribute Int^ % returnIntValue)
{
VARIANT var;
Int^ returnInt = gcnew Int(var.intVal);
}
Can use similar approach for other data types. It's only natural, a VARIANT of VT_INT is really just like an int, it's not as if there's some major conversion going on, you're just taking the actual value out of the VARIANT at the time when you're interested in it as you would if you were passing a straight integer value from C++ to C#. You would still need to do the gcnew anyway.
In the C# language specifications it explicitly states:
Delegates are similar to the concept
of function pointers found in some
other languages, but unlike function
pointers, delegates are
object-oriented and type-safe.
I understand delegates need to be a little more flexible than pointers because .NET moves memory around. That's the only difference I'm aware of, but I am not sure how this would turn a delegate into in OO concept...?
What makes a function pointer not object oriented? Are pointers and function pointers equivalent?
Well, Wikipedia says that "object oriented" means using "features such as data abstraction, encapsulation, messaging, modularity, polymorphism, and inheritance." Lacking a better definition, let's go with that.
Function pointers don't contain data, they don't encapsulate implementation details, they neither send nor receive messages, they are not modular, they are not typically used in a polymorphic manner (though I suppose they could in theory be covariant and contravariant in their return and formal parameter types, as delegates now are in C# 4) and they do not participate in an inheritance hierarchy. They are not self-describing; you can't ask a function pointer for its type because it doesn't have one.
By contrast, delegates capture data -- they hold on to the receiver. They support messaging in the sense that you can "message" a delegate by calling its ToString or GetType or Invoke or BeginInvoke methods to tell it to do something, and it "messages" you back with the result. Delegate types can be restricted to certain accessibility domains if you choose to do so. They are self-describing objects that have metadata and at runtime know their own type. They can be combined with other delegates. They can be used polymorphically as System.MulticastDelegate or System.Delegate, the types from which they inherit. And they can be used polymorphically in the sense that in C# 4 delegate types may be covariant and contravariant in their return and parameter types.
I believe it is because, when you hold a delegate to a member method, the OO framework "knows" you are holding a reference to the holding object, whereas with function pointers, first of all function isn't necessarily a member method and second of all, if the function is a member methods, the OO framework doesn't know it has to prevent the owning object from being freed.
Function pointers are just memory addresses.
Delegates are objects that have methods and properties:
-BeginInvoke
-DynamicInvoke
-Invoke
-Method
-Target
etc.
I'll explain with C++ examples because it's a language where this problem is present (and solved another way).
A mere function pointer just holds the address of a function, nothing else.
Consider the function
void f(int x) { return; }
Now, a simple function pointer is declared and assigned like this:
void (*fptr)(int) = &f;
And you can use it simply:
foo(5); // calls f(5)
However, in an object oriented language we usually deal with member functions, not free functions. And this is where things get nasty. Consider the following class:
class C { void g(int x) { return; } };
Declaring a function pointer to C::g is done like this:
void (*C::gptr)(int) = &C::g;
The reason why we need a different syntax is that member functions have a hidden this parameter, thus their signature is different.
For the same reason, calling them is problematic. That this parameter needs a value, which means you need to have an instance. Calling a pointer to a member function is done like this:
C c;
(c.*gptr)(5); // calls c.g(5);
Aside from the weird syntax, the real problem with this is that you need to pass the object together with your function pointer when you really just want to pass around one thing.
The obvious idea is to encapsulate the two, and that's what a delegate is. This is why a delegate is considered more OOP. I have no idea why it is considered more type-safe (maybe because you can cast function pointers to void*).
BTW the C++ solution in C++0x is adopted from Boost. It is called std::function and std::bind and works like this:
std::function<void (C*, int)> d = std::bind(&c::g, &c);
d(5); // calls c.g(5);
A function pointer can have no knowledge of the instance it belongs to unless you pass it in explicitly - all function pointers are to static members. A delegate, on the other hand, can be a regular member of the class, and the correct instance of the object will be used when the delegate is invoked.
Suppose one wants to design a general purpose anyprintf method which can behave as either fprintf, sprintf, cprintf [console printf with color support]. One approach would be to have it accept a function that accepts a void* and a char along with a void* and a va_list; it should then for each character of output call the passed-in function, passing it the supplied pointer and the character to be output.
Given such a function, one could implement vsprintf and fprintf [ignoring their return values for simplicitly] via:
void fprint_function(void* data, char ch) { fputc( (FILE*)data, ch); }
void sprint_function(void* data, char ch) { char**p = (char**)data; *((*p)++) = ch; }
void fprint_function(void* data, char ch) { cputchar( ch); }
void vfprintf(FILE *f, va_list vp, const char *fmt, va_list vp)
{
vsanyprintf(fprint_function, (void*)f, st, vp);
}
void vsprintf(char *st, va_list vp, const char *fmt, va_list vp)
{
vsanyprintf(fprint_function, (void*)f, st, vp);
}
void vcprintf(va_list vp, const char *fmt, va_list vp)
{
vsanyprintf(cprint_function, (void*)0, st, vp);
}
Effectively, the combination of the function pointer and void* behave as a method. Unfortunately, there's no way for the compiler to ensure that the data which is passed in the void* will be of the form expected by the supplied function. C++ and other object-oriented language add in compile-time validation of such type consistency.
A simple question, but I haven't found a definitive answer on Stack Overflow.
struct MyStruct { int x, y, z; }
MyStruct GetMyStruct() => new MyStruct();
static void Main()
{
var x = GetMyStruct(); // can boxing/unboxing ever occur?
}
Is a C# struct (value type) always copied to the stack when returned from a function, no matter how large it might be? The reason I'm unsure is that for some instruction sets other than MSIL (such as x86), a return value usually needs to fit into a processor register, and the stack is not directly involved.
If so, is it the call site that pre-allocates space on the CLR stack for the (expected) value return type?[edit: Summary of replies] For the intent of the original question, the answer is no; the CLR will never (silently) box a struct just for the purpose of transferring it as a return value.
It is a heavy implementation detail of the JIT compiler. In general, if the struct is small enough and has simple members then its gets returned in CPU registers. If it gets too big then the calling code reserves enough space on the stack and passes a pointer to that space as an extra hidden argument.
It will never be boxed, unless the return type of the method is object of course.
Fwiw: this is also the reason that the debugger cannot display the return value of the function in the Autos window. Painful sometimes. But the debugger doesn't get enough metadata from the JIT compiler to know exactly where to find the value. Edit: fixed in VS2013.
A struct is boxed whenever you want to treat it as an object, so if you call Func and assign the result to object it will be boxed.
E.g. doing this
object o = Func();
will yield the following IL
L_0000: call valuetype TestApp.foo TestApp.Program::Func()
L_0005: box TestApp.foo
L_000a: stloc.0
which shows that the return value is boxed, because we assign it to a reference of the type object.
If you assign it to a variable of type Foo it isn't boxed and thus it is copied and the value is stored on the stack.
Also, boxing wouldn't really help you here since it would involve creating an object to represent the value of the struct, and the values are effectively copied during the boxing operation.
When I develop in COM, I always see (void**) type conversion as below.
QueryInterface(/* [in] */ REFIID riid,/* [out] */ void** ppInterface)
What's exact meaning of it?
IMHO, it tells the compiler not to enforce type validation, since the type which is pointed by the ppInterface is not known to the client code at compile time.
Thanks~~~
Update 1
I understand it this way:
void* p implies AnyType* p
void ** pp implies pointer to AnyType*
Update 2
If void**pp means "pointer to void*", then what checks does the compiler do when it sees it?
A void ** is a pointer to a void *. This can be used to pass the address of a void * variable that will be used as an output parameter - eg:
void alloc_two(int n, void **a, void **b)
{
*a = malloc(n * 100);
*b = malloc(n * 200);
}
/* ... */
void *x;
void *y;
alloc_two(10, &x, &y);
The reason why COM uses void** with QueryInterface are somewhat special. (See below.)
Generally, void** simply means a pointer to void*, and it can be used for out parameters, ie. parameters that indicate a place where a function can return a value to. Your comment /* [out] */ indicates that the location pointed to by ppvInterface will be written to.
"Why can parameters with a pointer type be used as out parameters?", you ask? Remember that you can change two things with a pointer variable:
You can change the pointer itself, such that it points to another object. (ptr = ...)
You can modify the pointed-to object. (*ptr = ...)
Pointers are passed to a function by value, ie. the function gets its own local copy of the original pointer that was passed to it. This means you can change the pointer parameter inside the function (1) without affecting the original pointer, since only the local copy is modified. However, you can change the pointed-to object (2) and this will be visible outside of the function, because the copy has the same value as the original pointer and thus references the same object.
Now, about COM specifically:
A pointer to an interface (specified by riid) will be returned in the variable referenced by ppvInterface. QueryInterface achieves this via mechanism (2) mentioned above.
With void**, one * is required to allow mechanism (2); the other * reflects the fact that QueryInterface does not return a newly created object (IUnknown), but an already existing one: In order to avoid duplication of that object, a pointer to that object (IUnknown*) is returned.
If you're asking why ppvInterface has type void** and not IUnknown**, which would seem more reasonable type-safety-wise (since all interfaces must derive from IUnknown), then read the following argument taken from the book Essential COM by Don Box, p. 60 (chapter Type Coercion and IUnknown):
One additional subtlety related to QueryInterface concerns its second parameter, which is of type void **. It is very ironic that QueryInterface, the underpinning of the COM type system, has a fairly type-unsafe prototype in C++ [...]
IPug *pPug = 0;
hr = punk->QueryInterface(IID_IPug, (void**)&pPug);
Unfortunately, the following looks equally correct to the C++ compiler:
IPug *pPug = 0;
hr = punk->QueryInterface(IID_ICat, (void**)&pPug);
This more subtle variation also compiles correctly:
IPug *pPug = 0;
hr = punk->QueryInterface(IID_ICat, (void**)pPug);
Given that the rules of inheritance do not apply to pointers, this alternative definition of QueryInterface does not alleviate the problem:
HRESULT QueryInterface(REFIID riid, IUnknown** ppv);
The same limitation applies to references as to pointers as well. The following alternative definition is arguably more convenient for clients to use:
HRESULT QueryInterface(const IID& riid, void* ppv);
[...] Unfortunately, this solution does not reduce the number of errors [...] and, by eliminating the need for a cast, removes a visual indicator that C++ type safety might be in jeopardy. Given the desired semantics of QueryInterface, the argument types Microsoft chose are reasonable, if not type safe or elegant. [...]
It is just a pointer to void*.
Eg:
Something* foo;
Bar((void**)&foo);
// now foo points to something meaningful
Edit: A possible implementation in C#.
struct Foo { }
static Foo foo = new Foo();
unsafe static void Main(string[] args)
{
Foo* foo;
Bar((void**)&foo);
}
static unsafe void Bar(void** v)
{
fixed (Foo* f = &foo)
{
*v = f;
}
}
Passing by void * also ensures that the pointed to object cannot be deleted or tampered (accidentally).
"This implies that an object cannot be deleted using a pointer of type void* because there are no objects of type void."
It's a pointer to the interface pointer you request using this call. Obviously you can request all sorts of interfaces, so it has to be a void pointer. If the interface doesn't exist, the pointer is set to NULL.
edit: Detailed information to be found here: http://msdn.microsoft.com/en-us/library/ms682521(VS.85).aspx
It allows the API to specify that a pointer may be used as an [in-out] parameter in future, but for now, the pointer is unused. (NULL is usually the required value.)
When returning one of many possible types, with no common supertype (such as with QueryInterface), returning a void* is really the only option, and as this needs to be passed as an [out] parameter a pointer to that type (void**) is needed.
not to enforce type validation
Indeed, void* or void** are there to allow the use of different types of pointers, that can be downcasted to void* to fit in the function parameters type.
Pointer to pointer of unknown interface that can be provided.
Instead of using pointers to pointers, try using a reference to a pointer. It's a bit more C++ than using **.
e.g.
void Initialise(MyType &*pType)
{
pType = new MyType();
}
Someone asked me the other day when they should use the parameter keyword out instead of ref. While I (I think) understand the difference between the ref and out keywords (that has been asked before) and the best explanation seems to be that ref == in and out, what are some (hypothetical or code) examples where I should always use out and not ref.
Since ref is more general, why do you ever want to use out? Is it just syntactic sugar?
You should use out unless you need ref.
It makes a big difference when the data needs to be marshalled e.g. to another process, which can be costly. So you want to avoid marshalling the initial value when the method doesn't make use of it.
Beyond that, it also shows the reader of the declaration or the call whether the initial value is relevant (and potentially preserved), or thrown away.
As a minor difference, an out parameter needs not be initialized.
Example for out:
string a, b;
person.GetBothNames(out a, out b);
where GetBothNames is a method to retrieve two values atomically, the method won't change behavior whatever a and b are. If the call goes to a server in Hawaii, copying the initial values from here to Hawaii is a waste of bandwidth. A similar snippet using ref:
string a = String.Empty, b = String.Empty;
person.GetBothNames(ref a, ref b);
could confuse readers, because it looks like the initial values of a and b are relevant (though the method name would indicate they are not).
Example for ref:
string name = textbox.Text;
bool didModify = validator.SuggestValidName(ref name);
Here the initial value is relevant to the method.
Use out to denote that the parameter is not being used, only set. This helps the caller understand that you're always initializing the parameter.
Also, ref and out are not just for value types. They also let you reset the object that a reference type is referencing from within a method.
You're correct in that, semantically, ref provides both "in" and "out" functionality, whereas out only provides "out" functionality. There are some things to consider:
out requires that the method accepting the parameter MUST, at some point before returning, assign a value to the variable. You find this pattern in some of the key/value data storage classes like Dictionary<K,V>, where you have functions like TryGetValue. This function takes an out parameter that holds what the value will be if retrieved. It wouldn't make sense for the caller to pass a value into this function, so out is used to guarantee that some value will be in the variable after the call, even if it isn't "real" data (in the case of TryGetValue where the key isn't present).
out and ref parameters are marshaled differently when dealing with interop code
Also, as an aside, it's important to note that while reference types and value types differ in the nature of their value, every variable in your application points to a location of memory that holds a value, even for reference types. It just happens that, with reference types, the value contained in that location of memory is another memory location. When you pass values to a function (or do any other variable assignment), the value of that variable is copied into the other variable. For value types, that means that the entire content of the type is copied. For reference types, that means that the memory location is copied. Either way, it does create a copy of the data contained in the variable. The only real relevance that this holds deals with assignment semantics; when assigning a variable or passing by value (the default), when a new assignment is made to the original (or new) variable, it does not affect the other variable. In the case of reference types, yes, changes made to the instance are available on both sides, but that's because the actual variable is just a pointer to another memory location; the content of the variable--the memory location--didn't actually change.
Passing with the ref keyword says that both the original variable and the function parameter will actually point to the same memory location. This, again, affects only assignment semantics. If a new value is assigned to one of the variables, then because the other points to the same memory location the new value will be reflected on the other side.
It depends on the compile context (See Example below).
out and ref both denote variable passing by reference, yet ref requires the variable to be initialized before being passed, which can be an important difference in the context of Marshaling (Interop: UmanagedToManagedTransition or vice versa)
MSDN warns:
Do not confuse the concept of passing by reference with the concept of reference types. The two concepts are not the same. A method parameter can be modified by ref regardless of whether it is a value type or a reference type. There is no boxing of a value type when it is passed by reference.
From the official MSDN Docs:
out:
The out keyword causes arguments to be passed by reference. This is similar to the ref keyword, except that ref requires that the variable be initialized before being passed
ref:
The ref keyword causes an argument to be passed by reference, not by value. The effect of passing by reference is that any change to the parameter in the method is reflected in the underlying argument variable in the calling method. The value of a reference parameter is always the same as the value of the underlying argument variable.
We can verify that the out and ref are indeed the same when the argument gets assigned:
CIL Example:
Consider the following example
static class outRefTest{
public static int myfunc(int x){x=0; return x; }
public static void myfuncOut(out int x){x=0;}
public static void myfuncRef(ref int x){x=0;}
public static void myfuncRefEmpty(ref int x){}
// Define other methods and classes here
}
in CIL, the instructions of myfuncOut and myfuncRef are identical as expected.
outRefTest.myfunc:
IL_0000: nop
IL_0001: ldc.i4.0
IL_0002: starg.s 00
IL_0004: ldarg.0
IL_0005: stloc.0
IL_0006: br.s IL_0008
IL_0008: ldloc.0
IL_0009: ret
outRefTest.myfuncOut:
IL_0000: nop
IL_0001: ldarg.0
IL_0002: ldc.i4.0
IL_0003: stind.i4
IL_0004: ret
outRefTest.myfuncRef:
IL_0000: nop
IL_0001: ldarg.0
IL_0002: ldc.i4.0
IL_0003: stind.i4
IL_0004: ret
outRefTest.myfuncRefEmpty:
IL_0000: nop
IL_0001: ret
nop: no operation, ldloc: load local, stloc: stack local, ldarg: load argument, bs.s: branch to target....
(See: List of CIL instructions )
Below are some notes which i pulled from this codeproject article on C# Out Vs Ref
It should be used only when we are expecting multiple outputs from a function or a method. A thought on structures can be also a good option for the same.
REF and OUT are keywords which dictate how data is passed from caller to callee and vice versa.
In REF data passes two way. From caller to callee and vice-versa.
In Out data passes only one way from callee to caller. In this case if Caller tried to send data to the callee it will be overlooked / rejected.
If you are a visual person then please see this yourtube video which demonstrates the difference practically https://www.youtube.com/watch?v=lYdcY5zulXA
Below image shows the differences more visually
You need to use ref if you plan to read and write to the parameter. You need to use out if you only plan to write. In effect, out is for when you'd need more than one return value, or when you don't want to use the normal return mechanism for output (but this should be rare).
There are language mechanics that assist these use cases. Ref parameters must have been initialized before they are passed to a method (putting emphasis on the fact that they are read-write), and out parameters cannot be read before they are assigned a value, and are guaranteed to have been written to at the end of the method (putting emphasis on the fact that they are write only). Contravening to these principles results in a compile-time error.
int x;
Foo(ref x); // error: x is uninitialized
void Bar(out int x) {} // error: x was not written to
For instance, int.TryParse returns a bool and accepts an out int parameter:
int value;
if (int.TryParse(numericString, out value))
{
/* numericString was parsed into value, now do stuff */
}
else
{
/* numericString couldn't be parsed */
}
This is a clear example of a situation where you need to output two values: the numeric result and whether the conversion was successful or not. The authors of the CLR decided to opt for out here since they don't care about what the int could have been before.
For ref, you can look at Interlocked.Increment:
int x = 4;
Interlocked.Increment(ref x);
Interlocked.Increment atomically increments the value of x. Since you need to read x to increment it, this is a situation where ref is more appropriate. You totally care about what x was before it was passed to Increment.
In the next version of C#, it will even be possible to declare variable in out parameters, adding even more emphasis on their output-only nature:
if (int.TryParse(numericString, out int value))
{
// 'value' exists and was declared in the `if` statement
}
else
{
// conversion didn't work, 'value' doesn't exist here
}
How to use in or out or ref in C#?
All keywords in C# have the same functionality but with some boundaries.
in arguments cannot be modified by the called method.
ref arguments may be modified.
ref must be initialized before being used by caller it can be read and updated in the method.
out arguments must be modified by the caller.
out arguments must be initialized in the method
Variables passed as in arguments must be initialized before being passed in a method call. However, the called method may not assign a value or modify the argument.
You can't use the in, ref, and out keywords for the following kinds of methods:
Async methods, which you define by using the async modifier.
Iterator methods, which include a yield return or yield break statement.
Still feel the need for a good summary, this is what I came up with.
Summary,
When we are inside the function, this is how we specify the variable data access control,
in = R
out = must W before R
ref = R+W
Explanation,
in
Function may only READ that variable.
out
Variable must not be initialised first because,
function MUST WRITE to it before READ.
ref
Function may READ/WRITE to that variable.
Why is it named as such?
Focusing on where data gets modified,
in
Data must only be set before entering (in) function.
out
Data must only be set before leaving (out) function.
ref
Data must be set before entering (in) function.
Data may be set before leaving (out) function.
out is more constraint version of ref.
In a method body, you need to assign to all out parameters before leaving the method.
Also an values assigned to an out parameter is ignored, whereas ref requires them to be assigned.
So out allows you to do:
int a, b, c = foo(out a, out b);
where ref would require a and b to be assigned.
How it sounds:
out = only initialize/fill a parameter (the parameter must be empty) return it out plain
ref = reference, standard parameter (maybe with value), but the function can modifiy it.
You can use the out contextual keyword in two contexts (each is a link to detailed information), as a parameter modifier or in generic type parameter declarations in interfaces and delegates. This topic discusses the parameter modifier, but you can see this other topic for information on the generic type parameter declarations.
The out keyword causes arguments to be passed by reference. This is like the ref keyword, except that ref requires that the variable be initialized before it is passed. To use an out parameter, both the method definition and the calling method must explicitly use the out keyword. For example:
C#
class OutExample
{
static void Method(out int i)
{
i = 44;
}
static void Main()
{
int value;
Method(out value);
// value is now 44
}
}
Although variables passed as out arguments do not have to be initialized before being passed, the called method is required to assign a value before the method returns.
Although the ref and out keywords cause different run-time behavior, they are not considered part of the method signature at compile time. Therefore, methods cannot be overloaded if the only difference is that one method takes a ref argument and the other takes an out argument. The following code, for example, will not compile:
C#
class CS0663_Example
{
// Compiler error CS0663: "Cannot define overloaded
// methods that differ only on ref and out".
public void SampleMethod(out int i) { }
public void SampleMethod(ref int i) { }
}
Overloading can be done, however, if one method takes a ref or out argument and the other uses neither, like this:
C#
class OutOverloadExample
{
public void SampleMethod(int i) { }
public void SampleMethod(out int i) { i = 5; }
}
Properties are not variables and therefore cannot be passed as out parameters.
For information about passing arrays, see Passing Arrays Using ref and out (C# Programming Guide).
You can't use the ref and out keywords for the following kinds of methods:
Async methods, which you define by using the async modifier.
Iterator methods, which include a yield return or yield break statement.
Example
Declaring an out method is useful when you want a method to return multiple values. The following example uses out to return three variables with a single method call. Note that the third argument is assigned to null. This enables methods to return values optionally.
C#
class OutReturnExample
{
static void Method(out int i, out string s1, out string s2)
{
i = 44;
s1 = "I've been returned";
s2 = null;
}
static void Main()
{
int value;
string str1, str2;
Method(out value, out str1, out str2);
// value is now 44
// str1 is now "I've been returned"
// str2 is (still) null;
}
}
Just to clarify on OP's comment that the use on ref and out is a "reference to a value type or struct declared outside the method", which has already been established in incorrect.
Consider the use of ref on a StringBuilder, which is a reference type:
private void Nullify(StringBuilder sb, string message)
{
sb.Append(message);
sb = null;
}
// -- snip --
StringBuilder sb = new StringBuilder();
string message = "Hi Guy";
Nullify(sb, message);
System.Console.WriteLine(sb.ToString());
// Output
// Hi Guy
As apposed to this:
private void Nullify(ref StringBuilder sb, string message)
{
sb.Append(message);
sb = null;
}
// -- snip --
StringBuilder sb = new StringBuilder();
string message = "Hi Guy";
Nullify(ref sb, message);
System.Console.WriteLine(sb.ToString());
// Output
// NullReferenceException
Basically both ref and out for passing object/value between methods
The out keyword causes arguments to be passed by reference. This is like the ref keyword, except that ref requires that the variable be initialized before it is passed.
out : Argument is not initialized and it must be initialized in the method
ref : Argument is already initialized and it can be read and updated in the method.
What is the use of “ref” for reference-types ?
You can change the given reference to a different instance.
Did you know?
Although the ref and out keywords cause different run-time behavior, they are not considered part of the method signature at compile time. Therefore, methods cannot be overloaded if the only difference is that one method takes a ref argument and the other takes an out argument.
You can't use the ref and out keywords for the following kinds of methods:
Async methods, which you define by using the async modifier.
Iterator methods, which include a yield return or yield break statement.
Properties are not variables and therefore cannot be passed as out parameters.
An argument passed as ref must be initialized before passing to the method whereas out parameter needs not to be initialized before passing to a method.
why do you ever want to use out?
To let others know that the variable will be initialized when it returns from the called method!
As mentioned above:
"for an out parameter, the calling method is required to assign a value before the method returns."
example:
Car car;
SetUpCar(out car);
car.drive(); // You know car is initialized.
Extra notes regarding C# 7:
In C# 7 there's no need to predeclare variables using out. So a code like this:
public void PrintCoordinates(Point p)
{
int x, y; // have to "predeclare"
p.GetCoordinates(out x, out y);
WriteLine($"({x}, {y})");
}
Can be written like this:
public void PrintCoordinates(Point p)
{
p.GetCoordinates(out int x, out int y);
WriteLine($"({x}, {y})");
}
Source: What's new in C# 7.
It should be noted that in is a valid keyword as of C# ver 7.2:
The in parameter modifier is available in C# 7.2 and later. Previous versions generate compiler error CS8107 ("Feature 'readonly references' is not available in C# 7.0. Please use language version 7.2 or greater.") To configure the compiler language version, see Select the C# language version.
...
The in keyword causes arguments to be passed by reference. It makes the formal parameter an alias for the argument, which must be a variable. In other words, any operation on the parameter is made on the argument. It is like the ref or out keywords, except that in arguments cannot be modified by the called method. Whereas ref arguments may be modified, out arguments must be modified by the called method, and those modifications are observable in the calling context.