How to create a struct for Span<byte> - c#

I am trying to create a Struct of the Type Span<byte> but I get a compiler error for this line:
public Span<byte> X { get; set; }
It says:
Field or auto-implemented property cannot be of type Span<byte> unless it is an instance member of a ref struct.
I am not sure what this means or how I can create a struct of Span<byte>. I know structs are very fast so this is the reason I try to use both Span and Structs together.
Thank you!
public struct SpanStruct
{
public Span<byte> X { get; set; }
public SpanStruct(Span<byte> x)
{
X = x;
}
}
void function1()
{
var list = new SpanStruct[1000];
for (int i = 0; i < 1000; i++)
{
Span<byte> span = new Span<byte>();
list[i] = new SpanStruct(span);
}
}

Span is a special type that is designed to live on the stack. You cannot put it inside of a class because that could make what it references not outlive the heap reference. For instance you can use a Span to refer to something that is created via stackalloc.
Typically Span is only used as a local variable to refer to a piece of data in the most abstract way possible. There are instances where it makes sense to store them but in a lot of those cases Memory is better because while it removes the ability to refer to things like stackalloc you likely don't want that ability if you are storing things in classes.
If you need to create a struct that contains a Span denote it with public ref struct SpanStruct but note that means it too can not be put in a class.

It is the language that restricts that we cannot use it as field or property because Span<T> can only live on the stack. When should we use it?
It can be only used as following:
Method Parameter
A return type of a method
A local variable
See the following post
From Microsoft docs:
Span is a ref struct that is allocated on the stack rather than on the managed heap. Ref struct types have a number of restrictions to ensure that they cannot be promoted to the managed heap, including that they can't be boxed

Span is a ref struct that is allocated on the stack rather than on the managed heap. Ref struct types have a number of restrictions to ensure that they cannot be promoted to the managed heap, including that they can't be boxed, they can't be assigned to variables of type Object, dynamic or to any interface type, they can't be fields in a reference type, and they can't be used across await and yield boundaries. In addition, calls to two methods, Equals(Object) and GetHashCode, throw a NotSupportedException.
This is from Microsoft documentation.
You're trying to use Span<T> as a field/property within a reference type - class SpanStruct - which violates this.

Related

Why is the Pinnable<T> class in C# 7.2 defined the way it is?

I'm aware that Pinnable<T> is an internal class used by the methods in the new Unsafe class, and it's not meant to be used anywhere else other than in that class. This question is not about something practical, but it's just to understand why it's been designed like this and to learn a bit more about the language and its various "tricks" like this one.
As a recap, the Pinnable<T> class is defined here, and it looks like this:
[StructLayout(LayoutKind.Sequential)]
internal sealed class Pinnable<T>
{
public T Data;
}
And it's mainly used in the Span<T>.DangerousCreate method, here:
public static Span<T> DangerousCreate(object obj, ref T objectData, int length)
{
Pinnable<T> pinnable = Unsafe.As<Pinnable<T>>(obj);
IntPtr byteOffset = Unsafe.ByteOffset<T>(ref pinnable.Data, ref objectData);
return new Span<T>(pinnable, byteOffset, length);
}
The reason for Pinnable<T> being that it's used to keep track of the original object, in case the Span<T> instance was created by one (instead of a native pointer).
Given that reference type doesn't matter when pinning a reference (fixing both a ref T and Unsafe.As<T, byte>(ref T) works the same), is there a specific reason why the Pinnable<T> class was made generic? The original design in DotNetCross here in fact had a Pinnable class with just a single byte field, and it worked just the same. Is there any reason why using a generic class in this case would be an advantage, other than avoiding to cast the reference time when writing/reading/returning it?
Is there any other way, other than this unsafe-cast done with Unsafe.As, to get a reference to an object (I mean a reference to the object contents, otherwise it'd be the same as any variable of a class type)? I mean, any way to get a reference (which should basically have the same address of the actual object variable in the first place, right?) to an object without having to pass through some custom defined secondary class.
First of all, the Struct in [StructLayout(LayoutKind.Sequential)] doesn't mean that it is only valid for structs, it means the layout of the actual structure of the fields in memory, be it in a class or in a value type. This controls the actual runtime layout of the data, not just how the type would marshal to unmanaged code. The Sequential is important because without it, the runtime is pretty much free to store the memory however it sees fit, which means that Data may have some padding before it.
From what I understand about the implementation, the reason for Pinnable is to allow creating an instance of Span to a memory that may be moved by the GC, without having to pin the object first. If you don't use actual pointers and just references, nothing at all will need to be pinned.
I have noticed that it was introduced in a commit with a description saying it made Span more "portable" (a bold word for something that does a lot of unsafe things). I can't think of any other reason than something related to alignment for why it is generic. I suppose representing a T in terms of an offset from another T is better than as an offset from a byte. It may happen that the type of the first field may play a role in its actual address, even if the type was marked with LayoutKind.Sequential.
A reference to an object is different from an interior reference to an object (a reference to its data). It is implementation defined, but in .NET Framework, an instance of any class (or a boxed value type) starts with a header consisting of a sync block (for lock) and a pointer to the method table, a.k.a. the type of the object. On 32-bit, the header is 8 bytes, but the actual pointer points to the pointer to the method table (for performance reasons, getting the type happens more often than locking an object).
One but not portable way of getting the pointer to the start of the data is therefore casting the object reference to a pointer and adding 4 bytes to it. There the first field should start.
Another way I can think of is utilising GCHandle.AddrOfPinnedObject. It is commonly used for accessing array or string data, but it works for other objects:
[StructLayout(LayoutKind.Sequential)]
class Obj
{
public int A;
}
var obj = new Obj();
var gc = GCHandle.Alloc(obj, GCHandleType.Pinned);
IntPtr interior = gc.AddrOfPinnedObject();
Marshal.WriteInt32(interior, 0, 16);
Console.WriteLine(obj.A);
I think this actually is quite portable, but still needs to pin the object (there is InternalAddrOfPinnedObject defined in GCHandle, but even if that doesn't check whether the handle is actually pinned, the returned value may not be valid if it was used on a non-pinned object).
Still, the technique Span uses seems like the most portable way of doing that, since a lot of the underlying work is done in pure CIL (like reference arithmetics).

Obtain non-explicit field offset

I have the following class:
[StructLayout(LayoutKind.Sequential)]
class Class
{
public int Field1;
public byte Field2;
public short? Field3;
public bool Field4;
}
How can I get the byte offset of Field4 starting from the start of the class data (or object header)?
To illustrate:
Class cls = new Class();
fixed(int* ptr1 = &cls.Field1) //first field
fixed(bool* ptr2 = &cls.Field4) //requested field
{
Console.WriteLine((byte*)ptr2-(byte*)ptr1);
}
The resulting offset is, in this case, 5, because the runtime actually moves Field3 to the end of the type (and pads it), probably because it its type is generic. I know there is Marshal.OffsetOf, but it returns unmanaged offset, not managed.
How can I retrieve this offset from a FieldInfo instance? Is there any .NET method used for that, or do I have to write my own, taking all the exceptions into account (type size, padding, explicit offsets, etc.)?
Offset of a field within a class or struct in .NET 4.7.2:
public static int GetFieldOffset(this FieldInfo fi) =>
GetFieldOffset(fi.FieldHandle);
public static int GetFieldOffset(RuntimeFieldHandle h) =>
Marshal.ReadInt32(h.Value + (4 + IntPtr.Size)) & 0xFFFFFF;
These return the byte offset of a field within a class or struct, relative to the layout of some respective managed instance at runtime. This works for all StructLayout modes, and for both value- and reference-types (including generics, reference-containing, or otherwise non-blittable). The offset value is zero-based relative to the beginning of the user-defined content or 'data body' of the struct or class only, and doesn't include any header, prefix, or other pad bytes.
Discussion
Since struct types have no header, the returned integer offset value can used directly via pointer arithmetic, and System.Runtime.CompilerServices.Unsafe if necessary (not shown here). Reference-type objects, on the other hand, have a header which has to be skipped-over in order to reference the desired field. This object header is usually a single IntPtr, which means IntPtr.Size needs to be added to the the offset value. It is also necessary to dereference the GC ("garbage collection") handle to obtain the object's address in the first place.
With these considerations, we can synthesize a tracking reference to the interior of a GC object at runtime by combining the field offset (obtained via the method shown above) with an instance of the class (e.g. an Object handle).
The following method, which is only meaningful for class (and not struct) types, demonstrates the technique. For simplicity, it uses ref-return and the System.Runtime.CompilerServices.Unsafe libary. Error checking, such as asserting fi.DeclaringType.IsSubclassOf(obj.GetType()) for example, is also elided for simplicity.
/// <summary>
/// Returns a managed reference ("interior pointer") to the value or instance of type 'U'
/// stored in the field indicated by 'fi' within managed object instance 'obj'
/// </summary>
public static unsafe ref U RefFieldValue<U>(Object obj, FieldInfo fi)
{
var pobj = Unsafe.As<Object, IntPtr>(ref obj);
pobj += IntPtr.Size + GetFieldOffset(fi.FieldHandle);
return ref Unsafe.AsRef<U>(pobj.ToPointer());
}
This method returns a managed "tracking" pointer into the interior of the garbage-collected object instance obj.[see comment] It can be used to arbitrarily read or write the field, so this one function replaces the traditional pair of separate getter/setter functions. Although the returned pointer cannot be stored in the GC heap and thus has a lifetime limited to the scope of the current stack frame (i.e., and below), it is very cheap to obtain at any time by simply calling the function again.
Note that this generic method is only parameterized with <U>, the type of the fetched pointed-at value, and not for the type ("<T>", perhaps) of the containing class (the same applies for the IL version below). It's because the bare-bones simplicity of this technique doesn't require it. We already know that the containing instance has to be a reference (class) type, so at runtime it will present via a reference handle to a GC object with object header, and those facts alone are sufficient here; nothing further needs to be known about putative type "T".
It's a matter of opinion whether adding vacuous <T, … >, which would allow us to indicate the where T: class constraint, would improve the look or feel of the example above. It certainly wouldn't hurt anything; I believe the JIT is smart enough to not generate additional generic method instantiations for generic arguments that have no effect. But since doing so seems chatty (other than for stating the constraint), I opted for the minimalism of strict necessity here.
In my own use, rather than passing a FieldInfo or its respective FieldHandle every time, what I actually retain are the various integer offset values for the fields of interest as returned from GetFieldOffset, since these are also invariant at runtime, once obtained. This eliminates the extra step (of calling GetFieldOffset) each time the pointer is fetched. In fact, since I am able to include IL code in my projects, here is the exact code that I use for the function above. As with the C# just shown, it trivially synthesizes a managed pointer from a containing GC-object obj, plus a (retained) integer offset offs within it.
// Returns a managed 'ByRef' pointer to the (struct or reference-type) instance of type U
// stored in the field at byte offset 'offs' within reference type instance 'obj'
.method public static !!U& RefFieldValue<U>(object obj, int32 offs) aggressiveinlining
{
ldarg obj
ldarg offs
sizeof object
add
add
ret
}
So even if you are not able to directly incorporate this IL, showing it here, I think, nicely illustrates the extremely low runtime overhead and alluring simplicity, in general, of this technique.
Example usage
class MyClass { public byte b_bar; public String s0, s1; public int iFoo; }
The first demonstration gets the integer offset of reference-typed field s1 within an instance of MyClass, and then uses it to get and set the field value.
var fi = typeof(MyClass).GetField("s1");
// note that we can get a field offset without actually
// having any instance of 'MyClass'
var offs = GetFieldOffset(fi);
// i.e., later...
var mc = new MyClass();
RefFieldValue<String>(mc, offs) = "moo-maa"; // field "setter"
// note: method call used as l-value, on the left-hand side of '=' assignment!
RefFieldValue<String>(mc, offs) += "!!"; // in-situ access
Console.WriteLine(mc.s1); // --> moo-maa!! (in the original)
// can be used as a non-ref "getter" for by-value access
var _ = RefFieldValue<String>(mc, offs) + "%%"; // 'mc.s1' not affected
If this seems a bit cluttered, you can dramatically clean it up by retaining the managed pointer as ref local variable. As you know, this type of pointer is automatically adjusted--with interior offset preserved--whenever the GC moves the containing object. This means that it will remain valid even as you continue accessing the field unawares. In exchange for allowing this capability, the CLR requires that the ref local variable itself not be allowed to escape its stack frame, which in this case is enforced by the C# compiler.
// demonstrate using 'RuntimeFieldHandle', and accessing a value-type
// field (int) this time
var h = typeof(MyClass).GetField(nameof(mc.iFoo)).FieldHandle;
// later... (still using 'mc' instance created above)
// acquire managed pointer to 'mc.iFoo'
ref int i = ref RefFieldValue<int>(mc, h);
i = 21; // directly affects 'mc.iFoo'
Console.WriteLine(mc.iFoo == 21); // --> true
i <<= 1; // operates directly on 'mc.iFoo'
Console.WriteLine(mc.iFoo == 42); // --> true
// any/all 'ref' uses of 'i' just affect 'mc.iFoo' directly:
Interlocked.CompareExchange(ref i, 34, 42); // 'mc.iFoo' (and 'i' also): 42 -> 34
Summary
The usage examples focused on using the technique with a class object, but as noted, the GetFieldOffset method shown here works perfectly fine with struct as well. Just be sure not to use the RefFieldValue method with value-types, since that code includes adjusting for an expected object header. For that simpler case, just use System.Runtime.CompilerServicesUnsafe.AddByteOffset for your address arithmetic instead.
Needless to say, this technique might seem a bit radical to some. I'll just note that it has worked flawlessly for me for many years, specifically on .NET Framework 4.7.2, and including 32- and 64-bit mode, debug vs. release, plus whichever various JIT optimization settings I've tried.
With some tricks around TypedReference.MakeTypedReference, it is possible to obtain the reference to the field, and to the start of the object's data, then just subtract. The method can be found in SharpUtils.

User Defined Types (UDT) in C#?

I know in VB you can define a UDT like:
Public Type Buffer1
ProductCode As String
SerialNumber As String
Date As Date
End Type
Is it possible to create a User Defined Type in C#? I know that they are used to implement data structures. I have researched and cannot seem to find anything.
The .NET equivalent of the VB6 Type is a struct with public fields. If what you need is the equivalent of the VB6 type, you should the ignore people who are telling you not to use a mutable struct.
Some people believe all data types should behave like class objects, and since structs with exposed fields behave in ways that class objects cannot they are evil.
In reality, there are times when the equivalent of a VB6 Type can be very useful, and any attempt to achieve such functionality with anything other than an exposed-field or otherwise "mutable" struct will be more cumbersome and yield less performant code than would using a type whose natural semantics precisely match what one is trying to achieve. Note that struct fields should generally be exposed as fields rather than properties, since the whole purpose of a VB6 Type is to serve as an aggregation of data types, rather than an ecapsulation of protected state.
There is, alas, one problem with .NET structure types, which is that unless you declare a structure as an unsafe type, only usable within unsafe code (and not usable in contexts which require being able to run in a Partial Trust environment), arrays within structures cannot achieve the semantics of arrays nor fixed-length strings in a VB6 Type. C# will allow one to declare fixed-sized arrays within a struct using the fixed keyword, but as noted only in conjunction with unsafe. Any other array declarations within a struct will cause a struct to hold an array reference rather than an array. Thus:
struct MyStruct {int n; int[] Arr; }
MyStruct a,b;
a.Arr = new int[4];
a.Arr[0] = 3;
a.n = 1;
b=a; // Copies the value of n, and the array reference in Arr, but not the array contents
b.n = 2;
b.Arr[1] = 2;
After the above code runs, b has its own copy of variable n, so the write to b.n will not affect a.n. Field b.Arr, however, holds a reference to the same array as a.Arr. An assignment directly to field Arr of b could make it point to a different array from a.Arr, but the assignment to b.Arr[1] doesn't actually write to field Arr but instead writes to the array object identified by that field, which is the same object identified by field a.Arr.
In some cases, you can replace what in VB6 would be a fixed-sized array with a simple list of discrete variables. Icky, but workable. Alternatively, if the restrictions associated with fixed arrays aren't a problem, you could use those. Beyond that, there isn't any good way to include arrays within a structure.
C# have classes and structs:
public class Buffer1
{
string ProductCode;
string SerialNumber;
DateTime Date;
}
But if you don't know classes, you must previously read some articles or books about C# (e.g. CLR via C#)
A "User Defined Data Type" from Visual Basic is called a Structure in VB.Net, and a struct in C#.
public struct Buffer1
{
public string ProductCode;
public string SerialNumber;
public DateTime Date;
}
Note that unlike the VB6 Type, which always makes all fields public, fields in all .NET types, including structures, default to private.

VARIANT datatype of C++ into C#

What is equivalent of the VARIANT datatype of C++ in C#?
I have code in C++ which uses the VARIANT datatype. How can I convert that code in C#?
Well, there are actually two variant's in C++: boost::variant and COM variant. The solution follows more or less the same idea, but the former is more complex. I expect you mean to use the latter.
Let me first start by telling that this is something you just shouldn't use if possible. That said, this is how you do it :-)
Variants and interop
Variants are sometimes used in interop of if you need the byte representation to be the same.
If you're dealing with interop, make sure to check out the VariantWrapper class on MSDN and make it work like that.
Variants and porting considerations
Variants are mostly used in APIs, and usually like this:
void Foo(SomeEnum operation, Variant data);
The reason it's done like this in C++ is because there is no base object class and you therefore need something like this. The easiest way to port this is to change the signature to:
void Foo(SomeEnum operation, object data);
However, if you're porting anyway, you also seriously want to consider these two, since they are resolved at compile-time and can save you the big 'switch' that usually follows in method Foo:
void SomeOperation(int data);
void SomeOperation(float data);
// etc
Variants and byte consistency
In rare cases you need to manipulate the bytes themselves.
Essentially the variant is just a big union of value types wrapped in a single value type (struct). In C++, you can allocate a value type on the heap because a struct is the same as a class (well sort-of). How the value type is being used is just a bit important but more on that later.
Union simply means you are going to overlap all the data in memory. Notice how I explicitly noted value type above; for variant's this is basically what it's all about. This also gives us a way to test it - namely by checking another value in the struct.
The way to do this in C# is to use the StructLayout attribute in a value type, which basically works as follows:
[StructLayout(LayoutKind.Explicit)]
public struct Variant
{
[FieldOffset(0)]
public int Integer;
[FieldOffset(0)]
public float Float;
[FieldOffset(0)]
public double Double;
[FieldOffset(0)]
public byte Byte;
// etc
}
// Check if it works - shouldn't print 0.
public class VariantTest
{
static void Main(string[] args)
{
Variant v = new Variant() { Integer = 2 };
Console.WriteLine("{0}", v.Float);
Console.ReadLine();
}
}
C++ variant's can also be stored on the heap as I noted earlier. If you do this, you probably still want the memory signature to be the same. The way to do this is to box the Variant struct we build earlier by simply casing it to object.
This is a tricky question.
From C# 4, you can use dynamic to indicate that the type is known at run-time.
By my personal understanding, however, c++ requires the type known at compile time. Thus you might consider to use object, but object in C# is an existent type.
For the concept of multi-type, single value (AKA polymorphism) of VARIANT, you would not need to find a corresponding type in C#, just define your classes and interfaces. You can always reference an object as its interface which the class implements.
If you are porting the code, and to figure out a syntax that you can simply use in LHS and for the considering of the type is known at compile time, then use var.
When .NET implements a COM interface, just use VARIANT* instead.
Then bypass marshalling on the .NET receiving side by using a IntPtr type to receive the pointer.
public class ComVariant
{
[StructLayout(LayoutKind.Sequential)]
public struct Variant
{
public ushort vt;
public ushort wReserved1;
public ushort wReserved2;
public ushort wReserved3;
public Int32 data01;
public Int32 data02;
}
private Variant _variant;
private IntPtr _variantPtr;
public ComVariant(int variantPtr) : this(new IntPtr(variantPtr))
{
}
public ComVariant(IntPtr variantPtr)
{
_variant = (Variant)Marshal.PtrToStructure(variantPtr, typeof(Variant));
_variantPtr = variantPtr;
}
public VarEnum Vt
{
get
{
return (VarEnum)_variant.vt;
}
set
{
_variant.vt = (ushort)value;
}
}
public object Object
{
get
{
return Marshal.GetObjectForNativeVariant(_variantPtr);
}
}
}
then if you are accessing a VT_UNKNOWN pointing to a COM interface object instance, just
var variant = new ComVariant(variantPtr);
var stream = variant.Object as IStream; // will not be null if type is correct
var obj = variant.Object as IObj; // in general...
will do the trick, but pay attention not to use a newly allocated VARIANT and giving its ownership to the .NET implementation without deallocating it somewhere...
For more complex code you might read this article which also talks about memory management.
Let's take a step back. Sooner or later, we want the actual data in the VARIANT. A VARIANT is just a holder for meaningful data. Suppose we converted the VARIANT to some sort of Object in C# that had the variant type and some raw buffer under the .NET hood (e.g. .NET strings can expose the raw buffer). At that point, the VARIANT type would need to be determined from the object and the raw data converted or cast to the data type specified by the variant, and then create a new Object e.g. string/int/etc. from the raw data.
So, rather than worry about passing the VARIANT to C#, look at the variant data type and convert it in C++ to the actual data type and pass that to C#.
For example, if the VARIANT type is VT_INT, then get the int from the variant and can use something like:
VARIANT var;
Int^ returnInt = gcnew Int(var.intVal);
returnInt can be returned as an Out parameter from a C++ function in a C++ dll that can be called from C#. The C++ dll needs to use /clr option.
Function would look like:-
void ThisFunctionReturnsAnInt(Runtime::InteropServices::OutAttribute Int^ % returnIntValue)
{
VARIANT var;
Int^ returnInt = gcnew Int(var.intVal);
}
Can use similar approach for other data types. It's only natural, a VARIANT of VT_INT is really just like an int, it's not as if there's some major conversion going on, you're just taking the actual value out of the VARIANT at the time when you're interested in it as you would if you were passing a straight integer value from C++ to C#. You would still need to do the gcnew anyway.

change struct in method

How can I change struct in external method ?
public void ChangeStruct (MyStruct myStruct) {
myStruct.field1 = 10;
return;
}
When I pass struct to ChangeStruct method after that method I would like myStruct to be changed.
You need to pass a reference to the struct instead of a copy using the ref keyword :
public void ChangeStruct (ref MyStruct myStruct)
{
myStruct.field1 = 10;
}
ChangeStruct(ref someStruct);
Your current code create a full bit-for-bit copy of the struct before entering the method and it's this copy that you are modifying, the ref keyword force the caller to pass a reference (managed pointer) to the structure instead of the copy.
You can use the ref keyword to observe changes to structs, but in the grand scheme, you will be in a world of less hurt if you simply use a class.
For an idea on when to use or not use structs, you might consult this link. A quick snippet that you may find helpful:
Do not define a structure unless the type has all of the following characteristics:
It logically represents a single value, similar to primitive types
(integer, double, and so on).
It has an instance size smaller than 16
bytes.
It is immutable.
It will not have to be boxed frequently.
Structs are value types, you must use the ref keyword to prevent copying. Using ref and out is not recommended, see When is using the C# ref keyword ever a good idea?.

Categories