Creating a bitfield class that points to arbitrary data- is this safe?

Creating a bitfield class that points to arbitrary data- is this safe? - c#

Context
I am creating a Bitfield class that is responsible for providing access to a contiguous set of bits in a UInt32. The source data is not managed by the Bitfield, but instead another object. In practice, the same object that owns the source data will also own any Bitfield instances that point to it, so the pointer lifetime will never exceed that of the source data. All parameters passed to the Bitfield constructor are determined at runtime. My current approach is as follows:
public class Bitfield
{
private int offset;
private uint mask;
unsafe private uint* data;
unsafe public Bitfield(uint* data, int msb, int lsb)
{
this.data = data;
mask = (uint)(((1UL << (msb + 1)) - 1) ^ ((1UL << lsb) - 1));
offset = lsb;
}
unsafe public void Set(uint value) => *data = ((value << offset) & mask) | (*data & ~mask);
unsafe public uint Get() => (*data & mask) >> offset;
}
In the using application, a Bitfield might be employed as below:
class Program
{
static uint sourceData = 0xDEADBEEF;
unsafe static void Main(string[] args)
{
Bitfield high;
Bitfield low;
fixed (uint* data = &sourceData)
{
high = new Bitfield(data, 31, 16);
low = new Bitfield(data, 15, 0);
}
Console.WriteLine($"DEAD ?= {high.Get():X}");
Console.WriteLine($"BEEF ?= {low.Get():X}");
high.Set(0xFEED);
Console.WriteLine($"FEEDBEEF ?= {sourceData:X}");
Console.ReadKey();
}
}
Main question
Is this a sound approach or should I seek a different strategy?
Other considerations
I read that the Garbage Collector may rearrange memory, hence the fixed body. When this happens, I worry that the pointer will not be updated to match sourceData's new location. Therefore, high and low will operate on invalid data, rendering my approach dangerous. Can someone confirm this? I could pass the source data to the Get/Set methods with ref and achieve the same result as the pointer, but then the caller must keep track of which source data to pass to which Bitfields (This will vary by owning object at runtime).
Side questions:
Is there perhaps a Reflection construct that would work similarly? (I don't know much about Reflection.)
Why does the Garbage Collector rearrange memory? Is it to combat fragmentation?

I agree that the approach is not sound, and can lead to data corruption and/or invalid memory access. It seems that my dream of operating on arbitrary data would require some feature that ensures the referenced memory can't be moved without a pointer to it noticing. Unfortunately, neither BitVector32 nor BitArray quite meet this original intent either, but they did start me thinking.
Upon reviewing the use cases, I have settled on a solution by which I refer to the source data. Like the other suggestions, it doesn't meet the original intent. However, the source data will always be an IList<uint>, and never a lone value. With this in mind, I created a class that encapsulates the source data and manages a dictionary of field descriptors. It will then provide access to both individual fields and to the source data array. In this way, the fields remain flexible and can point to different data at runtime using an reference and index instead of a pointer.
// This is now just a descriptor of the field parameters
public struct Bitfield
{
public int index;
public int offset;
public uint mask;
public Bitfield(int index, int msb, int lsb)
{
this.index = index;
mask = (uint)(((1UL << (msb + 1)) - 1) ^ ((1UL << lsb) - 1));
offset = lsb;
}
}
// The data is now encapsulated in its own class
public class DataArray
{
public IList<uint> data;
private Dictionary<string, Bitfield> fields = new Dictionary<string, Bitfield>();
public DataArray(IList<uint> sourceData)
{
// I don't care if this is a copy or reference assignment as long as
// I use DataArray.data to access the array from now on
data = sourceData;
}
public void AddField(string name, int index, int msb, int lsb)
{
fields[name] = new Bitfield(index, msb, lsb);
}
public uint Get(string name)
{
uint result = 0;
if(fields.TryGetValue(name, out Bitfield field))
{
result = (data[field.index] & field.mask) >> field.offset;
}
else
{
// throw invalid name
}
return result;
}
public void Set(string name, uint value)
{
if(fields.TryGetValue(name, out Bitfield field))
{
data[field.index] = ((value << field.offset) & field.mask) | (data[field.index] & ~field.mask);
}
else
{
// throw invalid name
}
}
}

Related

Convert Bitfield to array

I have a uint called Forced that contains 32 bits.
I do stuff like:
if(Forced & 512)
doStuff();
What I am looking to do is put forced into an array which would then turn into:
if(ForcedArray[(int)Math.Log(512,2)])
doStuff();
Is there a convenient way in .NET to do this? What would be a convenient way to convert a bitfield to an array?

You could write an extension method for this:
public static class UIntExtensions
{
public static bool IsBitSet(this uint i, int bitNumber)
{
return i & (1 << bitNumber) != 0;
}
}
Or, if you want to do this the C#6 way:
public static class UIntExtensions
{
public static bool IsBitSet(this uint i, int bitNumber) => (i & (1 << bitNumber)) != 0;
}
Which is pretty easy to use from code:
if(Forced.IsBitSet((int)Math.Log(512,2)))
doStuff();
Obviously, a few checks for having a bit number >= 0 or <= 31 need to be added, but you get the idea.

Using bit-shift to access bits of an integer Forced & (1 << bitNumber) sounds like a good approach (nice function wrapping the access is shown in Ron Beyer's answer).
Most reader of the code will be puzzled by such transformation of compact single-word field into complicated data structure like array. Please consider avoiding that unless there are some other reasons (external API constraint like JSON serialization) or significant readability gain.
As intermediate approach you can create small wrapper structure that holds integer value and additionally exposes indexed access to each bit (preferably immutable).
If you really want and array - basic for loop or LINQ can be used to transform each bit into boolean. I.e. If it is just one integer (may need to adjust order depending which bit you need first, this one puts lowest bit first):
var array = Enumerable.Range(0, 32)
.Select(bitNumber => (Forced & (1 << bitNumber)) !=0)
.ToArray();

public static class UIntExtensions
{
public static byte[] GetBitArray(this uint v)
{
var r = byte[32];
for (var i = 0; i < 32; ++i)
{
r[i] = v & 1;
v = v >> 1
}
return r;
}
}

interop with nim return Struct Array containing a string /char* member

interoping nim dll from c# i could call and execute the code below
if i will add another function (proc) that Calls GetPacks() and try to echo on each element's buffer i could see the output in the C# console correctly
but i could not transfer the data as it is, i tried everything but i could not accomplish the task
proc GetPacksPtrNim(parSze: int, PackArrINOUT: var DataPackArr){.stdcall,exportc,dynlib.} =
PackArrINOUT.newSeq(parSze)
var dummyStr = "abcdefghij"
for i, curDataPack in PackArrINOUT.mpairs:
dummyStr[9] = char(i + int8'0')
curDataPack = DataPack(buffer:dummyStr, intVal: uint32 i)
type
DataPackArr = seq[DataPack]
DataPack = object
buffer: string
intVal: uint32
when i do same in c/c++ the type i am using is either an IntPtr or char*
that is happy to contain returned buffer member
EXPORT_API void __cdecl c_returnDataPack(unsigned int size, dataPack** DpArr)
{
unsigned int dumln, Index;dataPack* CurDp = {NULL};
char dummy[STRMAX];
*DpArr = (dataPack*)malloc( size * sizeof( dataPack ));
CurDp = *DpArr;
strncpy(dummy, "abcdefgHij", STRMAX);
dumln = sizeof(dummy);
for ( Index = 0; Index < size; Index++,CurDp++)
{
CurDp->IVal = Index;
dummy[dumln-1] = '0' + Index % (126 - '0');
CurDp->Sval = (char*) calloc (dumln,sizeof(dummy));
strcpy(CurDp->Sval, dummy);
}
}
c# signature for c code above
[DllImport(#"cdllI.dll", CallingConvention = CallingConvention.Cdecl), SuppressUnmanagedCodeSecurity]
private static extern uint c_returnDataPack(uint x, DataPackg.TestC** tcdparr);
C# Struct
public unsafe static class DataPackg
{
[StructLayout(LayoutKind.Sequential)]
public struct TestC
{
public uint Id;
public IntPtr StrVal;
}
}
finally calling the function like so:
public static unsafe List<DataPackg.TestC> PopulateLstPackC(int ArrL)
{
DataPackg.TestC* PackUArrOut;
List<DataPackg.TestC> RtLstPackU = new List<DataPackg.TestC>(ArrL);
c_returnDataPack((uint)ArrL, &PackUArrOut);
DataPackg.TestC* CurrentPack = PackUArrOut;
for (int i = 0; i < ArrL; i++, CurrentPack++)
{
RtLstPackU.Add(new DataPackg.TestC() { StrVal = CurrentPack->StrVal, Id = CurrentPack->Id });
}
//Console.WriteLine("Res={0}", Marshal.PtrToStringAnsi((IntPtr)RtLstPackU[1].StrVal));//new string(RtLstPackU[0].StrVal));
return RtLstPackU;
}
how could i produce similar c code as above from Nim ?
it doesn't have to be same code, but same effect, that in c# i would be able to read the content of the string. for now, the int is readable but the string is not
Edit:
this is what i tried to make things simple
struct array of int members
Update:
it seem that the problem is to do with my settings of nim in my windows OS.
i will be updating as soon as i discover what exactly is wrong.

The string type in Nim is not equivalent to the C's const char* type. Strings in Nim are represented as pointers, pointing into a heap-allocated chunk of memory, which has the following layout:
NI length; # the length of the stored string
NI capacity; # how much room do we have for growth
NIM_CHAR data[capacity]; # the actual string, zero-terminated
Please beware that these types are architecture specific and they are really an implementation detail of the compiler that can be changed in the future. NI is the architecture-default interger type and NIM_CHAR is usually equivalent to a 8-bit char, since Nim is leaning towards the use of UTF8.
With this in mind, you have several options:
1) You can teach C# about this layout and access the string buffers at their correct location (the above caveats apply). An example implementation of this approach can be found here:
https://gist.github.com/zah/fe8f5956684abee6bec9
2) You can use a different type for the buffer field in your Nim code. Possible candidates are ptr char or the fixed size array[char]. The first one will require you to give up the automatic garbage collection and maintain a little bit of code for manual memory management. The second one will give up a little bit of space efficiency and it will put hard-limits on the size of these buffers.
EDIT:
Using cstring may also look tempting, but it's ultimately dangerous. When you assign a regular string to a cstring, the result will be a normal char * value, pointing to the data buffer of the Nim string described above. Since the Nim garbage collector handles properly interior pointers to allocated values, this will be safe as long as the cstring value is placed in a traced location like the stack. But when you place it inside an object, the cstring won't be traced and nothing prevents the GC from releasing the memory, which may create a dangling pointer in your C# code.

Try to change your struct to:
public unsafe static class DataPackg
{
[StructLayout(LayoutKind.Sequential)]
public struct TestC
{
public uint Id;
[MarshalAs(UnmanagedType.LPStr)]
public String StrVal;
}
}

is it possible to have dynamic struct members dependent to another member in c#

in this case a binary file is written with the file format based on a struct
struct fileformat
{
struct mask
{
bool mem1present
bool mem2present
bool mem3present
//5 bits unused
}
//member only written in file if mem1present is true
byte mem1present
//member only written in file if mem2present is true
byte mem1present
//member only written in file if mem3present is true
byte mem1present
}
is this possible to be implemented in c#

Sure - you have to implement the serialization yourself to some extent, but you can do that easily enough.
It's unclear what sort of serialization you're using - if you're using the "raw" binary serialization from .NET, you want to override GetObjectData to only add the relevant data on serialization, and then in the protected constructor taking a SerializationInfo and a StreamingContext, populate your struct from the same data in reverse. See this MSDN article for some details.
I don't know what happens if you're using XML serialization.
If you're writing your own serialization (i.e. you've got a method such as WriteToStream) then you can choose to represent it however you want, of course.
EDIT: It sounds like you've probably got an existing file format you need to read in, but you can define your own types. It's easy to have a class or struct with multiple members and possibly a mask to say what's set, although without knowing more it may not be the best design. While you can use explicit layout to make this efficient in memory, it's probably easiest just to have separate members:
struct Foo
{
// Bit-set to determine which fields are actually used
private readonly byte mask;
private readonly int value1;
private readonly int value2;
private readonly int value3;
public Foo(byte mask, int value1, int value2, int value3)
{
this.mask = mask;
this.value1 = value1;
this.value2 = value2;
this.value3 = value3;
}
}
Then somewhere (either in the data type or not), something like:
Foo ReadFoo(Stream stream)
{
byte mask = stream.ReadByte();
int value1 = 0, value2 = 0, value3 = 0;
if ((mask & 1) == 1)
{
// However you do that, depending on your file format
value1 = ReadInt32FromStream(stream);
}
if ((mask & 2) == 2)
{
// However you do that, depending on your file format
value2 = ReadInt32FromStream(stream);
}
if ((mask & 4) == 4)
{
// However you do that, depending on your file format
value3 = ReadInt32FromStream(stream);
}
return new Foo(mask, value1, value2, value3);
}
By the way, I would seriously consider whether a struct is really the best approach here - consider using a class instead. I very rarely create my own structs.

Note: Your sample shows only the declaration of a nested struct type, not an instance of it.
From your question wording, you need an instance member.
struct fileformat
{
struct mask // type declaration only
{
bool mem1present
bool mem2present
bool mem3present
//5 bits unused
}
public mask mask; // <-- Member instance here
}
I apologize if I've misunderstood. Perhaps your struct was only to communicate the structure of the file to us?

How to pinvoke a variable-length array of structs from GetTokenInformation() safely for 32-bit and 64-bit? C#

I'm following the pinvoke code provided here but am slightly scared by the marshalling of the variable-length array as size=1 and then stepping through it by calculating an offset instead of indexing into an array. Isn't there a better way? And if not, how should I do this to make it safe for 32-bit and 64-bit?
[StructLayout(LayoutKind.Sequential)]
public struct SID_AND_ATTRIBUTES
{
public IntPtr Sid;
public uint Attributes;
}
[StructLayout(LayoutKind.Sequential)]
public struct TOKEN_GROUPS
{
public int GroupCount;
[MarshalAs(UnmanagedType.ByValArray, SizeConst = 1)]
public SID_AND_ATTRIBUTES[] Groups;
};
public void SomeMethod()
{
IntPtr tokenInformation;
// ...
string retVal = string.Empty;
TOKEN_GROUPS groups = (TOKEN_GROUPS)Marshal.PtrToStructure(tokenInformation, typeof(TOKEN_GROUPS));
int sidAndAttrSize = Marshal.SizeOf(new SID_AND_ATTRIBUTES());
for (int i = 0; i < groups.GroupCount; i++)
{
// *** Scary line here:
SID_AND_ATTRIBUTES sidAndAttributes = (SID_AND_ATTRIBUTES)Marshal.PtrToStructure(
new IntPtr(tokenInformation.ToInt64() + i * sidAndAttrSize + IntPtr.Size),
typeof(SID_AND_ATTRIBUTES));
// ...
}
I see here another approach of declaring the length of the array as much bigger than it's likely to be, but that seemed to have its own problems.
As a side question: When I step through the above code in the debugger I'm not able to evaluate tokenInformation.ToInt64() or ToInt32(). I get an ArgumentOutOfRangeException. But the line of code executes just fine!? What's going on here?

Instead of guessing what the offset, is its generally better to use Marshal.OffsetOf(typeof(TOKEN_GROUPS), "Groups") to get the correct offset to the start of the array.

I think it looks okay -- as okay as any poking about in unmanaged land is, anyway.
However, I wonder why the start is tokenInformation.ToInt64() + IntPtr.Size and not tokenInformation.ToInt64() + 4 (as the GroupCount field type is an int and not IntPtr). Is this for packing/alignment of the structure or just something fishy? I do not know here.
Using tokenInformation.ToInt64() is important because on a 64-bit machine will explode (OverflowException) if the IntPtr value is larger than what an int can store. However, the CLR will handle a long just fine on both architectures and it doesn't change the actual value extracted from the IntPtr (and thus put back into the new IntPtr(...)).
Imagine this (untested) function as a convenience wrapper:
// unpacks an array of structures from unmanaged memory
// arr.Length is the number of items to unpack. don't overrun.
void PtrToStructureArray<T>(T[] arr, IntPtr start, int stride) {
long ptr = start.ToInt64();
for (int i = 0; i < arr.Length; i++, ptr += stride) {
arr[i] = (T)Marshal.PtrToStructure(new IntPtr(ptr), typeof(T));
}
}
var attributes = new SID_AND_ATTRIBUTES[groups.GroupCount];
PtrToStructureArray(attributes, new IntPtr(tokenInformation.ToInt64() + IntPtr.Size), sidAndAttrSize);
Happy coding.

How can I put an array inside a struct in C#?

C++ code:
struct tPacket
{
WORD word1;
WORD word2;
BYTE byte1;
BYTE byte2;
BYTE array123[8];
}
static char data[8192] = {0};
...
some code to fill up the array
...
tPacket * packet = (tPacket *)data;
We can't do that as easy in C#.
Please note there is an array in the C++ structure.
Alternatively, using this source file could do the job for us, but not if there is an array in the structure.

I'm unsure of exactly what you are asking. Are you trying to get an equivalent structure definition in C# for plain old C# usage or for interop (PInvoke) purposes? If it's for PInvoke the follownig structure will work
[System.Runtime.InteropServices.StructLayoutAttribute(System.Runtime.InteropServices.LayoutKind.Sequential)]
public struct tPacket {
/// WORD->unsigned short
public ushort word1;
/// WORD->unsigned short
public ushort word2;
/// BYTE->unsigned char
public byte byte1;
/// BYTE->unsigned char
public byte byte2;
/// BYTE[8]
[System.Runtime.InteropServices.MarshalAsAttribute(System.Runtime.InteropServices.UnmanagedType.ByValArray, SizeConst=8, ArraySubType=System.Runtime.InteropServices.UnmanagedType.I1)]
public byte[] array123;
}
If you are looking for a plain old C# structure that has the same characteristics, it's unfortunately not possible to do with a struct. You cannot define an inline array of a contstant size in a C# structure nor can you force the array to be a specific size through an initializer.
There are two alternative options in the managed world.
Use a struct which has a create method that fills out the array
[System.Runtime.InteropServices.StructLayoutAttribute(System.Runtime.InteropServices.LayoutKind.Sequential)]
public struct tPacket {
public ushort word1;
public ushort word2;
public byte byte1;
public byte byte2;
public byte[] array123;
public static tPacket Create() {
return new tPacket() { array123 = new byte[8] };
}
}
Or alternatively use a class where you can initialize the array123 member variable directly.
EDIT OP watns to know how to convert a byte[] into a tPacket value
Unfortunately there is no great way to do this in C#. C++ was awesome for this kind of task because has a very weak type system in that you could choose to view a stream of bytes as a particular structure (evil pointer casting).
This may be possible in C# unsafe code but I do not believe it is.
Essentially what you will have to do is manually parse out the bytes and assign them to the various values in the struct. Or write a native method which does the C style casting and PInvoke into that function.

I think what you are looking for (if you are using a similar structure definition like JaredPar posted) is something like this:
tPacket t = new tPacket();
byte[] buffer = new byte[Marshal.SizeOf(typeof(tPacket))];
socket.Receive(buffer, 0, buffer.length, 0);
GCHandle pin = GCHandle.Alloc(buffer, GCHandleType.Pinned);
t = (tPacket)Marshal.PtrToStructure(pin.AddrOfPinnedObject(), typeof(tPacket));
pin.free();
//do stuff with your new tPacket t

It can be done with unsafe code too, although it restricts the context under which your program can run, and, naturally, introduces the possibility of security flaws.
The advantage is that you cast directly from an array to the structure using pointers and it's also maintenance-free if you are only going to add or remove fields from the struct. However, accessing the arrays require using the fixed-statement as the GC can still move the struct around in memory when it's contained in an object.
Here's some modified code of an unsafe struct I used for interpreting UDP packets:
using System;
using System.Runtime.InteropServices;
[StructLayout(LayoutKind.Sequential)]
public unsafe struct UnsafePacket
{
int time;
short id0;
fixed float acc[3];
short id1;
fixed float mat[9];
public UnsafePacket(byte[] rawData)
{
if (rawData == null)
throw new ArgumentNullException("rawData");
if (sizeof(byte) * rawData.Length != sizeof(UnsafePacket))
throw new ArgumentException("rawData");
fixed (byte* ptr = &rawData[0])
{
this = *(UnsafePacket*)rawPtr;
}
}
public float GetAcc(int index)
{
if (index < 0 || index >= 3)
throw new ArgumentOutOfRangeException("index");
fixed (UnsafePacket* ptr = &acc)
{
return ptr[index];
}
}
public float GetMat(int index)
{
if (index < 0 || index >= 9)
throw new ArgumentOutOfRangeException("index");
fixed (UnsafePacket* ptr = &mat)
{
return ptr[index];
}
}
// etc. for other properties
}
For this kind of code it is extremely important to check that the length of the array perfectly matches the size of the struct, otherwise you'll open for some nasty buffer overflows. As the unsafe keyword has been applied to the whole struct, you don't need to mark each method or codeblock as separate unsafe statements.

You can place what looks to the outside world like an array of fixed size within a safe structure by writing functions within the structure for access. For example, here is a fixed 4 by 4 double precision array within a safe structure:
public struct matrix4 // 4 by 4 matrix
{
//
// Here we will create a square matrix that can be written to and read from similar
// (but not identical to) using an array. Reading and writing into this structure
// is slower than using an array (due to nested switch blocks, where nest depth
// is the dimensionality of the array, or 2 in this case). A big advantage of this
// structure is that it operates within a safe context.
//
private double a00; private double a01; private double a02; private double a03;
private double a10; private double a11; private double a12; private double a13;
private double a20; private double a21; private double a22; private double a23;
private double a30; private double a31; private double a32; private double a33;
//
public void AssignAllZeros() // Zero out the square matrix
{ /* code */}
public double Determinant() // Common linear algebra function
{ /* code */}
public double Maximum() // Returns maximum value in matrix
{ /* code */}
public double Minimum() // Minimum value in matrix
{ /* code */}
public double Read(short row, short col) // Outside read access
{ /* code */}
public double Read(int row, int col) // Outside read access overload
{ /* code */}
public double Sum() // Sum of 16 double precision values
{
return a00 + a01 + a02 + a03 + a10 + a11 + a12 + a13 + a20 + a21 + a22 + a23 + a30 + a31 + a32 + a33;
}
public void Write(short row, short col, double doubleValue) // Write access to matrix
{ /* code */}
public void Write(int row, int col, double doubleValue) // Write access overload
{ /* code */}
}

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.