Iirc from C, you can have a statement along these lines:
char* str = "1234";
int nonStr = *((int*)str);
(I intentionally made the string 4 characters so in the average scenario it will have the same number of bytes as the integer.) This will dereference the memory where str is stored and give you the value if it was an integer (522207554 if I did the conversion right).
Is there any way to do the same thing in C#? I know this is a low level memory operation which is generally blissfully hidden from the C# programmer, I am only doing this for a teaching exercise.
You can do this using unsafe context and fixedstatement:
static unsafe void Main(string[] args)
{
string str = "1234";
fixed(char* strPtr = str)
{
int* nonStr = (int*)strPtr;
Console.WriteLine(*nonStr);
}
}
prints
3276849
You're looking for the unsafe functionality of C#.
Here's one reference. A general search for "C# unsafe pointer dereference" turns up many results.
Would a union simulation be sufficient for this?
http://msdn.microsoft.com/en-us/library/acxa5b99.aspx
Related
I couldn't think of a better way to word the title of this question, so I can change that if recommended.
I'm writing a Space Invaders emulator as a learning project and I have encountered an issue when trying to test one of the op codes I implemented. Here is the test code:
public unsafe void InrBTest()
{
Processor8080 processor = new Processor8080();
Processor8080* processorPtr = &processor;
byte testMemory = 0x03;
processorPtr->pc = 0x04;
processorPtr->b = 0x38;
//processorPtr->c = 0xFE;
processorPtr->memory = &testMemory;
processorPtr->memory[processorPtr->pc] = 0x04;
emulator.Emulate8080OpCode(processorPtr);
}
As you can see, there is nothing crazy going on here. All I'm doing is setting up a Processor8080 object, getting a pointer to it, and then setting some values to test.
The problem is the second-to-last line: processorPtr->memory[processorPtr->pc] = 0x04;. After this line executes, all of the values that I had previously set get completely changed and I get crazy values for the fields in the Processor8080 object -- even values that hadn't previously been set.
Also, the memory field in the Processor8080 struct gets set to null. I have several other tests that are set up this exact way, but none of them encounter this issue; they all work fine. I'll put the code for the Processor8080 struct below for reference, but I don't think the issue is there.
I'm fairly new to this kind of programming, so it wouldn't surprise me if I've just missed something somewhere. The most confusing thing to me is that other tests I've written don't do this at all.
I'll post one of the others below to show you an example. Thanks for any help.
Processor8080 struct:
public struct ConditionCodes
{
[BitFieldLength(1)]
public byte z;
[BitFieldLength(1)]
public byte p;
[BitFieldLength(1)]
public byte s;
[BitFieldLength(1)]
public byte cy;
[BitFieldLength(1)]
public byte ac;
[BitFieldLength(3)]
public byte pad;
}
public struct Processor8080
{
public byte a, b, c, d, e, h, l; //registers
public ushort sp, pc; //stack pointer, program counter
public unsafe byte* memory; //RAM
public byte intEnable;
public ConditionCodes cc;
}
Example of a test that is working:
public void InxBTest()
{
//Tests the INX B (0x03) instruction
unsafe
{
Processor8080 processor = new Processor8080();
Processor8080* processorPtr = &processor;
byte testMemory = 0x03;
processorPtr->pc = 0x03;
processorPtr->b = 0x38;
processorPtr->c = 0xFE;
processorPtr->memory = &testMemory;
processorPtr->memory[processorPtr->pc] = 0x03;
emulator.Emulate8080OpCode(processorPtr);
}
}
EDIT:
I have discovered that, as long as I don't set processorPtr->pc to a value greater that 0x03, there is no issue. I can't think of why that is, but if someone can shed some light, I'd greatly appreciate it.
It's a good old fashioned buffer overflow error.
You're assigning the address of a single byte to pc->memory:
processorPtr->memory = &testMemory;
Then, you're accessing that memory like an array:
processorPtr->memory[processorPtr->pc] = 0x03;
Problem is, here, processorPtr->pc is equal to 4, but the memory is only one byte long. Uh oh. Since it's memory on the stack, my guess is you're actually trampling over other values on the stack, and since your Processor8080 type is a struct, it lives on the stack, so you end up messing around with it.
If it appears to be working with values smaller than 4, it's probably because of padding or alignment. If your stack is 4 byte aligned, then you have to go at least 4 bytes to start touching other objects on it. Those are all suppositions, of course: you're dealing with undefined behavior here.
So, you're essentially accessing things outside the bounds of an array, but since it's unsafe code, you don't get a nice clean exception when doing it, instead you get weird side effects.
But anyway, working with pointers is a pain, and that's one example of it. I'd recommend you stick to managed memory.
I've seen this code which was used showing the reference value :
static void Main(string[] args)
{
string s1 = "ab";
string s2 = "a"+"b";
string s3 = new StringBuilder("a").Append("b").ToString();
Console.WriteLine(GetMemoryAddress(s1));
Console.WriteLine(GetMemoryAddress(s2));
Console.WriteLine(GetMemoryAddress(s3));
}
static IntPtr GetMemoryAddress(object s1)
{
unsafe
{
TypedReference tr = __makeref(s1);
IntPtr ptr = **(IntPtr**) (&tr);
return ptr;
}
}
Result (as expected):
(I know that string interning kicks in here, but that's not the question).
Question:
Although it seems that it does do the job,
Does using __makeref is this the right way of getting the reference value in c#?
Or are there any situations in which this ^ would fail ....?
Although it seems that it does do the job, Does using __makeref is this the right way of getting the reference value in c#?
There is no "right way" of doing this in C# - it isn't something you're meant to try and do, but: in terms of what it is doing - this is essentially relying on the internal layout of TypedReference and a type coercion; it'll work (as long as TypedReference doesn't change internally - for example the order of the Type and Value fields changes), but... it is nasty.
There is a more direct approach; in IL, you can convert from a managed pointer to an unmanaged pointer silently. Which means you can do something nasty like:
unsafe delegate void* RefToPointer(object obj);
static RefToPointer GetRef { get; } = MakeGetRef();
static RefToPointer MakeGetRef()
{
var dm = new DynamicMethod("evil", typeof(void*), new[] { typeof(object) });
var il = dm.GetILGenerator();
il.Emit(OpCodes.Ldarg_0);
il.Emit(OpCodes.Ret);
return (RefToPointer)dm.CreateDelegate(typeof(RefToPointer));
}
and now you can just do:
var ptr = new IntPtr(GetRef(o));
Console.WriteLine(ptr);
This is horrible, and you should never do it - and of course the GC can move things while you're not looking (or even while you are looking), but... it works.
Whether ref-emit is "better" than undocumented and unsupported language features like __makeref and type-coercion: is a matter of some debate. Hopefully purely academic debate!
In C# book is written that I'm unable to access unallocated memory. They said that is possible in unsafe context. My question is how this can be done?
I tried something like this:
static void Main(string[] args)
{
unsafe
{
int c;
Console.WriteLine(c);
}
}
With allow unsafe option in project properties. And this code is unable to compile.
The unsafe keyword does not completely alter the language or compilation model, you still need to initialize any variables before using them. If you want to access "unallocated memory", you need to get a pointer to that memory. Here is an example:
unsafe void AccessMemory()
{
const int address = 10000;
byte[] array = new byte[0];
fixed (byte* zero = array)
{
byte* p = zero + address;
}
}
Here we get a pointer to the empty array, which gives the zero pointer. Then we offset that pointer by some amount (address), which results in a pointer to that memory address.
I need to marshal some nested structures in C# 4.0 into binary blobs to pass to a C++ framework.
I have so far had a lot of success using unsafe/fixed to handle fixed length arrays of primitive types. Now I need to handle a structure that contains nested fixed length arrays of other structures.
I was using complicated workarounds flattening the structures but then I came across an example of the MarshalAs attribute which looked like it could save me a great deal of problems.
Unfortunately whilst it gives me the correct amount of data it seems to also stop the fixed arrays from being marshalled properly, as the output of this program demonstrates. You can confirm the failure by putting a breakpoint on the last line and examining the memory at each pointer.
using System;
using System.Threading;
using System.Runtime.InteropServices;
namespace MarshalNested
{
public unsafe struct a_struct_test1
{
public fixed sbyte a_string[3];
public fixed sbyte some_data[12];
}
public struct a_struct_test2
{
[MarshalAs(UnmanagedType.ByValArray, SizeConst = 3)]
public sbyte[] a_string;
[MarshalAs(UnmanagedType.ByValArray, SizeConst = 4)]
public a_nested[] some_data;
}
public unsafe struct a_struct_test3
{
public fixed sbyte a_string[3];
[MarshalAs(UnmanagedType.ByValArray, SizeConst = 4)]
public a_nested[] some_data;
}
public unsafe struct a_nested
{
public fixed sbyte a_notherstring[3];
}
class Program
{
static unsafe void Main(string[] args)
{
a_struct_test1 lStruct1 = new a_struct_test1();
lStruct1.a_string[0] = (sbyte)'a';
lStruct1.a_string[1] = (sbyte)'b';
lStruct1.a_string[2] = (sbyte)'c';
a_struct_test2 lStruct2 = new a_struct_test2();
lStruct2.a_string = new sbyte[3];
lStruct2.a_string[0] = (sbyte)'a';
lStruct2.a_string[1] = (sbyte)'b';
lStruct2.a_string[2] = (sbyte)'c';
a_struct_test3 lStruct3 = new a_struct_test3();
lStruct3.a_string[0] = (sbyte)'a';
lStruct3.a_string[1] = (sbyte)'b';
lStruct3.a_string[2] = (sbyte)'c';
IntPtr lPtr1 = Marshal.AllocHGlobal(15);
Marshal.StructureToPtr(lStruct1, lPtr1, false);
IntPtr lPtr2 = Marshal.AllocHGlobal(15);
Marshal.StructureToPtr(lStruct2, lPtr2, false);
IntPtr lPtr3 = Marshal.AllocHGlobal(15);
Marshal.StructureToPtr(lStruct3, lPtr3, false);
string s1 = "";
string s2 = "";
string s3 = "";
for (int x = 0; x < 3; x++)
{
s1 += (char) Marshal.ReadByte(lPtr1+x);
s2 += (char) Marshal.ReadByte(lPtr2+x);
s3 += (char) Marshal.ReadByte(lPtr3+x);
}
Console.WriteLine("Ptr1 (size " + Marshal.SizeOf(lStruct1) + ") says " + s1);
Console.WriteLine("Ptr2 (size " + Marshal.SizeOf(lStruct2) + ") says " + s2);
Console.WriteLine("Ptr3 (size " + Marshal.SizeOf(lStruct3) + ") says " + s3);
Thread.Sleep(10000);
}
}
}
Output:
Ptr1 (size 15) says abc
Ptr2 (size 15) says abc
Ptr3 (size 15) says a
So for some reason it is only marshalling the first character of my fixed ANSI strings. Is there any way around this, or have I done something stupid unrelated to the marshalling?
This is a case of a missing diagnostic. Somebody should have spoken up and tell you that your declaration is not supported. Where that somebody is either the C# compiler, producing a compile error, or the CLR field marshaller, producing a runtime exception.
It's not like you can't get a diagnostic. You'll certainly get one when you actually start using the struct as intended:
a_struct_test3 lStruct3 = new a_struct_test3();
lStruct3.some_data = new a_nested[4];
lStruct3.some_data[0] = new a_nested();
lStruct3.some_data[0].a_notherstring[0] = (sbyte)'a'; // Eek!
Which elicits CS1666, "You cannot use fixed size buffers contained in unfixed expressions. Try using the fixed statement". Not that "try this" advice is all that helpful:
fixed (sbyte* p = &lStruct3.some_data[0].a_notherstring[0]) // Eek!
{
*p = (sbyte)'a';
}
Exact same CS1666 error. Next thing you'd try is put an attribute on the fixed buffer:
public unsafe struct a_struct_test3 {
[MarshalAs(UnmanagedType.ByValArray, SizeConst = 3)]
public fixed sbyte a_string[3];
[MarshalAs(UnmanagedType.ByValArray, SizeConst = 4)]
public a_nested[] some_data;
}
//...
a_struct_test3 lStruct3 = new a_struct_test3();
lStruct3.some_data = new a_nested[4];
IntPtr lPtr3 = Marshal.AllocHGlobal(15);
Marshal.StructureToPtr(lStruct3, lPtr3, false); // Eek!
Keeps the C# compiler happy but now the CLR speaks up and you get a TypeLoadException at runtime: "Additional information: Cannot marshal field 'a_string' of type 'MarshalNested.a_struct_test3': Invalid managed/unmanaged type combination (this value type must be paired with Struct)."
So, in a nutshell you should have gotten either CS1666 or TypeLoadException on your original attempt as well. That did not happen because the C# compiler was not forced to look at the bad part, it only generates CS1666 on a statement that accesses the array. And it did not happen at runtime because the field marshaller in the CLR did not attempt to marshal the array because it is null. You can file a bug feedback report at connect.microsoft.com but I'd be greatly surprised if they won't close it with "by design".
In general, an obscure detail matters a great deal to the field marshaller in the CLR, the chunk of code that converts struct values and class objects from their managed layout to their unmanaged layout. It is poorly documented, Microsoft does not want to nail down the exact implementation details. Mostly because they depend too much on the target architecture.
What matters a great deal is whether or not a value or object is blittable. It is blittable when the managed and unmanaged layout is identical. Which only happens when every member of the type has the exact same size and alignment in both layouts. That normally only happens when the fields are of a very simple value type (like byte or int) or a struct that itself is blittable. Notoriously not when it is bool, too many conflicting unmanaged bool types. A field of an array type is never blittable, managed arrays don't look anything like C arrays since they have an object header and a Length member.
Having a blittable value or object is highly desirable, it avoids the field marshaller from having to create a copy. The native code gets a simple pointer to managed memory and all that is needed is to pin the memory. Very fast. It is also very dangerous, if the declaration does not match then the native code can easily color outside the lines and corrupt the GC heap or stack frame. A very common reason for a program that use pinvoke to bomb randomly with ExecutionEngineException, excessively difficult to diagnose. Such a declaration really deserves the unsafe keyword but the C# compiler does not insist on it. Nor can it, compilers are not allowed to make any assumptions about managed object layout. You keep it safe by using Debug.Assert() on the return value of Marshal.SizeOf<T>, it must be an exact match with the value of sizeof(T) in a C program.
As noted, arrays are an obstacle to getting a blittable value or object. The fixed keyword is intended as a workaround for this. The CLR treats it like an opaque value type with no members, just a blob of bytes. No object header and no Length member, as close as you could get to a C array. And used in C# code like you'd use an array in a C program, you must use a pointer to address the array elements and check three times that you don't color outside of the lines. Sometimes you must use a fixed array, happens when you declare a union (overlapping fields) and you overlap an array with a value. Poison to the garbage collector, it can no longer figure out if the field stores an object root. Not detected by the C# compiler but reliably trips a TypeLoadException at runtime.
Long story short, use fixed only for a blittable type. Mixing fields of a fixed size buffer type with fields that must be marshaled cannot work. And isn't useful, the object or value gets copied anyway so you might as well use the friendly array type.
interoping nim dll from c# i could call and execute the code below
if i will add another function (proc) that Calls GetPacks() and try to echo on each element's buffer i could see the output in the C# console correctly
but i could not transfer the data as it is, i tried everything but i could not accomplish the task
proc GetPacksPtrNim(parSze: int, PackArrINOUT: var DataPackArr){.stdcall,exportc,dynlib.} =
PackArrINOUT.newSeq(parSze)
var dummyStr = "abcdefghij"
for i, curDataPack in PackArrINOUT.mpairs:
dummyStr[9] = char(i + int8'0')
curDataPack = DataPack(buffer:dummyStr, intVal: uint32 i)
type
DataPackArr = seq[DataPack]
DataPack = object
buffer: string
intVal: uint32
when i do same in c/c++ the type i am using is either an IntPtr or char*
that is happy to contain returned buffer member
EXPORT_API void __cdecl c_returnDataPack(unsigned int size, dataPack** DpArr)
{
unsigned int dumln, Index;dataPack* CurDp = {NULL};
char dummy[STRMAX];
*DpArr = (dataPack*)malloc( size * sizeof( dataPack ));
CurDp = *DpArr;
strncpy(dummy, "abcdefgHij", STRMAX);
dumln = sizeof(dummy);
for ( Index = 0; Index < size; Index++,CurDp++)
{
CurDp->IVal = Index;
dummy[dumln-1] = '0' + Index % (126 - '0');
CurDp->Sval = (char*) calloc (dumln,sizeof(dummy));
strcpy(CurDp->Sval, dummy);
}
}
c# signature for c code above
[DllImport(#"cdllI.dll", CallingConvention = CallingConvention.Cdecl), SuppressUnmanagedCodeSecurity]
private static extern uint c_returnDataPack(uint x, DataPackg.TestC** tcdparr);
C# Struct
public unsafe static class DataPackg
{
[StructLayout(LayoutKind.Sequential)]
public struct TestC
{
public uint Id;
public IntPtr StrVal;
}
}
finally calling the function like so:
public static unsafe List<DataPackg.TestC> PopulateLstPackC(int ArrL)
{
DataPackg.TestC* PackUArrOut;
List<DataPackg.TestC> RtLstPackU = new List<DataPackg.TestC>(ArrL);
c_returnDataPack((uint)ArrL, &PackUArrOut);
DataPackg.TestC* CurrentPack = PackUArrOut;
for (int i = 0; i < ArrL; i++, CurrentPack++)
{
RtLstPackU.Add(new DataPackg.TestC() { StrVal = CurrentPack->StrVal, Id = CurrentPack->Id });
}
//Console.WriteLine("Res={0}", Marshal.PtrToStringAnsi((IntPtr)RtLstPackU[1].StrVal));//new string(RtLstPackU[0].StrVal));
return RtLstPackU;
}
how could i produce similar c code as above from Nim ?
it doesn't have to be same code, but same effect, that in c# i would be able to read the content of the string. for now, the int is readable but the string is not
Edit:
this is what i tried to make things simple
struct array of int members
Update:
it seem that the problem is to do with my settings of nim in my windows OS.
i will be updating as soon as i discover what exactly is wrong.
The string type in Nim is not equivalent to the C's const char* type. Strings in Nim are represented as pointers, pointing into a heap-allocated chunk of memory, which has the following layout:
NI length; # the length of the stored string
NI capacity; # how much room do we have for growth
NIM_CHAR data[capacity]; # the actual string, zero-terminated
Please beware that these types are architecture specific and they are really an implementation detail of the compiler that can be changed in the future. NI is the architecture-default interger type and NIM_CHAR is usually equivalent to a 8-bit char, since Nim is leaning towards the use of UTF8.
With this in mind, you have several options:
1) You can teach C# about this layout and access the string buffers at their correct location (the above caveats apply). An example implementation of this approach can be found here:
https://gist.github.com/zah/fe8f5956684abee6bec9
2) You can use a different type for the buffer field in your Nim code. Possible candidates are ptr char or the fixed size array[char]. The first one will require you to give up the automatic garbage collection and maintain a little bit of code for manual memory management. The second one will give up a little bit of space efficiency and it will put hard-limits on the size of these buffers.
EDIT:
Using cstring may also look tempting, but it's ultimately dangerous. When you assign a regular string to a cstring, the result will be a normal char * value, pointing to the data buffer of the Nim string described above. Since the Nim garbage collector handles properly interior pointers to allocated values, this will be safe as long as the cstring value is placed in a traced location like the stack. But when you place it inside an object, the cstring won't be traced and nothing prevents the GC from releasing the memory, which may create a dangling pointer in your C# code.
Try to change your struct to:
public unsafe static class DataPackg
{
[StructLayout(LayoutKind.Sequential)]
public struct TestC
{
public uint Id;
[MarshalAs(UnmanagedType.LPStr)]
public String StrVal;
}
}