Prevent compiler/cpu instruction reordering c#

Prevent compiler/cpu instruction reordering c# - c#

I have an Int64 containing two Int32 like this:
[StructLayout(LayoutKind.Explicit)]
public struct PackedInt64
{
[FieldOffset(0)]
public Int64 All;
[FieldOffset(0)]
public Int32 First;
[FieldOffset(4)]
public Int32 Second;
}
Now I want constructors (for all, first and second). However the struct requires all fields to be assigned before the constructor is exited.
Consider the all constructor.
public PackedInt64(Int64 all)
{
this.First = 0;
this.Second = 0;
Thread.MemoryBarrier();
this.All = all;
}
I want to be absolutely sure that this.All is assigned last in the constructor so that half of the field or more isn't overwritten in case of some compiler optimization or instruction reordering in the cpu.
Is Thread.MemoryBarrier() sufficient? Is it the best option?

Yes, this is the correct and best way of preventing reordering.
By executing Thread.MemoryBarrier() in your sample code, the processor will never be allowed to reorder instructions in such a way that the access/modification to First or Second will occur after the access/modification to All. Since they both occupy the same address space, you don't have to worry about your later changes being overwritten by your earlier ones.
Note that Thread.MemoryBarrier() only works for the current executing thread -- it isn't a type of lock. However, given that this code is running in a constructor and no other thread can yet have access to this data, this should be perfectly fine. If you do need cross-thread guarantee of operations, however, you'll need to use a locking mechanism to guarantee exclusive access.
Note that you may not actually need this instruction on x86 based machines, but I would still recommend the code in case you run on another platform one day (such as IA64). See the below chart for what platforms will reorder memory post-save, rather than just post-load.

The MemoryBarrier will prevent re-ordering, but this code is still broken.
LayoutKind.Explicit and FieldOffsetAttribute are documented as affecting the memory layout of the object when it is passed to unmanaged code. It can be used to interop with a C union, but it cannot be used to emulate a C union.
Even if it currently acts the way you expect, on the platform you tested, there is no guarantee that it will continue to do so. The only guarantee made is in the context of interop with unmanaged code (that is, p/invoke, COM interop, or C++/CLI it-just-works).
If you want to read a subset of bytes in a portable future-proof manner, you'll have to use bitwise operations or a byte array and BitConverter. Even if the syntax isn't as nice.

Check the remarks section of the following link: http://msdn.microsoft.com/en-us/library/system.threading.thread.memorybarrier.aspx
It says MemoryBarrier() is required only on multiprocessor systems with weak memory ordering. So, this is a sufficient option but whether this is the best option or not depends upon the system you are using.

First, I'm aware this answer doesn't really solve the reordering problem, but negates it. By using unsafe code, you can avoid writing to First and Second completely.
public unsafe PackedInt64(long all) {
fixed (PackedInt64* ptr = &this)
*(long*) ptr = all;
}
It's not meant to be the most elegant solution and probably doesn't pass most company policies regarding managed code, but it should work.

Related

Clear stack variable memory

I have 8 uints which represent a security key like this:
uint firstParam = ...;
uint secondParam = ...;
uint thirdParam = ...;
uint etcParam = ...;
uint etcParam = ...;
They are allocated as local variables, inside of an UNSAFE method.
Those keys are very sensitive.
I was wondering do those locals on the stack get deleted when the method is over? Does the UNSAFE method have an affect on this? MSDN says that Unsafe code is automatically pinned in memory.
If they are not removed from memory, will assigning them all to 0 help at the end of the method, even though analyzers will say this has no effect?
So I tested zeroing out the variables. However, in x64 Release mode the zeroing is removed from the final product (checked using ILSpy)
Is there any way to stop this?
Here is the sample code (in x64 Release)
private static void Main(string[] args)
{
int num = new Random().Next(10, 100);
Console.WriteLine(num);
MethodThatDoesSomething(num);
num = 0; // This line is removed!
Console.ReadLine();
}
private static void MethodThatDoesSomething(int num)
{
Console.WriteLine(num);
}
The num = 0 statement is removed in x64 release.
I cannot use SecureString because I'm P/Invoking into a native method which takes the UInts as a paramter.
I'm P/Invoking into the unmanaged method AllocateAndInitializeSid, which takes 8 uints as parameters. What could I do in this scenerio?
I have tried adding
[MethodImpl(MethodImplOptions.NoInlining | MethodImplOptions.NoOptimization)]
to the sample code (above Main method), however, the num = 0 is STILL removed!

EDIT: after some reasoning I've come to correct this answer.
DO NOT use SecureString, as #Servy and #Alejandro point out in the comments, it is not considered really secure anymore and will give a misguided sense of security, probably leading to futhering unconsidered exposures.
I have striked the passages I'm not comfortable with anymore and, in their place, would recommend as follows.
To assign firstParam use:
firstParam = value ^ OBFUSCATION_MASK;
To read firstParam use (again):
firstParam ^ OBFUSCATION_MASK;
The ^ (bitwise XOR) operator is the inverse of itself, so applying it twice returns the original value. By reducing the time the value exists without obfuscation (for the CPU time is actually the number of machine code cycles), its exposure is also reduced. When the value is stored for long-term (say, 2-3 microseconds) it should always be obfuscated. For example:
private static uint firstParam; // use static so that the compiler cannot remove apparently "useless" assignments
public void f()
{
// somehow acquire the value (network? encrypted file? user input?)
firstParam = externalSourceFunctionNotInMyCode() ^ OBFUSCATION_MASK; // obfuscate immediately
}
Then, several microseconds later:
public void g()
{
// use the value
externalUsageFunctionNotInMyCode(firstParam ^ OBFUSCATION_MASK);
}
The two external[Source|Usage]FunctionNotInMyCode() are entry and exit points of the value. The important thing is that as long as the value is stored in my code it is never in the plain, it's always obfuscated. What happens before and after my code is not under our control and we must live with it. At some point values must enter and/or exit. Otherwise what program would that be?
One last note is about the OBFUSCATION_MASK. I would randomize it for every start of the application, but ensure that the entropy is high enough, that means that the count of 0 and 1 is maybe not fifty/fifty, but near it. I think RNGCryptoServiceProvider will suffice. If not, it's always possible to count the bits or compute the entropy:
private static readonly uint OBFUSCATION_MASK = cryptographicallyStrongRandomizer();
At that point it's relatively difficult to identify the sensitive values in the binary soup and maybe even irrelevant if the data was paged out to disk.
As always, security must be balanced with cost and efficiency (in this case, also readability and maintainability).
ORIGINAL ANSWER:
Even with pinned unmanaged memory you cannot be sure if the physical memory is paged out to the disk by the OS.
In fact, in nations where Internet Bars are very common, clients may use your program on a publicly accessible machine. An attacker may try and do as follows:
compromise a machine by running a process that occasionally allocates all the RAM available;
wait for other clients to use that machine and run a program with sensitive data (such as username and password);
once the rogue program exhausts all RAM, the OS will page out the virtual memory pages to disk;
after several hours of usage by other clients the attacker comes back to the machine to copy unused sectors and slack space to an external device;
his hope is that pagefile.sys changed sectors several times (this occurs through sector rotation and such, which may not be avoided by the OS and can depend on hardware/firmware/drivers);
he brings the external device to his dungeon and slowly but patiently analyze the gathered data, which is mainly binary gibberish, but may have slews of ASCII characters.
By analyzing the data with all the time in the world and no pressure at all, he may find those sectors to which pagefile.sys has been written several "writes" before. There, the content of the RAM and thus heap/stack of programs can be inspected.
If a program stored sensitive data in a string, this procedure would expose it.
Now, you're using uint not string, but the same principles still apply. To be sure to not expose any sensitive data, even if paged out to disk, you can use secure versions of types, such as SecureString.
The usage of uint somewhat protects you from ASCII scanning, but to be really sure you should never store sensitive data in unsafe variables, which means you should somehow convert the uint into a string representation and store it exclusively in a SecureString.
Hope that helps someone implementing secure apps.

In .NET, you can never be sure that variables are actually cleared from memory.
Since the CLR manages the memory, it's free to move them around, liberally leaving old copies behind, including if you purposely overwrite them with zeroes o other random values. A memory analyzer or a debugger may still be able to get them if it has enough privileges.
So what can you do about it?
Just terminating the method leaves the data behind in the stack, and they'll be eventually overwritten by something else, without any certainity of when (or if) it'll happen.
Manually overwriting it will help, provided the compiler doesn't optimize out the "useless" assignment (see this thread for details). This will be more likely to success if the variables are short-lived (before the GC had the chance to move them around), but you still have NO guarrantes that there won't be other copies in other places.
The next best thing you can do is to terminate the whole process immediately, preferably after overwritting them too. This way the memory returns to the OS, and it'll clear it before giving it away to another process. You're still at the mercy of kernel-mode analyzers, though, but now you've raised the bar significantly.

How to Span<T> and stackalloc to create a temporary small list

I was reading a description of some code written in C that gains speed due to allocating temporary arrays on the stack instead of the heap for use in very hot loops. (It was described as being similar to SBO optimization). The object in question is similar to a List<T> in that it's just an array with some basic convenience functionality on top. It allocates a small section of memory to use, and if the list is expanded past the size of the array, it allocates a new array on the heap, copies the data, and updates the pointer.
I would like to do the same thing in C#, but I'm not sure how to accomplish it as I want to keep this in a safe context so I can't use a pointer to update the data reference if its expanded, and Span<int> doesn't have an implicit cast to int[]. Specifically:
stackalloc memory is released on method exit, so I'm not sure if there's a simpler way to use a struct like this than giving it a Span field and assigning it after creating within the method using it.
How do I neatly switch between using backing fields of different types (Span and int[]) without changing the public-facing interface?

I managed to come up with a solution, not sure if it's the best implementation, but it seems to work. I also have a couple of alternatives.
Note: This is useful for increasing speed only when you have a function that needs to create a temporary array and is called very frequently. The ability to switch to a heap allocated object is just a fallback in case you overrun the buffer.
Option 1 - Using Span and stackalloc
If you're building to .NET Core 2.1 or later, .NET Standard 2.1 or later, or can use NuGet to use the System.Memory package, the solution is really simple.
Instead of a class, use a ref struct (this is necessary to have a Span<T> field, and neither can leave the method where they're declared. If you need a long-lived class, then there's no reason to try to allocate on the stack since you'll just have to move it to the heap anyway.)
public ref struct SmallList
{
private Span<int> data;
private int count;
//...
}
Then add in all your list functionality. Add(), Remove(), etc. In Add or any functions that might expand the list, add a check to make sure you don't overrun the span.
if (count == data.Length)
{
int[] newArray = new int[data.Length * 2]; //double the capacity
Array.Copy(data.ToArray(), 0, new_array, 0, cap);
data = new_array; //Implicit cast! Easy peasy!
}
Span<T> can be used to work with stack allocated memory, but it can also point to heap allocated memory. So if you can't guarantee your list will always be small enough to fit in the stack, the snippet above gives you a nice fallback that shouldn't happen frequently enough to cause noticeable problems. If it is, either increase the initial stack allocation size (within reason, don't overflow!), or use another solution like an array pool.
Using the struct just requires an extra line and a constructor that takes a span to assign to the data field. Not sure if there's a way to do it all in one shot, but it's easy enough:
Span<int> span = stackalloc int[32];
SmallList list = new SmallList(span);
And if you need to use it in a nested function (which was part of my issue) you just pass it in as a parameter instead of having the nested function return a list.
void DoStuff(SmallList results) { /* do stuff */ }
DoStuff(list);
//use results...
Option 2: ArrayPool
The System.Memory package also includes the ArrayPool class, which lets you store a pool of small arrays that your class/struct could take out without bothering the garbage collector. This has comparable speed depending on the use case. It also has the benefit that it would work for classes that have to live beyond a single method. It's also fairly easy to write your own if you can't use System.Memory.
Option 3: Pointers
You can do something like this with pointers and other unsafe code, but the question was technically asking about safe code. I just like my lists to be thorough.
Option 4: Without System.Memory
If, like me, you're using Unity / Mono, you can't use System.Memory and related features until at least 2021. Which leaves you to roll your own solution. An array pool is fairly straightforward to implement, and does the job of avoiding garbage allocations. A stack allocated array is a bit more complicated.
Luckily, someone has already done it, specifically with Unity in mind. The page linked is quite long, but includes both sample code demonstrating the concept and a code generation tool that can make a SmallBuffer class specific to your exact use case. The basic idea is to just create a struct with individual variables that you index as if they were an array.
Update: I tried both these solutions and the array pool was slightly faster (and a lot easier) than the SmallBuffer in my case, so remember to profile!

In C and C++ languages the developer defines in which memory an object is going to be instantiated: stack or heap.
In C# you it is determined by the author of the data type.
You can achieve your goal using Span and pointers. https://learn.microsoft.com/en-us/dotnet/api/system.span-1?view=netcore-3.1.
But I would not recommend you to do that, because your code is not safe. Meaning that CLR gives you all the responsibility to manage it, at least clean the memory, when you do not need such object anymore. Usually the C# developers come to such tricks, when they want to optimise really big data collections, which allocates a lot of memory in the heap.
If it is still what you are looking for - than, probably, C# is not the best option to use.
Even more, if you have a big collection and somehow you find the way how to put it in stack memory - you can easily face StackOverflowException.

C# - How to Bypass Error cs0212 Cheaply for Programmers and Computers?

I want to process many integers in a class, so I listed them into an int* array.
int*[] pp = new int*[]{&aaa,&bbb,&ccc};
However, the compiler declined the code above with the following EXCUSE:
> You can only take the address of an unfixed expression inside of a fixed statement initializer
I know I can change the code above to avoid this error; however, we need to consider ddd and eee will join the array in the future.
public enum E {
aaa,
bbb,
ccc,
_count
}
for(int i=0;i<(int)E._count;i++)
gg[(int)E.bbb]
　
Dictionary<string,int>ppp=new Dictionary<string,int>();
ppp["aaa"]=ppp.Count;
ppp["bbb"]=ppp.Count;
ppp["ccc"]=ppp.Count;
gg[ppp["bbb"]]
These solution works, but they make the code and the execution time longer.
I also expect a nonofficial patch to the compiler or a new nonofficial C# compiler, but I have not seen an available download for many years; it seems very difficult to have one for us.
Are there better ways so that
I do not need to count the count of the array ppp.
If the code becomes long, there are only several letters longer.
The execution time does not increase much.
To add ddd and eee into the array, there are only one or two
setences for each new member.

.NET runtime is a managed execution runtime which (among other things) provides garbage collection. .NET garbage collector (GC)
not only manages the allocation and release of memory, but also transparently moves the objects around the "managed heap", blocking
the rest of your code while doing it.
It also compacts (defragments) the memory by moving longer lived objects together, and even "promoting" them into different parts of the heap, called generations, to avoid checking their status too often.
There is a bunch of memory being copied all the time without your program even realizing it. Since garbage collection is an operation that can happen at any time during the execution of your program, any pointer-related
("unsafe") operations must be done within a small scope, by telling the runtime to "pin" the objects using the fixed keyword. This prevents the GC from moving them, but only for a while.
Using pointers and unsafe code in C# is not only less safe, but also not very idiomatic for managed languages in general. If coming from a C background, you may feel like at home with these constructs, but C# has a completely different philosophy: your job as a C# programmer should be to write reliable, readable and maintenable code, and only then think about squeezing a couple of CPU cycles for performance reasons. You can use pointers from time to time in small functions, doing some very specific, time-critical code. But even then it is your duty to profile before making such optimizations. Even the most experienced programmers often fail at predicting bottlenecks before profiling.
Finally, regarding your actual code:
I don't see why you think this:
int*[] pp = new int*[] {&aaa, &bbb, &ccc};
would be any more performant than this:
int[] pp = new int[] {aaa, bbb, ccc};
On a 32-bit machine, an int and a pointer are of the same size. On a 64-bit machine, a pointer is even bigger.
Consider replacing these plain ints with a class of your own which will provide some context and additional functionality/data to each of these values. Create a new question describing the actual problem you are trying to solve (you can also use Code Review for such questions) and you will benefit from much better suggestions.

Object instance valid only for the current method

Is it possible to create an object that can register whether the current thread leaves the method where it was created, or to check whether this has happened when a method on the instance gets called?
ScopeValid obj;
void Method1()
{
obj = new ScopeValid();
obj.Something();
}
void Method2()
{
Method1();
obj.Something(); //Exception
}
Can this technique be accomplished? I would like to develop a mechanism similar to TypedReference and ArgIterator, which can't "escape" the current method. These types are handled specially by the compiler, so I can't mimic this behavior exactly, but I hope it is possible to create at least a similar rule with the same results - disallow accessing the object if it has escaped the method where it was created.
Note that I can't use StackFrame and compare methods, because the object might escape and return to the same method.

Changing method behavior based upon the source of the call is a bad design choice.
Some example problems to consider with such a method include:
Testability - how would you test such a method?
Refactoring the calling code - What if the user of your code just does an end run around your error message that says you can't do that in a different method than it was created? "Okay, fine! I'll just do my bad thing in the same method, says the programmer."
If the user of your code breaks it, and it's their fault, let it break. Better to just document your code with something like:
IInvalidatable - Types which implement this member should be invalidated with Invalidate() when you are done working with this.
Ignoring the obvious point that this almost seems like is re-inventing IDisposible and using { } blocks (which have language support), if the user of your code doesn't use it right, it's not really your concern.
This is likely technically possible with AOP (I'm thinking PostSharp here), but it still depends on the user using your code correctly - they would have to have it in the build process, and failing to function if they aren't using a tool just because you're trying to make it easy on them is evil.
Another point - If you are just attempting to create an object which cannot be used outside of one method, and any attempted operation outside of the method would fail, just declare it a local inside the method.
Related: How to find out which assembly handled the request

Years laters, it seems this feature was finally added to C# 7.2: ref struct.
Another related language feature is the ability to declare a value type that must be stack allocated. In other words, these types can never be created on the heap as a member of another class. The primary motivation for this feature was Span and related structures. Span may contain a managed pointer as one of its members, the other being the length of the span. It's actually implemented a bit differently because C# doesn't support pointers to managed memory outside of an unsafe context. Any write that changes the pointer and the length is not atomic. That means a Span would be subject to out of range errors or other type safety violations were it not constrained to a single stack frame. In addition, putting a managed pointer on the GC heap typically crashes at JIT time.
This prevents the code from moving the value to the heap, which partly solves my original problem. I am not sure how returning a ref struct is constrained, though.

Pinning a delegate within a struct before passing to unmanaged code

I'm trying to use an unmanaged C dll for loading image data into a C# application. The library has a fairly simple interface where you pass in a struct that contains three callbacks, one to receive the size of the image, one that receives each row of the pixels and finally one called when the load is completed. Like this (C# managed definition):
[System.Runtime.InteropServices.StructLayoutAttribute(System.Runtime.InteropServices.LayoutKind.Sequential)]
public struct st_ImageProtocol
{
public st_ImageProtocol_done Done;
public st_ImageProtocol_setSize SetSize;
public st_ImageProtocol_sendLine SendLine;
}
The types starting st_ImageProtocol are delgates:
public delegate int st_ImageProtocol_sendLine(System.IntPtr localData, int rowNumber, System.IntPtr pixelData);
With the test file that I'm using the SetSize should get called once, then the SendLine will get called 200 times (once for each row of pixels in the image), finally the Done callback gets triggered. What actually happens is that the SendLine is called 19 times and then a AccessViolationException is thrown claiming that the library tried to access protected memory.
I have access to the code of the C library (though I can't change the functionality) and during the loop where it calls the SendLine method it does not allocate or free any new memory, so my assumption is that the delegate itself is the issue and I need to pin it before I pass it in (I have no code inside the delegate itself currently, besides a counter to see how often it gets called, so I doubt I'm breaking anything on the managed side). The problem is that I don't know how to do this; the method I've been using to declare the structs in unmanaged space doesn't work with delegates (Marshal.AllocHGlobal()) and I can't find any other suitable method. The delegates themselves are static fields in the Program class so they shouldn't be being garbage collected, but I guess the runtime could be moving them.
This blog entry by Chris Brumme says that delegates don't need to be pinned before being passed into unmanaged code:
Clearly the unmanaged function pointer must refer to a fixed address. It would be a disaster if the GC were relocating that! This leads many applications to create a pinning handle for the delegate. This is completely unnecessary. The unmanaged function pointer actually refers to a native code stub that we dynamically generate to perform the transition & marshaling. This stub exists in fixed memory outside of the GC heap.
But I don't know if this holds true when the delegate is part of a struct. It does imply that it is possible to manually pin them though, and I'm interested in how to do this or any better suggestions as to why a loop would run 19 times then suddenly fail.
Thanks.
Edited to answer Johan's questions...
The code that allocates the struct is as follows:
_sendLineFunc = new st_ImageProtocol_sendLine(protocolSendLineStub);
_imageProtocol = new st_ImageProtocol()
{
//Set some other properties...
SendLine = _sendLineFunc
};
int protocolSize = Marshal.SizeOf(_imageProtocol);
_imageProtocolPtr = Marshal.AllocHGlobal(protocolSize);
Marshal.StructureToPtr(_imageProtocol, _imageProtocolPtr, true);
Where the _sendLineFunc and the _imageProtocol variables are both static fields of the Program class. If I understand the internals of this correctly, that means that I'm passing an unmanaged pointer to a copy of the _imageProtocol variable into the C library, but that copy contains a reference to the static _sendLineFunc. This should mean that the copy isn't touched by the GC - since it is unmanaged - and the delegate won't be collected since it is still in scope (static).
The struct actually gets passed to the library as a return value from another callback, but as a pointer:
private static IntPtr beginCallback(IntPtr localData, en_ImageType imageType)
{
return _imageProtocolPtr;
}
Basically there is another struct type that holds the image filename and the function pointer to this callback, the library figures out what type of image is stored in the file and uses this callback to request the correct protocol struct for the given type. My filename struct is declared and managed in the same way as the protocol one above, so probably contains the same mistakes, but since this delegate is only called once and called quickly I haven't had any problems with it yet.
Edited to update
Thanks to everybody for their responses, but after spending another couple of days on the problem and making no progress I decided to shelve it. In case anyone is interested I was attempting write a tool for users of the Lightwave 3D rendering application and a nice feature would have been the ability to view all the different image formats that Lightwave supports (some of which are fairly exotic). I thought that the best way to do this would be to write a C# wrapper for the plugin architecture that Lightwave uses for image manipulation so I could use their code to actually load the files. Unfortunately after trying a number of the plugins against my solution I had a variety of errors that I couldn't understand or fix and my guess is that Lightwave doesn't call the methods on the plugins in a standard way, probably to improve the security of running external code (wild stab in the dark, I admit). For the time being I'm going to drop the image feature and if I do decide to reinstate it I'll approach it in a different way.
Thanks again, I learnt a lot through this process even though I didn't get the result I wanted.

I had a similar problem when registering a callback delegate (it would be called, then poof!). My problem was that the object with the method being delegated was getting GC'ed. I created the object in a more global place so as to keep it from being GC'ed.
If something like that doesn't work, here are some other things to look at:
As additional info, take a look at GetFunctionPointerForDelegate from the Marshal class. That is another way you could do this. Just make sure that the delegates are not GC'ed. Then, instead of delegates in your struct, declare them as IntPtr.
That may not solve the pinning, but take a look at fixed keyword, even though that may not work for you since you are dealing with a longer lifetime than for what that is typically used.
Finally, look at stackalloc for creating non-GC memory. These methods will require the use of unsafe, and might therefore put some other constraints on your Assemblies.

It would be interesting to know a little more:
How do you create the ImageProtocol struct? Is it a local variable or a class member or do you allocate it in unmanaged memory with Marshal.AllocHGlobal?
How is it sent to the C function? Directly as stack variable or as a pointer?
A really tricky problem! It feels like the delegate data is moved around by the GC which causes the access violation. The interesting thing is that the delegate data type is a reference data type, which stores its data on the GC heap. This data contains things like the address of the function to call (function pointer) but also a reference to the object that contains the function. This should mean that even though the actual function code is stored outside of the GC heap, the data that holds the function pointer is stored in the GC heap and can hence be moved by the GC. I thought about the problem a lot last night but haven't come up with a solution....

You don't say exactly how the callback is declared in the C library. Unless it is explictly declared __stdcall you'll slowly corrupt your stack. You'll see your method get called (probably with the parameters reversed) but at some point in the future the program will crash.
So far as I know there is no way around that, other than writing another callback function in C that sits between the C# code and the library that wants a __cdecl callback.

If the c function is a __cdecl function then you have to use the Attribut
[UnmanagedFunctionPointer(CallingConvention.Cdecl)]
before the delegate declaration.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.