how to run some code in memory?

how to run some code in memory? - c#

I have a compiler which compiles assembly language to machine language (in memory).
My project is in C# .net.
Is there any way to run the memory on a thread?
How can DEP prevent it?
byte[] a:
01010101 10111010 00111010 10101011 ...

The key is to put the executable code into a block of memory allocated with VirtualAlloc such that the buffer is marked as executable.
IntPtr pExecutableBuffer = VirtualAlloc(
IntPtr.Zero,
new IntPtr(byteCount),
AllocationType.MEM_COMMIT | AllocationType.MEM_RESERVE,
MemoryProtection.PAGE_EXECUTE_READWRITE);
(then use VirtualFree to clean up after yourself).
This tells Windows that the memory should be marked as executable code so that it won't trigger a DEP check.

I doubt there's a supported way. I don't know and haven't researched it, but here are some guesses:
The easiest way might be to launch it as a process: write it into a *.com file and then tell the O/S to run that executable.
Alternatively, pass the memory as a parameter to the CreateThread function (but you'll need to wrorry about the code having the right calling conventions, expecting the specified parameters, preserving registers, and being in memory which is executable).
Another possibility is to write the opcodes into memory which is know is already going to be executed (e.g. overwrite existing code in a recently-loaded DLL).

It's possible to execute bytes as code:
Inline x86 ASM in C#
It does require the use of unsafe code.
I thought that this was just a fun fact but useless in practice, but perhaps your application actually has a use for this :)

You can whitelist your application from the control panel
http://ask-leo.com/how_do_i_turn_off_data_execution_prevention_errors.html
I doubt you can whitelist it programattically, but certainly not without admin access - that would defeat the purpose of this security feature.

Related

Is there a way to dynamically generate native x86 code and execute it from .NET?

I'd like to dynamically generate native unmanaged x86 code and then execute it super fast from managed .NET. I know I could emit some IL code and execute that, but I'm thinking the jetting would take too much time to get the benefit of any speed gain I get from it.
I need super fast code, I want to generate a function with the x86 opcodes in memory and pass a fixed pointer to a memory block to it, so it would make some really fast calculations on that block.
I just not sure how to call the native code from .net. remember this should be on the fly in memory, not building a dll. Speed is what really matters here. It's part of a genetic computation project.

The C language is the "portable assembler" and you can generate x86 opcodes directly (writing in assembler would be better). If you compiled the C code into an object, you could link it into .net as a library.
If you are trying to generate executable code dynamically (on-the-fly) you would need to allocate a binary array, push the object code into the array, then get the beginning address of the array after the memory headed and assign that to a function pointer and call it.
However, antivirus software specifically looks for this behavior and would identify your code a virus more than likely.
Also your processor is designed to have "code" memory segments and "data" memory segments. Typically you cannot dynamically modify the code segment without causing a segfault and you cannot execute out of a data segment without causing a segfault.
Also, you code would only run on a SISC processor (x86) and not on a RISC processor.
There is a lot to consider.
We used to do this in assembler in the old days on SISC systems.
Cheers,
David

The short answer is that you can't do that with C#.
You can do that with C++/CLI, by building a mixed-mode assembly that does your "super fast" calculations in native code. That way, you wouldn't need (presumably) to generate executable code "on-the-fly".
If for some reason you can't hard-code the calculation functions, then you will have to acquire executable memory from the operating system.
On Windows, you do that with the VirtualAlloc and VirtualProtect functions. Again, doing that from C# would require P/Invoke interop, which would most likely reverse your speed gains.

How does managed language ensure no segfault

As far as I know (correct me if I am wrong please), managed languages (or at least C#) is not going to make any segfault (at least when no Unsafe or directly dealing with unmanaged memory). This opposite to unmanaged language (or at least C++) where you can get segfault by just taking a look to cat near you for a second while coding.
The question: How managed language ensure this? were their runtime library built and tested so carefully. Or they have some way to catch these segfault and deal with it in a way or another?
The motivation behind this question: I have C# application that calls a native C++ library (both were built by me). When my C++ DLL makes segfault, the whole application goes down (some services go down) which is not a good thing at all. I know that when getting segfault, this means something was done wrongly and need to be corrected. However, at least I want some mechanism to solve this problem when the buggy (may cause segfault) C++ DLL is working on the customer machine.

They don't allow you to manually deallocate memory.
They don't enable you to read/write from/to arbitrary memory addresses (C++ also doesn't allow this, but the language syntax makes it possible).
(as a special form of the above) They check every array access whether it is within the bounds of the array
To the best of my knowledge, they don't have undefined bahavior (except of courese, when calling unsafe code)
I want some mechanism to solve this problem when the buggy (may cause segfault) C++ DLL is working on the customer machine.
The problem is that even if you could allow your program to continue (I don't know if Windows/c# offer any mechanism to do this), it might no longer be in a valid state, so depending on what the error is and to what kind of ressources you program has access to, this might actually result in worse errors than just a program crash, including the destruction of userdata.

Detecting memory access to a process

I'm trying to check if an application tries to manipulate a particular process (for ex. hooks itself to it). I couldn't find a proper approach to accomplish this. Is computing checksum over running process possible? If it's not how can i detect this situation?

Other process can't make hooks in your process, can modify memory but to make hooks this code must be in your address space, this can be done to injecting DLL to your process when is starting (at runtime inject dll is a hard one), you can easy check this by listing DLL's in your process and searching some ReadProcessMemory, WriteProcessMemory, OpenProcess, CallNextHookEx functions calls in their code. To do that get address (GetProcAddress) of function and search value in code (you can add some asm call predictions for that for tight range result).
You can check what is wrong with your PE file in disk and in memory, when DLL injection at startup time was occurrence then your PE file after was copied to memory from file should be corrupted, after last dll library you should have overwritten debug symbols with additional dll import. This modification can be done on file same as in memory.
The best method but probably will not easy for you when you are using C# language is obfuscate your code. I think this is a good method because you don't hook something that you don't know how work, because you don't know what hook you must do and where. But for good obfuscate C# code you must find good software for that and probably pay not low price.

C# or C++ sandboxed assembly

I'm thinking of writing a program that involves including super fast Assembly or as it dosn't have to be human readable it could be Machine Code in C++ or C#. However I also have other possibly more troublesome requirements.
I would need to be able to:
Store machine code programs in normal variables / object instances, for example strings "40 9B 7F 5F ..." to edit and run them.
Have the programs able to output data. I saw an example where one had a pointer to an int that it could use.
Have the programs not able to output data anywhere else. For example to not be able to perform such actions as to delete files, view the system spec or change the state of the memory of the C++ or C# program they are contained within.
For example, it could be something like this:
machine n;
n = "40 9B 7F";
n[1] = "5F";
// 'n' is now "40 5F 7F"
unsigned short s = 2;
n.run(&s);
// while 'n' was running it may have changed 's' but would not have been able to
// change anything else anywhere on the system including in this C++ / C# program
According to the wiki link Michael Dorgan posted "asm(std::string);" runs the String as assembler and it's also easy to referance variables from the C++ part of the program. Editing a std::String is easy and Alex has noted that I can ensure that the code is safe by not allowing unsafe commands.

Sandboxing native machine code is non-trivial. If you really want that take a look at NACL from google which implements a machine code sandbox for browsers.
What is more practical is to use .NET IL instead of machine code and use a sandboxed (or hosted) AppDomain. This comes much closer and still is fast due to the dynamically jit-compilation to machine code.
An alternative you have is to use Windows builtin rights management and spawn a new process with restricted rights. Never done that so I don't know if you can reduce the target processes rights as much as you want. Anyways that would be a pure win32 process just running machine code, so you lose any ability of using .NET in the sandboxed process.

If you want to include assembler in your C/C++ code, consider either inline assembly routines, or compiling seperate full on assembler files and linking them back in. Inline assembler syntax is kinda weird, but I believe it is probably the best choice for you from what I've read.
Wikipedia to the rescue for some samples:
Inline assembler examples

Update based on comments:
This is far from a trivial task. You have to implement a linker, assembler (to scan and sandbox) and loader.
I wonder what the use case is -- for my example I'll assume you want to to have an assembly contest where people submit solutions to problems and you "test" them.
This is the best solution I can think of:
Have a hosting program that takes as input assembly language.
Invoke the assembler to compile and link the assembly program.
Create a protected virtual environment for the program to run in (how you do this depends on the platform) which runs as a user that has no rights to the system.
Capture the results
This solution allows you to leverage existing assemblers, loaders and security without having to re-implement them.
The best example code of dynamically loading, running and sandboxing C# code I know of is the terrarium game at http://terrarium2.codeplex.com/
However, you might consider something better suited to this job, like a scripting system. Lua comes to mind as a popular one. Using Lua users will only be able to perform the actions you allow. http://www.lua.org/

If you restrict the subset of supported instructions, you can do what you want more or less easily.
First, you have to parse and decode an input instruction to see if it's in the supported subset (most of parsing/decoding can be done just once). Then you need to execute it.
But before executing, there's one important thing to take care of. Based on the decoded details of the instruction and the CPU registers state, you have to calculate the memory addresses that the instruction is going to access as data (including on-stack locations) or transfer control to. If any of those are outside of the established limits, fire alarm. Otherwise, if it's a control transferring instruction (e.g. jmp, jz), you must additionally ensure that the address it passes control to is not only within the memory, where all these instructions lie, but also is the address of one of those instructions and not an address inside of any of them (e.g. 1 or 2 bytes from the beginning of a 3+ bytes long instruction). Passing control anywhere else is a no-no. You do not want these instructions to pass control to any standard library functions either because you won't be able to control execution there and they're not always safe when supplied with bogus/malicious inputs. Also, these instructions must not be able to modify themselves.
If all is clear, you can either emulate the instruction or more or less directly execute it (control passing instructions will likely have to be always emulated because you want to stop execution after every instruction). For the latter you can create a modifiable function containing these things:
Code to save CPU registers of the caller and load them with the state for the instruction being executed.
The instruction.
The reverse of step 1: code to save post-execution register state and restore the caller's register state.
You can try this approach.

Call Unmanaged code from managed or spawn process

I have an unmanaged C++ exe that I could call from inside my C# code directly (have the C++ code that I could make a lib) or via spawning a process and grabbing the data from the OutputStream. What are the advantages/disadvantages of the options?

Since you have source code of the C++ library, you can use C++/CLI to compile it into a mixed mode dll so it is easy to be used by the C# application.
The benefit of this will be most flexible on data flow (input or output to that C++ module).
While running the C++ code out of process has one benefit. If your C++ code is not very robust, this can make your main C# process stable so as not to be crashed by the C++ code.

The big downside to scraping the OutputStream is the lack of data typing. I'd much rather do the work of exporting a few functions and reusing an existing library; but, that's really just a preference.

Another disadvantage of spawning a process is that on windows spwaning a process is a very expensive (slow) operation. If you intend to call the c++ code quite often this is worth considering.
An advantage can be that you're automatically more isolated to crashes in the c++ program.
Drop in replacement of the c++ executable can be an advantage as well.
Furthermore writing interop code can be big hassle in c#. If it's a complicated interace and you decide to do interop, have a look at c++/cli for the interop layer.

You're far better off taking a subset of the functions of the C++ executable and building it into a library. You'll keep type safety and you'll be able to better leverage Exception Handling (not to mention finer grain control of how you manage the calls into the functions in the library).
If you go with grabbing the data from the OutputStream of the executable, you're going to have no visibility into the processes of the executable, no real exception handling, and you're going to lose any type information you may have had.

The main disadvantage to being in process would be making sure you handle the managed/native interactions correctly.
1)
The c++ code will probably depend on deterministic destruction for cleanup/resource freeing etc. I say probably because this is common and good practice in c++.
In the managed code this means you have to be careful to dispose of your c++ cli wrapper code properly. If your code is used once, a using clause in c# will do this for you. If the object needs to live a while as a member you'll find that the dispose will need to be chained the whole way through your application.
2)
Another issue depends on how memory hungry your application is. The managed garbage collector can be lazy. It is guaranteed to kick in if a managed allocation needs more space than is available. However the unmanaged allocator is not connected in anyway. Therefore you need to manaully inform the managed allocator that you will be making unmanaged allocations and that it should keep that space available. This is done using the AddMemoryPressure method.
The main disadvantages to being out of process are:
1) Speed.
2) Code overhead to manage the communication.
3) Code overhead to watch for one or other process dying when it is not expected to.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.