Related
I am writing code to parse a complicated string in C++ and create a tree from it. I would like to use C# in Visual Studio 2017 to call a native c++ method that returns a vector of nodes.
A Node looks like:
class node
{
public:
std::vector<node> subnodes;
std::string name;
};
and the c++ function might look like:
class noderizer
{
public:
node getNodes(std::string str);
};
what is the most efficient (coding time with secondary consideration for speed) way to call the noderizer::getNodes(...) member and create a equivalent class for c#?
I am assuming that the best route is to create a duplicate class definition in c# and then copy marshal the native std::strings into managed Strings"
public class node
{
public string name = string.Empty;
List<node> integers = new List<node>();
}
It's not clear if this article contains the latest information for c++ interop, but a related article on wrapping c++ native classes for use in c# indicates that I could most likely just wrap noderizer native c++ as noderizer * m_Impl; and then call the getNodes member and copy over each parameter. Is this the correct methodology?
You could use pInvoke as per your first referenced article, but the objects you are returning seem quite complicated. I think you will have a heck of a time with data marshalling if you're not already well versed (and even then). pInvoke is great for simple calls to C libraries where data marshalling is basic, but it gets very complicated very fast. This is not good for your situation.
The second article is closer to where you want to look. That being said, you have to consider whether you want to be wrapping a managed class or just calling a function that takes a string and returns your tree in managed format (i.e. it makes a copy rather than wrap unmanaged data). By your post it seems like you just need the latter.
Your best option is to use C++/CLI. Check out this tutorial. I am confident that taking this approach will make light work of your task. I won't attempt to explain the subject as the article above does a very good job. In essence, you will be able write functions with both managed and unmanaged data types, where all the data marshalling is built into the environment. A simple cast will marshal the data for you behind the scenes. As a big bonus, debugging is great as you can step from C# to C++/CLI to C++ code and back.
In the article above, the author does describe how to wrap an unmanaged class, which you can well do, but your problem will still be with data conversion.
I would approach your problem in 4 steps:
Write a static function getNodes() in C++/CLI that you can call from C# as any regular static method, passing a string as an argument. The article above will help you with that. Also See Here when creating the C++/CLI project.
getNodes() will use your C++ code to create a tree in its unmanaged form.
Use getNodes() to convert the tree from unmanaged to managed form.
Return the result to the caller in C#.
your declaration will look something like this, where the Node is your managed calss, ref keyword signifies that the class is managed, and ^ is the managed version of *.
public ref class noderizer
{
public:
static Node^ getNodes(String ^mStr);
};
Here getNodes() calls your C++ function and does the data conversion. I don't think it will take you long to figure it out.
Once you get a hang of the syntax, I think you will find it quite intuitive to use if you are already familiar with C# and C++.
As for performance, unless you're making thousands of consecutive calls or need critical real time data, it should not be an issue. One thing I would say though, if you're concerned with performance, you should write the pure C++ code in a separate purely unmanaged dll. I can't find the article for the life of me, but I remember looking at some benchmarks of executing purely unmanaged long running code block that was compiled inside the C++/CLI dll vs calling the exact same code that was compiled inside its own purely unmanaged dll. If I recall correctly, the separate dll was something like 3x faster.
I've searched good and Stack Overflow but couldn't find an answer to what I was looking for. Is there anyway to hook the call of Python functions from within C++/C#? Capture the function call plus it parameters?
Edit with an example:
def a(pOne):
//do stuff
def b():
a("a")
So on the call to 'a' I want C++ to hook that function (assuming the function name is known) and execute a C++ function/event on the call to 'a' with the parameters passed to the C++ function/event being the value of what was passed in for 'pOne'.
Thanks
There a couple of different options for C++ depending on what you want to do.
The best way to do what you want to do is to 'extend' Python.
My Favorite way to do this is Boost.Python
You can also use the raw C-API though I wouldn't recommend this.
For some examples of embedding and extending with Boost.Python you can look at this project I have been working on.
If you just want to write some C++ code and make it callable from Python, you do that by Extending and Embedding the Python Interpreter. (Actually, just Extending; ignore the other half.)
Basically, if you write a new module foo, anyone can import foo and call foo.a("a"), and it doesn't matter if module foo is implemented as a Python file foo.py, or compiled from foo.cpp into a dynamic library foo.so.
For that, as Kyle C suggests, there are a number of higher-level ways to do this so you don't need to deal with the C API directly. (In addition to his suggestion of Boost.Python, I'd also suggest looking at Cython, which lets you write the code in an almost-Python language that can talk directly to your C++, while also exposing things directly to Python.)
However, that isn't what you asked for. What you want to be able to do is take some function that's defined in Python code, and hook it to do something different. For that, you really do need to read the Extending and Embedding documentation linked above. You're going to have to write an embedded interpreter, reproduce some of the behavior (exactly how much depends on where exactly you want to hook it), and make sure that wherever PyObject_Call or a similar function would have been called, it first checks whether that object is the a function and, if so, calls your hook code.
This is going to be pretty difficult. If you haven't written an embedded Python interpreter yet, go do that before you even think about how to hook it.
It's worth noting that it's probably much, much easier to do the hooking from within Python than from outside the interpreter. (If you've never heard of "monkeypatching", go google that.) And you can always make your Python hook call code from a module that you built in C++, or even call directly into a compiled .so file via ctypes.
Finally, if you want to hook some running interpreter instance at runtime, that's even more difficult. You obviously need to be able to do some kind of debug/trace/etc. attach and code insertion, and the details of that are entirely dependent on your platform. Then, you'll want to do the same thing you would have done in the previous hard version, except that you'll have to do it by intercepting calls to, e.g., PyObject_Call (from libpython.so/Python.dll/Python.framework/whatever) to go through your hooks first. If you don't already know how to hook SO calls on your platform, you need to learn that before you can even think about hooking Python code from outside.
Do you have reasons not to alter the in-python code? If not, why not write an extension module or a python ctypes call to your dll, and wrap a with a decorator that calls your hook?
import my_cpp_module
def hook_a_to_call_my_cpp_module(orig_a):
def replacement_a(pOne):
try:
my_cpp_module.my_cpp_function(pOne)
finally:
return orig_a(pOne)
#
return replacement_a
#hook_a_to_call_my_cpp_module
def a(pOne):
pass
you can even do this without touching the source file which has a, like this:
# in this other file you import the module that contains function a
import the_py_that_has_a
# next line just to avoid copy-pasting the above code from my answer here again
from the_decorator_from_the_previous_example import hook_a_to_call_my_cpp_module
the_py_that_has_a.a = hook_a_to_call_my_cpp_module(the_py_that_has_a.a)
If you have some reasons not to alter your py code at all, could you state them in a comment?
For me , the Pointer was one of the hardest concept in programming languages in C++. When I was learning C++, I spent tremendous amount of time learning it. However, Now I primarily work in projects that are entirely written in languages like C#, and VB.NET etc. As a matter fact, I have NOT touched C++ for almost 4 years. Even though, C# has pointer, but I have not encouter the situation where I must use pointer in C#. So my question is , what kinds of productivity can we obtain in C# by using pointer ? what are the situation where the uses of the pointer is must?
You're already using lots of pointers in C#, except that they don't look like pointers. Every time you do something with an instance of a class, that's a pointer. You're getting almost all the potential benefit already, without the hassle.
It is possible to use pointers more explicitly in C#, which is what most people mean by C# pointers, but I would think the occasions would be very rare. They may be useful to link to C libraries and the like, but other than that I don't see much use for them.
Personally, I've never had a need for using pointers in .NET, but if you're dealing with absolute performance critical code, you'd use pointers. If you look at the System.String class, you'll see that a lot of the methods that handle the string manipulation, use pointers. Also, when dealing with image processing, very often it's useful to use pointers. Now, one can definitely argue whether those sort of applications should be written in .NET in the first place (I think they should), but at least if you need to squeeze out that extra bit of speed, you can.
I use pointers in C# only in rare circumstances that mostly have to do with sending/receiving data, where you have to convert a byte array to a struct and vice-versa. Though even then, you don't have to deal with pointers directly typically.
In some cases, you can use pointers to improve performance, because with the Marshaller, sometimes you have to copy memory to access data, while with pointers, you can access it directly (think Bitmap.Lock()).
Personally, I've never needed to use a pointer in C#. If I need that kind of functionality, I write that code in C++/CLI, and call it from C#. If I need to pass pointers from C# to C++/CLI or vice-versa, I pass them around as an IntPtr and cast it to the type I need in C++/CLI.
In my opinion - if you're using pointers in C#, in 99% of cases, you're using the language wrong.
Edit:
The nice thing about C++/CLI is that you can mark individual classes for native-only compilation. I do a lot of image processing work which needs to happen very quickly; it uses a lot of pointer-based code. I generally have a managed C++/CLI object forward calls to a native C++ object where my processing takes place. I turn on optimizations for that native code and viola, I get a nice performance gain.
Granted, this only matters if the performance gain you get by executing native, optimized code can offset the overhead of managed to unmanaged transitions. In my case, it always does.
Is there such a thing as an x86 assembler that I can call through C#? I want to be able to pass x86 instructions as a string and get a byte array back. If one doesn't exist, how can I make my own?
To be clear - I don't want to call assembly code from C# - I just want to be able to assemble code from instructions and get the machine code in a byte array.
I'll be injecting this code (which will be generated on the fly) to inject into another process altogether.
As part of some early prototyping I did on a personal project, I wrote quite a bit of code to do something like this. It doesn't take strings -- x86 opcodes are methods on an X86Writer class. Its not documented at all, and has nowhere near complete coverage, but if it would be of interest, I would be willing to open-source it under the New BSD license.
UPDATE:
Ok, I've created that project -- Managed.X86
See this project:
https://github.com/ZenLulz/MemorySharp
This project wraps the FASM assembler, which is written in assembly and as a compiled as Microsoft coff object, wrapped by a C++ project, and then again wrapped in C#. This can do exactly what you want: given a string of x86/x64 assembly, this will produce the bytes needed.
If you require the opposite, there is a port of the Udis86 disassembler, fully ported to C#, here:
https://github.com/spazzarama/SharpDisasm
This will convert an array of bytes into the instruction strings for x86/x64
Take a look at Phoenix from Microsoft Research.
Cosmos also has some interesting support for generating x86 code:
http://www.gocosmos.org/blog/20080428.en.aspx
Not directly from C# you can't. However, you could potentially write your own wrapper class that uses an external assembler to compile code. So, you would potentially write the assembly out to a file, use the .NET Framework to spin up a new process that executes the assembler program, and then use System.IO to open up the generated file by the assembler to pull out the byte stream.
However, even if you do all that, I would be highly surprised if you don't then run into security issues. Injecting executable code into a completely different process is becoming less and less possible with each new OS. With Vista, I believe you would definitely get denied. And even in XP, I think you would get an access denied exception when trying to write into memory of another process.
Of course, that raises the question of why you are needing to do this. Surely there's got to be a better way :).
Take a look at this: CodeProject: Using unmanaged code and assembly in C#.
I think you would be best off writing a native Win32 dll. You can then write a function in assembler that is exported from the dll. You can then use C# to dynamically link to the dll.
This is not quite the same as passing in a string and returning a byte array. To do this you would need an x86 assembler component, or a wrapper around masm.exe.
i don't know if this is how it works but you could just shellexecute an external compiler then loading the object generated in your byte array.
I have a small to medium project that is in C++/CLI. I really hate the syntax extensions of C++/CLI and I would prefer to work in C#. Is there a tool that does a decent job of translating one to the other?
EDIT: When I said Managed c++ before I apparently meant c++/CLI
You can only translate Managed C++ code (and C++/CLI code) to C# if the C++ code is pure managed. If it is not -- i.e. if there is native code included in the sources -- tools like .NET Reflector won't be able to translate the code for you.
If you do have native C++ code mixed in, then I'd recommend trying to move the native code into a separate DLL, replace your calls to DLL functions by easily identifiable stub functions, compile your project as a pure .NET library, then use .NET reflector to de-compile into C# code. Then you can replace the calls to the stub functions by p-invoke calls to your native DLL.
Good luck! I feel for you!
.NET Managed C++ is like a train wreck. But have you looked into C++ CLI? I think Microsoft did a great job in this field to make C++ a first class .NET citizen.
http://msdn.microsoft.com/en-us/magazine/cc163852.aspx
I'm not sure if this will work, but try using .Net Reflector along with ReflectionEmitLanguage plug-in. The RelelectionEmitLanguage plug-in claims to convert your assembly to c# code.
It has to be done manually unfortunately, but if the code is mostly C++/CLI (not native C++) then it can actually be done pretty quickly. I managed to port around 250,000 lines of C++/CLI code into C# in less than a couple of months, and I don't even know C++ very well at all.
If preserving Git history is important, you might want to git mv your cpp file into a cs file, commit, then start porting. The reason for this is that Git will think your file is new if you modify it too much after renaming it.
This was my approach when porting large amounts of code (so that it wouldn't take forever):
Create another worktree / clone of the branch and keep it open at all times
This is extremely important as you will want to compare your C# to the old C++/CLI code
Rename cpp to cs, delete header file, commit
I chose to rename the cpp file since its git history is probably more important than the header file
Create namespace + class in cs file, add any base classes/interfaces (if abstract sealed, make static in C#)
Copy fields first, then constructors, then properties, and finally functions
Start replacing with Ctrl+H:
^ to empty
:: to .
-> to .
nullptr to null
for each to foreach
gcnew to new
L" to "
Turn on case sensitivity to avoid accidental renames (for example L"cool" should become "cool", not "coo"
Prefixes like ClassName:: to empty, so that MyClass::MyMethod becomes MyMethod
Go through the red code and port manually code that cannot be just replaced (e.g. some special C++ casts), unless you have some cool regex to do it fast
Once code compiles, go through it again, compare to C++/CLI line by line, check for errors, clean it up, move on.
If you encounter a dependency that needs to be ported, you could pause, port that, then come back. I did that, but it might not be so easy.
Properties were the most annoying to port, because I had to remove everything before and after the getters and setters. I could have maybe written a regex for it but didn't bother doing so.
Once the porting is done, it's very important that you go through the changes line by line, read the code, and compare with C++/CLI code and fix possible errors.
One problem with this approach is that you can introduce bugs in variable declarations, because in C++/CLI you can declare variables in 2 ways:
MyType^ variable; <- null
MyType variable; <- calls default constructor
In the latter case, you want to actually do MyType variable = new MyType(); but since you already removed all the ^ you have to just manually check and test which one is correct. You could of course just replace all ^'s manually, but for me it would have taken too long (plus laziness) so I just did it this way.
Other recommendations:
Have a dummy C++/CLI project and a tool like LinqPad or another C# project to test differences between C++/CLI and C# if you're unsure of a piece of ported code
Install Match Margin to help highlight similar code (helped me when porting WinForms code)
ReSharper! It helped with finding bugs and cleaning up the code a LOT. Truly worth the money.
Some gotchas that I encountered while porting:
Base classes can be called in C++/CLI like so: BaseClass->DoStuff, but in C# you would have to do base.DoStuff instead.
C++/CLI allows such statements: if (foo), but in C# this has to be explicit. In the case of integers, it would be if (foo != 0) or for objects if (foo != null).
Events in base classes can be invoked in C++/CLI, but in C# it's not possible. The solution is to create a method, like OnSomeEvent, in the base class, and inside that to invoke the event.
C++/CLI automatically generates null checks for event invocations, so in C# make sure to add an explicit null check: MyEvent?.Invoke(this, EventArgs.Empty);. Notice the question mark.
dynamic_cast is equivalent to as cast in C#, the rest can be direct casts ((int) something).
gcnew can be done without parentheses. In C# you must have them with new.
Pay attention to virtual override keywords in the header files, you can easily forget to mark the C# methods with override keyword.
Intefaces can have implementations! In this case, you might have to rethink the architecture a bit. One option is to pull the implementation into an abstract class and derive from it
Careful when replacing casts with Convert calls in C#
Convert.ToInt32 rounds to the narest int, but casting always rounds down, so in this case we should not use the converter.
Always try casting first, and if that doesn't work, use the Convert class.
Variables in C++/CLI can be re-declared in a local scope, but in C# you get naming conflicts. Code like this easily lead to hard to find bugs if not ported carefully.
Example: An event handler can take a parameter e, but also has a try-catch like catch (Exception e) which means there are 2 e variables.
Another example:
// number is 2
int number = 2;
for (int number = 0; number < 5; number++)
{
// number is now 0, and goes up to 4
}
// number is again 2!
The above code is illegal in C#, because there is a naming conflict. Find out exactly how the code works in C++ and port it with the exact same logic, and obviously use different variable names.
In C++/CLI, it's possible to just write throw; which would create a generic C++ exception SEHException. Just replace it with a proper exception.
Be careful when porting code that uses the reference % sign, that usually means that you will have to use ref or out keywords in C#.
Similarly, pay attention to pointers * and & references. You might have to write additional code to write changes back whereas in C++ you can just modify the data pointed to by the pointer.
It's possible to call methods on null object instances in C++/CLI. Yes seriously. So inside the function you could do If (this == null) { return; }.
Port this type of code carefully. You might have to create an extension method that wraps over this type of method in order to avoid breaking the code.
Check and make sure everything in the old project file vcxproj was ported correctly. Did you miss any embedded resources?
Careful when porting directives like #ifdef, the "if not" (#ifndef) looks awfully similar but can have disastrous consequences.
C++/CLI classes automatically implement IDisposable when adding a destructor, so in C# you'll need to either implement that interface or override the Dispose method if it's available in the base class.
Other tips:
If you need to call Win32 functions, just use P/Invoke instead of creating a C++/CLI wrapper
For complex native C++ code, better create a C++/CLI project with managed wrappers
Again, pay attention to pointers. I had forgotten to do Marshal.StructureToPtr in my P/Invoke code which wasn't necessary in the C++ version since we had the actual pointer and not a copy of its data.
I have surely missed some things, but hopefully these tips will be of some help to people who are demoralized by the amount of code that needs to be ported, especially in a short period of time :)
After porting is done, use VS/ReSharper to refactor and clean up the code. Not only is it nice for readability, which is my top priority when writing code, but it also forces you to interact with the code and possibly find bugs that you otherwise would have missed.
Oh and one final FYI that could save you headaches: If you create a C++/CLI wrapper that exposes the native C++ pointer, and need to use that pointer in an external C++/CLI assembly, you MUST make the native type public by using #pragma make_public or else you'll get linker errors:
// put this at the top of the wrapper class, after includes
#pragma make_public(SomeNamespace::NativeCppClass)
If you find a bug in the C++/CLI code, keep it. You want to port the code, not fix the code, so keep things in scope!
For those wondering, we got maybe around 10 regressions after the port. Half were mistakes because I was already on autopilot mode and didn't pay attention to what I was doing.
Happy porting!
Back ~2004 Microsoft did have a tool that would convert managed C++ to C++/CLI ... sort of. We ran it on a couple of projects, but to be honest the amount of work left cleaning up the project was no less than the amount of work it would have been to do the conversion by hand in the first place. I don't think the tool ever made it out into a public release though (maybe for this reason).
I don't know which version of Visual Studio you are using, but we have managed C++ code that will not compile with Visual Studio 2005/2008 using the /clr:oldSyntax switch and we still have a relic VS 2003 around for it.
I don't know of any way of going from C++ to C# in a useful way ... you could try round tripping it through reflector :)
Such projects are often done in c++/cli because C# isn't really an elegant option for the task. e.g. if you have to interface with some native C++ libraries, or do very high performance stuff in low level C. So just make sure whoever chose c++/cli didn't have a good reason to do it before doing the switch.
Having said that, I'm highly skeptical there's something that does what you ask, for the simple reason that not all C++/cli code is translatable to C# (and probably vice versa too).