I was converting a Managed c++ project to a C# one. The C++ project includes a constants C++ header file which is an external dependency present outside of the project.
In the newly created C# file, is there a way to include this C++ header file? I dont want to redefine these constants in a C# file as changes by clients will take place on the C++ header file.
It's not possible to include it.
You have 2 options really
duplicate it for the managed layer, and maintain it in synch with the C++ header.
read and parse it at runtime, and use reflection in the C# parts that require those symbols.
As noted by others, you can automate the first one.
If you have a huge amount of header files, you can take a look at SWIG: http://www.swig.org/
This will generate C# files from C/C++ header files.
For more info see also: http://www.swig.org/Doc2.0/SWIGDocumentation.html#CSharp
The results are quite impressive! But the naming is more C++ like, than C# style... but this was expected...
Unfortunately no, that isn't possible as the C# compiler doesn't understand what to do with .h files. Even it if did, it still illegal to have un-scoped variables declarations (constant or otherwise) in .NET.
You'll have to convert the file either by hand, or as mentioned in the comment by Joachim Pileborg, build a utility to auto-convert it to C# code for you.
Related
I'm working with Visual Studio C#, and I need to parse C header files to extract information only about the function declarations contained within. For each function I need the name, return type, and its parameters. If possible, I'd like the parameters in the order in which they appear in the function declaration.
I've seen stuff online about using visual studios tags, or Exhuberant Ctags, etc. But from what I gathered those aren't really options that let me perform the parse from my C# program with C# code (I may be mistaken?). I've also looked through all the other answers to related questions but they don't seem really apply to my situation (I may just be dumb).
If I could at least get all the lines of code that represent function declarations I'd have a good start and could hand-parse the rest myself.
Thanks in advance
To "parse" C (header) files in a deep sense and pick up the type information for function declarations, in practice you need:
a full preprocessor (including the pecaddillos added by the vendor, MS has some pretty odd stuff in their headers),
a full (syntax) parser/AST builder for the C dialect of interest (there's no such thing as "C"; there is what the vendor offers in this revision of the compiler)
a full symbol table construction (because typedefs are aliases for the actual types of interest)
Many people will suggest "write your own parser (for C)". Mostly those people haven't done this; its a lot more work to do this and get it right than they understand. If you don't start with a production-level machinery, you won't get through real C header files without fixing it all.
Just parsing plain C is hard; consider the problem of parsing the ambiguous phrase
T*X;
A classic parser cannot parse this without additional hackery.
You will also not be able to parse a C header file by itself, in general. You need to have the source code context (often including the compiler command line) in which it is included, or typedefs, preprocessor conditionals and macros in a specific header file will be undefined and therefore unexpandable into the valid C that the compiler normally sees.
You are better off getting pre-existing pre-tested machinery that will do this for you. Clang comes to mind as an option, although I'm not sure it handles the MS header files. GCC is kind of an option, but it really, really wants to be a compiler, not your local friendly C source code analysis tool, and again I'm unsure of its support for MS dialects of C. Our DMS Software Reengineering Toolkit has all of the above for various MS dialects of C.
Having chosen a tool that can actually parse such headers, you'll likely want to do something with the collected header information. You are vague about what you want to accomplish. Having mentioned C# and C in the same breath, there's a hint that you want to call C programs from C# code, and thus need to generate C# equivalent APIs for the C code. For this you will need machinery to manipulate the type information provided, and to build the "text" for the C# declarations. For this, you are likely to find that you need other supporting tooling to do that part, too. Here GCC is a complete non-starter; it will offer you no additional help. Clang and DMS are both designed to be libraries of custom-tool building machinery.
Of course, this may all be moot depending on how much header file text you want to handle; it if is just one header file, doing it manually is probably easiest. You suggest you are willing to do that ("could hand-parse..."). In that case, all you really need to do is to run the preprocessor and interpret the output. I beleive you can do with command line switches for GCC and Clang and even the MS compilers; I know DMS can do this. For easily avialable options here, see How do I see a C/C++ source file after preprocessing in Visual Studio?
Update 1:
I am wondering whether I can reference to a .lib file, but it seems that I cannot.
If this is true, and I have no source code of the C++ project, How can I use its methods?
Btw, I'm using FastCV library.
I come across a situation that I need to call C++ methods from C# code.
The C++ generated files structure:
lib
--libfastcv.lib
--vc120.pdb inc
--fastcv.h
--fastcv.inl
--stdint.h
I know how to call C++ methods from C# :
[DllImport("libfastcv.lib",CallingConvention=CallingConvention.Cdecl)]
public static extern <ReturnType> <MethodName>(<Parameters>);
But I think the .h and .inl files need to be included in my C# project as well.
So how to include them?
Thank you very much.
They don't. Instead, you need to build/use binary-compatible types in your own code, and use them. (And, you importing a method from dll, not from lib).
You dont have to do any includes. The DLLImport should be enough.
To see the Methods you can import you can use DependencyWalker or my favourite Tool CFF Explorer
I often used any WINAPI functions where i need some constants defined in headers. I always had to define them in my C# code too, theres no way to "import" them.
Ok, so I was wondering how one would go about creating a program, that creates a second program(Like how most compression programs can create self extracting self excutables, but that's not what I need).
Say I have 2 programs. Each one containing a class. The one program I would use to modify and fill the class with data. The second file would be a program that also had the class, but empty, and it's only purpose is to access this data in a specific way. I don't know, I'm thinking if the specific class were serialized and then "injected" into the second file. But how would one be able to do that? I've found modifying files that were already compiled fascinating, though I've never been able to make changes that didn't cause errors.
That's just a thought. I don't know what the solution would be, that's just something that crossed my mind.
I'd prefer some information in say c or c++ that's cross-platform. The only other language I'd accept is c#.
also
I'm not looking for 3-rd party library's, or things such as Boost. If anything a shove in the right direction could be all I need.
++also
I don't want to be using a compiler.
Jalf actually read what I wrote
That's exactly what I would like to know how to do. I think that's fairly obvious by what I asked above. I said nothing about compiling the files, or scripting.
QUOTE "I've found modifying files that were already compiled fascinating"
Please read and understand the question first before posting.
thanks.
Building an executable from scratch is hard. First, you'd need to generate machine code for what the program would do, and then you need to encapsulate such code in an executable file. That's overkill unless you want to write a compiler for a language.
These utilities that generate a self-extracting executable don't really make the executable from scratch. They have the executable pre-generated, and the data file is just appended to the end of it. Since the Windows executable format allows you to put data at the end of the file, caring only for the "real executable" part (the exe header tells how big it is - the rest is ignored).
For instance, try to generate two self-extracting zip, and do a binary diff on them. You'll see their first X KBytes are exactly the same, what changes is the rest, which is not an executable at all, it's just data. When the file is executed, it looks what is found at the end of the file (the data) and unzips it.
Take a look at the wikipedia entry, go to the external links section to dig deeper:
http://en.wikipedia.org/wiki/Portable_Executable
I only mentioned Windows here but the same principles apply to Linux. But don't expect to have cross-platform results, you'll have to re-implement it to each platform. I couldn't imagine something that's more platform-dependent than the executable file. Even if you use C# you'll have to generate the native stub, which is different if you're running on Windows (under .net) or Linux (under Mono).
Invoke a compiler with data generated by your program (write temp files to disk if necessary) and or stored on disk?
Or is the question about the details of writing the local executable format?
Unfortunately with compiled languages such as C, C++, Java, or C#, you won't be able to just ``run'' new code at runtime, like you can do in interpreted languages like PHP, Perl, and ECMAscript. The code has to be compiled first, and for that you will need a compiler. There's no getting around this.
If you need to duplicate the save/restore functionality between two separate EXEs, then your best bet is to create a static library shared between the two programs, or a DLL shared between the two programs. That way, you write that code once and it's able to be used by as many programs as you want.
On the other hand, if you're really running into a scenario like this, my main question is, What are you trying to accomplish with this? Even in languages that support things like eval(), self modifying code is usually some of the nastiest and bug-riddled stuff you're going to find. It's worse even than a program written completely with GOTOs. There are uses for self modifying code like this, but 99% of the time it's the wrong approach to take.
Hope that helps :)
I had the same problem and I think that this solves all problems.
You can put there whatever code and if correct it will produce at runtime second executable.
--ADD--
So in short you have some code which you can hard-code and store in the code of your 1st exe file or let outside it. Then you run it and you compile the aforementioned code. If eveything is ok you will get a second executable runtime- compiled. All this without any external lib!!
Ok, so I was wondering how one would
go about creating a program, that
creates a second program
You can look at CodeDom. Here is a tutorial
Have you considered embedding a scripting language such as Lua or Python into your app? This will give you the ability to dynamically generate and execute code at runtime.
From wikipedia:
Dynamic programming language is a term used broadly in computer science to describe a class of high-level programming languages that execute at runtime many common behaviors that other languages might perform during compilation, if at all. These behaviors could include extension of the program, by adding new code, by extending objects and definitions, or by modifying the type system, all during program execution. These behaviors can be emulated in nearly any language of sufficient complexity, but dynamic languages provide direct tools to make use of them.
Depending on what you call a program, Self-modifying code may do the trick.
Basically, you write code somewhere in memory as if it were plain data, and you call it.
Usually it's a bad idea, but it's quite fun.
For my programming languages class, I'm writing a research paper on some papers by some important people in the history of language design. One by CAR Hoare struck me as odd because it speaks against independent compilation techniques used in C and later C++ before C even became popular.
Since this is primarily an optimization to speed up compilation times, what is it about Java and C# that make them able to avoid reliance on independent compilation? Is it a compiler technique or are there elements of the language that facilitate this? And are there any other compiled languages that used these techniques before them?
Short answer: Java and C# don't avoid separate compilation; they make full use of it.
Where they differ is that they don't require the programmer to write a pair of separate header/implementation files when writing a reusable library. The user writes the definition of a class once, and the compiler extracts the information equivalent to the "header" from that single definition and includes it in the output file as "type metadata". So the output file (a .jar full of .class files in Java, or an .dll assembly in .NET-based languages) is a combination of binaries AND headers in a single package.
Then when another class is compiled and it depends on the first class, it can look at the metadata instead of having to find a separate include file.
It happens that they target a virtual machine rather than a specific chip architecture, but that's a separate issue; they could put x86 machine code in as the binary and still have the header-like metadata in the same file as well (this is in fact an option in .NET, albeit rarely used).
In C++ compilers it is common to try to speed up compilation by using "pre-compiled headers". The metadata in .NET .dll and .class files is much like a pre-compiled header - already parsed and indexed, ready for rapid look-ups.
The upshot is that in these modern languages, there is one way of doing modularization, and it has the characteristics of a perfectly organised and hand-optimised C++ modular build system - pretty nifty, speaking ASFAC++B.
IMO, one of the biggest factors here is that both java and .NET use intermediate languages; that means that the compiled unit (jar/assembly) contains, as a pre-requisite, a lot of expressive metadata about the types, methods, etc; meaning that it is already laid out conveniently for reference checking. The runtime still checks anyway, in case you are pulling a fast one ;-p
This isn't very far removed from the MIDL that underpins COM, although there the TLB is often a separate entity.
If I've misunderstood your meaning, please let me know...
You could consider a java .class file to be similar to a precompiled header file in C/C++. Essentially the .class file is the intermediate form that a C/C++ linker would need as well as all of the information contained in the header (Java just doesn't have a separate header).
Form your comment in another post:
"I'm basically meaning the idea in
C/C++ that each source file is its own
individual compilation unit. This
doesn't as much seem to be the case in
C# or Java."
In Java (I cannot speak for C#, but I assume it is the same) each source file is its own individual compilation unit. I am not sure why you would think it is not... perhaps we have different definitions of compilation unit?
It requires some language support (otherwise, C/C++ compilers would do it too)
In particular, it requires that the compiler generates self-contained modules, which expose metadata that other modules can reference to call into them.
.NET assemblies are a straightforward example. All the files in a project are compiled together, generating one dll. This dll can be queried by .NET to determine which types it contains, so that other assemblies can call functions defined in it.
And to make use of this, it must be legal in the language to reference other modules.
In C++, what defines the boundary of a module? The language specifies that the compiler only considers data in its current compilation unit (.cpp file + included headers). There is no mechanism for specifying "I'd like to call function Foo in module Bar, even though I don't have the prototype or anything for it at compile-time". The only mechanism you have for sharing type information between files is with #includes.
There is a proposal to add a module system to C++, but it won't be in C++0x. Last I saw, the plan was to consider it for a TR1 after 0x is out.
(It's worth mentioning that the #include system in C/C++ was originally used because it'd speed up compilation. Back in the 70's, it allowed the compiler to process the code in a simple linear scan. It didn't have to build syntax trees or other such "advanced" features. Today, the tables have turned and it's become a huge bottleneck, both in terms of usability and compilation speed.)
The object files generated by a C/C++ are ment to be read only by the linker, not by the compiler.
As to other languages: IIRC Turbo Pascal had "units" which you could use without having any source code. I think the point is to create metadata along with compiled code which can then be used by the compiler to figure out the interface to the module (i.e. signatures of functions, class layout etc.)
One problem with C/C++ which prevents just replacing #include with some kind of #import is also the preprocessor, which can completely change the meaning/syntax etc of included/imported modules. This would be very difficult (if not impossible) with a Java-like module system.
Is there such a thing as an x86 assembler that I can call through C#? I want to be able to pass x86 instructions as a string and get a byte array back. If one doesn't exist, how can I make my own?
To be clear - I don't want to call assembly code from C# - I just want to be able to assemble code from instructions and get the machine code in a byte array.
I'll be injecting this code (which will be generated on the fly) to inject into another process altogether.
As part of some early prototyping I did on a personal project, I wrote quite a bit of code to do something like this. It doesn't take strings -- x86 opcodes are methods on an X86Writer class. Its not documented at all, and has nowhere near complete coverage, but if it would be of interest, I would be willing to open-source it under the New BSD license.
UPDATE:
Ok, I've created that project -- Managed.X86
See this project:
https://github.com/ZenLulz/MemorySharp
This project wraps the FASM assembler, which is written in assembly and as a compiled as Microsoft coff object, wrapped by a C++ project, and then again wrapped in C#. This can do exactly what you want: given a string of x86/x64 assembly, this will produce the bytes needed.
If you require the opposite, there is a port of the Udis86 disassembler, fully ported to C#, here:
https://github.com/spazzarama/SharpDisasm
This will convert an array of bytes into the instruction strings for x86/x64
Take a look at Phoenix from Microsoft Research.
Cosmos also has some interesting support for generating x86 code:
http://www.gocosmos.org/blog/20080428.en.aspx
Not directly from C# you can't. However, you could potentially write your own wrapper class that uses an external assembler to compile code. So, you would potentially write the assembly out to a file, use the .NET Framework to spin up a new process that executes the assembler program, and then use System.IO to open up the generated file by the assembler to pull out the byte stream.
However, even if you do all that, I would be highly surprised if you don't then run into security issues. Injecting executable code into a completely different process is becoming less and less possible with each new OS. With Vista, I believe you would definitely get denied. And even in XP, I think you would get an access denied exception when trying to write into memory of another process.
Of course, that raises the question of why you are needing to do this. Surely there's got to be a better way :).
Take a look at this: CodeProject: Using unmanaged code and assembly in C#.
I think you would be best off writing a native Win32 dll. You can then write a function in assembler that is exported from the dll. You can then use C# to dynamically link to the dll.
This is not quite the same as passing in a string and returning a byte array. To do this you would need an x86 assembler component, or a wrapper around masm.exe.
i don't know if this is how it works but you could just shellexecute an external compiler then loading the object generated in your byte array.