in C++, one can use this expression:
#define IDENTIFIER NAME
eg. #define MY_NAME "Gideon"
Is this similarly possible in C#?
No. #define can only be used to define flags to be tested with #if (and then only at the start of a file).
Use a constant string instead:
const string MY_NAME = "Gideon";
Those are completely different things. In C++ it basically replaces the string MY_NAME with expression assigned to it "Gideon".
The same happens in C#, in case of canstant expression, but in C++ you can define complete macros(functions) to MY_NAME in order to make them run, which is not possible in C#
I am not saying you should do it or that it is going to work as you expect, but there is nothing stopping you from trying to use a C preprocessor (e.g. GNU cpp) on your code.
Related
I'm trying to use a DLL generated by ikvmc from a jar file compiled from Scala code (yeah my day is THAT great). The Scala compiler seems to generate identifiers containing dollar signs for operator overloads, and IKVM uses those in the generated DLL (I can see it in Reflector). The problem is, dollar signs are illegal in C# code, and so I can't reference those methods.
Any way to work around this problem?
You should be able to access the funky methods using reflection. Not a nice solution, but at least it should work. Depending on the structure of the API in the DLL it may be feasible to create a wrapper around the methods to localise the reflection code. Then from the rest of your code just call the nice wrapper.
The alternative would be to hack on the IL in the target DLL and change the identifiers. Or do some post-build IL-hacking on your own code.
Perhaps you can teach IKVM to rename these identifiers such that they have no dollar sign? I'm not super familar, but a quick search pointed me at these:
http://weblog.ikvm.net/default.aspx?date=2005-05-02
What is the format of the Remap XML file for IKVM?
String and complex data types in Map.xml for IKVM!
Good Hunting
Write synonyms for those methods:
def +(a:A,b:A) = a + b
val plus = + _
I fear that you will have to use Reflection in order to access those members. Escaping simply doesn't work in your case.
But for thoose of you, who interested in escaping mechanics I've wrote an explanation.
In C# you can use the #-sign in order to escape keywords and use them as identifiers. However, this does not help to escape invalid characters:
bool #bool = false;
There is a way to write identifiers differently by using a Unicode escape sequence:
int i\u0064; // '\u0064' == 'd'
id = 5;
Yes this works. However, even with this trick you can still not use the $-sign in an identifier. Trying...
int i\u0024; // '\u0024' == '$'
... gives the compiler error "Unexpected character '\u0024'". The identifier must still be a valid identifier! The c# compiler probably resolves the escape sequence in a kind of pre-processing and treats the resulting identifier as if it had been entered normally
So what is this escaping good for? Maybe it can help you, if someone uses a foreign language character that is not on your keyboard.
int \u00E4; // German a-Umlaut
ä = 5;
In VB.NET 2008, I used the following statement:
MyKeyChr = ChrW(e.KeyCode)
Now I want to convert the above statement into C#.
Any Ideas?
The quick-and-dirty equivalent of ChrW in C# is simply casting the value to char:
char MyKeyChr = (char)e.KeyCode;
The longer and more expressive version is to use one of the conversion classes instead, like System.Text.ASCIIEncoding.
Or you could even use the actual VB.NET function in C# by importing the Microsoft.VisualBasic namespace. This is really only necessary if you're relying on some of the special checks performed by the ChrW method under the hood, ones you probably shouldn't be counting on anyway. That code would look something like this:
char MyKeyChr = Microsoft.VisualBasic.Strings.ChrW(e.KeyCode);
However, that's not guaranteed to produce exactly what you want in this case (and neither was the original code). Not all the values in the Keys enumeration are ASCII values, so not all of them can be directly converted to a character. In particular, casting Keys.NumPad1 et. al. to char would not produce the correct value.
Looks like the C# equivalent would be
var MyKeyChr = char.ConvertFromUtf32((int) e.KeyCode)
However, e.KeyCode does not contain a Unicode codepoint, so this conversion is meaningless.
The most literal way to translate the code is to use the VB.Net runtime function from C#
MyKeyChr = Microsoft.VisualBasic.Strings.ChrW(e.KeyCode);
If you'd like to avoid a dependency on the VB.Net runtime though you can use this trimmed down version
MyKeyChr = Convert.ToChar((int) (e.KeyCode & 0xffff));
The C# equivalent of ChrW(&H[YourCharCode]) is Strings.ChrW(0x[YourCharCode])
You can use https://converter.telerik.com/ do convert between VB & C#.
This worked for me to convert VB:
e.KeyChar = Microsoft.VisualBasic.ChrW(13)
To C#:
e.KeyChar == Convert.ToChar(13)
E.g:
isValidCppIdentifier("_foo") // returns true
isValidCppIdentifier("9bar") // returns false
isValidCppIdentifier("var'") // returns false
I wrote some quick code but it fails:
my regex is "[a-zA-Z_$][a-zA-Z0-9_$]*"
and I simply do regex.IsMatch(inputString).
Thanks..
It should work with some added anchoring:
"^[a-zA-Z_][a-zA-Z0-9_]*$"
If you really need to support ludicrous identifiers using Unicode, feel free to read one of the various versions of the standard and add all the ranges into your regexp (for example, pages 713 and 714 of http://www-d0.fnal.gov/~dladams/cxx_standard.pdf)
Matti's answer will work to sanitize identifiers before inserting into C++ code, but won't handle C++ code as input very well. It will be annoying to separate things like L"wchar_t string", where L is not an identifier. And there's Unicode.
Clang, Apple's compiler which is built on a philosophy of modularity, provides a set of tokenizer functions. It looks like you would want clang_createTranslationUnitFromSourceFile and clang_tokenize.
I didn't check to see if it handles \Uxxxx or anything. Can't make any kind of gurarantees. Last time I used LLVM was five years ago and it wasn't the greatest experience… but not the worst either.
On the other hand, GCC certainly has it, although you have to figure out how to use cpp_lex_direct.
I need to parse and split C and C++ functions into the main components (return type, function name/class and method, parameters, etc).
I'm working from either headers or a list where the signatures take the form:
public: void __thiscall myClass::method(int, class myOtherClass * )
I have the following regex, which works for most functions:
(?<expo>public\:|protected\:|private\:) (?<ret>(const )*(void|int|unsigned int|long|unsigned long|float|double|(class .*)|(enum .*))) (?<decl>__thiscall|__cdecl|__stdcall|__fastcall|__clrcall) (?<ns>.*)\:\:(?<class>(.*)((<.*>)*))\:\:(?<method>(.*)((<.*>)*))\((?<params>((.*(<.*>)?)(,)?)*)\)
There are a few functions that it doesn't like to parse, but appear to match the pattern. I'm not worried about matching functions that aren't members of a class at the moment (can handle that later). The expression is used in a C# program, so the <label>s are for easily retrieving the groups.
I'm wondering if there is a standard regex to parse all functions, or how to improve mine to handle the odd exceptions?
C++ is notoriously hard to parse; it is impossible to write a regex that catches all cases. For example, there can be an unlimited number of nested parentheses, which shows that even this subset of the C++ language is not regular.
But it seems that you're going for practicality, not theoretical correctness. Just keep improving your regex until it catches the cases it needs to catch, and try to make it as stringent as possible so you don't get any false matches.
Without knowing the "odd exceptions" that it doesn't catch, it's hard to say how to improve the regex.
Take a look at Boost.Spirit, it is a boost library that allows the implementation of recursive descent parsers using only C++ code and no preprocessors. You have to specify a BNF Grammar, and then pass a string for it to parse. You can even generate an Abstract-Syntax Tree (AST), which is useful to process the parsed data.
The BNF specification looks like for a list of integers or words separated might look like :
using spirit::alpha_p;
using spirit::digit_p;
using spirit::anychar_p;
using spirit::end_p;
using spirit::space_p;
// Inside the definition...
integer = +digit_p; // One or more digits.
word = +alpha_p; // One or more letters.
token = integer | word; // An integer or a word.
token_list = token >> *(+space_p >> token) // A token, followed by 0 or more tokens.
For more information refer to the documentation, the library is a bit complex at the beginning, but then it gets easier to use (and more powerful).
No. Even function prototypes can have arbitrary levels of nesting, so cannot be expressed with a single regular expression.
If you really are restricting yourself to things very close to your example (exactly 2 arguments, etc.), then could you provide an example of something that doesn't match?
In C#, if you want a String to be taken literally, i.e. ignore escape characters, you can use:
string myString = #"sadasd/asdaljsdl";
However there is no equivalent in Java. Is there any reason Java has not included something similar?
Edit:
After reviewing some answers and thinking about it, what I'm really asking is:
Is there any compelling argument against adding this syntax to Java? Some negative to it, that I'm just not seeing?
Java has always struck me as a minimalist language - I would imagine that since verbatim strings are not a necessity (like properties for instance) they were not included.
For instance in C# there are many quick ways to do thing like properties:
public int Foo { get; set; }
and verbatim strings:
String bar = #"some
string";
Java tends to avoid as much syntax-sugar as possible. If you want getters and setters for a field you must do this:
private int foo;
public int getFoo() { return this.foo; }
public int setFoo(int foo) { this.foo = foo; }
and strings must be escaped:
String bar = "some\nstring";
I think it is because in a lot of ways C# and Java have different design goals. C# is rapidly developed with many features being constantly added but most of which tend to be syntax sugar. Java on the other hand is about simplicity and ease of understanding. A lot of the reasons that Java was created in the first place were reactions against C++'s complexity of syntax.
I find it funny "why" questions. C# is a newer language, and tries to improve in what is seen as shortcomings in other languages such as Java. The simple reason for the "why" question is - the Java standard does not define the # operator such as in C#.
Like said, mostly when you want to escape characters is for regexes. In that case use:
Pattern.quote()
I think one of the reasons is that regular expressions (which are a major reason for these kind of String literals) where not part of the Java platform until Java 1.4 (if I remember correctly). There simply wasn't so much of a need for this, when the language was defined.
Java (unfortunately) doesn't have anything like this, but Groovy does:
assert '''hello,
world''' == 'hello,\nworld'
//triple-quotes for multi-line strings, adds '\n' regardless of host system
assert 'hello, \
world' == 'hello, world' //backslash joins lines within string
I really liked this feature of C# back when I did some .NET work. It was especially helpful for cut and pasted SQL queries.
I am not sure on the why, but you can do it by escaping the escape character. Since all escape characters are preceded by a backslash, by inserting a double backslash you can effectively cancel the escape character. e.g. "\now" will produce a newline then the letters "ow" but "\now" will produce "\now"
I think this question is like: "Why java is not indentation-sensitive like Python?"
Mentioned syntax is a sugar, but it is redundant (superfluous).
You should find your IDE handles the problem for you.
If you are in the middle of a String and copy-paste raw text into it, it should escape the text for you.
PERL has a wider variety of ways to set String literals and sometimes wish Java supported these as well. ;)