In VB.NET 2008, I used the following statement:
MyKeyChr = ChrW(e.KeyCode)
Now I want to convert the above statement into C#.
Any Ideas?
The quick-and-dirty equivalent of ChrW in C# is simply casting the value to char:
char MyKeyChr = (char)e.KeyCode;
The longer and more expressive version is to use one of the conversion classes instead, like System.Text.ASCIIEncoding.
Or you could even use the actual VB.NET function in C# by importing the Microsoft.VisualBasic namespace. This is really only necessary if you're relying on some of the special checks performed by the ChrW method under the hood, ones you probably shouldn't be counting on anyway. That code would look something like this:
char MyKeyChr = Microsoft.VisualBasic.Strings.ChrW(e.KeyCode);
However, that's not guaranteed to produce exactly what you want in this case (and neither was the original code). Not all the values in the Keys enumeration are ASCII values, so not all of them can be directly converted to a character. In particular, casting Keys.NumPad1 et. al. to char would not produce the correct value.
Looks like the C# equivalent would be
var MyKeyChr = char.ConvertFromUtf32((int) e.KeyCode)
However, e.KeyCode does not contain a Unicode codepoint, so this conversion is meaningless.
The most literal way to translate the code is to use the VB.Net runtime function from C#
MyKeyChr = Microsoft.VisualBasic.Strings.ChrW(e.KeyCode);
If you'd like to avoid a dependency on the VB.Net runtime though you can use this trimmed down version
MyKeyChr = Convert.ToChar((int) (e.KeyCode & 0xffff));
The C# equivalent of ChrW(&H[YourCharCode]) is Strings.ChrW(0x[YourCharCode])
You can use https://converter.telerik.com/ do convert between VB & C#.
This worked for me to convert VB:
e.KeyChar = Microsoft.VisualBasic.ChrW(13)
To C#:
e.KeyChar == Convert.ToChar(13)
Related
Is it possible to convert a string to ordinal upper or lower case. Similar like invariant.
string upperInvariant = "ß".ToUpperInvariant();
string lowerInvariant = "ß".ToLowerInvariant();
bool invariant = upperInvariant == lowerInvariant; // true
string upperOrdinal = "ß".ToUpperOrdinal(); // SS
string lowerOrdinal = "ß".ToLowerOrdinal(); // ss
bool ordinal = upperOrdinal == lowerOrdinal; // false
How to implement ToUpperOrdinal and ToLowerOrdinal?
Edit:
How to to get the ordinal string representation? Likewise, how to get the invariant string representation? Maybe that's not possible as in the above case it might be ambiguous, at least for the ordinal representation.
Edit2:
string.Equals("ß", "ss", StringComparison.InvariantCultureIgnoreCase); // true
but
"ß".ToLowerInvariant() == "ss"; // false
I don't believe this functionality exists in the .NET Framework or .NET Core. The closest thing is string.Normalize(), but it is missing the case fold option that you need to successfully pull this off.
This functionality exists in the ICU project (which is available in C/Java). The functionality you are after is the unorm2.h file in C or the Normalizer2 class in Java. Example usage in Java and related test.
There are 2 implementations of Normalizer2 that I am aware of that have been ported to C#:
icu-dotnet (a C# wrapper library for ICU4C)
ICU4N (a fully managed port of ICU4J)
Full Disclosure: I am a maintainer of ICU4N.
From msdn:
TheStringComparer returned by the OrdinalIgnoreCase property treats the characters in the strings to compare as if they were converted to uppercase using the conventions of the invariant culture, and then performs a simple byte comparison that is independent of language.
But I'm guessing doing that won't achieve what you want, since simply doing "ß".ToUpperInvariant() won't give you a string that is ordinally equivallent to "ss". There must be some magic in the String.Equals method that handles the speciall case of Why “ss” equals 'ß'.
If you're only worried about German text then this answer might help.
In the latest edition of JavaSpecialists newsletter, the author mentions a piece of code that is un-compilable in Java
public class A1 {
Character aChar = '\u000d';
}
Try compile it, and you will get an error, such as:
A1.java:2: illegal line end in character literal
Character aChar = '\u000d';
^
Why an equivalent piece of c# code does not show such a problem?
public class CharacterFixture
{
char aChar = '\u000d';
}
Am I missing anything?
EDIT: My original intention of question was how c# compiler got unicode file parsing correct (if so) and why java should still stick with the incorrect(if so) parsing?
EDIT: Also i want myoriginal question title to be restored? Why such a heavy editing and i strongly suspect that it heavily modified my intentions.
Java's compiler translates \uxxxx escape sequences as one of the very first steps, even before the tokenizer gets a crack at the code. By the time it actually starts tokenizing, there are no \uxxxx sequences anymore; they're already turned into the chars they represent, so to the compiler your Java example looks the same as if you'd actually typed a carriage return in there somehow. It does this in order to provide a way to use Unicode within the source, regardless of the source file's encoding. Even ASCII text can still fully represent Unicode chars if necessary (at the cost of readability), and since it's done so early, you can have them almost anywhere in the code. (You could say \u0063\u006c\u0061\u0073\u0073\u0020\u0053\u0074\u0075\u0066\u0066\u0020\u007b\u007d, and the compiler would read it as class Stuff {}, if you wanted to be annoying or torture yourself.)
C# doesn't do that. \uxxxx is translated later, with the rest of the program, and is only valid in certain types of tokens (namely, identifiers and string/char literals). This means it can't be used in certain places where it can be used in Java. cl\u0061ss is not a keyword, for example.
I'm trying to use a DLL generated by ikvmc from a jar file compiled from Scala code (yeah my day is THAT great). The Scala compiler seems to generate identifiers containing dollar signs for operator overloads, and IKVM uses those in the generated DLL (I can see it in Reflector). The problem is, dollar signs are illegal in C# code, and so I can't reference those methods.
Any way to work around this problem?
You should be able to access the funky methods using reflection. Not a nice solution, but at least it should work. Depending on the structure of the API in the DLL it may be feasible to create a wrapper around the methods to localise the reflection code. Then from the rest of your code just call the nice wrapper.
The alternative would be to hack on the IL in the target DLL and change the identifiers. Or do some post-build IL-hacking on your own code.
Perhaps you can teach IKVM to rename these identifiers such that they have no dollar sign? I'm not super familar, but a quick search pointed me at these:
http://weblog.ikvm.net/default.aspx?date=2005-05-02
What is the format of the Remap XML file for IKVM?
String and complex data types in Map.xml for IKVM!
Good Hunting
Write synonyms for those methods:
def +(a:A,b:A) = a + b
val plus = + _
I fear that you will have to use Reflection in order to access those members. Escaping simply doesn't work in your case.
But for thoose of you, who interested in escaping mechanics I've wrote an explanation.
In C# you can use the #-sign in order to escape keywords and use them as identifiers. However, this does not help to escape invalid characters:
bool #bool = false;
There is a way to write identifiers differently by using a Unicode escape sequence:
int i\u0064; // '\u0064' == 'd'
id = 5;
Yes this works. However, even with this trick you can still not use the $-sign in an identifier. Trying...
int i\u0024; // '\u0024' == '$'
... gives the compiler error "Unexpected character '\u0024'". The identifier must still be a valid identifier! The c# compiler probably resolves the escape sequence in a kind of pre-processing and treats the resulting identifier as if it had been entered normally
So what is this escaping good for? Maybe it can help you, if someone uses a foreign language character that is not on your keyboard.
int \u00E4; // German a-Umlaut
ä = 5;
I'm working with strings, which could contain surrogate unicode characters (non-BMP, 4 bytes per character).
When I use "\Uxxxxxxxxv" format to specify surrogate character in F# - for some characters it gives different result than in the case of C#. For example:
C#:
string s = "\U0001D11E";
bool c = Char.IsSurrogate(s, 0);
Console.WriteLine(String.Format("Length: {0}, is surrogate: {1}", s.Length, c));
Gives: Length: 2, is surrogate: True
F#:
let s = "\U0001D11E"
let c = Char.IsSurrogate(s, 0)
printf "Length: %d, is surrogate: %b" s.Length c
Gives: Length: 2, is surrogate: false
Note: Some surrogate characters works in F# ("\U0010011", "\U00100011"), but some of them doesn't work.
Q: Is this is bug in F#? How can I handle allowed surrogate unicode characters in strings with F# (Does F# has different format, or only the way is to use Char.ConvertFromUtf32 0x1D11E)
Update:
s.ToCharArray() gives for F# [| 0xD800; 0xDF41 |]; for C# { 0xD834, 0xDD1E }
This is a known bug in the F# compiler that shipped with VS2010 (and SP1); the fix appears in the VS11 bits, so if you have the VS11 Beta and use the F# 3.0 compiler, you'll see this behave as expected.
(If the other answers/comments here don't provide you with a suitable workaround in the meantime, let me know.)
That obviously means that F# makes mistake while parsing some string literals. That is proven by the fact character you've mentioned is non-BMP, and in UTF-16 it should be represented as pair of surrogates.
Surrogates are words in range 0xD800-0xDFFF, while neither of chars in produced string fits in that range.
But processing of surrogates doesn't change, as framework (what is under the hood) is the same. So you already have answer in your question - if you need string literals with non-BMP characters in your code, you should just use Char.ConvertFromUtf32 instead of \UXXXXXXXX notation. And all the rest processing will be just the same as always.
It seem to me that this is something connected with different forms of normalization.
Both in C# and in F# s.IsNormalized() returns true
But in C#
s.ToCharArray() gives us {55348, 56606} //0xD834, 0xDD1E
and in F#
s.ToCharArray() gives us {65533, 57422} //0xFFFD, 0xE04E
And as you probably know System.Char.IsSurrogate is implemented in the following way:
public static bool IsSurrogate(char c)
{
return (c >= HIGH_SURROGATE_START && c <= LOW_SURROGATE_END);
}
where
HIGH_SURROGATE_START = 0x00d800;
LOW_SURROGATE_END = 0x00dfff;
So in C# first char (55348) is less than LOW_SURROGATE_END but in F# first char (65533) is not less than LOW_SURROGATE_END.
I hope this helps.
I can't seem to find the answer to this question.
It seems like I should be able to go from a number to a character in C# by simply doing something along the lines of (char)MyInt to duplicate the behaviour of vb's Chr() function; however, this is not the case:
In VB Script w/ an asp page, if my code says this:
Response.Write(Chr(139))
It outputs this:
‹ (character code 8249)
Opposed to this:
(character code 139)
I'm missing something somewhere with the encoding, but I can't find it. What encoding is Chr() using?
Chr() uses the system default encoding, I believe - so it's roughly equivalent to:
byte[] bytes = new byte[] { 139 };
char c = Encoding.Default.GetString(bytes)[0];
On my box (Windows CP1252 as the default) that does indeed give Unicode 8249.
If you want to call something that has exactly the behaviour of VB's Chr from C#, then, why not simply call it rather than trying to deduce its behaviour?
Just put a "using Microsoft.VisualBasic;" at the top of your C# program, add the VB runtime DLL to your references, and go to town.
If you cast an int to a char, you will get the character with the Unicode character code that was in the integer. The char data type is just a 16 bit UTF-16 character code.
To get the equivalent of the VBScript chr() function in .NET you would need something like:
string s = Encoding.Default.GetString(new byte[]{ 139 });