Dollar ($) character used in resource.GetObject - c#

I just spotted this line in an old Windows Forms app (created by the designer):
this.Icon = ((System.Drawing.Icon)(resources.GetObject("$this.Icon")));
it seems to be using ResourceManager.GetObject() in order to get the icon. My question is regarding the significance of the $ prefixing this. There is no mention of the dollar symbol in the docs.
Does the dollar have a special meaning (reflection possibly?) or merely just to do with the implementation of GetObject()?
In addition where is the icon actually stored?

This is a pretty standard trick used in .NET, the compiler uses it too when it needs to generate a name for a auto-generated class or field. Using a character like $ ensures that there can never be a name collision with an identifier in the program.
There isn't much chance of that when you only program in C#, this is a keyword. But certainly in other languages. You could for example create a VB.NET Winforms project, drop a button on the form and name it "this". When you localize the form, the button's Text property source appears as:
<data name="this.Text" xml:space="preserve">
<value>Button1</value>
</data>
That would be a name collision with the form's Text property resource if it didn't put the $ in front of it. Not until you program in a language that permits $ in an identifier name anyway. None of the standard VS languages do.
Yet another detail is that you'll have trouble referencing that button from C# code. There's an escape hatch for that as well, you can use #this in your code. The # prefix makes sure that the compiler doesn't recognize it as a keyword but just a plain identifier.

The WinForms designer actually dumps quite a bit of hidden items like this into the resource file (.resx) assocaited with each form in order to support, mostly, internationalization (though other designer meta-data is there as well). While text and icons may be obvious, even layout information can be there. I suppose those German words are pretty long so when internationalizaing the form you may actually need to change label widths.
The $ I would assume is a way to make sure the designer-added resources don't conflict with user resources.

It turns out that the icon is stored within the corresponding .resx file and was actually defined with the dollar prefix $this.Icon. This would imply that the dollar doesn't have a special meaning at all.

Related

C#'s StringInfo and TextElementEnumerator can't recognize graphemes properly

In C# StringInfo and TextElementEnumerator classes provide methods and properties for text elements.
And here, we can find the definition of the Text Element.
The .NET Framework defines a text element as a unit of text that is
displayed as a single character, that is, a grapheme. A text element
can be any of the following:
Yes, it says a text element is a grapheme in .NET. I also tested with some unicode characters myself, and it really seemed true until I tested one Korean letter '가'.
As we all know some Unicode characters consist of multiple code points. Also we may face code point sequences and that's the reason I'm using StringInfo and TextElementEnumerator instead of simple String.
StringInfo and TextElementEnumerator could tell if Chars were surrogate pairs correctly. And "\u0061\u0308", a Unicode character which consists of multiple code points, was recognized as one text element just as expected. But as for "\u1100\u1161", it failed to say that it was also one text element.
"\u1100" is a leading letter "ㄱ", and "\u1161" is a vowel letter "ㅏ". They can be individual characters and shown to the users just as I write here and you can see them now. But if they are used together, they are rendered as one character "가" instead of "ㄱㅏ".
There are two ways in order to represent a Korean character "가":
Using a single code point U+AC00 from Hangul Syllable.
Using two code points U+1100 and U+1161 from Jamo.
Most of the time the former is used. The latter is rarely used, to be honest, I can't imagine when it's used at all..
Anyway, the first one is just one precomposed letter and the second is a sequence of Lead and Vowel which is treated as one character. When rendered they look the exactly same and both are actually canonically equivalent.
Also the following line returns true in C# :
"\u1100\u1161".Normalize() == "\uAC00"
I wonder why Normalize() here works just fine when C# doesn't think they are one complete text element..
I thought it had something to do with my .NET's version, but it turns out it's not the case. This thing happens even in Mono too.
I tested this with ICU as well, and it could treat "\u1100\u1161" as one grapheme correctly!
I initially thought StringInfo and TextElementEnumerator could eliminate need for ICU4C in some simple cases, so I'm very disappointed now..
Here's my question :
Am I doing something wrong here?
or
A Text Element in .NET isn't a user-perceived character unlike in ICU?
The basic issue here is that per the Korean standard KS X 1026, the two jamos ㄱ and ㅏ are distinct from their combined form 가. In fact, this exact example is used in the official standard (see section 6.2).
Long story short, Microsoft attempted to follow the standard but other operating systems and applications don't necessarily do so. Hence you can get "malformed" content from other software / platforms that appears to be parsed incorrectly on Windows / in .NET, even though it is parsed "correctly" on those platforms.
You will either need to ensure your data is correctly formed in the first place (unlikely, given that the de-facto standard is to completely ignore the official standard) or you will need to use ICU (or a similar library) to deal with these cases.

Does C# remove the characters • from a string?

I have this C# code used to populate a label on the screen of a phone. Note that it's not HTML source being used here.
c1Label.Text = "To select cards for your deck you can one of a number of options
•
and this XAML
<local:JustifiedLabel x:Name="c1Label" Text= "To select cards for your deck you can one of a number of options
•
The former shows &#10 as part of the text but the XAML version works fine and shows this as a line feed followed by a bullet.
This is to be expected. Both languages (C# and XML) have different rules, especially regarding what characters are “special” and how they have to be escaped when you want to use them anyway. In the C# string
"
•"
are just exactly those letters since they have no special meaning to the C# compiler. In XML they are numeric character references, and are an escape mechanism of including arbitrary characters.
Conversely, in C# the following
"\n \u2022"
represents a line feed and a bullet. But in XML it's just the exact characters as written.
You can construct endless such examples with almost any two different languages. Yes, this means you cannot just copy text from one language and expect it to represent the same string in another language. If you're transforming one language into another it's easy to handle programmatically, when you're copying stuff around manually you just have to live with this and adapt accordingly.

Scintilla.NET regular expression based syntax highlighing

Is it possible to use regular expressions to define syntax highlighting in Scintilla? And if so, how to do it?
I have a custom language to process, which cannot be described in simple terms of keywords and delimiters. The meaning of particular structures in this language is dependent only on their position relative to keywords. I have regular expression based parser for this format, all I need is to apply regular expression defined rules as text styles.
I mean if something matches regex1, it should have style1. Is it possible? How?
If not - can I set styles for manually selected ranges? I mean to assign style number to a specified character range in editor. How to do it?
Is it possible to define Scintilla styles in code, not in xml file?
EDIT:
OK, I've found a way.
foreach (Match m in Patterns.Keyword0.Matches(Encoding.ASCII.GetString(e.RawText)))
e.GetRange(m.Index, m.Index + m.Length).SetStyle(1);
The problem is RawText property. It's byte buffer of UTF-8 encoded text. The text property contains nice UTF-16 text, but the GetRage method accepts byte offset not character offset. If I use conversion on each TextChanged event I loose almost all speed advantage from using Scintilla.
Of course the easiest way would be to change internal encoding to UTF-16, but when I do it, I get exception saying this encoding is not supported. The only one supported seems to be UTF-8 which is ridiculously hard (and slow) to process.
I'm hitting a wall here.
The key to this is to set the lexer to SCLEX_CONTAINER and then handle the SCN_STYLENEEDED notification. This means you only ever have to process the text that actually needs styling.
There are several guides linked at the top of the Scintilla Documentation that detail various aspects of implementing customs lexers, so I won't bother repeating any of that here.
As for performance: I've written custom scintilla lexers is python that decode to utf-8 when styling and have never noticed any significant issues, so I'd be amazed if you couldn't at least match that using C#.

Why does this happen with ToolStripMenuItems?

When adding a ToolStripMenuItem to a form and setting RightToLeft to true and having a quote at the end of the text does it place the quote at the front of the Text?
ToolStripMenuItem1.Text = "Name \"Text\"";
ToolStripMenuItem1.RightToLeft = System.Windows.Forms.RightToLeft.Yes;
Displays as; "Name "Text
Edit: This also happens with single quotes.
By setting RightToLeft to Yes, you are asking the Windows text rendering engine to apply the text layout rules used in languages that use a right-to-left order. Arabic and Hebrew. Those rules are pretty subtle, especially because English phrases in those languages are not uncommon. It is not going to render "txeT emaN" as it normally does with Arabic or Hebrew glyphs, that doesn't make sense to anybody. It needs to identify sentences or phrases and reverse those. Quotes are special, they delineate a phrase.
Long story short, you are ab-using a feature to get alignment that was really meant to do something far more involved. Don't use it for that.
EDIT:
IF you just want to change Text Alignment then there is always ToolStripItem.Alignment or ToolStripItem.Padding to try...
The feature you are using is meant for localization and can support mixed content...
The way you use that feature seems more like an abuse... since Windows needs to make sense of "mixed content" of which you provide an extreme (no arabic at all).
I find it always hard to be sure that such behaviour though unexpected and unintuitive is really a bug...
Rendering logic for BiDi text is rather complex - for example when you have an exclamation mark somewhere in text which should be rendered RightToLeft it can lead to reversing the direction depending on the implementation...
For some insight see http://www.unicode.org/reports/tr9/
As it seems .NET is not fully compliant with the Unicode BiDi algorithm... there is even library that tries to implement it see http://sourceforge.net/projects/nbidi/

How do I display only the class name in doxygen class diagrams?

Using doxygen and graphviz with my C# project, I can generate class diagrams in the documentation pages. These diagrams have the full class names and namespaces in them, e.g.
Acme.MyProduct.MyClasses.MyClass
Is it possible to configure doxygen to cut this down a bit to just the class name?
MyClass
The fully qualified paths make even simple diagrams rather wide and unwieldy. I'd like to minimize the need for horizontal scrolling.
I suspect that you've already solved this as it is a year old, but an answer might be useful for anyone else searching for this (as I just did). You can use the "HIDE_SCOPE_NAMES" option. Setting it to YES (or checking it in the doxywizard GUI) will hide namespaces. From my doxygen file:
# If the HIDE_SCOPE_NAMES tag is set to NO (the default) then Doxygen
# will show members with their full class and namespace scopes in the
# documentation. If set to YES the scope will be hidden.
HIDE_SCOPE_NAMES = YES
The HIDE_SCOPE_NAMES works great but only hides the scope in the class diagram but not the caller/callee graphs for each method.
To reduce the width of those diagrams to a readable size you can rename the scope using the input filter. This will not remove the namespace but will reduce it to a more readable width.
For example to rename the namespace "COMPANY_NAMESPACE" to "sf" use:
# The INPUT_FILTER tag can be used to specify a program that doxygen should
# invoke to filter for each input file. Doxygen will invoke the filter program
# by executing (via popen()) the command <filter> <input-file>, where <filter>
# is the value of the INPUT_FILTER tag, and <input-file> is the name of an
# input file. Doxygen will then use the output that the filter program writes
# to standard output. If FILTER_PATTERNS is specified, this tag will be
# ignored.
INPUT_FILTER = "sed 's,COMPANY_NAMESPACE,sf,'"

Categories