Localization alternative to Resx file

Localization alternative to Resx file - c#

Why I don't want to use Resx files:
I am looking for an alternative for resx files to offer multilanguage support for my project, due to the following reasons:
I don't like to specify a "messageId" when writing messages, it is more effort and it is annoying for the flow as I don't see what the log message would actually say and I would need to open another tab to edit the message
Sometimes I use code inline because I don't want to create new variables for to easy steps (e. g. Log.Info("Iterated {i+1} times");). Using variables or doing simple calculations inline makes the whole code sometimes more clearly than creating additional code lines
What I could imagine instead:
An external application which crawls a compiled exe for all strings, giving you the opportunity to ignore/add strings which should be translated. It could create a XML or Json file for all languages as well then. It would replace all strings with a hash/id so that a lookup for strings in all languages is still possible.
Am I the only one who is not happy with the commonly used Resx / centralized string db solution? Do I miss points why this wouldn't be a good idea?

One reason for relying on established approaches instead of implementing your own format is translation. It really depends on how your resources are translated: if it is done by volunteers with a technical background who don't mind working in a plain text editor, then you are free to come up with your own resource format. If on the other hand you send out your resources to professional translators who are not very technical and who prefer to work in a translation environment with integrated terminology management, translation memory, spelling and quality checks etc. it is quite likely that this environment will not be able to handle your homemade resource format.
Since I already mentioned professional translation environments: some of these tools rely on IDs to figure out which strings are old and which are new. If you use the approach that the text is the ID every fixed typo in your source language means that you create a new string that needs to be translated - and paid for. If the translator sees that the source text for a string has changed he can have a look at the change, notice that a typo has been fixed, decide that the translation is still OK and sign the string off, without extra translation cost.
By the way, if you want good localizations for strings like Log.Info("Iterated {i+1} times"); you have to find some way of dealing with plural forms correctly. Some languages have different grammatical rules for different numbers (see the Unicode Language Plural Rules for an overview). Just because something is easy to do in code does not mean that it is easy to localize, I'm afraid.
To sum this up: if you want to create your own resource format, talk with your translators. Ask them which formats they can handle. Think about translation related limitations that come with your format, for example if there are any characters that the translators should not use because they break your strings? Apostrophes and quotes are prime candidates here because they are often used as string delimiters in resource files, or < and & if you decide to go the XML way. Think about a conversion to XLIFF and back: most translation environments can handle XLIFF.

Related

How to exports strings from a WPF application code for internationalization?

I'm trying to modify an existing C# application for internationalization. The process for WPF has some documentation here and seems reasonably transparent as I can continue to develop normally and run msbuild from time to time and check if everything holds. However, while going through the sample project, I realized that it won't cover strings defined in code. In my case, most of them are used for logging and could more or less be easily exported with regexes. This seems a bit hazardous as well as I'm not certain the center will hold if I try to extract C# source with regex. I guess that I could wrap every string in a translation function that will perform the lookup in resources.
I'm not sure how to proceed from there. I'll have a bunch of strings that I could dump in a resx file and another set of strings extracted from the baml files internationalized in another way. Since I'm expecting each method to bring their own complications, I'd rather deal with only half of those complications if possible.
Is there any way to have either method work for both cases? I'd honestly prefer the second one since it makes more sense to me but I guess I could roll with generating a gazillion of Uids and only using 5% - 10% of them.

I develop multi-language check-in kiosks for one of the worlds busiest international airports (either #1 or #3, depending on how you define it), and in my experience the best solution for this in WPF apps is custom markup extensions. First, you can use regular language as your key, which means all of your XAML can be written in whatever language is most convenient for your developers. Secondly, you can add custom namespaces to the XAML namespaces, which helps keep your XAML tidy. Third, it's very easy to write utilities to extract your XAML extensions and collate them into Excel spreadsheets (say) for your translators, then incorporate the translations themselves back into your application. Finally, the translation tables themselves can be easily switched at runtime, allowing you to change your language on-the-fly.
Put all this together and all your XAML looks like this:
<TextBlock Text="{Translate 'Text to be translated appears here'}" />
And of course it's easy to control which text goes through your translation engine and which text doesn't, by simply controlling exactly where you use your Translate markup extension.

C# Resources: display resource string names instead of localized values

Having rather large project using Resources for internationalization (following this guide: ASP.NET MVC 2 Localization complete guide, using things like data attributes, and so on) we run into the need of translating the resource files. In the beggining of project I selected approach to have lot of small resource files - for each view, viewmodel, controller, ... So I ended up having hundreds of resources. During the translations (which is done by our partners using ResXManager tool we run into trouble identifying the context of the string (where is it displayed, to find out the correct form of translation to make sense when displayed).
So I was asked to make the mutation of application which do not display the localized values, but the keys (or string names). E.g. having string in resources TBL_NAME used somewhere in the view like #ResX.TBL_NAME and translated into english as "Name", I would like to show it in this special mutation as "TBL_NAME", so the translator may see the context - where exactly this string is used.
The best would be, if this is not special build of application, but rather the another "language" of the application available for translators, so he can switch between english and this "unlocalized" languages.
I'm looking some easy ideas of doing this. So far I was thinking of these approaches:
Override ResourceManager.GetString - cannot use, because we use generated Designer classes to access strings massively and so far I haven't find a way to change created ResourceManager (see this answer). Did I miss something?
Create resources for some unused language, which will contain pairs string name/translated value as TBL_NAME/TBL_NAME - viable, but very exhausting since we have hundreds of resources. Also the addition of new resource will require us to remeber that we need to add also this unused language resource will exact same strings name. You also have to do twice much work when adding single string to application.
At the moment, it seems for me, that using resources and current approach it is impossible to solve this task, so I decided to ask this as question (and I'm aware it is rather discussion than question) here, hoping, someone will give me some hint about other approach to solve this problem.

My preferred option would be to give the translators an environment where they can see what they are translating. Rigi requires a bit of setup (basically you need to add an additional UI language), but once you have done that translators can work within the live website - or in a test instance, which is what we did.
They can also work in screenshots, which is convenient when translators would have to access admin or other role specific pages but you do not want to bother giving them all kinds of user rights. These screenshots can be generated as part of automated UI tests or during manual UI testing.
I am afraid I can't say anything about the cost of the solution, but our translators are really happy with it. I am not sure if this is what you are looking for since you asked for an easy solution, but it definitely solves the issue of giving translators the context they need to do their job - better than displaying resource IDs.

Translation and localization issue

Does Microsoft implementation of C# runtime offer some localization mechanism to translate common strings like Overflow, Stack overflow, Underflow, etc...
See the code below - it's a part of Mono and Mono itself has a Locale.GetText routine for making such translations.
// Added to avoid possible integer overflow.
if (inputOffset > inputBuffer.Length - inputCount)
throw new ArgumentException("inputOffset" +
Locale.GetText("Overflow");
Now - how is it done in Microsoft version of runtime and how can I use it, for example, to get the localized equivalent of Overflow without adding resource files?

.NET provides a framework that makes it easy to localize your content (ResourceManager) and while it internally maintains some translations for its own purpose (for example DateTime.ToString gives you a textual representation for the date/time that is locally appropriate, which includes the translated month and day names), it does not provide you with any ready-made translations, be they common strings or not. It could hardly do this reliably anyway, as there is a plethora of human languages out there and words can have different translations depending on context etc.
In your example, I would say that you are OK with untranslated exception messages. Although Microsoft recommends that you localize exception descriptions and they do localize their own (at least for major languages), this advice seems ill-thought at it's not only a waste of effort to translate all this text that users probably should never see, but it can make debugging a nightmare.

Yes, it does and it's a terrible idea. It makes debugging so much harder.

without adding resource files
What do you have against resource files? Resources are the prescribed way to provide localized and localizable strings, images, and other data for a .NET app or assembly.
Note that single word substitution as shown in your example code will result in poor quality translations. Different languages have different sentence structure and word order which your single word substitution won't accommodate. Non-English languages often involve genders for nouns and declension of words to properly reflect their role and number in a phrase. Single word substitution fails miserably at this.
Your non-English customers will most likely prefer that you not butcher their language by attempting to partially translate text a word here and a word there. If you're going to go to the trouble of supporting localizable messages, do it right and allow the entire string to be translated so that word ordering and declension can be done properly by translators. In cases where the content is variable, make the format string a resource so that the translator can set off the variable data using the conventions of the language.

Using only 1 resource file instead of 1 resource file per form/other strings

We are localizing our forms and strings in a project and are having a problem; Visual Studio creates a resource file for each form when setting Localizable to true.
It's nothing more than a minor nuisance having to send all of the resource files to translators, but is it possible to get VS to use a global resources file instead?
Thanks!

Like Yoda would say, possible it is.
You will have to dynamically translate the dialogs when they are loaded. I did this on several projects and I would say it's much better than having localized resource files.

As others already said, it is possible to use global resource file manually. I believe that it is actually more problematic and less maintainable but still possible.
Now onto why MS decided on one resource file per form. Well, from Internationalization point of view, this solution is better. On one hand it gives translators one important thing: the context. On the other hand, it is typical for project to grow. It is really unlikely that you will make changes to all forms at once. And you know what? Depending on your deal with translation vendor, you can usually spend less on Localization if just few percent of strings would change. That is just because they can use Translation Memory (TM) software.
With one global resource file, there is usually no context and no way to reasonably use TM. The result is, translations are less accurate and take longer (one needs to actually read large blocks of text to make sure everything is correctly translated).
By the way, you do not need to send out individual resource files. Instead you can use some kind of translation kit generator (or translation manager software) to create something useful for translators (for example translation memory friendly file). Sadly, I cannot give you the names of such tools (although I know that there are few of them) since my employer is using custom system for that and I didn't have a chance to work with other tools.

WinFrom doesn't support generating a global resources for Forms automatically in VS.
You must assign the strings by yourself. For example:
Add a Resources.resx with Resources.designer.cs to your project;
Define your strings in Resources.resx.
In your form.cs code, assign the strings in constructor like:
Label1.Text = Resources.Label1Text;

Prepare for multi language capability. Have I missed a trick?

My asp.net web app is currently being developed and I want to handle any language input by the user. This input will then be displayed to other users on the site.
So far I have done the following:
Put this is the head - meta http-equiv="Content-Type" content="text/html; charset=utf-8"
Saved inputs in NVARCHAR fields
Do I need to do anything else? Do I need any other meta tags (content-language, etc)?

Also think of a way to localize your UI, either via resources or with an appropriate support in your database. If the users are expected to generate non-English content, they will definitely appreciate seeing UI in their native language.

You should remember not to make assumptions that are not valid in general.
A fairly common assumption that is wrong is that (str.ToUpper().ToLower() == str) for any string str. A more subtle assumption is that the concept of "upper" and "lower" case even makes sense for any given language.
Another frequent problematic assumption is that a single char in the input is always an actual character from user's perspective. This is wrong - even setting things such as surrogate pairs aside, there are also combining characters. You either have to normalize your strings (and even that isn't 100% foolproof), or just avoid dealing with individual chars.
If you want to deal with more than just plain text input displayed verbatim - i.e. full, proper localization - you'll also have to handle number, date, currency etc formats correctly; and, for example, do not assume that decimal separator is a dot.
My best general advice would be to just go and read Michael Kaplan's blog, Microsoft's local guru on localization and related issues. Look for categories (tags) such as "Collation/Casing", "Encoding/Codepages" and "Int'l Programming". There's a lot of stuff there, and most of it is either directly relevant to your question, or interesting, or both. If, after reading a couple of his blog posts, you start thinking that maybe hiring a localization expert just to point out potential non-obvious problems in that area is a good idea, then you're probably right :)

Whenever you use String.Format append client's culture spec. Using FxCop allows to explore these places.
Exclude string constants from .cs code
Place images (that can contain culture specific text) into skin files or resources.

The browsers determine the charset in the following order:
Content-Type http header (value example: "text/html; charset=utf-8")
XML declaration
meta attribute
You should check that the web server does not send conflicting content type information in headers.
Make sure you save the files in UTF-8.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.