why store internationalization "words" in separate (xml) files?

why store internationalization "words" in separate (xml) files? - c#

I've been reading up a bit on how people do internationalization. It seems that the common consensus is to save those strings in a separate file (usually xml) and load it when necessary.
I'm wondering why not just store those strings in a database instead? isn't it much better this way?
Btw the nature of my app is a website app.

The most important thing is to store your string tables outside of your compilation units so that incorporating updated translations does not require a rebuild. This allows for new or updated translations to be incorporated at a later point without too much hassle.
Of course, those string tables could be stored anywhere. If you want to put them in a database, knock yourself out. As long as your application can reach them and your translation staff know how to deliver them into the right place, it doesn't make a difference.

Serious Internationalization is always a big project with a lot of parts and players. As #zerkms alludes to, the translation task is very often an offline activity by individuals and teams around the world.
So it makes sense to have a clean work product that a translation team can produce (the translated XML file).
Once the file is translated, it is up to you how you handle the translations within your software. It is common to keep them in memory since you often need to substitute in variables into the placeholders.

If you do store in the database, you will introduce additional overhead of querying the database everytime your "locale" switches.
This the reason for resource bundles. You package it along with source code, but you dont have to change code to add support for languages.
You could also subclass the resourcebundle class yourself and implement jdbc support so the locale-specific strings are stored in the database.
http://java.sun.com/developer/technicalArticles/Intl/ResourceBundles/

Related

C# Resources: display resource string names instead of localized values

Having rather large project using Resources for internationalization (following this guide: ASP.NET MVC 2 Localization complete guide, using things like data attributes, and so on) we run into the need of translating the resource files. In the beggining of project I selected approach to have lot of small resource files - for each view, viewmodel, controller, ... So I ended up having hundreds of resources. During the translations (which is done by our partners using ResXManager tool we run into trouble identifying the context of the string (where is it displayed, to find out the correct form of translation to make sense when displayed).
So I was asked to make the mutation of application which do not display the localized values, but the keys (or string names). E.g. having string in resources TBL_NAME used somewhere in the view like #ResX.TBL_NAME and translated into english as "Name", I would like to show it in this special mutation as "TBL_NAME", so the translator may see the context - where exactly this string is used.
The best would be, if this is not special build of application, but rather the another "language" of the application available for translators, so he can switch between english and this "unlocalized" languages.
I'm looking some easy ideas of doing this. So far I was thinking of these approaches:
Override ResourceManager.GetString - cannot use, because we use generated Designer classes to access strings massively and so far I haven't find a way to change created ResourceManager (see this answer). Did I miss something?
Create resources for some unused language, which will contain pairs string name/translated value as TBL_NAME/TBL_NAME - viable, but very exhausting since we have hundreds of resources. Also the addition of new resource will require us to remeber that we need to add also this unused language resource will exact same strings name. You also have to do twice much work when adding single string to application.
At the moment, it seems for me, that using resources and current approach it is impossible to solve this task, so I decided to ask this as question (and I'm aware it is rather discussion than question) here, hoping, someone will give me some hint about other approach to solve this problem.

My preferred option would be to give the translators an environment where they can see what they are translating. Rigi requires a bit of setup (basically you need to add an additional UI language), but once you have done that translators can work within the live website - or in a test instance, which is what we did.
They can also work in screenshots, which is convenient when translators would have to access admin or other role specific pages but you do not want to bother giving them all kinds of user rights. These screenshots can be generated as part of automated UI tests or during manual UI testing.
I am afraid I can't say anything about the cost of the solution, but our translators are really happy with it. I am not sure if this is what you are looking for since you asked for an easy solution, but it definitely solves the issue of giving translators the context they need to do their job - better than displaying resource IDs.

Pre-process MVC Razor File For Multi-Lingual Language Strings?

In my application we have multi-lingual language strings which are stored in custom tables, as the user can edit, delete, import new languages etc... via a UI
Currently, what I'm doing is at the beginning of each request is. I'm going off and getting all the language strings (From our database) for the currently selected language and sticking them in a dictionary.
I then have a Html Helper extension method which I use in the razor views (See below), which fishes in the dictionary I got at the beginning of the request to pull out the correct language based on the key supplied in the helper.
Html.LanguageString("MyLanguage.KeyHere")
Now this works fine. However, as the application is getting bigger. We are getting more and more language strings. It's not an issue right now, as its still very fast as there are only around 200 strings to get.
But this also means I'm getting all of them, even if a page has say one on it. I'd ideally like a way of processing the LanguageString("")'s before hand and doing a query to just get those that are needed at the beginning of the request? Or maybe my own linq based language that can be processed and product a more efficient call.
I'm looking for some advice on how to do this. As I'd like the application to be as efficient as possible. Any advice, help, tips are greatly received. Thanks.

I'd suggest caching language strings on the application basis rather than fetching them for every request. For example, this can be done by maintaining a static dictionary and invalidating the cache only when the user makes changes to these strings. This will make your application more responsive as well as save you from implementing (imho) rather more complex and not necessarily efficient technique of loading this data on-demand.
As a side note I'd add the following: it's usually a good practice to address these kinds of problems when they arise (rather than fixing something that is not broken) and focus on more important things. I totally agree that performance implications of a given solution must always be taken into consideration, I'm just saying that premature optimizations are not always a good idea.

Using only 1 resource file instead of 1 resource file per form/other strings

We are localizing our forms and strings in a project and are having a problem; Visual Studio creates a resource file for each form when setting Localizable to true.
It's nothing more than a minor nuisance having to send all of the resource files to translators, but is it possible to get VS to use a global resources file instead?
Thanks!

Like Yoda would say, possible it is.
You will have to dynamically translate the dialogs when they are loaded. I did this on several projects and I would say it's much better than having localized resource files.

As others already said, it is possible to use global resource file manually. I believe that it is actually more problematic and less maintainable but still possible.
Now onto why MS decided on one resource file per form. Well, from Internationalization point of view, this solution is better. On one hand it gives translators one important thing: the context. On the other hand, it is typical for project to grow. It is really unlikely that you will make changes to all forms at once. And you know what? Depending on your deal with translation vendor, you can usually spend less on Localization if just few percent of strings would change. That is just because they can use Translation Memory (TM) software.
With one global resource file, there is usually no context and no way to reasonably use TM. The result is, translations are less accurate and take longer (one needs to actually read large blocks of text to make sure everything is correctly translated).
By the way, you do not need to send out individual resource files. Instead you can use some kind of translation kit generator (or translation manager software) to create something useful for translators (for example translation memory friendly file). Sadly, I cannot give you the names of such tools (although I know that there are few of them) since my employer is using custom system for that and I didn't have a chance to work with other tools.

WinFrom doesn't support generating a global resources for Forms automatically in VS.
You must assign the strings by yourself. For example:
Add a Resources.resx with Resources.designer.cs to your project;
Define your strings in Resources.resx.
In your form.cs code, assign the strings in constructor like:
Label1.Text = Resources.Label1Text;

Should I store localization content in the application state

I am developing my first multilingual C# site and everything is going ok except for one crucial aspect. I'm not 100% sure what the best option is for storing strings (typically single words) that will be translated by code from my code behind pages.
On the front end of the site I am going to use asp.net resource files for the wording on the pages. This part is fine. However, this site will make XML calls and the XML responses are only ever in english. I have been given an excel sheet with all the words that will be returned by the XML broken into the different languages but I'm not sure how best to store/access this information. There are roughly 80 words x 7 languages.
I am thinking about creating a dictionary object for each language that is created by my global.asax file at application run time and just keeping it stored in memory. The plus side for doing this is that the dictionary object will only have to be created once (until IIS restarts) and can be accessed by any user without needing to be rebuilt but the downside is that I have 7 dictionary objects constantly stored in memory. The server is a Win 2008 64bit with 4GB of RAM so should I even be concerned with memory taken up by using this method?
What do you guys think would be the best way to store/retrieve different language words that would be used by all users?
Thanks for your input.
Rich

From what you say, you are looking at 560 words which need to differ based on locale. This is a drop in the ocean. The resource file method which you have contemplated is fit for purpose and I would recommend using them. They integrate with controls so you will be making the most from them.
If it did trouble you, you could have them on a sliding cache, i.e. sliding cache of 20mins for example, But I do not see anything wrong with your choice in this solution.
OMO
Cheers,
Andrew
P.s. have a read through this, to see how you can find and bind values in different resource files to controls and literals and use programatically.
http://msdn.microsoft.com/en-us/magazine/cc163566.aspx

As long as you are aware of the impact of doing so then yes, storing this data in memory would be fine (as long as you have enough to do so). Once you know what is appropriate for the current user then tossing it into memory would be fine. You might look at something like MemCached Win32 or Velocity though to offload the storage to another app server. Use this even on your local application for the time being that way when it is time to push this to another server or grow your app you have a clear separation of concerns defined at your caching layer. And keep in mind that the more languages you support the more stuff you are storing in memory. Keep an eye on the amount of data being stored in memory on your lone app server as this could become overwhelming in time. Also, make sure that the keys you are using are specific to the language. Otherwise you might find that you are storing a menu in german for an english user.

Internationalization in the database

Do you guys think internationalization should only done at the code level or should it be done in the database as well.
Are there any design patterns or accepted practices for internationalizing databases?
If you think it's a code problem then do you just use Resource files to look up a key returned from the database?
Thanks

Internationalization extends to the database insofar as your columns will be able to hold the data. NText, NChar, NVarchar instead of Text, Char, Varchar.
As far as your UI goes, for non-changing labels, resources files are a good method to use.

If you refer to making your app support multiple languages at the UI level, then there are more possibilities. If the labels never change, unless when you release a new version, then resource files that get embedded in the executable or assembly itself are your best bet, since they work faster. If your labels, on the other hand, need to be adjusted at runtime by the users, then storing the translations in the database is a good choice. As far as the code itself, the names of tables & fields in the database, we keep them in English as much as possible, since English is the "de facto" standard for IT people.

It depends a lot of what you are storing in your database. Taking examples from a recent project I was on, a film title that is entered at a client site and only visible to that client is fair game to store as-is in the database. A projector error code, on the other hand, because it can be viewed by the client, as well as by network operations centers that might be in different countries, should be stored as an error code (and supporting data, like lamp hours and the title of the movie being shown) which can be translated at the gui level depending on the language setting of the viewer.

#hova covers the technicalities, but something you might want to consider is support of a system showing a language you don't understand.
One way to cope with this is to have English as the default language, and a user setting that switches into a different language. That way your support users can log in and see the system in a natural way (assuming English as their first language), and your actual users can see the system in their first language. IMO, the data should always be 'natural' - in the language of the users.
Which raises another interesting point - should your system allow multiple languages for cross-border installations? In my experience, for user interface yes, but for data, no. To take a trivial example of address formatting, a letter to a French third party from a Swiss system should still have a Swiss-format address instead of a French one, as it has to go through the Swiss postal system first.

If your customers are Japanese and want to see their names in Kanji and Katakana (and sometimes in most formal Gaiji), you've got to store them as Unicode. No way around that.

Even things like addresses are very different between the US and Japan. One schema won't cut it for both.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.