Having rather large project using Resources for internationalization (following this guide: ASP.NET MVC 2 Localization complete guide, using things like data attributes, and so on) we run into the need of translating the resource files. In the beggining of project I selected approach to have lot of small resource files - for each view, viewmodel, controller, ... So I ended up having hundreds of resources. During the translations (which is done by our partners using ResXManager tool we run into trouble identifying the context of the string (where is it displayed, to find out the correct form of translation to make sense when displayed).
So I was asked to make the mutation of application which do not display the localized values, but the keys (or string names). E.g. having string in resources TBL_NAME used somewhere in the view like #ResX.TBL_NAME and translated into english as "Name", I would like to show it in this special mutation as "TBL_NAME", so the translator may see the context - where exactly this string is used.
The best would be, if this is not special build of application, but rather the another "language" of the application available for translators, so he can switch between english and this "unlocalized" languages.
I'm looking some easy ideas of doing this. So far I was thinking of these approaches:
Override ResourceManager.GetString - cannot use, because we use generated Designer classes to access strings massively and so far I haven't find a way to change created ResourceManager (see this answer). Did I miss something?
Create resources for some unused language, which will contain pairs string name/translated value as TBL_NAME/TBL_NAME - viable, but very exhausting since we have hundreds of resources. Also the addition of new resource will require us to remeber that we need to add also this unused language resource will exact same strings name. You also have to do twice much work when adding single string to application.
At the moment, it seems for me, that using resources and current approach it is impossible to solve this task, so I decided to ask this as question (and I'm aware it is rather discussion than question) here, hoping, someone will give me some hint about other approach to solve this problem.
My preferred option would be to give the translators an environment where they can see what they are translating. Rigi requires a bit of setup (basically you need to add an additional UI language), but once you have done that translators can work within the live website - or in a test instance, which is what we did.
They can also work in screenshots, which is convenient when translators would have to access admin or other role specific pages but you do not want to bother giving them all kinds of user rights. These screenshots can be generated as part of automated UI tests or during manual UI testing.
I am afraid I can't say anything about the cost of the solution, but our translators are really happy with it. I am not sure if this is what you are looking for since you asked for an easy solution, but it definitely solves the issue of giving translators the context they need to do their job - better than displaying resource IDs.
I'm working on a website that will deployed internationally. Very big site, but for the sake of simplicity, all we're concerned about is my Error.aspx with c# code behind. I'd like to make this custom error page as dynamic as possible. There's at least a handful of languages we need to read this page in right now, and more to come. This page needs to work independently and without a database to reference.
I'd like to have some text, and have the appropriate translation appear based on the language appropriate for that domain... e.g. ".com" = English, ".ca/fr" = French, ".mx" = Spanish... you get the idea.
What's the best way to do this?
I've looked into API's, but the decent ones have a cost threshold, and while it might look really helpful, this is just pretty standard error message text, that's unlikely to change, so that seems like overkill to have a dynamic translator. It might help with scalability, but it's extra money indefinitely, when it will only save vs hard-coding on the handful of occasions where we add another language/country/domain.
The other idea I had was to simply hardcode it in the c#. parse out Request.URL and get the domain, and make a ever-growing switch statement which would assign the appropriate text. (As an aside, I'm also trying to find a better way to do this, but is the country code something that would be an available piece of information from either the request object or server?) This way would be independent, precise, and the only drawback on a concrete level would be the cost of adding new languages, or changing every string (probably not that many, at least at first) if the content of the error message needed to be adjusted. But this feels like bad practice.
I've been researching this for a day now, but I haven't found any alternatives to these 2 options. What are the best practices for handling small amounts of text for translation, without the use of a CMS?
There is an easy built-in way to handle localization in ASP.NET Web Forms. It uses the Language Preference settings in the client's browser to select the language. Posting the steps of setting it up would be redundant since there's lots of information on this subject available online. Here is a good tutorial.
EDIT:
It might be a good idea to read up on HTML resource files. That is the HTML standard for handling different languages (referred to as localization). And it is what ASP.NET uses in the background when creating a local resource for a server control.
I apologize in advance for the generic nature of my question, but I was unable to find any helpful advice from people trying to do the same thing as me on the web. Let me describe my scenario:
I am providing end users/designers of a website the ability to customize their views by storing the views (using Razor) in the database. I have all of this working, but my question is the following; From a security standpoint, how can I ensure and enforce that unwanted code doesn't get executed in the user-defined view? There are two basic approaches that I think will work conceptually, but am not sure which one is more possible or feasible.
Option 1: Create a validation method in the administration tool that allows the user to input the view code. This would need to either take a whitelist or blacklist approach to what is allowable or not.
Option 2: Prevent unwanted code from being able to execute when rendering of the view occurs.
As a quick example of something that would need to be blocked, we wouldn't want to allow access to read or write files, access any data access functions, or even access configuration settings, etc. in the web.config. There will likely be a decently-sized list of things that probably shouldn't be allowable, but I'll need to sit down and try to think of as many security-related concerns as possible.
My question then is, which method would be the best bet? Also, can any direction be provided on how to go about either? I thought I might be able to make trust-level based change which would be Option 2, but couldn't find any way to make that work in a per-view based manor (the administration code is allowed to execute whatever it wants). I'm thinking Option 1 will end up being the best bet and I'll have to check for the input of certain framework functions that shouldn't be allowed. Does anyone have any experience doing anything like what I'm trying to do? ANY feedback is much appreciated!
This would be extremely difficult.
You could run the the template through the Razor preprocessor, then use Roslyn (still in early beta) to parse the generated file and look through all method calls (or constructors) and return an error if it calls something you don't like.
I strongly recommend that you use a whitelist for that, since the .Net framework is big enough that you are bound to overlook something in a blacklist.
However, I would instead recommend that you not use Razor at all and instead use a templating engine that does not allow real C# code.
Is it bad practice to have a string like
"name=Gina;postion= HouseMatriarch;id=1234" to hold state data in an application.I know that I could just as well have a struct , class or hashtable to hold this info.
Is it acceptable practice to hold delimited key/value pairs in a database field– just for use in where the data type is not known at design time.
Thanks...I am just trying to insure that I maintain good design practices
Yes, holding your data in a string like "name=Gina;postion= HouseMatriarch;id=1234" is very bad practice. This data structure should be stored in structs or objects, because it is hard to access, validate and process the data in a string. It will take you much more time to write the code to parse your string to get at your data than just using the language structures of C#.
I would also advise against storing key/value pairs in database fields if the other option is just adding columns for those fields. If you don't know the type of your data at design time, you are probably not doing the design right. How will you be able to build an application when you don't know what data types your fields will have to hold? Or perhaps you should elaborate on the context of the application to make the intent clearer. It is not all black and white :-)
Well, if your application only passes this data between other systems, I don't see any problems in treating it as a string. However, if you need to use the data, I would definitely introduce the necessary types to handle that.
I think you will find your application easier to maintain if you make a struct or class to hold the data and then add a custom property to return (and set) the string you been using. This method will take the fields and format it in the string that you are already using and do the reverse (take the string and fill the fields) This way you maintain maximum compatibility with your old algorithms.
Well one immediate problem with that approach is embedded escape chars. Given your example what would happen if the user entered their name as follows:
Pet;er
or
Pe=;ter
or
pe;Name=Yeoi;
I am not sure what state data it is you are trying to hold, and without any context it's hard to make valid suggestions. Perhaps a first step would be to replace this with a key value pair, at least that negates the problem mentioned above and means you don't have to parse strings regularly.
I try to not keep data in any string based formats. But I encountered several situations, in which it was not possible to know in advance how the structure of the data will be (e.g. it was possible for the customer/end-user to dynamically add fields).
In contrast to your approach, we decided to store the data in XML, e.g. in your case this would be something similar like this:
<user id="1234">
<name>Gina</name>
<postion>HouseMatriarch</position>
</user>
This gives you the following advantages:
The classes to work with the data (read/write) are already available in the framework (e.g. XmlDocument or XML serialization)
you can easily exchange the data with other systems (if/when required)
You can store the data in a file
you can store the data in a database column (xml data type). You can even query that column when using SQL Server (although I'd try to avoid storing data in XML, that has to be queried)
using XML allows to add additional fields to your data at any time
Update: I'm not sure why my answer was downvoted that much - maybe it is because of the bad example. Therefore I'd like to make it clear: I would not use XML for properties such as an ID/primary key of a user, or for standard properties like "name", "email", etc. But for "extended/dynamic" properties (as described above) I still think this is an easy and elegant solution.
If you want to store structured data in a string I think you should use a standard notation such as JSON.
It's bad practice because of the amount of effort you have to go to, to construct the strings and parse them later. There are other more robust ways of serialising data for passing between systems.
For core business data, suitably designed classes will be far simpler to maintain, and with all the properties strongly typed, you'll know early on when you mis-type a property name.
As for key-value pairs, I'd say they're sometimes Ok, sometimes not. If there are a lot of possible values, but not a lot of actually owned values, then it can be perfectly all right to use KVPs. Sebastian Dietz's alternative of having a separate column for each field would result in a lot of empty fields in that case. It would also mean extra work altering the table every time you needed a new one.
None of the answers has mentioned normalization yet, so I thought I would. When database fields are involved, one of the key principles of normalization is that each field in a table only represents one thing. Delimited fields violate that principle.
One of the guys at Red Gate Software posted this article along those lines that you may find useful.
Well it just means that it is less searchable or indexable as a hashtable would be. It would also require a bit of processing to get into a state where it could be easily used by other bits of code. For example a bit of code that queries the id in that data would be something horrible such as:
if(dataStringThing.Substring(26, 4) == SomeIdInStringFormat)
So yes in most cases it is bad practice. However in other cases where this might be a default format that you need to retain the data in or performance means that you only should parse it as and when required. So it may not be a bad thing.
I would suggest myself if you have reasons to keep it in that format that it might be best to transform it into a class that separates the fields but also create a ToString() implementation on that class that restores it to the original format if you also need this. If the performance of it is a concern then modify this object to only parse the source into the fields in the class the first time those fields are accessed.
To re-iterate nothing in isolation is necessarily a bad practise. Bad practises are context dependant.
It (hopefully) obviously shouldn't be a normal choice. But there are cases where it's useful or necessary.
I can't think of any cases that wouldn't involve it being part of a communications protocol with some external service (e.g. a database connection string), so you're probably stuck with the format.
If you have a choice in the format (perhaps you are writing both sides of a system which can only communicate using strings), then at least choose something structured and well known. Examples of such have been given elsewhere, but the prime ones are naturally going to be XML or JSON. CSV, or some other delimited format may be useful in very simple cases (such as the database connection string) - but pay special attention to escaping delimiter characters (as the "Bobby Tables" joke (already referenced in another comment) nicely illustrated - google for him if you are not familiar with that one).
Your mention of a database suggests that this may be where the focus is. Are you trying to serialise application objects? (there are other ways of doing that). As another poster said, this may be a sign of a design that needs rethinking. But if you do need to store unknown datatypes in a DB, then XML may be an appropriate choice - especially if your DB supports XML fields. It's a bit of a minefield, though, so make sure you are familiar with how they work first.
I think it is not that bad when you are using a StringList for manging your string.
Especially when the structure of a e.g configuration-string (or configuration-database field) must be flexibel.
But in normally you should not do this, because of this disadvantages.
It all depends on what you're trying to accomplish.
If you need a heirarcical format of data or lots of fields that preserve data type, then no... a parsed string is a bad idea.
However, if you just need to transmit a string across a service and byte-conservation is important, then a Tag-Data pair may be exactly what you need.
If you do use a parsed string, it's important to be able to get at the data inside and quickly manage it. If you want an example TDP class, I posted one today to my website:
http://www.jerryandcheryl.net/jspot/2009/01/tag-data-pairs.html
I hope that helps.
I suggest considering these usage factors.
If you are processing the data within your own code, then you can use whatever data structures you wish. However, you may have issues developing your own implementation of a complex data structure, so consider using a pre-built one instead. Many come with whatever programming platform you may be using, while many more are documented in various books, articles, and discussions both printed and online. If you properly isolate your work from others, then you can safely do whatever you want.
On the other hand, if you need to share that data with others, then most careful consideration should be given. If you must share the data with an API, or via a storage mechanism (database, file, etc.), or via some transport (sockets, HTTP, etc.), then you should be thinking of others first and foremost. If you wish success and respect from your efforts, then you need to pay attention to standards and conventions and cost. Thankfully, practically any such use that you can imagine has been done before, so you can leverage others' efforts.
In a database, consider how others (and yourself) will be inserting, updating, deleting, and selecting the data. For example, using XML in a database makes all these steps unnecessarily hard and expensive compared to the alternatives. Pay attention to database normalization--learn it if you are not familiar already.
If you are dealing with text, pay attention to character encodings and make them explicit.
If there is an existing standard or convention for what you are doing, honor it. If there is a compelling reason to deviate, then accept the burden of justifying it, explaining it, and making it easy for others to accommodate your choices.
If you control both sides of a communication/transport medium, feel free to optimize. If you don't, err on the side of interoperability. Remember that a primary difference between the two scenarios is the level of self-description embedded with the data: interoperability has lots, optimization drops it based on shared assumptions. Text-rich data is more understandable, but binary is faster.
Think about your audience.