JavaScript Internationalization - the Good, the Bad, and the Ugly
Written by Adam Asnes

Sunday, 30 March 2008

Given JavaScript's status as the de facto browser client scripting language, and given the international nature of the Internet, it was inevitable that JavaScript and internationalization (i18n) would eventually cross paths. At Lingoport, (www.lingoport.com) we see a good deal of JavaScript in our client's code that we internationalize. While JavaScript is not completely without international capabilities and functionality, it does have its share of challenges and faults. This article briefly discusses some of what to expect of JavaScript in an international web application - what works (the good), what to watch out for (the bad), and what to avoid (the ugly).

The Good - Unicode

Probably the best news about JavaScript and i18n is that it supports Unicode. This means you should never have to worry about character corruption provided you take care to make sure that JavaScript is using it.

If a JavaScript script block is embedded in an HTML file, it will automatically assume the character encoding of the enclosing page. Thus, if you have defined your HTML character set as UTF-8 you have done all you need to do. If your JavaScript is included as a separate .js file, you can add a charset attribute to your script tag to specify the character encoding of the included file. For example, a JavaScript file called functions.js that is encoded in UTF-8 would be included like this:

<script src="functions.js" type="text/javascript" charset="UTF-8"></script>

You can also include Unicode characters in any JavaScript regardless of encoding by defining the characters using Unicode escape definitions (\u + 4 hexadecimal values that specify the Unicode character value in big-endian order). For example, you could define a string with a smiley face character like this:

var smiley = "\u263A"; // smiley face -> :)

JavaScript is even smart enough to know the length of Unicode strings in terms of characters and not bytes. For example, smiley.length would return 1.

The Bad - Strings

One of the more annoying issues with JavaScript and i18n is dealing with embedded strings. As with any other programming language, embedded strings in an application's code make it difficult if not impossible to localize. Unfortunately, JavaScript does not have the concept of a resource file, and strings that will be generated by JavaScript must be defined in the code.

The easiest approach to deal with this issue is to define your JavaScript strings dynamically in server-side code (Java/JSP, ASPX, PHP, etc.). The following example defines some string resources in a JavaScript script block at the top of a JSP page:


<script language="JavaScript">

. . .

var RES_WELCOME = "<%= getString(RES_WELCOME", currentLocale); %>;

var RES_CURRENT_LOCALE_NAME = "<%= current Locale.getDisplayName (currentLocale);%>";

var RES_SELECT_LANGUAGE = "<%= getString ("RES_SELECT_LANGUAGE", currentLocale); %>";

var RES_GOODBYE = "<%= getString ("RES_GOODBYE", currentLocale); %>;

...

</script>


Assuming the currentLocale object is set to English (US), the resulting block should look like this:


<script language="JavaScript">

. . .

var RES_WELCOME = "Welcome";

var RES_CURRENT_LOCALE_NAME = "English (US)";

var RES_SELECT_LANGUAGE = "Select a language:";

var RES_GOODBYE = "Goodbye";

. . .

</script>


When currentLocale is set to German (Germany) it should change to this:


<script language="JavaScript">

. . .

var RES_WELCOME = "Wilkommen";

var RES_CURRENT_LOCALE_NAME = "Deutsch (Deutschland)";

var RES_SELECT_LANGUAGE = "Wahlen Sie eine Sprache:";

var RES_GOODBYE = "Auf Wiedersehen";

. . .

</script>


For French (France):


<script language="JavaScript">

. . .

var RES_WELCOME = "Bienvenue";

var RES_CURRENT_LOCALE_NAME = "Francais (France)";

var RES_SELECT_LANGUAGE = "Choisir une langue :";

var RES_GOODBYE = "Au revoir";

. . .

</script>


You get the idea.

There are a couple things to keep in mind with this approach. First, any strings that are embedded in the files, whether JSP/ASPX/PHP/etc. or JavaScript .js files, must be externalized, i.e. the strings should be moved into the string resource block as demonstrated below, and replaced in the code with their variable names. Second, the JavaScript string resource block should be defined before any other embedded blocks or .js file includes that make use of these externalized strings. For example, the resource block should be defined before the following function is called:


<script language="JavaScript">

. . .

function checkLanguageSelect() {

if (languageSelect.index==0)

alert (RES_SELECT_LANGUAGE);

. . .

</script>


Note that this simple example doesn't deal with more sophisticated functionality such as locale fallback, but this basic approach solves the simpler string resource-related issues common in JavaScript.


The Ugly - Language, Dates/Times

When it comes to language, JavaScript knows enough to be dangerous. That is, it knows what the browser's default language is (it's defined in navigator.language for Netscape-descendent browsers such as Firefox and in navigator.browserLanguage for Internet Explorer). On my English (US) system these get reported as "en-us" or "en_US." It is tempting to think that this information is a useful indication of the preferred language of the user, and in many cases it will be, but it doesn't allow for the possibility of a user preferring a language other than the browser default.

On a related note, there are a small number of "locale-specific" methods in JavaScript, which deal with the presentation of dates and times as strings, but these are always formatted in a single format for the browser's default locale. This also applies to the ability to parse date and time strings; they will only be parsed correctly if the strings are formatted according to the conventions of the browser's default locale.

Although these provide some minimal language support, it is actually best to ignore these and instead rely on the server to provide this functionality as much as possible.

With the advent of AJAX, a higher level of i18n functionality becomes possible because of the ability to interact with the server in a more seamless fashion. Using AJAX to achieve this higher functionality in JavaScript will be discussed in a future article.

Article Source: http://www.ArticleBlast.com

About The Author:

Adam Asnes founded Lingoport in 2001 after seeing firsthand that the niche for software globalization engineering products and services was underserved in the localization industry. As Lingoport's President and CEO, he focuses on sales and marketing alliances while maintaining oversight of the company's internationalization services engineering and Globalyzer product development. Adam is a frequent speaker and columnist on globalization technology as it affects businesses expanding their worldwide reach.

Comments On This Article:

Only registered users can write comments.
Please login or register.


You are welcome to publish this article free of charge on your website, newsletter, or e-zine, provided:

Site Menu
Home
Create An Account
FAQ's
Contact Us
ArticleBlast Site News
Article Categories
Advertising & Marketing
Animals & Pets
Arts & Entertainment
Auto & Trucks
Babies & Parenting
Business & Management
Computers & Internet
E-Com & Online Biz
Food & Drink
Health & Exercise
Home & Family
Home Improvement
Kids & Teens
Laws & Legal
Men
Money & Finance
News & Society
Real Estate
Reviews
School & Education
Self Improvement
Sports & Recreation
Travel & Leisure
Web Development
Website Promotion
Women
Writing
Login
Username

Password

Remember me
Forgotten your password?


Site Sponsors:

USFranchiseNews.com - Franchise News, Press Releases, Franchise Opportunities Divine Write - Advertising Copywriter, Website Copywriter, SEO Copywriter Become An ArticleBlast Site Sponsor
Thursday, August 28th 2008