Features

Features > Web Apps Features > Dev

Having a web app with users in more languages than just English is a good problem to have. But where do you start with internationalisation? In Part I of this article Steve Ellis breaks it all down so your app doesn’t get lost in translation.

Building web apps can be a lot of fun - particularly if people like and use them. If you dig through your stats the chances are that people from all over the world are visiting you and they speak lots of different languages. By only serving your app or website in English you’re assuming that everyone can speak it, or at least understand it well enough to get by. Just as learning to speak a new language can open the door to new cultures and ways of thinking, having a website that speaks multiple languages allows you to engage people who might otherwise have left looking for an alternative.

This was the situation we found ourselves in back in March just after we launched Diarised. We had people not only visiting us from all over the world but blogging about us in many different languages. We thought it would be a nice idea to get Diarised working in a few.

The process of getting your website ready for multiple languages is known as internationalisation. Internationalisation goes beyond getting the website working in other languages and covers aspects such as dates, times and currency. In this two part article we’re going to look at how you go about preparing a website for internationalisation. Then we’ll look at solutions to some of the real-world problems that can arise, such as dynamic data and plurals.

Associative Arrays

As a first stab at internationalisation we might try using associative arrays. We could store our translated text using the English text as the key, like so:


  //set up our language arrays

  $english["welcome to example.com"] = "welcome to example.com";
  $english["have a nice day"] = "have a nice day";

  $spanish["welcome to example.com"] = "Bienvenidos a example.com";
  $spanish["have a nice day"] = "Ten un día bueno";
  ...

  /*
  *  We'll set the locale based on a variable in the query string
  */

  switch($_GET["lang"]){
    case "es":		            //es turns the page into Spanish
      $messages = $spanish;
      break;

    default:			    //everything else defaults back to English
      $messages = $english;
      break;
  }
  ...

  echo $messages["welcome to example.com"]

Adding new languages should then be a case of adding more arrays and updating the switch statement. Perfect! Well not quite. There are a few issues with this approach:

  1. Dynamic data. What if you need to insert someone’s username into a sentence? Breaking the sentence down into lots of little phrases will only make your translator’s life difficult and will probably lead to inaccurate translations.
  2. You’re assuming your translator will be able to understand PHP array syntax and avoid breaking everything. What happens when they decide to stick something in quotation marks? The last thing you want to do after receiving a translation is to start debugging syntax errors.
  3. This approach won’t fail gracefully. What happens if you accidentally type: echo $messages["wlcome to example.com"] By missing off the “e” your translation will break and leave blank text.

Clearly this method has issues. What we really want is a way to allow non-techies to translate our text for us and a way for us to simply “plug” the translation back into the website.

Introducing Gettext

After ruling out associative arrays to do Diarised translation’s we decided to use gettext. gettext is the GNU internationalization library and it provides an excellent way of separating code from content. If you’re hosting on Linux there’s a good chance it will already be installed on your server.

So, how does it work?

The official Gettext website gives a highly detailed and fairly confusing overview but the gist is:

  1. You pass every piece of text you need translated through a function called gettext
  2. Once everything has been marked up you run the xgettext command to create a PO (Portable Object) file. This is a plain text file containing the source text and a place for the translated text
  3. You send this to your translator to open with a PO editor
  4. Once your translator has filled out the translations they send the PO file back to you
  5. You compile the PO file into a MO (Machine Object) file that gettext can read
  6. You set the locale of your site to a language (usually through the query string) and sit back and admire your website in a totally different language

Step 1 is just a case of the following:


  <p>welcome to example.com</p>

becomes


  <p><? echo gettext("welcome to example.com"); ?></p>

Marking up your code to use gettext is probably the most irritating step but fortunately you only have to do this once. It will then work for as many languages as you like. The next stage is to get this information into our translation file.

How to create a PO file

The first step is to set up the directory structure:

  1. In your webroot folder create a folder called locale
  2. Inside that create a folder for each language you plan on supporting and use the language code as the folder name, e.g. for Spanish use es_ES or if you wanted say a localised Argentine version you could have es_AR
  3. Inside each of those folders create a new one called LC_MESSAGES. This is where we will keep our translation files

To create the PO file you’ll need to use the command line but don’t worry we’re here to hold your hand. Fire up a terminal window, connect to your webserver and be brave.
The command we need is called xgettext, this will scan a script looking for calls to gettext then grab the text you’re passing and put it into a PO file. For example:


  # xgettext -o messages.po *.php

This will search every PHP page in the current working directory and stick the results in messages.po. Once this is done open the file with a plain text editor and make the following change:


  "Content-Type: text/plain; charset=CHARSETn"

becomes


  "Content-Type: text/plain; charset=utf-8n"

Now save it and send it to your translator

Although a PO file is a plain text file its contents aren’t particularly friendly. Luckily there are a few pieces of software that make editing PO files a bit easier (especially for your translator). For Windows there’s poedit, and for the Mac there’s LocFactory Editor. Both are free and will make your translators lives much easier.

When your translator sends the PO file back stick it into the appropriate LC_MESSAGES folder created earlier and open a terminal window so we can compile it

In your command window go to the LC_MESSAGES folder with messages.po and issue the following:


  # msgfmt messages.po

Assuming there were no errors this will churn out a file called messages.mo. This is the compiled file gettext will actually read, any changes to your PO file will require you to redo this step to make the changes live. Now all we need to do is tell gettext which language we want our text in.

Binding a locale

This step can be done via a few lines of PHP near the start of your script:


  $locale = $_GET["locale"];

  putenv("LC_ALL=$locale");
  setlocale(LC_ALL, $locale);

  bindtextdomain("messages", "locale/");	//binds the messages domain to the locale folder
  bind_textdomain_codeset("messages","UTF-8"); 	//ensures text returned is utf-8, quite often this is iso-8859-1 by default
  textdomain("messages");	//sets the domain name, this means gettext will be looking for a file called messages.mo

$locale will need to be set to the locale that you want the website appear in. To begin with it’s simplest to set this via the query string as shown above. One of the advantages of gettext is that if it can’t find a folder for the locale you pick it will just go back to English, meaning when someone sticks locale=kl expecting a Klingon version they’ll just get English.

That’s it for part one. At this point you should be able to at least make a start on preparing your websites for internationalisation. In part two we’re going to look at some of the real world issues you’re likely to run into and tell you what we did to solve them on Diarised.

Fuel is a brand new, affordable conference about powering your business with the web: London June 13

24 Responses to “Give your web app international appeal”

  1. Pierre says

    Thanks for sharing this!

  2. Made Media Ltd » Give your web app international appeal says

    […] Give your web app international appeal is the first in a two-part series offering fairly technical guidance on how to internationalise web apps generally. It’s a good read if you’re into that kind of thing. Watch out for part two coming soon. […]

  3. German Rumm says

    You could use gettext() alias to make the transition less iritating. You can type

    echo _(’Welcome to example.com’)

    instead of

    echo gettext(’Welcome to example.com’);

    Since _() is an alias to gettext()

  4. Soenke Ruempler says

    We also wrote a tutorial about PHP and gettext which goes a bit more into detail: http://blog.northclick.de/archives/20

  5. Jonathan Snook says

    Just a quick note that setlocale on Windows takes a numerical value. The PHP documentation has more details.

  6. Merengue says

    Looking forward to the next article. By the way, the spanish translation for $spanish[”have a nice day”] = “Ten un día bueno”; is incorrect. The correct one is $spanish[”have a nice day”] = “Tenga un buen día”;

  7. Ben Griffiths says

    Very interesting - im in the midst of creating a web app myself - this is a perfect addition to build in now, rather than later :)

  8. samuele says

    yes but my first language is italian and not english?
    i can’t generate po files from a web with italian as default lang?

  9. Frank Sattler says

    Very interesting article - thanks for sharing this!
    One thing I would add as a native non-English speaker: as tempting as it might be - do not, under any circumstances, try to localise your site using Babelfish or similar without having access to a native speaker, or at least a very comprehensive knowledge of the language you want to provide.
    I’ve seen a number of sites that started off in one language, and the authors added incomprehensible translations for another half dozen, and they’re not pretty.
    The only thing the people behind these sites have achieved is to end up being a laughing stock and turning off users at teh same time.
    It’s far better to have a well-written English site than one that’s available in a dozen languages which no speaker of these languages can read because they can’t understand it…

  10. Darren Stuart says

    You can do this in asp.net 2.0 without writing a line of code.

    good read mind.

  11. Saumendra swain says

    Thanks for sharing such an intresting and informative knowlegde.

  12. Prestito says

    I prefer to use a language file in order to set a different language like as a lots of cms do that.

  13. Chandra says

    Thanks for the article. I am actually planning to launch a multilingual site and this article has some great tips for me.

  14. loptar says

    good to know there are more other way than a language translation files. may be i will try it. :-)

  15. Mike Lisn says

    How to create a PO file… - Thanks! very helpfull for me. Btw. article is very good.

  16. Stefan says

    I was very much thinking about solution for internationalization in the past. However Gettext looks interesting, there is too much hassle with coding. Finally I decided to create individual templates for each language and stick with it. I have all system messages in template. I just pick proper template for each language and thats it. It’s much better optimized now. The issue of internationalization is much more complex than translation of texts. There can be also cultural differences so individual templates can be great help.

  17. Vitamin Features » Give your web app international appeal, Part II says

    […] Print Format […]

  18. John Manoogian III says

    will part 2 of this article cover using place-holders in your gettext strings to account for moveable tokens in translation? none of the PHP gettext tutorials online present a straightforward solution for strings with multiple keyword subsitiutions that doesn’t break compability with xgettext. the solutions i’ve seen posted all require you to manually fix your strings in the PO file.

    looking forward to version 2!

  19. CP says

    Thanks for sharing this! It´s good to know, that I have a choice or better an alternative to language translation files!

  20. Simon Jensen says

    Thank you so much for this … This has save soooooo much time for me! And it works like a charm.

  21. I am just a programmer » Give your web app international appeal says

    […] read more | digg story […]

  22. T’as le bonjour de Jean Rat ! » Blog Archive » La mini revue de web du Dimanche 18/11/2007 says

    […] Pour bien internationaliser son application web http://www.thinkvitamin.com/features/webapps/give-your-web-app-international-appeal […]

  23. Vitamin - Give Your Web App International Appeal | GreatSo.com says

    […] Building a web application can be a lot of fun. However, there is a problem if we are having a web app with users in more languages than just English. Where do you start with internationalisation? Steve Ellis published a great article about How To Give Your Web Application an International Appeal which breaks into two parts. In Part I Steve Ellis covered the basics of internationalisation. In Part II he takes us around the world with several real life examples, like tricky plurals, dynamic data, localising images, video and audio, and more. […]

  24. Vitamin - Give Your Web App International Appeal | GreatSo.com says

    […] Building a web application can be a lot of fun. However, there is a problem if we are having a web app with users in more languages than just English. Where do you start with internationalisation? Steve Ellis published a great article about How To Give Your Web Application an International Appeal which breaks into two parts. In Part I Steve Ellis covered the basics of internationalisation. In Part II he takes us around the world with several real life examples, like tricky plurals, dynamic data, localising images, video and audio, and more. […]

Leave a Reply

Basic HTML (<strong>, <em>, <a>, etc.) is allowed in your comments. Please be respectful and keep your comments on-topic. If we think you're being offensive for no reason, we'll delete your comment.

Comments RSS