I18N & L10N
I18N is a short way of writing “internationalization”. L10n is a short way of writing “localization”. These typically refer to the process around making your software application available in different languages. If you want users all around the world to use your application in their native language, then it’s important to use the proper workflow and best practices to achieve a multingual app!
This page is a collection of some notes on how to make your web app support multiple languages, including tooling, best practices for implementation, and the general workflow for how to do this.
The Basic Idea
The pursuit of a localized, multilingual software application is not a new thing. Almost since software has existed, developers have been trying to make their applications multilingual. In other words, there is a lot of history, well0-established best practes, old tools and time-tested knowledge around this problem.
The basic idea of how to translate your app can be summed up like so:
- A User Interface has strings in it, these are “UI Strings”
- Use a tool to extract UI strings from your source code into a different format
- Let translators translate the UI strings somewhere else, away from your source code
- Your application then uses the translated UI strings (translations) for the locale of the current user at run time to create a localized user interface.
Here’s a really confusing ASCII diagram:
┌────────────────┐ │ │ │ │ │ │ │ source code ├───────┐ │ │ │ │ │ │ │ │ │ ┌──────────────────────────┐ └────────────────┘ ▼ │ │ extract UI strings │ │ │ │ │ │ │ translation platform │ └────────────────────────────► │ │ │ │ │ │ └─────┬────────────────────┘ │ │ ▼ translators translate the ui strings │ │ Translations │ english ◄──────┬───┬─────────┘ chinese ◄──────┘ │ german ◄──────────┘ ┌────────────┐ │ │ application│ │ │ at ├─────► render ◄─────┘ │ runtime │ the proper └────────────┘ set of Translations
As mentioned, translating your app is an ancient artform. One of the oldest, most tried-and-true tools for managing L10N is GNU
gettext. Most Unix systems even come with this tool installed. If you’re on a mac, you can go directly to a terminal and write:
Gettext basically works like this:
gettextscans your codebase and extracts UI strings
gettextputs the extracted UI strings in
there stands for
- Each UI string in its original readable form in your application is now a
MSGID, i.e., a message id. The source code UI string in the original language you wrote your appplication in is now a key or unique ID.
- Translations go into
.pofiles, which are the same as the
.potfile except populated with translations.
- Each language you want to translate into has a
While the original GNU
gettext command might not be a good fit for your project, the
.po/.pot format is an excellent way for managing translations, with features supporting lots of different edge cases of L10N, such as differing plural forms in different languages.
In short: You can follow the
gettext spec using an implementation in your language of choice to better fit your toolchain.
Some things to keep in mind when writing your to-be-translated codebase.
- Don’t hardcode numbers or units into UI strings
- Don’t break up a sentence into many smaller strings
- Don’t concatenate many UI strings
- Use readable UI strings for the source code, not a key-like syntax for each
Pipe your raw UI strings here, then let translators translate them.
- Crowdin - not sure if supports
gettext, but used by some popular open source projects
- ttag - a JS implementation of
gettext. Really awesome.
- gettext-parser - a bit lower-level file for parsing
- react i18next - not based on
- Format JS - not based on