Do all of your users speak the same language? If so, are you simply ignoring or preventing users that speak different languages? Whether you are expanding an existing application into a new market, or you are launching a greenfield product, you may need to support multiple languages. But what does that mean?
Is it enough to translate the words? You can take a French car, translate all indicators to English, and drive it in the United Kingdom. But, the steering wheel will still be on the wrong side of the car!
We are mostly talking about the "izations": Internationalization, localization, and globalization. These are often abbreviated to I18n, L10n, and G11n, respectively. While specific definitions vary, we will use the following definitions for our purposes, starting with defining Locale.
- Locale: the information needed to identify a user's language, cultural region, and other preferences for the UI. This is often just a language and country code. For example: en-GB, en-US, ja-JP.
- Internationalization: enabling an application or platform to be used in different languages and cultures. This is the enablement step. This is required for the first non-native language. In the UI, this includes runtime support for switching out headings and icons based on the identified locale. We will discuss many additional considerations below.
- Localization: enabling an application to be used in a specific language and culture. This typically involves translations, formatting of dates and numbers, and currency conversion. Grammar rules, language direction, and culturally dependent images are also addressed with localization.
- Globalization: an umbrella term for Internationalization and Localization.
The specific requirements for applications and platforms will vary widely. But it can be useful to think through the challenges with a framework. We can start with a basic 3-tiered application with a UI, API and database. The UI represents things that the user directly interacts with, such as screens, buttons, labels, etc. The API performs business logic and retrieves data. The database represents the data that is stored at rest, whether this is a file system, a traditional relational database, a graph database, or even a third party API.
Starting with the UI, we should consider:
- Anything with words
- Text content
- Text based menus
- Help text
- Icons and other visual indicators
Anything that is displayed to the user as text needs special consideration. Menus, buttons and headings can often be approached with low-level translations. Help text and other lengthy descriptions to aid the user should be translated with context in mind, rather than a word-by-word translation.
Icons can be an opportunity to target multiple locales with the same resource. By selecting a more generic concept, specific localization efforts can sometimes be avoided. Properly chosen, one icon can work in many regions. For example, the icon for edit is often just a pencil:
How does the application identify the locale? Is this detected and defaulted? Can the user make a selection? How many languages are supported within each geographical area? A user might be French, but prefer to use the application in English. The translations might no longer apply, but date formatting and currency conversions would still need to be handled.
When handling data, we need to decide where to apply the locale transformation, and how the data is stored. If we are generating a report for a user, that report could look different based on dates, decimal separators, currencies, language direction, and even sorting based on something that is translatable. Essentially, reports are another form of UI and usually require the same globalization treatment as your interactive UI. The reporting logic would need to be extended to consider locale. Multiple API calls may need to be updated to accept this locale information.
Some data needs to be adjusted based on the user’s locale. Dates and times are adjusted based on time zones. Monetary values may need to be converted to local currencies. There are two main concerns with these types of data:
- Conversion - adjusting the value of the information based on the locale. With dates and times, this involves time zone adjustments. With money, it is converting from one currency to the equivalent value in another.
- Formatting - adjusting how the information is displayed. With dates, this involves the ordering of the months, days and years. With currency, the specific symbols used can change, as well as locations of periods (.) and commas (,).
Let’s consider date and times. It is common to store date and time information in UTC, and only convert to the appropriate time zone in the UI. In that case, the API would not need the locale information for date and time handling; it is acting as a pass-through from the data storage to the UI. Dates and Times are somewhat of a special case, because we have the ability to store them as an absolute time value, then convert later. In all cases, we still need to handle date formatting, which is independent of time zone conversions.
In these date examples, the locale transformation is applied in the UI. For the reporting example, the locale transformation is applied at a lower level in the API, or even in the database query. API internationalization depends on the data storage strategy, and the location of display transformations for the user.
Database (Data at rest)
The same data concept can have vastly different meanings depending on locale. Phone numbers and addresses fall into this category. Consider postal codes. In the US they are 5 digits, unless you also include the additional 4 digit geographic segment. In Canada, they are essentially 6 digit alphanumeric, with additional restrictions. Other countries have different restrictions and formatting, all for postal code. A thorough review of UI validation, API parameter checking and database constraints is a must. I18n requires consideration of data storage and validation throughout the technology stack.
One rule of thumb is that dates should be stored in the database in a consistent format (typically UTC). Likewise, it can be useful to have a master currency for monetary values. Even this suggestion may not always be ideal. The application might only have a context for local currencies. If that information is entered and viewed only within specific markets, it might not make sense to first translate it to your chosen master currency for storage. Currency conversions can change based on market fluctuations and accounting rules. These changes would need to be communicated to the user. An understanding of locale based workflows is needed in order to deliver a well balanced solution.
For other pieces of data, the application might be at its core a content management system (CMS). In this case, the user input from one locale will be consumed in different languages and regions. This could add the complexity of exposing data management in multiple languages to the end user. For example, a product catalog for a global storefront will need names, descriptions and details in multiple languages. This application may need to manage localization of data within the application, outside of the software development process. This becomes a user workflow to support localization.
There are multiple approaches here that might fit your needs. In some cases, localized data might be stored in parallel columns within a database table. The API layer might contain the logic to pull the appropriate columns based on the user's locale. Sometimes a more sophisticated lookup approach is used that enables an end-user to dynamically add new locales.
Adding support for a new user group can seem daunting. We can reduce the complexity by adopting a similar process that we might already use for regular features. Indeed, locale support is itself a set of features.
Yes, this does add overhead. The balance will be specific to your application or platform. Consider UI layout changes based on language direction. Many languages are read from left-to-right. English, of course, has a left-to-right direction. Arabic and Hebrew are right-to-left. Some languages are top-to-bottom, but these can also be flexible in their direction. In order to truly support global languages, the UI may need to consider layout changes based on locale. If, however, your user base is always one of two latin based languages, you may only need to support left-to-right languages.
We can be well on our way by adopting an internal culture of consideration and verification. As designers and developers, consider different languages and regions for every feature we build. Are capabilities lacking in the core system that prevent locale specific displays? As part of QA, verify that features respond appropriately to all supported locales.
Necessary process steps to support localization:
- Design for locales
- Add locale considerations into feature design and definition.
- Identify translation authors
- These authors will need the workflow context in order to appropriately translate concepts, not just words. Ideally, they would participate in feature definition.
- Watch out for decisions based on language fluency, but that ignore dialects and cultural norms. Here are 4 ways to ask “What is your name?” in Spanish:
- ¿Cómo te llamas?
- ¿Cuál es tu nombre?
- ¿Cómo se llama?
- ¿Cuál es su nombre?
- Define verification processes.
- During testing, who will be responsible for locale specific testing?
- Will this be a post-QA step centered on specific locales?
- Or, will each feature have a separate validation step that checks-off each supported locale?
- Some combination of both?
Where are you on the spectrum of adoption? Has I18n been a core tenet of your business from the start? Or, are you expanding a local product internationally?
Does your application have CMS tendencies? Does the user input itself need to exist simultaneously in multiple locales?
Do your user workflows cross locales? Does your system need to translate concepts at runtime from one locale to another? Or, does your system simply need the ability to support multiple (but mostly independent) locales?
Enablement != support. I18n is the (largely) one-time cost of enabling your platform to support multiple languages and regions. Sometimes this is baked into the development process from the start. Other times it is a specific effort within an organization for an application or a suite of applications. In either case, the ongoing cost for localization will still need to be handled. Specific languages will need to be supported. And, new features will need to properly handle all supported languages.