Given
the constant competitive pressure on executives to expedite
product time-to-market, many developers are given tight
deadlines to deliver functional software. This software
is often geared for localization once the source language
version is ready for release.
Keeping these pressures in mind, developers
can strive to ensure that basic principles are maintained
while developing software to facilitate localization efforts
- and meet time-to-market requirements for all the required
languages, not just the source.
Here are 12 do’s and don’ts that all developers
should read and apply in their work.
DO EXTERNALIZE MESSAGES IN MESSAGE
CATALOGS, RESOURCE FILES, AND CONFIGURATION FILES
Messages are textual objects that are translatable components.
These catalogs or files, such as Java resource bundle message
files or Microsoft resource files, are installed in a locale-specific
location or named with a locale-specific suffix.
This practice will facilitate the localization process,
since localizers can work on these resource bundles without
the need to modify source code. It will also permit the
use of a single source code for all languages, where only
the resource bundles will have different language flavors.
DON’T INTERNATIONALIZE FIXED TEXTUAL OBJECTS
These are objects that should not be translated, such as
comments, commands, and configuration settings. Only externalize
the strings needing translation. If these objects appear
in resource or configuration files, they should be marked
"NOT_FOR_TRANSLATION."
Here are some examples of fixed textual objects:
- User names, group names, and passwords
- System or host names
- Names of terminals (/dev/tty*), printers,
and special devices
- Shell variables and environment variable
names
- Message queues, semaphores, and shared
memory labels
- UNIX commands and command line options
(e.g., ls -l is still ls -l in all locales)
- Commands such as /usr/bin/dos2unix and
/usr/ccs/bin/gprof
- Commands that are XPG4-compliant (in
/usr/xpg4/bin/vi) and have equivalent non- XPG4 commands;
non-XPG4 commands that are not fully internationalized.
For example, / usr/bin/vi does not process non-EUC codesets,
but /usr/xpg4/bin/vi is fully internationalized and can
process characters in any locale.
- Some GUI textual components, such as
keyboard mnemonics and keyboard accelerators
DO ALLOW FOR TEXT EXPANSION IN MESSAGES
(ESPECIALLY FOR GUI ITEMS).
Here are some Microsoft translations into
German:
- bundle -> Einzelvorgangsbündel
- Link -> Verknüpfung
- Login -> Anmeldung
- Update -> Aktualisierung
- Undo -> Rückgängig (machen)
- Geschäftsaktivitätsüberwachung replaces
the acronym BAM (Business Activity Monitoring)
Apply the following expansion rules when
possible.
When the source text is:
- 0 -10 characters: The expansion required
is from 101 - 200%.
- 11- 20 characters: 81 - 100%
- 21 - 30 characters: 61 - 80%
- 31 - 50 characters: 41 - 60%
- 50 - 70 characters: 31 - 40%
- Over 70 characters: 30%
But keep the string length well below your
limit (usually 254 characters) to account for the extra
characters needed.
Try to place the labels above the controls,
not beside them. The expansion of a label can increase the
width of the form more than the expected resolution, which
will force horizontal scroll bars or cause truncation. This
also simplifies localizing applications required into bidirectional
languages (languages that are read from different directions
[RTL or LTR], such as Arabic and Hebrew).
DON’T USE VARIABLES WHEN YOU CAN AVOID
THEM
Variables create questions in the translator’s
mind as to the gender of the term to substitute, making
it difficult 2to correctly translate the sentences that
incorporate it. If variables are to be used, offer a list
of replacements. Also allow for gender and plurals variations
in the translation of the sentences that incorporate the
variable.
For example:
<%if err = 400
errtext = "server"
else
errtext = "connection"
end if
<P> The <%=errtext%> is currently unavailable </P>
While this displays grammatically correct
sentences in English, the translation in French will be
problematic. In French, the word "server" is masculine,
while the word "connection" is feminine. The translator
cannot use the correct translation for the article "the"
based on the translation of the differing genders of server
and connection.
The code should be instead:
<%if err = 400
<P> The server is currently unavailable </P>
else
<P> The connection is currently unavailable </ P>
end if
At the same time and for similar
reasons, don’t use composite strings. A composite string
is an error message or other text that is dynamically generated
from partial sentence segments and presented to the user
in full sentence form. Use complete sentences instead, even
at the expense of repeating segments. This will ensure the
accuracy of the translation, regardless of gender, plurality,
conjugation, or sentence structure.
Also, avoid using the same placeholders
when using multiple variables in the same string, since
the sentence structure does change in different languages.
For example,
<Total %s, %s of %s< (as in Total
5, 1 of 5) might read "5 of 1, Total 5" in the
translated text. Instead, use numbered placeholders (e.g.,
"Total %1, %2 of %3").
DO PERFORM PSEUDO-TRANSLATION
Pseudo-translation is the process of replacing or adding
characters to your software strings to detect character
encoding issues and hard-coded text remaining in the source
files.
Here’s an example of a few strings from a C resource file,
with their respective pseudo-translations in Japanese:
IDS_TITLE_OPEN_SKIN "Select Device"
IDS_TITLE_OPEN_SKIN "Slct Dvc"
IDS_MY_OPEN "&Open"
IDS_MY_OPEN "&Opn"
In these strings, Japanese characters replace the vowels
in all English words. After compilation, testers can easily
detect corrupt characters (junk characters replacing the
Japanese characters) or strings that remain fully in English
(source strings still embedded in the code).
ENGLISH ORIGINAL
After pseudo translation in French and insertion of a pre
and post text character (_). Note the post text character
at the end of many strings how it is truncated
Pseudo translated dialog after resizing. Note the correct
resizing to see the post-text character (_) at the end of
each string.
In the previous example, text is pseudo translated into
French and shown before and after the resizing of the dialog
box. Correct pseudo translation techniques also extend the
string size by any % (15% used in the above example) and
add a fixed pre and post delimiter to help identify the
beginning and end of each string.
DON’T USE IF CONDITIONS OR RELY ON A SORT ORDER
IN YOUR CODE TO EVALUATE A STRING VALUE.
For example, avoid (IF Gender = "Male" THEN). Always depend
on enumeration or unique IDs.
DO USE UNICODE FUNCTIONS AND METHODS TO SUPPORT ALL
SCRIPTS
Applications that store and retrieve text data need to
accept and display the characters from any given language.
Using Unicode encoding solves the problem of unsupported
character sets and the display of junk characters.
DON’T INSERT HARD CARRIAGE RETURNS IN THE MIDDLE
OF SENTENCES
Translation memory tools key off hard returns and assume
that the sentence has ended. Inserting them in the middle
of a sentence leads to incomplete sentences in the translation
database and corrupts the sentence structure in the target
language files. Instead, replace hard returns with soft
returns (or better yet, use a break tag of some sort, such
as <BR>).
Also be aware that sentence structures change in different
languages, as well as the length of sentence parts. So,
additional breaks may be needed in target languages
DO CHOOSE YOUR THIRD-PARTY SOFTWARE PROVIDER CAREFULLY
Insist they support Unicode and comply with the above practices.
Often problems are encountered with third-party software,
and the fact that you don’t have control over their
code to fix the problems makes the localization tasks particularly
difficult.
Often 3rd party tools are localized.
If so, this will help save on your localization costs. Ask
your software provider to give you access to the localized
files and glossaries. If they are not localized, Pseudo
translation is a good technique to apply to quickly test
for obvious issues.
DON’T USE TEXT IN ICONS AND BITMAPS
The translated text may be too long to fit.
Also, avoid using symbols with cultural connotations and
locale-specific idioms.
In general, you should create graphics in
either a vector format like Adobe Illustrator, Macromedia
Freehand, or Corel Draw, or if a bitmap format is required
then save the text in a new layer, like with Adobe Photoshop.
Completely "flat," bitmap file formats like GIF, JPEG, or
PNG are harder to localize as replacing the source text
with the target may distort the graphic.
The basic gray background behind "QA Completed"
enables easy substitution for translated text. However the
purple QA will be problematic to substitute with AQ for
French or QS for German.
DO USE LONG DATES OR MONTH ABBREVIATIONS
INSTEAD OF NUMBERS WHEN IDENTIFYING DATES
Month vs. day orders in different parts
of the world vary (e.g., mm/dd/yy in the US; dd/mm/yy in
Europe).
Also, the use of AM and PM as in the above
source dialog box will be problematic in Europe as they
use a 24 hour format as opposed to two 12 hours, AM and
PM.
DON’T ALPHABETICALLY SORT STRINGS IN
STRING TABLES AND RESOURCE BUNDLES
Try to offer as much context as you can with the externalized
strings. This will help the translator better adapt the
translation to that context. If context is non-existent,
run-time QA will take much longer to correct the translations.
For example: "Update" could be the action (to update) or
the software itself. "Check" in a financial software could
be the action (noun or verb), or the monetary equivalent.
"Email" could be a verb or a noun.
CONCLUSION
Keep in mind that once you localize your product, the amount
of work that will be needed for debugging and fixing any
problems has the tendency to be multiplied by the number
of languages that you support. Following these principles
will not only expedite localization, but more importantly
reduce testing, rework, and quality assurance time and costs
- ultimately allowing your company to meet the strict time-to-market
requirements expected by your executives.
Don’t short-circuit your localization activities by side-stepping
these issues. If you need help, consider involving experts.
ABOUT THE AUTHOR
Micheline Freij is Operations
Director at GlobalVision International, Inc. www.globalvis.com,
a Software Localization and Translation specialist. She
is trilingual and holds a BS double-major in Computer Science
and Math from RI College. Ms. Freij has worked for 15 years
in the software and localization industries. She has traveled
to and lived in many countries.
She can be reached at Micheline@globalvis.com.
ClientSide
News Magazine - www.clientsidenews.com
Read
more articles - Free!
E-mail
this article to your colleague!
Need
more translation jobs? Click here!
Translation
agencies are welcome to register here - Free!
Freelance
translators are welcome to register here - Free!
Subscribe
to TranslationDirectory.com newsletter - Free!
Take
part in TranslationDirectory.com poll - your voice counts!