The Guide to Translation and Localization: Engineering and Computer-aided Tools Globalization translation jobs
Home More Articles Join as a Member! Post Your Job - Free! All Translation Agencies
Advertisements

The Guide to Translation and Localization: Engineering and Computer-aided Tools



Become a member of TranslationDirectory.com at just $12 per month (paid per year)





[ Table of Contents ]

Chapter 6: Engineering and Computer-aided Tools

Roles of the Engineering Group

True localization is a multi-discipline activity that includes linguistics, formatting, engineering, and quality assurance. Engineering is an integral component of this service and is one of the services offered by a localization provider that differentiates simple translation from comprehensive localization.

Localization engineers are involved at every stage of the localization process. Often they consult on internationalization matters before the materials are even developed. Once source files are created, the engineers' analysis of them provides vital information for project planning and budget estimation. Linguists rely on engineers to extract text strings from source content and prepare marked-up files to facilitate translation. They also manage the ensuing translation memory and use tools such as Trados, Catalyst, and Multilizer to improve consistency and lower costs. Prior to delivery, engineers may perform the functional testing of the localized products.

At Lingo Systems, members of our engineering group closely interact with the other production departments to provide further support. For our formatting group, they import and export text from desktop publishing applications. For our QA department, they perform functional testing of technical projects such as user interfaces, websites, and help systems. And they are always available for a quick game of pool over lunch.

First Things First: Internationalization

Many companies develop their products with only a U.S. customer in mind. When these domestic products are slated for distribution to foreign markets, the process of localization often reveals limitations in the product design. Internationalization is the process of engineering a product so that it can be localized for export to any country.

Often, internationalization is quite simple. For example, some languages use more characters and take up more space than others. A properly internationalized source file will leave room for text expansion. Another common internationalization step is to resize an 8 1/2" x 11" document to European A4 paper size.

In addition to considering overall design and layout, the internationalization process focuses on, but is not limited to, the following points:

Cedric Vezinet photo

Cedric Vezinet

Director of Engineering

After 10 years in this industry, I still find every day as exciting as the very first one on October 9th, 1996 when I started my Lingo career as a French linguist. By looking at my pictures in the previous versions of the guide, one can tell that I have lost a lot of hair over the years but I have not lost the motivation.

1) Does the design account for cultural differences in various metrics such as currency, units of measure, date format, phone numbers, and addresses?

2) Are all the localizable strings isolated from variables and other code for easy extraction?

3) Are unique strings re-used in different contexts throughout the product?

4) Is the product free of embedded and concatenated strings?

5) Is the interface designed for dynamic layout so that it can accommodate text expansion?

6) Do automated lists take into account any sorting order differences in the target locale?

To avoid internationalization surprises, involve your localization provider during the product design stage so that localization requirements can be taken into consideration during development. If this is not done, your localization vendor will likely have to perform some product internationalization prior to beginning localization. This may not only compromise timelines, but may also have an adverse effect on your budget.

Encoding: Pick Your Poison

A major question to address when you begin the internationalization process is: can or will your application use some flavor of Unicode as its encoding format? Before Unicode was invented, there were dozens of different encoding systems. No single one contained enough characters to represent every possible language. For example, the European Union alone required several different encodings just to cover its languages. Even for a single language like English, no single encoding was adequate for all the letters, punctuation, and technical symbols in common use.

To add to the challenge, many of these encoding systems also conflict with one another. That is, two encodings will use the same numeric assignment for two different characters, or use different numeric assignments for the same character. Computers (especially servers) must be able to support many different encodings - but it still may not be enough. Whenever data is passed between different encodings or platforms, it runs the risk of being corrupted.

Unicode eliminates most of these problems. It is well established, works on all platforms, and supports many more characters than most of us have ever heard of or will ever use. Unicode provides a unique number for every character, no matter what the platform, no matter what the program, no matter what the language. It also allows data to be transported between many different systems without corruption. Due to the natural progression of technology, there are many different Unicode formats: Big-Endian, UTF-7, UTF-8, UTF-16, UTF-32, and on into the future. In general, UTF-8 will be most common on the Web. UTF16, UTF16LE, UTF16BE are mostly used by Java and Windows. UTF32, UTF32LE, UTF32BE are mostly used by various UNIX systems. Fortunately, the conversions between all of them are algorithmically based and quick to implement.

Pick Early, Test Often

Many internationalization issues can be identified early in the development process by performing internationalization testing of the source material. Machine translation (MT) technology is often used for this purpose since it can generate pseudo-translated content that has the look and characteristics of translated material without a costly investment in translation. Machine Translation is based on advanced computational linguistic analysis and, because it is cheap, can quickly generate lots of translated content for testing purposes. Such testing can help pinpoint issues in the localization project before they become major headaches. For instance, a pseudo-translation can identify variables in the software that should not be translated, allowing you to isolate them prior to actual linguistic work.

It is important to note that MT has many drawbacks when it comes to actual translation (e.g., it requires the use of constrained vocabulary or it may not convey complex or abstract concepts), but it is a very valuable tool in the internationalization phase.

Internationalization is not a service commonly offered by localization companies as it requires highly skilled and specialized personnel with a very strong understanding of the platforms and development environments being used. Moreover, a well executed internationalization review will not necessarily rid your files of all potential localization headaches - but it will reduce them to a manageable level and avoid the introduction of additional defects during the localization process.

In general, the difference between a successful project and one plagued by problems is a direct function of the amount of interaction between the client and vendor's engineering departments in the early stages of the project. Internationalization evaluation and testing is a very cost effective way to ensure that your product is ready for localization - especially when measured against the delays and costs associated with trying to resolve these issues during the localization process.

On Our Way: Localization Begins

Once all internationalization issues have been addressed, the localization process can begin on a good foundation. For the engineering group, this usually means preparing the source files for translation. How this is done varies depending on the type of materials. The four main categories are: documentation localization, help localization, UI localization, and web localization.

Page by Page: Documentation

Want to see us pull a rabbit out of our hat? Well, perhaps that's a stretch, but this is where the magic starts. Imagine you have just purchased the latest and greatest Widget. The first thing you do is read the manual, right? (C'mon, work with us here.) Now, imagine lifting all the English out of that manual, crunching it all up and then carefully unfolding it to reveal a brand new language. It's a bit of a strange notion, but that's pretty much how document localization works. In the simplest terms, documentation engineering is the process of importing and exporting text from a desktop publishing application.

Chris van Grunsven photo

Chris van Grunsven

Senior Localization Engineer

In my many years here, I have become the "Keeper of Useless Knowledge," like: How to count to 31 with one hand, 1,023 with two hands. How to convert Word 6.0 RTFs to Word 2.0 RTFs, with a text editor so they will work on Windows 3.11. Where the copies of Swedish Windows 3.0, Japanese Word 6.0, and the DOS 4.0 user's manuals are shelved. And, if you can't find the holiday decorations, I know where they are, too.

Since most translators work within Microsoft Word using Computer-Aided Translation (CAT) software, the source material (which can be in any medium) must be converted to an RTF file or TTX file while preserving the formatting of the original document in order for the linguist to be able to work with it. This is done by using different tagged text formats (codes) to isolate the formatting from the translatable text. By protecting the formatting, the translators can then use their CAT tools and focus exclusively on what needs to be translated without being confused by formatting codes, which can be very numerous (especially in the case of Quark documentation).

The vast majority of documentation is developed using Adobe InDesign, Adobe FrameMaker, QuarkXPress, and Adobe PageMaker. As you can see, Adobe Systems Inc. has quite a few different writing tools, but as time goes by, they seem to be moving toward one versatile application that will address all documentation development needs. In January 2004, Adobe began transitioning its PageMaker users to InDesign. If InDesign continues to gain popularity in the technical writing community, it will make the localization process much easier. InDesign is a terrific application for localization. It offers full Unicode support and is well-suited for cross-platform work. It also allows for XML integration with content management systems. But we must admit that there are quite a few deficiencies in the INX (InDesign Exchange) and Tagged Text format which makes the localization process a little bit tricky. But we always have a bit of engineering magic up our sleeves to solve the problem.

Regardless of the application used to develop the materials, when the RTF or TTX files come back from translation they head straight to Engineering. With a wave of a wand and some feverish keyboard tapping, engineers pour the localized text back into the source documents and hand them off to the DTP department where they are polished to perfection.

Stop the Presses: Help File Localization

As a means of disseminating information, print documentation is quickly losing ground to interactive help systems. Well-structured online help provides users with incredible search capability, allowing them to find more information in less time than with conventional print documentation. Many help users say this leads to a richer experience. We could not agree more.

Help systems are not only getting bigger, they are getting smarter. Perhaps most importantly, however, they are becoming easier and less costly (if not downright cheap) to create. Single-source publishing tools such as AuthorlT, ArborText, Web Works, or RoboHelp are now able to import previously generated Word or FrameMaker documents and then leverage them to create interactive help systems. Let's note here that localization savvy applications such as AuthorlT, which offer built-in localization support, make the translation process a walk in the park. As more companies discover these benefits, this trend will only accelerate. The main help formats we see being used are WinHelp, HTML Help, WebHelp, JavaHelp, Oracle Help, and the relatively new FlashHelp. Even though all these formats have their own specific uses, when it comes to localizing help systems, the approach is similar.

Interface This: Software Localization

An engineering group really shines during the localization of software. We take on all comers: any flavor of Windows, Mac OS, UNIX, Linux, Palm OS, Symbian, mainframe, and Java based applications. And we will take any variety: web-based, server-based, or client-based.

For some programming language and platform combinations, software localization requires a process not unlike the one used for documentation. The localization engineer extracts the text from the application and then creates a tagged RTF or TTX file for the translator that protects the underlying codes. When it comes to protecting the codes, TTX is by far the better choice. In other cases, the localization engineer uses a proprietary tool or off-the-shelf application like Catalyst or Multilizer that allows the translator to work directly on compiled files and executables. All things being equal, however, it is more common and easier to work in the resource (RC) files or properties files to minimize the amount of preparation work and reduce the potential for defects being introduced during the localization process.

Whatever method is used, one thing is sure: the continuing evolution of Unicode technology and the greater understanding of the needs of the international market has made localization engineering much easier. The latest OS editions from Microsoft and Apple are perfect illustrations.

The combination of Windows XP and Office 2003 is a must-have when dealing with multiple languages in your day-to-day operations. It is now possible to easily generate text files in many encodings for the most widely used languages on any Western operating system. The manipulation of Eastern languages, double-byte, and even right-to-left languages has been made much easier, too. We previously had to navigate from one native operating system to the next just to manipulate localized files. Much of this tussle has now disappeared and native operating systems are only used for online functional testing of the final localized product.

Also widely used and indispensable is Apple and its Mac OS X, and especially its latest "Tiger" release. Not only is it a great system for localization but, in our opinion, the most localization-friendly operating system on the market. With just a simple drag of the mouse, users are able to switch the UI and/or the system's language!

No matter what the platform, the best way to make your UI localization-friendly is to externalize all localizable strings (similar to Java's properties file). Whenever possible, design your UI so that most of the strings are located in well-formatted files where the variables are followed by the string and the interface is dynamically laid out. Another important rale is to avoid string concatenation.

Going World Wide: Web Localization

Mike van Grunsven photo

Mike van Grunsven

Senior Localization Engineer

Unlike my brother who is the "Keeper of Useless Knowledge" here at Lingo, I like to think my knowledge is useful. Yellow and blue make Grun,2 + 2= H,hieroglyphetc...

User Interfaces are increasingly web-based because they are easier to maintain and offer more support than client-based applications. In most cases, both web-based applications and commercial websites have a database such as Oracle, SQL Server, MySQL, or Access as the back end. Fortunately, no matter what the type of database, the same process and tools (e.g., Multilizer) are used for localization.

From an engineering perspective, the most important step in localizing a database is to use well defined spec sheets listing the tables and the fields requiring localization. It also helps to have the database designed in such a way as to facilitate either field or table localization. From there, the only other hurdle could be string length limitations, but these are easily managed with tools such as Multilizer.

As with many things in life, however, what is good for the goose may not be as good for the gander. Websites built with dynamic content are usually very localization friendly for engineers. In most cases, it is relatively easy for us to extract the text strings from the underlying database. Unfortunately, once the text has been extracted, it is not so friendly for the linguists who translate the strings.

Rather than working with a complete document, all the translator sees are random, out-of-context strings - a difficult challenge for even the most skilled professional. It is therefore a good idea to use a description field in your database to give some guidance to the linguists. With proper instructions, the engineer will be able to include non-translatable fields in the translation packages that are provided to the translator.

The most compelling advantage of a database-backed website is the downstream benefits. Updates (including localization maintenance) become very easy and very cheap. As changes are made to the site, the new and modified strings are extracted, translated, and then reinserted. In many cases, localization delivery can even be automated using a translation portal such as Lingo Systems' "LingoNet."

There can be other challenges to localizing a website besides the database component. For example, using multiple programming languages can create parsing difficulty when generating RTF files for the linguists. The most common programming languages found on most websites are PHP, JSP, Perl, ASP, ColdFusion, and JavaScript.

Last, but far from least, working on the graphical assets of a website can be difficult when source materials (Photoshop, Illustrator, or CorelDraw files) are not available. With their omnipresent gradient backgrounds and obscure fonts, nothing is worse than being asked to recreate localized versions of these elements. This invariably requires design expertise from (and budget for) our DTP department.

Repeat after us: It is always a good idea to keep the source files in a safe place and to isolate localizable layers. This useful feature is offered by most, if not all, image and graphic editing software such as Photoshop, Fireworks, and Illustrator.

Talking the Talk: Terminology Management and Translation Memory

The last function that a localization engineer performs may be the most important. Terminology management, including the creation and maintenance of translation memories (TMs), has a huge effect on both quality and consistency. It may also be the single most important factor in reducing localization costs.

TMs are a must-have for any localization project. Some localization firms assign the task of terminology control to the project manager. At Lingo Systems, we believe in using the right person for the job and have no doubt that when it comes to managing hundreds of multilingual translation memories this person is the localization engineer. An inaccurate or corrupted TM (whether it is a linguistic corruption or an encoding corruption) can reduce leveraging, adding to the cost of the project and ultimately hurting the quality of the translation.

There are several players in the CAT tool market. Since the acquisition of Trados by SDL on June 20, 2005, the largest share belongs to SDL TRADOS. A few other smaller, more specialized players like DejaVu and TermStar are also worth mentioning. The principle behind each of these products is the same: the translator uses the tool interactively within a word processor to automatically retrieve existing translations from a database (Translation Memory). For localization engineers, it does not really matter which tool is used since most, if not all, of them are TMX compliant, meaning that the TM content can be exchanged between CAT tools through an XML-based export file (TMX file). All of them also offer fuzzy matching, which gives the translator close matches to a localizable sentence thereby speeding up the translation.

Another linguistic tool that is often integral to the development and maintenance of an effective TM is glossary management. In the world of localization, glossaries represent a list of key terms and definitions that the translator will need to properly localize the source materials. Many of the TM tools include a glossary management module to facilitate the compilation of a glossary and the subsequent translation of the key terms whenever (and wherever) they appear. These modules, such as MultiTerm from SDL TRADOS, run in the background as the translation is being done in a word processing application. They then flag for the linguist any term that is located in the MultiTerm glossary, minimizing the time a linguist typically needs to go back and forth between reference materials and applications.

The newest glossary management tools are so customizable that they even allow the user to add multimedia content to the term definition. The possibilities are infinite. The next generation of translation tools even allow the localization vendors to share their glossaries as well as their translation memories over the web (SDL Trados TM Server and MultiTerm Online are good examples), which greatly facilitates the interaction between the localization company the client, the linguists, and the in-country reviewer.

Wrapping Up

Let's speak plainly. Localization engineering is not rocket science, but in our estimation, it comes close. As you prepare for a localization project, be sure to leave a seat at the table for an engineer. From the initial internationalization planning, through actual translation and implementation stages, and on­going translation memory development and maintenance, an engineer will be directly involved. We may be biased, but we believe that a top-notch engineering department can anticipate the potential issues you may face well ahead of time: from nagging technical oddities to esoteric cultural differences. So, as you plan your next project, keep us engineers in mind. The extra time you invest up front will pay off in terms of reduced timelines and cost savings in the long run.

[ Table of Contents ]









Submit your article!

Read more articles - free!

Read sense of life articles!

E-mail this article to your colleague!

Need more translation jobs? Click here!

Translation agencies are welcome to register here - Free!

Freelance translators are welcome to register here - Free!









Free Newsletter

Subscribe to our free newsletter to receive news from us:

 
Menu
Recommend This Article
Read More Articles
Search Article Index
Read Sense of Life Articles
Submit Your Article
Obtain Translation Jobs
Visit Language Job Board
Post Your Translation Job!
Register Translation Agency
Submit Your Resume
Find Freelance Translators
Buy Database of Translators
Buy Database of Agencies
Obtain Blacklisted Agencies
Advertise Here
Use Free Translators
Use Free Dictionaries
Use Free Glossaries
Use Free Software
Vote in Polls for Translators
Read Testimonials
Read More Testimonials
Read Even More Testimonials
Read Yet More Testimonials
And More Testimonials!
Admire God's Creations

christianity portal
translation jobs


 

 
Copyright © 2003-2024 by TranslationDirectory.com
Legal Disclaimer
Site Map