Making Reuse Intelligent: Improving Enterprise Information Quality Management Press Release (Advertisements) translation jobs
Home More Articles Join as a Member! Post Your Job - Free! All Translation Agencies
Advertisements

Making Reuse Intelligent: Improving Enterprise Information Quality Management



Become a member of TranslationDirectory.com at just $12 per month (paid per year)





ClientSide News Magazine pictureReuse has become a buzzword in technical communication and localization. For one thing, businesses want to be sure that they write the same information only once. They also want to avoid translating information repeatedly, because it is expressed using different words or a different word order. But in a distributed writing environment where disparate groups are contributing to huge content repositories, how can you make sure that content is created only once? This article looks at the role technology can play in promoting content reuse. In particular, innovations in linguistic technology are making it possible for companies to take a systematic approach to this challenge.

Topic-based reuse

We can think about reuse under two broad headings, topic-based reuse and linguistic reuse. Companies already recognize the tremendous benefits of DITA, the Darwin Information Typing Architecture. It is an emerging trend in XML and the leading technology infrastructure for topic-based reuse.

Through a process often called “chunking,” DITA helps create recyclable, transferable units from extensive documents by breaking them into smaller topics. It provides a structure that eliminates the need for user-defined DTDs, while letting users create customized topic extensions for their own needs. In essence, DITA provides a framework for breaking often enormous documents into manageable packages.

But the key advantage of DITA involves reuse. Imagine five different products with a power supply that has to be connected in a standard way. DITA helps create a single topic to describe the setup process instead of five. It thus eliminates 80% of the content that companies previously had to manage - edit, maintain, and translate.

The current state of linguistic reuse

Although more and more organizations recognize that topic-based reuse is good for them, sentence-level (or sentence-fragment level) reuse remains a relatively unexplored territory. Yet reusing linguistic segments ensures consistency across documents and makes localization more cost-effective by eliminating the need for retranslation. Remember - translation memory systems do not work at the topic level, but at the sentence or segment level. Working on this level is therefore the key to controlling translation costs.

Most of the current solutions to this challenge rely on “fuzzy matching” algorithms. These algorithms measure the similarity between two character strings (sentences or sentence fragments). On a superficial level, fuzzy matching seems like a useful solution to the problem. But the reality is quite different. Fuzzy matching works in translating, but is far less suited to writing environments.

Consider this example:

WARNING: Switch power off only
when the fan has stopped.

Fuzzy matching offers the following potential suggestions having different or even opposite, meanings:

WARNING: Switch power on only when
the fan has stopped.

-and-

Switch power off before the fan has
stopped.

There are similar problems for sentences with variables.

For the example:

Operating temperature must not
exceed 45 degrees Celsius.

fuzzy matching offers:

Operating temperature should not
exceed 50 degrees Celsius.

-and-

Operating temperature must not
exceed 65 degrees Celsius.

and so on.

In terms of usability, authors might have to wade through tens of suggestions for a single input, or thousands for a single document, which not only discourages writers from using the tool, but also increases the risk that they will introduce inaccurate information.

Matching Meaning

Up until now, technology has not met the challenges of applying reuse at the sentence level. The technologies that are available have delivered few tangible results for authoring and editing. The tools currently available do not address the single most important aspect of linguistic reuse: matching sentences or sentence fragments in terms of meaning. At the same time, the tools have often proven unwieldy or unusable in practice.

Acrolinx has recently introduced a new Intelligent Reuse component for its Information Quality software that meets these two challenges, combining meaning-based reuse with usability.

Consider the following examples:

  • Follow this link to find out more
  • To find out more, follow this link
  • Click here to find out more
  • For more information, please go here

These segments are simply different ways of saying the same thing, but translating them individually increases costs. Tools based on fuzzy matching are not useful here because the words and word sequences are too different. However, Intelligent Reuse identifies the similarity in meaning so that authors do not have to write different sentences to express the same thing.

Behind the scenes, a technology based on Artificial Intelligence extracts sentences from a translation memory or content management system. It groups sentences with similar meanings into so-called “micro-clusters.” The previous example is one such cluster. The following sentences are drawn from a cluster of approximately 25 sentences:

End Date must be greater than or equal to
Start Date.
End date must be equal to or later than
the start date..
End date should be greater than start date.
The start date cannot be later than the
end date.
Start date must be before end date!
The start date must be on or before the
end date.
Your end date must be after your start
date.
Your start date must be before your end
date.

The end date must be later than or the
same as the start date.
The actual end date must be on or after
the actual start date.
You cannot enter an “End Date” that is
before your “Start Date.”
Please enter an end date that is later
than the start date.
Please enter an End Date that is later
than or the same as the Start
End Time must be later than the Start Time.
Please enter a start date that is before
the end date.

Typically, content repositories or translation memories contain many segments that are redundant or of questionable quality. Based on initial experiences with the tool in business settings, Intelligent Reuse reduces redundancy in content by 15-35%. It also filters the micro-clusters for quality, checking for spelling and grammar, corporate style, and terminology. Compare the second and first sentences in the micro-cluster. “Start date” is capitalized in the first, but not in the second; the second contains a double period at the end. Juxtaposing the sentences in this way enables users to detect issues that typical spellcheckers might not catch. Intelligent Reuse provides spellchecking and quality assurance on a sentence level, rather than a word level, with an overhead similar to regular spellchecking.

After checking for quality, Intelligent Reuse chooses a representative sentence, a “winner” in terms of representativeness and quality. For this cluster, Intelligent Reuse chooses the following sentence, which is highlighted in a web-based interface:

Please enter an end date that
is later than the start date.

At this point, linguistic administrators can accept the suggested representative sentence, choose another one, or even move sentences from one cluster to another using the interface. This validation process is a key aspect of quality assurance because it helps administrators choose only correct sentences. Once a representative sentence has been chosen, the administrator activates its cluster for document checking.

Putt ing Intelligent Reuse into practice

From the perspective of writers, the tool now functions exactly like a spellchecker. For any sentence that approximates a representative sentence in meaning, writers receive a single standard sentence as a suggestion. Intelligent Reuse provides suggestions for sentences already stored in a content repository. But what is truly new about the tool is that it makes suggestions for newly authored sentences that match a representative sentence in meaning. For the preceding example, a writer comes up with the sentence:

The start date must precede
the end date.

Intelligent Reuse would suggest the validated representative sentence:

Please enter an end date that
is later than the start date.

Even though the new sentence is not part of the original micro-cluster. In addition, the tool does not detract from productivity because it makes only one, high-quality suggestion for any input.

This reuse is always intelligent because the suggestions match in meaning, not proximity of letters or words. The system can understand numbers and units and other complex entities. If we turn back to our previous example for fuzzy matching:

Operating temperature must not exceed 45
degrees Celsius.
Operating temperature should not exceed 50
degrees Celsius.
Operating temperature must not exceed 65
degrees Celsius.

Intelligent Reuse recognizes that the temperature variables in the sentences make a difference. It would place these sentences in the same micro-cluster, but recognize the temperature values as variables. Let us say that the linguistic administrator validates the first sentence of the cluster. If an author writes:

The operating temperature should not
exceed 80 degrees Celsius.

Intelligent Reuse suggests:

Operating temperature must not exceed 80
degrees Celsius.

In other words, it offers the validated representative sentence, but preserves the value (80 degrees) that the author typed. More than translation cost or usability is at issue in this case, since the difference in operating temperature affects product safety.

While this new technology has the potential to cut costs significantly in the translation and localization cycles, one of its most promising fields of application concerns text authored by non-native speakers. Intelligent Reuse helps non-native speakers meet the challenge of formulating text in a foreign language by offering them a representative sentence that has already been checked for quality and validated. As more and more companies employ nonnative speakers to author their technical documentation, Reuse could offer enormous benefits.

Finally, Intelligent Reuse extends beyond technical documentation to software strings, where developers confront significant issues in deciding whether a message is available. Here, Intelligent Reuse represents a novel approach to a problem where there are currently few solutions. For acrolinx, Intelligent Reuse comprises part of a holistic view of enterprise information quality management.

Its initial results have been immensely promising, in terms of decreasing redundancy, improving quality, increasing productivity, and cutting costs. For the first time, information developers can implement a linguistic-based reuse strategy that makes sense.

About acrolinx

acrolinx is market leader in quality assurance tools for professional information developers. These tools help companies worldwide to maintain their corporate image, address compliance issues, improve quality, and control document production and localization costs. Its flagship product, acrocheck™, is used internationally by thousands of customers in a variety of industries, including software, automotive, life sciences, and aerospace. acrocheck has been deployed at global enterprises like SAP, Symantec, SAS, Philips, Siemens, Motorola, and Bosch.

acrolinx maintains its headquarters in Berlin, Germany with a sales and support subsidiary in North America.




ClientSide News Magazine - www.clientsidenews.com







Submit your article!

Read more articles - free!

Read sense of life articles!

E-mail this article to your colleague!

Need more translation jobs? Click here!

Translation agencies are welcome to register here - Free!

Freelance translators are welcome to register here - Free!









Free Newsletter

Subscribe to our free newsletter to receive news from us:

 
Menu
Recommend This Article
Read More Articles
Search Article Index
Read Sense of Life Articles
Submit Your Article
Obtain Translation Jobs
Visit Language Job Board
Post Your Translation Job!
Register Translation Agency
Submit Your Resume
Find Freelance Translators
Buy Database of Translators
Buy Database of Agencies
Obtain Blacklisted Agencies
Advertise Here
Use Free Translators
Use Free Dictionaries
Use Free Glossaries
Use Free Software
Vote in Polls for Translators
Read Testimonials
Read More Testimonials
Read Even More Testimonials
Read Yet More Testimonials
And More Testimonials!
Admire God's Creations

christianity portal
translation jobs


 

 
Copyright © 2003-2024 by TranslationDirectory.com
Legal Disclaimer
Site Map